Pixel Phenology Pipeline
Prerequisite: This pipeline reads datacubes produced by the netCDF Datacube Pipeline. Run that pipeline first to produce
*_datacube.ncfiles, then pointPIXEL_INPUT_DATACUBESat those files or their parent directory.
The pixel_phenology pipeline reads those per-pixel datacubes and computes 19
phenological metrics for every pixel on the same spatial grid. The output is one
CF-1.8 NetCDF per (VI, region) containing one spatial layer per metric, plus a
summary CSV with spatial statistics per metric.
When to Use This Pipeline
Use the pixel phenology pipeline when you need:
Spatially explicit phenological metric maps (one value per 30-m pixel per metric)
Per-pixel interannual variability (std, CV, peak range) across a multi-year record
Bimodality detection (n_peaks, peak separation, valley depth) mapped across a region
Use the phenology pipeline when you need ROI-mean time series, smoothed curves aggregated over a region, or interactive HTML plots.
Selecting the Pipeline
This is a two-step workflow. Set PIPELINE in config.local.env:
# Step 1 — produce the input datacubes:
PIPELINE="netcdf_datacube"
# Step 2 — compute per-pixel metric maps from those datacubes:
PIPELINE="pixel_phenology"
# Separate workflow — ROI-mean time series, metrics, plots:
PIPELINE="phenology"
See Overview — Setup for the full config file model.
Processing Model
For each input datacube ({VI}_{region_label}_datacube.nc):
1. Open with xarray (lazy); apply optional --start-date / --end-date filter
Warn if uncompressed array size > 8 GB
2. Load full (time, y, x) array into RAM as float32
3. Precompute λ D^T D Whittaker penalty matrix once
(all pixels share the same n_days — computed once, reused for every pixel)
4. ThreadPoolExecutor: dispatch y-row chunks (50 rows per chunk)
Each thread, for every pixel in its chunk:
a. Apply valid-range mask (vi_min, vi_max → NaN)
b. Skip pixel if valid obs < min_valid_obs → all metrics = NaN
c. Map valid observations onto the daily grid (NaN → weight 0)
d. Solve (W + λ D^T D) z = W y [Whittaker smooth]
e. For each year with ≥ min_valid_obs_per_year observations:
peak NDVI + DOY, integrated NDVI, floor, ceiling,
green-up rate, season length, bimodality
f. Aggregate across years: mean / std for annual metrics
g. Compute whole-series metrics: CV, interannual peak range + std
5. Assemble (19, n_y, n_x) output array from all chunks
6. Write {VI}_{region_label}_pixel_metrics.nc (CF-1.8, zlib complevel=4)
7. Write {VI}_{region_label}_pixel_metrics_summary.csv
Threading rationale: scipy.sparse.linalg.spsolve releases the Python GIL, so
ThreadPoolExecutor achieves true multi-core parallelism while all threads share the
single in-memory array without inter-process serialisation overhead.
Memory note: The full (time, y, x) float32 array is loaded into RAM once. Use
--start-date / --end-date to reduce the time axis if the datacube is large.
Input
PIXEL_INPUT_DATACUBES / --input-datacubes accepts two forms:
Form |
Example |
|---|---|
Directory — all |
|
File path(s) — space-separated list of individual files |
|
When a directory is given, files are discovered with find … -name "*_datacube.nc" | sort
and an error is raised if none are found.
VI and region_label are parsed from each filename:
{VI}_{region_label}_datacube.nc → first underscore-separated token = VI, remainder = region_label.
Whittaker Smoothing
The pipeline uses the Whittaker smoother exclusively. The penalty matrix
λ D^T D is precomputed once for the full time axis and reused for every pixel,
making per-pixel solve cost dominated by the sparse linear system rather than matrix
construction.
The smoother solves:
(W + λ D^T D) z = W y
where W is the diagonal weight matrix (1 = observed, 0 = gap), D is the
2nd-order difference matrix, and λ is the smoothing strength.
|
CLI |
Default |
Effect |
|---|---|---|---|
|
— |
— |
Tight; follows observations closely |
|
— |
✓ default |
Balanced smoothing |
|
— |
— |
Very smooth; coarse biome-level characterisation |
If the solver fails for a pixel, that pixel falls back to a zero-filled result and its metrics are set to NaN.
Metrics (19)
All metrics are derived from the per-pixel Whittaker-smoothed daily series. Floor and ceiling are computed from the curve’s annual minimum and maximum — no biome-specific DOY windows are used.
Annual metrics (mean + std aggregated across years)
Metric band |
Description |
|---|---|
|
Mean annual peak VI value |
|
Interannual std of peak VI |
|
Mean day-of-year of annual peak |
|
Interannual std of peak DOY |
|
Mean integrated VI (trapezoidal, NDVI-days per year) |
|
Interannual std of integrated VI |
|
Mean green-up rate: |
|
Interannual std of green-up rate |
|
Mean annual curve minimum (dry-season trough) |
|
Mean annual curve maximum (= peak VI) |
|
Mean days above |
|
Interannual std of season length |
|
Mean number of peaks per year detected by |
|
Mean DOY distance between the two tallest peaks (NaN if n_peaks < 2) |
|
Mean |
|
Mean normalised trough depth: |
Whole-series metrics (single value per pixel)
Metric band |
Description |
|---|---|
|
Coefficient of variation of raw (unsmoothed) valid observations across the full record |
|
|
|
Std of annual peak VI across all years |
Observation thresholds
Parameter |
|
CLI flag |
Default |
Effect |
|---|---|---|---|---|
Total obs |
|
|
|
Pixel skipped entirely (all metrics = NaN) if total valid obs < N |
Per-year obs |
|
|
|
Annual window skipped if obs < N; does not contribute to mean/std |
Bimodality parameters
Parameter |
|
CLI flag |
Default |
Description |
|---|---|---|---|---|
Peak prominence |
|
|
|
Minimum NDVI prominence for a peak to be counted |
Peak min distance |
|
|
|
Minimum separation in days between detected peaks |
Season threshold |
|
|
|
Amplitude fraction above floor for season-length calculation |
Output
File Naming
File |
Location |
|---|---|
|
|
|
|
|
|
|
|
|
|
The overview PNG and HTML are generated by default. Disable individually via PIXEL_OVERVIEW_FIGURE=false / PIXEL_OVERVIEW_HTML=false in config.local.env, or --no-overview-figure / --no-overview-html on the CLI.
File Structure Example
pixel_metrics/ ← PIXEL_OUTPUT_DIR
├── pixel_phenology_20260320_153100.log
├── G5_1/
│ ├── NDVI_G5_1_pixel_metrics.nc
│ ├── NDVI_G5_1_pixel_metrics_summary.csv
│ ├── NDVI_G5_1_pixel_metrics_overview.png
│ └── NDVI_G5_1_pixel_metrics_overview.html
└── G5_12/
├── NDVI_G5_12_pixel_metrics.nc
├── NDVI_G5_12_pixel_metrics_summary.csv
├── NDVI_G5_12_pixel_metrics_overview.png
└── NDVI_G5_12_pixel_metrics_overview.html
Output NetCDF Structure
Each output file is a CF-1.8 compliant netCDF4 with:
Dimensions:
y— northing in meters (UTM), same grid as input datacubex— easting in meters (UTM), same grid as input datacube
Variables:
One float32
(y, x)variable per metric — 19 total, zlib-compressed (level 4)spatial_ref— scalar CRS container (copied from input datacube)
Global attributes:
Attribute |
Description |
|---|---|
|
|
|
Creation timestamp |
|
Region label |
|
VI name (e.g. |
|
Absolute path to the input datacube |
|
λ value used for smoothing |
|
Bimodality prominence threshold |
|
Bimodality minimum peak separation |
|
Season-length amplitude fraction |
|
Minimum obs threshold used |
|
Present only when |
Summary CSV
{VI}_{region_label}_pixel_metrics_summary.csv contains one row per metric with spatial
statistics computed across all non-NaN pixels:
Column |
Description |
|---|---|
|
Metric name |
|
Spatial mean across valid pixels |
|
Spatial std |
|
5th percentile |
|
Median |
|
95th percentile |
|
Count of pixels with a non-NaN value for this metric |
CLI Reference
python src/pixel_phenology_extract.py --help
Argument |
|
Default |
Description |
|---|---|---|---|
|
|
(required) |
Path(s) to |
|
|
|
Root output directory |
|
|
|
Whittaker smoothing strength λ (10–1000; higher = smoother) |
|
|
|
Min valid obs over full record; fewer → pixel set to NaN |
|
|
|
Min valid obs per annual window; fewer → that year skipped |
|
|
|
Min NDVI prominence for bimodality peak detection |
|
|
|
Min separation in days between detected peaks |
|
|
|
Amplitude fraction above floor for season-length calculation |
|
|
|
Valid range for NDVI |
|
|
|
Valid range for EVI2 |
|
|
|
Valid range for NIRv |
|
|
|
Number of parallel threads for pixel processing |
|
|
— |
Include only time steps on or after this date |
|
|
— |
Include only time steps on or before this date |
|
|
(overview generated) |
Skip the print-quality 19-panel overview PNG |
|
|
(overview generated) |
Skip the interactive Plotly HTML overview |
|
— |
|
|