Module 01 — The Analysis Pipeline | EUDR Forest Analyzer Course

Trace an Upload from Browser to Verdict

Follow the journey of a GeoJSON file from the moment a user clicks "Upload" to the final compliance verdict. Each step below maps to a real stage in the backend code.

Step 1

Upload a Geospatial File

The user sends a file to POST /api/analysis/upload. The FileProcessor takes over: it validates the geometry, auto-repairs self-intersecting polygons using make_valid(), extracts individual features from the collection, and calculates each plot's area in hectares.

Supported formats include GeoJSON, KML, and Shapefile (ZIP). The parsed geometries are stored in a PostGIS database with full coordinate reference system metadata.

Expected result: Upload ID returned, plots stored in PostGIS database.

Step 2

Trigger the Analysis

The client calls POST /api/analysis/analyze/{upload_id}. The system creates an instance of ForestAnalyzerWithAlerts, the main analysis engine that coordinates all downstream checks.

First, it detects the country from the plot's coordinates using bounding-box lookups or reverse geocoding. The country determines the baseline risk level — high-risk countries (Brazil, Indonesia, DRC, etc.) start with a +20 point penalty in the risk score.

Expected result: Analysis initiated, country identified, baseline risk level assigned.

Step 3

Query Forest Coverage

GEE is queried for Hansen Global Forest Change data (dataset UMD/hansen/global_forest_change_2023_v1_11). The system compares the 2020 baseline forest cover against the current state and calculates the percentage of forest lost within the plot boundary.

This is a pixel-level analysis at 30-meter resolution. Each pixel encodes the year of tree cover loss (2001–2023), allowing the system to distinguish pre-cutoff loss from post-cutoff loss.

Expected result: Forest coverage percentage calculated, loss areas identified with year-of-loss attribution.

Step 4

Fetch Deforestation Alerts

Two independent alert systems are queried concurrently using asyncio.gather():

GLAD — Optical detection via Landsat imagery at 30m resolution. Global coverage since 2001.
RADD — Radar detection via Sentinel-1 SAR at 10m resolution. Works through cloud cover, day or night. Covers humid tropics since 2019.

Running these in parallel halves the wait time since they query independent GEE datasets. When both systems detect loss in the same area, confidence is significantly higher.

Expected result: GLAD and RADD alert data retrieved, cross-validated where overlapping.

Step 5

Calculate Risk Score

The risk score is a composite value out of 100, built from four factors:

Country risk: +20 (high-risk), 0 (standard), or -20 (low-risk)
Forest loss: up to +10 points based on loss percentage
Post-cutoff alert area: up to +25 points if alerts exist after 2020-12-31, plus +25 for a confirmed post-cutoff alert flag
Data uncertainty: +5 points for measurement and model error margin

A score above 70 flags the plot as non-compliant and requiring review.

Expected result: Numeric risk score computed, ready for verdict determination.

Step 6

Deliver the Verdict

The final ComplianceStatus is determined:

COMPLIANT — all checks pass, no post-cutoff deforestation detected
NON_COMPLIANT — one or more thresholds exceeded (forest loss > 2%, risk score > 70, or post-cutoff alerts found)
NEEDS_REVIEW — borderline results that require human judgment

Results are persisted to the analysis_results table and alert records to deforestation_alerts. An enhanced PDF report is generated with satellite imagery, NDVI change maps, and land cover classification.

Expected result: Compliance verdict stored, PDF report available for download.

Common Tasks

How to upload and analyze a plot

Upload your geospatial file (GeoJSON, KML, or Shapefile ZIP) via POST /api/analysis/upload. The response includes an upload_id.
Trigger the analysis by calling POST /api/analysis/analyze/{upload_id}. This starts the full pipeline: forest coverage, GLAD alerts, RADD alerts, and risk scoring.
Poll for completion with GET /api/analysis/status/{analysis_id} until the status is completed.
Retrieve the results with GET /api/analysis/results/{analysis_id}. The response includes compliance status, risk score, forest loss percentage, and alert details.
Download the PDF report from GET /api/reports/enhanced-pdf/{upload_id} for a full visual report with satellite imagery.

How to check if a plot is compliant

Look at the compliance_status field in the analysis results. It will be one of: COMPLIANT, NON_COMPLIANT, or NEEDS_REVIEW.
Check the risk_score value. Anything above 70 indicates non-compliance.
Inspect the alerts array for any entries with dates after the EUDR cutoff (2020-12-31). A single post-cutoff alert is sufficient to trigger NON_COMPLIANT status.
Review the forest_loss_percentage. Loss exceeding 2% of the plot area also triggers non-compliance.

How to use async queue for large batches

Submit the plot for asynchronous processing via POST /api/queue/submit. The response includes a queue_id.
Poll the job status at GET /api/queue/status/{queue_id}. Status progresses through: PENDING → PROCESSING → COMPLETED (or FAILED).
When the job completes, results are automatically emailed to the registered user with a PDF download link.
For batch jobs, a combined summary email is sent with individual PDF attachments (Enterprise tier) or a single summary PDF (Free tier).
Check overall queue statistics at GET /api/queue/stats to monitor throughput and pending jobs.

Why the Pipeline Works This Way

Defense in Depth

The system does not rely on a single metric. Instead, three independent checks must all pass for a plot to be considered compliant:

Forest loss percentage — quantitative measure of canopy change
Risk score threshold — composite score incorporating country risk, alert severity, and data quality
Post-cutoff alert check — any deforestation detected after 31 December 2020

🔒

Think of it like a bank vault

A modern bank vault has three independent locks: a combination dial, a physical key, and a biometric scanner. Each one independently prevents unauthorized access. Even if one mechanism is compromised or malfunctions, the other two still protect the contents. The analysis pipeline works the same way — forest loss percentage, risk score, and post-cutoff alerts are three independent barriers. A plot must clear all three to be deemed compliant.

Why Concurrent Alert Fetching Matters

GLAD and RADD query entirely independent datasets on GEE. Neither depends on the other's result. This makes them a textbook case for concurrency.

Python

import asyncio

async def analyze_alerts(geometry):
    glad_task = get_glad_alerts(geometry)
    radd_task = get_radd_alerts(geometry)

    glad, radd = await asyncio.gather(
        glad_task,
        radd_task
    )
    return glad, radd

What this does

Line 4–5: Create two tasks — one for GLAD (optical, Landsat) and one for RADD (radar, Sentinel-1). Neither starts executing yet.

Line 7–10: asyncio.gather() launches both tasks simultaneously. While GLAD waits for its GEE response, RADD can be processing its request (and vice versa).

Result: Total wall-clock time is roughly max(glad_time, radd_time) instead of glad_time + radd_time. For typical GEE queries of 3–5 seconds each, this saves 3–5 seconds per analysis.

The EUDR Cutoff Date

The entire regulation hinges on a single date: 31 December 2020. Under EU Regulation 2023/1115, commodities placed on the EU market must not have been produced on land that was deforested after this cutoff.

Any deforestation detected after this date — whether by GLAD's optical analysis or RADD's radar detection — automatically renders the plot NON_COMPLIANT. Deforestation that occurred before the cutoff is noted in the report but does not affect compliance status.

This binary rule is why the post-cutoff alert check exists as an independent barrier in the pipeline. Even if forest loss is below 2% and the risk score is under 70, a single confirmed alert dated 1 January 2021 or later is enough to fail the plot.

Quick Reference

Compliance Thresholds

Check	Threshold	Effect
Forest loss	`> 2%`	NON_COMPLIANT
Risk score	`> 70`	NON_COMPLIANT
Alert after cutoff	`after 2020-12-31`	NON_COMPLIANT
All checks pass	—	COMPLIANT

API Endpoints

Method	Path	Description
`POST`	`/api/analysis/upload`	Upload a geospatial file (GeoJSON, KML, Shapefile)
`POST`	`/api/analysis/analyze/{id}`	Start EUDR compliance analysis for an upload
`GET`	`/api/analysis/status/{id}`	Check analysis completion status
`GET`	`/api/analysis/results/{id}`	Retrieve full analysis results

Risk Score Components

Factor	Points	Condition
Country (high-risk)	`+20`	Plot located in BR, ID, CD, PE, CO, BO, VE, or MY
Country (standard)	`0`	Country not classified as high or low risk
Country (low-risk)	`-20`	Plot in a country with strong forest governance
Post-cutoff alert	`+25`	Any GLAD or RADD alert dated after 2020-12-31
Alert area factor	`up to +25`	Scaled by area of detected deforestation
Forest loss	`+10`	Measurable canopy loss detected within the plot
Data uncertainty	`+5`	Margin for satellite measurement and model error

← All Modules Module 2 →