Architecture & Extension
The full picture and how you'd modify this system.
See the Full Architecture
This interactive diagram shows every major component of the EUDR Forest Analyzer and how they connect. Click any component to learn what it does and where it lives in the codebase.
BROWSER
API LAYER (FastAPI Routers)
SERVICES (Business Logic)
EXTERNAL SERVICES
Key Design Decisions
Provider Abstraction
All satellite data access goes through a factory pattern. Switching from Google Earth Engine to Copernicus means changing one environment variable, not rewriting services.
Async Queue
Long-running analyses are offloaded to a background worker via a database-backed queue. The user gets an immediate response with a job ID, then polls or receives an email when results are ready.
Dual Alert Systems
GLAD (optical, 30m) and RADD (radar, 10m) run in parallel via asyncio.gather(). Radar works through clouds; optical has longer history. Cross-validation increases confidence when both detect loss.
Self-Healing
The stuck_analysis_fixer runs every minute, detecting analyses stuck in PROCESSING state beyond the timeout. It automatically retries up to 3 times before marking as ERROR, preventing silent failures.
Extension Recipes
Five practical recipes for extending the system. Each follows existing patterns so your changes integrate cleanly.
How to add a new satellite data provider
Example: adding Planet Labs as a data source alongside GEE, Planetary Computer, and CDSE.
- Create
backend/providers/planet_provider.py. Import and extend theDataProviderbase class fromproviders/base.py. - Implement all 6 required methods:
initialize(),get_forest_loss_alerts(),get_radar_alerts(),get_land_cover(),get_satellite_imagery(), andget_ndvi_analysis(). Each must accept the same geometry and date parameters as the base interface. - Open
backend/providers/factory.py. Add a new case to the provider selection logic:if provider == "planet": return PlanetProvider(). - Set
DATA_PROVIDER=planetin your.envfile. The factory will now create your provider automatically. - No changes needed in services, API endpoints, or frontend. The abstraction layer ensures all existing code works with any provider that implements the interface.
How to add a new alert source
Example: adding a hypothetical FORMA alert system alongside GLAD and RADD.
- Create
backend/services/forma_alert_service.pyfollowing the pattern ofglad_alert_service.py. Implement a method that accepts a geometry and date range and returns alert data. - Open
backend/services/forest_analyzer_with_alerts.py. In the_analyze_alerts()method, find theasyncio.gather()call whereglad_taskandradd_taskare defined. - Add your new task:
forma_task = self.forma_service.get_alerts(geometry, start_date, end_date). Add it to theasyncio.gather(glad_task, radd_task, forma_task)call. - Process the FORMA results the same way GLAD and RADD results are processed: check for alerts after the EUDR cutoff date and add them to the combined summary.
- Add alert API endpoints in
backend/api/alerts.pyfor querying FORMA alerts directly (optional, for UI display).
How to add a new report section
Example: adding a water risk section to the enhanced PDF report.
- Open
backend/services/report_generator.py. Create a new method_draw_water_risk_section(self, canvas, data). - Use ReportLab canvas methods to draw your section:
canvas.drawString()for text,canvas.drawImage()for maps,canvas.setFont()for styling. Follow existing methods like_draw_biodiversity_section()for layout patterns. - Handle page breaks by checking the current Y position. If your content won't fit, call
canvas.showPage()and reset headers. - In the main
generate_compliance_report()method, add a call to your new method at the desired position in the report sequence. - If your section needs data, fetch it in the service layer (e.g.,
enhanced_analysis_service.py) and pass it through the existing data dictionary.
How to add a new API entity
Example: adding a "Certifications" entity to track sustainability certifications for plots.
- Define the data model in
backend/models/. Create a new file or add to__init__.py: a dataclass with fields likeid,plot_id,certification_type,issued_date,expiry_date. - Add the database table in
backend/utils/database.py. Add aCREATE TABLE IF NOT EXISTS certifications (...)statement to the schema setup function. - Create the API router in
backend/api/certifications.py. Define CRUD endpoints:POST /api/certifications,GET /api/certifications/{id},GET /api/certifications/plot/{plot_id}, etc. - Register the router in
backend/app.py: import your router and addapp.include_router(certifications_router, prefix="/api/certifications").
How to deploy to production
The system includes Docker deployment files in the deploy/ directory.
- Configure all environment variables in
.envfor production: real GEE service account credentials, production PostgreSQL host and password, SMTP credentials for email, and MinIO endpoint. - Set
USE_REAL_FOREST_DATA=trueand provide the GEE service account key file. Without this, the system uses simulated forest data. - Run
docker-compose up -dfrom thedeploy/directory. This starts the FastAPI backend, PostgreSQL+PostGIS database, and MinIO storage. - Verify the deployment by checking
http://your-host:8000/api/docsfor the Swagger UI and running a test analysis upload. - For GitHub Container Registry (ghcr.io) deployments, the deploy config is pre-configured. Push your image and update the compose file with your registry path.
Architectural Insights
Layers of Abstraction
The system is organized into four clean layers. Each layer only communicates with its immediate neighbors, never skipping levels. The API layer never directly queries Google Earth Engine -- it calls a service, which calls a provider. This makes the system testable (mock any layer) and swappable (replace any layer without touching the others).
This layering means you can test business logic without a database (mock the data access layer), swap GEE for Planetary Computer without touching services (swap the provider), or replace FastAPI with another framework without changing any business logic.
Why Not Microservices?
All code runs in one process. The team is small, the deployment is simple, and the services share data through function calls (fast, type-safe) instead of HTTP calls (slow, fragile, need serialization). Microservices add network complexity, distributed tracing, service discovery, and deployment orchestration that only pay off at much larger scale. A well-structured monolith with clean layer boundaries can be split into microservices later if needed -- the boundaries are already there in the code.
Extension Points
The system was designed to be extended at specific, well-defined points. Each extension point has a clear pattern to follow, so new features integrate cleanly without modifying existing code:
- New data providers -- implement the DataProvider interface and register in the factory
- New alert sources -- add a task to the
asyncio.gather()call in_analyze_alerts() - New report sections -- add a
_drawmethod and call it fromgenerate_compliance_report() - New API entities -- create a router and register it in
app.py - New subscription features -- add entries to the feature flag dictionary in
get_features() - New background jobs -- start them in the
startup_event()inapp.py
Check Your Understanding
Architecture Reference
System Layers
| Layer | Location | Responsibility | Example |
|---|---|---|---|
| HTTP | backend/api/ |
Parse requests, validate input, route to services | analysis.py |
| Business Logic | backend/services/ |
Domain rules, analysis, scoring | forest_analyzer_with_alerts.py |
| Data Models | backend/models/ |
Define data shapes and enums | ComplianceStatus, RiskLevel |
| Data Access | backend/utils/ |
Database connections, queries | database.py |
| External | backend/providers/ |
Third-party API adapters | gee_provider.py |
Extension Points
| What to Add | Where | Pattern |
|---|---|---|
| New data provider | providers/ + factory.py |
Implement DataProvider interface |
| New alert source | services/ + _analyze_alerts() |
Add to asyncio.gather() |
| New report section | report_generator.py |
Add _draw method + call it |
| New API entity | api/ + models/ + app.py |
Create router + register |
| New subscription feature | models/user.py |
Add to get_features() dict |
| New background job | services/ + app.py startup |
Start in startup_event() |
Tech Stack
| Technology | Purpose | Why Chosen |
|---|---|---|
| FastAPI | Web framework | Async support, auto-generated docs, type validation |
| PostgreSQL + PostGIS | Database | Spatial queries on geographic data |
| Google Earth Engine | Satellite data | Petabyte-scale datasets, cloud processing |
| ReportLab | PDF generation | Full control over layout and imagery |
| JWT (python-jose) | Authentication | Stateless tokens, refresh support |
| psycopg2 | DB driver | Connection pooling, dict cursors |
| Shapely | Geometry processing | Validate/repair polygons |
Course Complete
You have worked through all seven modules of the EUDR Forest Analyzer course. You now understand how the system is built, how data flows through it, and how to extend it for new requirements.
Here is what you covered: