ETL Pipeline & Load-bearing Capacity Visualisation
Feb 2026 – Jun 2026 · HU University of Applied Sciences Utrecht
Overview
A terrain trafficability tool built for the Dutch defense sector. Given real-time and forecast weather data combined with a national soil-type map, the system calculates per-grid-cell drivability scores for military vehicles across the Veluwe region. Commanders can use the map to assess which vehicles can operate where, based on current ground conditions.
Approach
The backend is a Go service split into two stages: an ETL pipeline and a GIS calculation
engine. Three independent pipelines ingest data from the KNMI weather-station API, a
Sentinel satellite soil-saturation S3 bucket, and the KNMI GRIB forecast model. Each
pipeline implements shared Extractor, Transform[T], and Loader[T] interfaces, so the ETL runner treats
every source identically. After the loaders finish, the calculator sweeps a fixed lat/lon
grid over the Veluwe, computes a weighted draagkracht score per cell from rainfall, sunshine,
and temperature coefficients that vary per USCS soil type, then maps each score to a five-tier
drivability enum through a VCI lookup table. Results are written to PostGIS and exposed as
WMS map tiles via GeoServer, with a lightweight Gin API serving the frontend.
Highlights
- The grid computation spawns one goroutine per latitude row. Because the weather data is read-only and rows are independent, concurrency is free: no locks needed, and the full Veluwe grid finishes in a fraction of the serial time.
results.drivabilityis range-partitioned by month. Every WMS tile query filters on a singlevalid_timeandvci, so PostGIS prunes to one partition and hits the composite GIST index rather than scanning the whole table.- Loaders stage data via
COPYinto a temp table before upserting to the target, making bulk inserts as fast as possible while still preserving idempotency across repeated ETL runs.
Outcome
The full pipeline (extract, transform, load, and calculate) runs every 8 hours in production via a Docker scheduler container. The draagkracht formula is structurally complete but uses placeholder coefficients pending real values from the product owner, so scores are not yet operationally meaningful. Given more time, I'd switch the rainfall input from daily KNMI data to the hourly GRIB dataset so the 28-hour rolling window is actually accurate.
Stack
Role
Backend Developer · Scrum Master
Links
Source code is private, but certain code can be shared on request.