← back Stef van Dijk

ETL Pipeline & Load-bearing Capacity Visualisation


Feb 2026 – Jun 2026 · HU University of Applied Sciences Utrecht

Overview


A terrain trafficability tool built for the Dutch defense sector. Given real-time and forecast weather data combined with a national soil-type map, the system calculates per-grid-cell drivability scores for military vehicles across the Veluwe region. Commanders can use the map to assess which vehicles can operate where, based on current ground conditions.

Approach


The backend is a Go service split into two stages: an ETL pipeline and a GIS calculation engine. Three independent pipelines ingest data from the KNMI weather-station API, a Sentinel satellite soil-saturation S3 bucket, and the KNMI GRIB forecast model. Each pipeline implements shared Extractor, Transform[T], and Loader[T] interfaces, so the ETL runner treats every source identically. After the loaders finish, the calculator sweeps a fixed lat/lon grid over the Veluwe, computes a weighted draagkracht score per cell from rainfall, sunshine, and temperature coefficients that vary per USCS soil type, then maps each score to a five-tier drivability enum through a VCI lookup table. Results are written to PostGIS and exposed as WMS map tiles via GeoServer, with a lightweight Gin API serving the frontend.

Highlights


  • The grid computation spawns one goroutine per latitude row. Because the weather data is read-only and rows are independent, concurrency is free: no locks needed, and the full Veluwe grid finishes in a fraction of the serial time.
  • results.drivability is range-partitioned by month. Every WMS tile query filters on a single valid_time and vci, so PostGIS prunes to one partition and hits the composite GIST index rather than scanning the whole table.
  • Loaders stage data via COPY into a temp table before upserting to the target, making bulk inserts as fast as possible while still preserving idempotency across repeated ETL runs.

Outcome


The full pipeline (extract, transform, load, and calculate) runs every 8 hours in production via a Docker scheduler container. The draagkracht formula is structurally complete but uses placeholder coefficients pending real values from the product owner, so scores are not yet operationally meaningful. Given more time, I'd switch the rainfall input from daily KNMI data to the hourly GRIB dataset so the 28-hour rolling window is actually accurate.

Stack

GoPostGISGeoServerDockerTypeScriptSvelteTailwind CSSGitGitHub ActionsGeoServerAgentic WorkflowsLLM AgentsScrumPostgreSQL

Role

Backend Developer · Scrum Master

Links

Source code is private, but certain code can be shared on request.