Large-scale GeoAI deployment for a national mapping programme, England
Large-scale GeoAI deployment for a national mapping programme, England
A national agency responsible for maintaining authoritative geospatial records of built infrastructure in England and Wales required a step-change in the speed and consistency of building footprint extraction. Their existing process relied on a combination of manual digitisation and semi-automated feature extraction tools that were both time-intensive and prone to variation between operators. With housing stock expanding and planning reform placing new demands on the accuracy of building inventories, the agency needed a scalable, reproducible automated pipeline that could process millions of structures at the accuracy levels required for official national datasets.

Their existing process relied on a combination of manual digitisation and semi-automated feature extraction tools that were both time-intensive and prone to variation between operators.
The principal technical challenge was ensuring the model performed consistently across highly varied urban morphologies from dense Victorian terraces in northern cities to sparse rural dispersed settlement patterns in the south-west and Wales while maintaining the spatial precision standards mandated for national reference data.
VE3 developed a cloud-native building footprint extraction pipeline built around a Mask R-CNN instance segmentation architecture, trained on a stratified sample of aerial imagery tiles drawn from across the national coverage area. The training corpus was deliberately designed to capture the full diversity of English and Welsh building typologies, including terraced housing, semi-detached suburban stock, detached rural dwellings, industrial units, agricultural buildings, and complex multi-wing institutional structures. Pre-processing involved reprojecting all input imagery and vector reference data into British National Grid (EPSG:27700) and harmonising tile boundaries to eliminate seam artefacts at processing block edges.
The pipeline integrated directly with the client's existing Esri-based GIS environment through a custom PostGIS interface, meaning all outputs were immediately queryable within their operational tooling without intermediate file conversion steps. GPU-accelerated inference was deployed on AWS EC2 instances using a Kubernetes orchestration layer that autoscaled compute resources based on the processing queue depth, enabling sustained throughput of over 100,000 structures per day during peak processing windows. Each extracted footprint was accompanied by a confidence score and a set of QA flags indicating cases where the model's certainty fell below predefined thresholds, allowing the client's GIS team to prioritise manual review effort on the highest-risk outputs rather than reviewing the entire dataset.
Topology correction was applied as a post-processing step, using GEOS-based geometry validation routines to close gaps, resolve overlapping polygon boundaries, and align building outlines to the underlying OS MasterMap reference geometry. This was critical for ensuring downstream usability of the extracted footprints in planning and infrastructure applications, where topological integrity is a hard requirement.
One of the most significant challenges was handling the prevalence of terraced and semi-detached properties in urban areas, where shared party walls mean that individual building footprints cannot be separated using height or spectral information alone. VE3's approach incorporated adjacency pattern learning into the Mask R-CNN architecture, training the model on manually verified instance masks that encoded the correct separation lines between adjoining structures. Greenhouses, garden rooms, and outbuildings presented a secondary challenge; these were addressed through a combination of minimum area thresholds and geometric compactness filters applied in post-processing, calibrated against OSMM descriptive attributes where available.
For large complex structures such as schools, hospitals, and warehouses — which frequently exhibit highly irregular plan forms — the pipeline applied a hierarchical segmentation strategy, first identifying the primary building envelope and then resolving internal courtyards and covered linkways as separate geometries where appropriate.
Independent accuracy assessment against a stratified random sample of 50,000 manually digitised ground-truth footprints yielded an overall Intersection over Union (IoU) score of 91.3%, with urban residential stock achieving 92.8% IoU and rural dispersed buildings achieving 87.4% IoU the latter driven primarily by the greater variability in agricultural building forms rather than any systematic model failure. The pipeline processed the full 4.2 million structure dataset within the agreed 18-month contract period, with the final production run completing two weeks ahead of schedule.
.png)
This engagement directly validates VE3's core capability for the VOA Dwelling Area Estimates requirement. The project demonstrates production-proven performance on building footprint extraction at the scale of 4+ million structures using the same OS MasterMap and aerial imagery data stack that underpins the VOA methodology. The accuracy figures, QA flagging approach, and GPU-accelerated processing architecture described here map directly onto the technical pipeline proposed for the VOA trial, giving confidence that the 50–60k property scope of the initial LA trial can be delivered well within the proposed two-month development and training window.