Since 2016, Whythawk has made more than 4,000 Freedom of Information requests and curated over 20 million records on individual commercial locations in England and Wales.
We have supported the Greater London Authority (GLA), the Ministry for Housing, Communities & Local Government (MHCLG), University College London (UCL) and the universities of Leeds, Northumbria, and Warwick, and research groups like Centre for Cities, Centre for London and the Consumer Data Research Centre (CDRC).
Our data and analysis have served to inform analysis into the COVID lockdown period, the Levelling Up economic recovery response, and research into meanwhile use for empty shops, business energy consumption, the impact of rates on business vacancy, and business activity clustering maps.
openLocal tracks the history of all types of business units, across England and Wales, irrespective of their proximity to active high streets or town centres. We integrate a wide variety of source data imported from thousands of openly licenced datasets published as spreadsheets by local and national government.
Data are assembled via a combination of machine-learning techniques – including regression analysis, natural language processing and pattern-matching – into a single, unified geospatial database supporting research requirements for complex queries. All sources are automatically imported and processed, save for local rates data which are processed manually and algorithmically by our data wranglers.
All our data are available under a Creative Commons Attribution Licence ensuring you can easily share and reuse our work.
Our challenge, as our database has grown to near a terrabyte over the last 10 years, has been to improve our methods for rapdily and effectively restructuring and integrating data into accessible reports for researchers. In 2025, we began a major infrastructure redevelopment project to rebuild our existing systems.
Our solution
We have undertaken a complete refactoring and optimisation of the data integration and validation processes, as well as splitting the workflow into three dedicated applications:
- Transformer: produces standardised structured data from the 300 local authorities in England and Wales we track. This is served by our independent whyqd data wrangling application.
- Integrator produces structured and integrated data, validated against multiple test requirements.
- Explorer is a researcher-facing application permitting complex queries and structured data downloads.
In addition, we are setting up a standalone, optimised server dedicated to the database and which all other applications can access as appropriate.
Our objective is an integrated service which provides the typical services of a Valuations Office Agency explorer extended with our unique ratepayer and rates relief data.
Outcomes
As at the end of 2025, the core report-building functionality is complete and already being used by researchers - both commercial and academic - and we are developing the visual explorer application to go live in Q2 2026.