- We assess the regional transferability of household classifiers based on datasets of
seven utility companies from two European countries
- Models that predict residential household characteristics can be trained in one region and applied to another one with only a small loss in performance
Creating machine learning models is time-consuming and costly: Ample training data must be collected, the data must be prepared and cleaned, the statistical models (or: the classifiers) need to be generated and tuned, and finally a rigorous evaluation is necessary. For machine learning models that predict household characteristics it is still unclear how well such models perform when applied to households from other regions. The transferability to other regions, however, would ease their practical application.
Based on datasets from seven utility companies in Germany and Switzerland that contain information on customer location and annual electricity consumption over several years, we trained several classifiers to predict household characteristics (household type, number of residents, heating type, etc.) and tested the resulting models with data from other regions. We also investigated to what extent data from other utility companies improve such models and to what extent governmental statistical data support their transferability.
We found that predictive models trained with data from one region can be applied to data from another region when accepting a decrease in accuracy by about 4%. Using training data from multiple utility companies helps to improve the performance of such models, whereas governmental statistical data only marginally improve the models under study.
Hopf, K., Riechel, S., Sodenkamp, M., & Staake, T. (2017). Predictive Customer Data Analytics – The Value of Public Statistical Data and the Geographic Model Transferability. In ICIS 2017 Proceedings. Seoul, South Korea: AIS electronic library.
This project has been funded in parts by the Swiss Federal office of Energy (Grant numbers SI/501053-01, SI/ 501202-01) and the Eureka member countries and European Union (EUROSTARS Grant number E!9859 - BENgine II.
Konstantin Hopf, Mariya Sodenkamp, Thorsten Staake.
Harvesting Open Data for Household Classification
Testing the Transferability of Household Classifiers
Identifying Target Customers for Cross Selling: The Case of Biogas
For further information or if you have any questions please do not hesitate to contact us.