Transferability of Household Classification Models and the Value of Governmental Statistical Data

Research Highlights

- We assess the regional transferability of household  classifiers based on datasets of seven utility companies from two European countries
- Models that predict residential household characteristics can be trained in one region and applied to another one with only a small loss in performance

 

Challenge

Creating machine learning models is time-consuming and costly: Ample training data must be collected, the data must be prepared and cleaned, the statistical models (or: the classifiers) need to be generated and tuned, and finally a rigorous evaluation is necessary. For machine learning models that predict household characteristics it is still unclear how well such models perform when applied to households from other regions. The transferability to other regions, however, would ease their practical application. 

 

Approach

Based on datasets from seven utility companies in Germany and Switzerland that contain information on customer location and annual electricity consumption over several years, we trained several classifiers to predict household characteristics (household type, number of residents, heating type, etc.) and tested the resulting models with data from other regions. We also investigated to what extent data from other utility companies improve such models and to what extent governmental statistical data support their transferability. 

Results

We found that predictive models trained with data from one region can be applied to data from another region when accepting a decrease in accuracy by about 4%. Using training data from multiple utility companies helps to improve the performance of such models, whereas governmental statistical data only marginally improve the models under study. 

Selected publications

Hopf, K., Riechel, S., Sodenkamp, M., & Staake, T. (2017). Predictive Customer Data Analytics – The Value of Public Statistical Data and the Geographic Model Transferability. In ICIS 2017 Proceedings. Seoul, South Korea: AIS electronic library.

 

Funding

This project has been funded in parts by the Swiss Federal office of Energy (Grant numbers SI/501053-01, SI/ 501202-01) and the Eureka member countries and European Union (EUROSTARS Grant number E!9859 - BENgine II.

Date: 2015-2017

 

Team

Konstantin Hopf, Mariya Sodenkamp, Thorsten Staake.  


All projects on Machine Learning for Customer Insights

Harvesting Open Data for Household Classification    

Testing the Transferability of Household Classifiers

Identifying Target Customers for Cross Selling: The Case of Biogas


CONTACT US

For further information or if you have any questions please do not hesitate to contact us.