An integrated framework of feature engineering and machine learning for large-scale energy anomaly detection
Journal article
Authors/Editors
Strategic Research Themes
Publication Details
Author list: Thanyapisit Buaprakhong, Varintorn Sithisint, Awirut Phusaensaart, SinthonWilke, Thatsamaphon Boonchuntuk, Thittaporn Ganokratanaa*, Mahasak Ketcham
Publisher: Tech Science Press
Publication year: 2026
Journal acronym: Energy Engineering
ISSN: 01998595, 15460118
URL: https://www.techscience.com/energy/online/detail/25698
Languages: English-United States (EN-US)
Abstract
The rapid digitalization of the energy sector has led to the deployment of large-scale smart metering systems that generate high-frequency time series data, creating new opportunities and challenges for energy anomaly detection. Accurate identification of anomalous patterns in building energy consumption is essential for optimizing operations, improving energy efficiency, and supporting grid reliability. This study investigates advanced feature engineering and machine learningmodeling techniques for large-scale time series anomaly detection in building energy systems. Expanding upon previous benchmark frameworks, we introduce additional features such as oil price indices and solar cycle indicators, including sunset and sunrise times, to enhance the contextual understanding of consumption patterns. Our comparative modeling approach encompasses an extensive suite of algorithms, including KNeighborsUnif, KNeighborsDist, LightGBMXT, LightGBM, RandomForestMSE, CatBoost, ExtraTreesMSE, NeuralNetFastAI, XGBoost, NeuralNetTorch, and LightGBMLarge. Data preprocessing includes rigorous handling of missing values and normalization, while feature engineering focuses on temporal, environmental, and value-change attributes. The models are evaluated on a comprehensive dataset of smart meter readings, with performance assessed using metrics such as the Area Under the Receiver Operating Characteristic Curve (AUC-ROC). The results demonstrate that the integration of diverse exogenous variables and a hybrid ensemble of traditional tree-based and neural network models can significantly improve anomaly detection performance. This work provides new insights into the design of robust, scalable, and generalizable frameworks for energy anomaly detection in complex, real-world settings.
Keywords
No matching items found.






