Download:

Abstract:

Insufficient building information, including footprint, conditioned area, age, and type, hinders urban-scale energy modeling. These parameters are crucial inputs for the simulation and optimization processes integral to the modeling. Prototypical building energy models, based on building surveys and code requirements at the time of construction, are frequently used when audit-quality data is unavailable. This helps to infer internal building characteristics. However, even local data sources like tax assessors’ data contain unique land use or parcel codes that can be challenging to map to these prototypical buildings. This information does not directly correlate with the standard building type used to perform energy simulations. In this study, we apply and cross-validate several machine learning algorithms to automate the mapping from general building descriptions to standardized building types, as defined by the U.S. Department of Energy (DOE), a key component to accurately estimate building energy profiles at scale. The XGBoost algorithm outperformed others, achieving an F1 score, precision, and recall of 92.8%, 93.4%, and 93.0%, respectively. These results highlight the potential of advanced machine learning techniques in bridging the data gap for urban-scale energy modeling and suggest a path forward for enhancing the resolution and accuracy of large building energy datasets.