An LLM-based Framework with Retrieval-Augmented Generation for Building Code Interpretation

Fan Yang; Yiru Hou; Jiansong Zhang

Abstract:

Building codes are inherently complex and challenging to understand due to their detailed and interrelated safety requirements, technical language, and regional variations. Traditional manual approaches, such as consulting building code experts, are effective but constrained by high labor costs and the frequent updates to building codes. The advent of large language models (LLMs) offers a promising solution for automating and intelligently interpreting building codes. However, LLMs can sometimes produce unrelated or incorrect information due to hallucinations. In this paper, the authors propose a method that combines LLMs with retrieval-augmented generation (RAG) techniques to develop a "building code expert" capable of accurately answering user queries about building codes. To validate the approach, a dataset of 150 data records - each consisting of a query, an answer, and the relevant context - was extracted from Chapters 5 and 10 of the International Building Code 2015. Experimental results demonstrate that the proposed method outperforms the state-of-the-art question-answering framework. This research provides a notable step forward in leveraging artificial intelligence (AI) to improve accessibility, accuracy, and efficiency in understanding building codes, with potential applications such as compliance checking, automated design validation, and risk assessment.