Publications / 2022 Proceedings of the 39th ISARC, Bogotá, Colombia

Information Extraction from Text Documents for the Semantic Enrichment of Building Information Models of Bridges

Phillip Schönfelder, Tariq Al-Wesabi, Andreas Bach and Markus König
Pages 175-182 (2022 Proceedings of the 39th ISARC, Bogotá, Colombia, ISBN 978-952-69524-2-0, ISSN 2413-5844)

The majority of innovative approaches in the realm of the retrospective generation of building information models for existing structures deal with geometry extraction from point clouds or engineering drawings. However, many building-specific or object-specific attributes for the enrichment of building models cannot be inferred from these geometric and visual data sources, and thus their acquisition requires the analysis of textual building documentation. One type of such documents are structural bridge records, which include specifications regarding used material, location, structural health, modifications, and administrative data. The docu-ments are semi-structured and hardly allow a robust infor-mation extraction based on traditional programming, since the implementation of such an approach would result in a complex nesting of conditional clauses, which is not guaran-teed to remain effective for future versions of the document structure. Therefore, a data-driven approach is adopted for the information extraction. This paper demonstrates an end-to-end semantic enrichment method, taking a bridge status report as input and feeding structured object parameters di-rectly to the building information modeling software for the enrichment of the model. The proposed method requires lit-tle user interaction and achieves production-ready accuracy. It is tested on an as-built model of an actual bridge and shows promising results.

Keywords: Building information modeling; Information extraction; Semantic enrichment; Natural language processing; Named entity recognition; Machine learning