Publications / 2024 Proceedings of the 41st ISARC, Lille, France

Evaluation of Mapping Computer Vision Segmentation from Reality Capture to Schedule Activities for Construction Monitoring in the Absence of Detailed BIM

Juan D. Nunez-Morales, Yoonhwa Jung, Mani Golparvar-Fard
Pages 847-854 (2024 Proceedings of the 41st ISARC, Lille, France, ISBN 978-0-6458322-1-1, ISSN 2413-5844)
Abstract:

Over the past few years, research has focused on leveraging computer vision in construction progress monitoring, particularly in comparing construction photologs to Building Information Modeling (BIM), with or without schedule data. The practical application of these techniques and a large number of startups that have brought hyper AI and human-in-the-loop services around progress monitoring have revealed several gaps: 1) Current BIM-driven projects do not have model disciplines at the right level of maturity and Level of Development; 2) definitions of states of work-in-progress that are detectable from images are not formalized; 3) poor schedule quality and lack of frequent progress update challenges the incorporation of detailed 4D BIM for progress tracking. Such gaps are addressed in this work by exploring the requirements for mapping modern computer vision techniques for object segmentation with construction schedule activities to automate progress monitoring applications using computer vision without BIM as a baseline. The approach utilizes reality mapping practices to offer time machines for construction progress, organizing photologs over space and time. Additionally, this work shows how Large Language Models can structure schedule activity descriptions around $<$Uniformat Object Classification, Location$>$, focusing on how vision and language models can be trained separately with limited annotated data. ASTM Uniformat classification is utilized to map triangulated object segments from images to color-coded 3D point clouds aligned with schedule activities without the need for image and language feature alignments. Exemplary results on tied new transformer-based models with few-shot learning are shown, and the requirements for full-scale implementation are discussed.

Keywords: Automated Progress Monitoring, Artificial Intelligence, Computer Vision, Natural Language Processing