More details, including all training code and model weights, will be released upon acceptance of the paper. Thank you for your interest!
Abstract
The adoption of artificial intelligence in the Architecture, Engineering, and Construction domain is hindered by a reliance on task-specific models that fail to generalize across diverse geometric applications. To address this limitation, this paper introduces a point cloud-based foundation model for 3D Building Information Modeling (BIM) geometry, pre-trained via a Latent-Euclidean Joint Embedding Predictive Architecture on individual BIM objects. By enforcing predictive consistency between global object context and local topological details within a regularized latent space, the proposed model extracts robust semantic features while suppressing low-level geometric noise. Extensive evaluations demonstrate the generalizability of the learned representations across multiple downstream tasks, achieving competitive performance in standard and fine-grained object classification, semantic segmentation via transfer learning, in- and out-of-distribution part segmentation on BIM and computer-aided design objects respectively, and zero-shot tasks including shape retrieval and anomaly detection. These results establish a foundation model that facilitates diverse applications for 3D BIM geometry.