Dispatches from the Road with Richard Zhang

Written by
Back to all posts

We caught up with Richard Zhang, Augmenta's VP of AI and R&D, after a few weeks on the road presenting at 3DV and CDFAM — two conferences where the hardest problems in spatial AI and computational design are being worked out. Here's what he brought back.

You just got back from presenting at 3DV and CDFAM. These aren't construction conferences. Who's in the room at events like these, and what's the broader conversation around 3D generative AI right now?

3DV and CDFAM are very different conferences, with the former an academic gathering and the latter an industry-focused event. I gave my 3DV talk at the Area Chair Workshop, which hosted leading computer vision researchers. Later, I was also part of a panel discussion on Foundation Models involving thought leaders in the field. At CDFAM, which was single-track, the audience were predominantly from the industry, with many start-up companies present. Most conversations surrounding 3D Generative AI focused on the data challenge, surrogate models, finding the right data model or representations, as well as the contrast between generic and bespoke Foundation Models. Interestingly, the question of "Is 3D even needed?" kept coming up.

What is the research community focused on that industry hasn't caught up to yet? Where is the leading edge of this work headed?

To me, the most obvious contrast is that most companies are fully content to be mere consumers of the latest and shiniest AI models. Obviously, they are not under the burden of advancing the state of the art. The research community is dutifully taking on the challenge of identifying the weaknesses of the existing models and finding innovative ways to improve them. A leading thesis is how contemporary large models are lacking basic spatial and physical precision or intelligence to solve real problems in the physical world, and that this is an issue scaling alone will not resolve. There has been much talk about developing bespoke (i.e., specialist) models.

When you look at how other industries have gone through design automation — manufacturing, aerospace, automotive — what does that tell you about where construction is headed, and roughly how fast?

I think the construction industry is at a crossroad. By scale, construction dwarfs the aerospace and automotive sectors, even when combined, and still outsizes manufacturing, yet it remains the laggard in AI adoption. Our industry also faces a unique challenge: the cost of design automation does not amortize over high volumes. As my CEO, Francesco Iorio, puts it: you can sell a million copies of an iPhone design, but no two buildings are ever meant to be identical. Complexities such as this should not be a deterrent to AI adoption, but a catalyst. AI must accelerate in construction to unlock productivity, sustainability, and waste reduction. All signals I can gather suggest that 2026 will represent an inflection point — the year we shift from merely consuming AI to fundamentally innovating with it.

You've spent time as an Amazon Scholar and have deep academic roots in computer graphics and spatial AI. What pulled you toward construction? Most researchers working in this space aren't thinking about conduit routing.

Honestly, I did not come to Augmenta with a full appreciation of the difficulty and relevance of the MEP problem, especially when all I could think of plumbing was that our plumber only ever came to our house to fix what was under the sink or behind the toilet. But the research I had been doing in the years prior, especially in spatial AI and looking forward to functional AI, made me gravitate strongly toward applying AI to the physical domain. At the same time, I was also increasingly aware that in terms of data scale and model capabilities, the challenges at the scene (indoor room to building) level are significantly greater than those at the object level. Architecture and construction clearly offers the tallest challenge of all. Researchers like myself always welcome a challenge.

Your talk framed AI for the built environment around "functional" design, not just generative design. Can you explain the difference, and why it matters for actually delivering a building?

Being "generative" is the process, while being "functional" is the goal. Why would you want to produce a 3D design? Unlike image or video generation, it is not only for one's viewing pleasure! A 3D design, such as one of a building, is meant to be used or physically constructed, which again goes into its usage in the real world. We want to automate the design of a fully engineered 3D building that is ready for construction.

There's a well-known gap between what AI can do in a research lab and what actually holds up on a live project. Where does that gap most often break down, and what does it take to close it?

The gap is not specific to AI models. In research, a good outcome typically ends up in a high-quality publication, where success is measured by whether it outperforms state-of-the-art baselines, not by whether it delivers a product to real customers. What I learned from my time at Amazon was that an algorithm achieving 90% accuracy on a well-known benchmark is not the final answer. Every customer-facing result must be rigorously QA-ed, and the error tolerance is practically zero. That was when I developed not only an appreciation for, but a strong gravitation toward, human-centred workflows such as active learning. The stakes differ by domain, of course. In e-commerce, a misinformed customer is unhappy; in construction, errors are far less tolerable, as they can lead to serious safety consequences and significant financial loss. This is why I believe AI models in construction should not seek full automation. They must work alongside designers in an iterative workflow, with fast surrogate models providing meaningful feedback, while continuously learning from every correction and decision made along the way.

What's the hardest technical problem in automating design for the built environment that most people outside the research field don't fully appreciate?

MEP for sure. To my knowledge, most research works in the fields of AI and visual computing have been on architectural (e.g., facade modeling) and structural (e.g., floorplans) elements of buildings. What turns out to be the hardest problem in practice, in terms of coordination and time consumption, MEP, is almost never touched.

AI skepticism is real in construction. You hear it from VDC teams, detailers, project managers. What's your honest response when someone says the output won't meet their project requirements?

AI is here to stay, and by 2026, it is already becoming a way of life. While AI models may never fully automate your construction projects to meet all of your requirements, they will become indispensable tools to reduce project completion time, cost, and waste, while improving sustainability and design quality. In a competitive market, clients will always gravitate toward contractors who can demonstrate these advantages. The choice is not whether AI will reshape construction, but whether you will be among those who shape how it does. I believe that every company will adopt AI — those who do not embrace it now will find themselves falling behind, not gradually, but decisively.

Non-residential construction — commercial, institutional, healthcare, mission critical — is where Augmenta is focused. Why is that the hard problem, and what makes it different from other building types?

Non-residential buildings are larger, more complex, and more varied in functions than their residential counterparts. They do not follow cookie-cutter design patterns, so the cost of designing each one cannot amortize over high volumes the way a repeated residential floor plan can. Every hospital, data center, or school is essentially bespoke, with different site conditions, programmatic requirements, regulatory constraints, and MEP configurations. This is precisely where the economic case for automation is strongest, since there is no other lever to pull on design cost.

What does a "foundation model for construction" actually mean in practice? For a contractor focused on their next job, why should they care?

At the high level, you can think of such a foundation model as a ChatGPT or Claude to answer questions and complete tasks specific for construction. It will be a bespoke, or specialist, model, not a generic one. The most critical difference though, for the foundation model that I want to build, is that it goes far beyond question answering. My model must be able to create 3D designs, reason about them, and alter them on demand, for a contractor. All the construction knowledge, in textual form, must be well aligned to their spatial manifestations.

After being on the road and talking to people at the frontier of this work, what are you most focused on bringing back to what Augmenta is building next?

A stronger conviction that we are on the right path, starting with addressing the data scarcity and access problem through generative design. At the same time, we must work on spatial alignment with current LLMs.

If you had to give one piece of advice to a BIM or VDC manager sorting through the AI tools landing in their inbox every week, what would you tell them to pay attention to — and what would you tell them to ignore?

My honest advice: ignore tools that merely wrap existing LLMs in a construction-themed interface. If a tool's core capability is something you could approximate yourself with a few API calls and some prompt engineering, it is not worth spending your money. The real challenge in construction is spatial and functional AI: the ability to reason in 3D, generate valid and constructible designs, and edit them intelligently. That capability cannot be conjured from a general-purpose language model. So when evaluating any new tool, ask one simple question: can it actually work in 3D space, or does it only talk about it? The answer will tell you everything.

Author
Augmenta

Augmenta is building the foundation model for construction, powered by a frontier spatial AI engine purpose-built for the physical world. It solves what general-purpose AI cannot touch: generating precise, code-compliant, physically valid building designs composed of millions of components across multiple engineering disciplines in continuous 3D space. Augmenta’s commercially validated AI currently automates electrical system design, the hardest discipline, at data center scale. Founded by pioneers of Generative Design at Autodesk, Augmenta is based in Toronto, Canada.

Read more

Related articles

The future of building design

Learn how your organization can use and benefit from the Augmenta Construction Platform

CONTACT US