Call it the MVC problem: minimum viable context. Too little and it hallucinates your domain. Too much and it drifts from your actual goal. The process has to be controlled.

dltHub

Building a canonical data model (CDM) with an LLM requires careful calibration of context — too little causes hallucination, too much causes drift. Three approaches were tested: a 20-question Q&A method (too noisy), business scenario descriptions (scope creep across departments), and an intent-first minimal approach (company name + goal). The third approach proved most effective, bootstrapping a focused ontology from just three inputs. The key insight is that a clear, scoped use case produces a useful model, while vague or broad input produces an unfocused one. The CDM toolkit is part of the dltHub AI Workbench, integrating with Claude Code, Cursor, and Codex.

Minimum Viable Context for Building a Canonical Data Model

Approach 1: Let the User Talk — The 20 Questions Method Link icon

Approach 2: Business Scenarios — Right Instinct, Wrong Boundary Link icon