Building a canonical data model (CDM) with an LLM requires careful calibration of context — too little causes hallucination, too much causes drift. Three approaches were tested: a 20-question Q&A method (too noisy), business scenario descriptions (scope creep across departments), and an intent-first minimal approach (company name + goal). The third approach proved most effective, bootstrapping a focused ontology from just three inputs. The key insight is that a clear, scoped use case produces a useful model, while vague or broad input produces an unfocused one. The CDM toolkit is part of the dltHub AI Workbench, integrating with Claude Code, Cursor, and Codex.
Table of contents
Approach 1: Let the User Talk — The 20 Questions Method Link iconApproach 2: Business Scenarios — Right Instinct, Wrong Boundary Link iconApproach 3: Start with Intent Link iconThe Takeaway Link iconSort: