LinkedIn developed DataHub, an open-sourced metadata catalog platform that evolved through three generations. Initially, LinkedIn used a monolithic application which later split into a metadata service with an API. The third generation emphasized a stream-based, real-time architecture and decentralization, enabling efficient, trustworthy metadata handling. DataHub supports various APIs, real-time metadata changes, and federated metadata services, making it reliable and suitable for large enterprises.

8m read timeFrom blog.det.life
Post cover image
Table of contents
The Metadata ModelsThe Metadata StoreIngestion FrameworkGraphQL APIUser Interface

Sort: