DataHub: The Metadata Platform Developed at LinkedIn
LinkedIn developed DataHub, an open-sourced metadata catalog platform that evolved through three generations. Initially, LinkedIn used a monolithic application which later split into a metadata service with an API. The third generation emphasized a stream-based, real-time architecture and decentralization, enabling efficient, trustworthy metadata handling. DataHub supports various APIs, real-time metadata changes, and federated metadata services, making it reliable and suitable for large enterprises.
