Best of Data Management2025

  1. 1
    Article
    Avatar of systemdesigncodexSystem Design Codex·1y

    15 Must-Know Elements of System Design

    A well-designed system leverages various architectural elements to manage distributed systems, enhance scalability, service management, networking efficiency, and data storage. Key components include distributed message queues, caching, task schedulers, content delivery networks, consistent hashing, service discovery, DNS, load balancers, API gateways, databases, object storage, sharding, replication, and monitoring tools. These elements help improve performance, manage traffic, and ensure system resilience and fault tolerance.

  2. 2
    Article
    Avatar of theburningmonktheburningmonk.com·1y

    Bye bye schema coupling, hello semantic coupling

    The post introduces the concept of semantic coupling as an innovative method to manage schema evolution in event-driven architectures, replacing traditional schema coupling. By using semantic tags instead of versioning or altering event schemas, it allows consumers to focus on the meaning of data rather than its representation. This approach prevents breaking changes and eliminates the need for event versioning, encouraging smoother schema transitions. Tools like Orbital and Taxi can facilitate these transformations by decoupling the data representation from consumers.

  3. 3
    Article
    Avatar of communityCommunity Picks·1y

    BigDataBoutique/awesome-opensearch: A curated list of links and resources all about Opensearch. Maintained by the Opensearch experts at BigData Boutique (makers of Pulse for Opensearch)

    The resource collection 'awesome-opensearch' is maintained by BigData Boutique. It provides a wide range of links, tools, and articles related to Opensearch, including official documentation, community forums, migration guides, and cost optimization tips. Contributions to the repository are encouraged, with guidelines provided for adding valuable content.

  4. 4
    Article
    Avatar of decuberssDecube·1y

    Introducing Decube's Public API

    Decube has released its Public API to streamline data governance workflows. The API facilitates bulk management of glossaries, manual lineages, and user groups, enhancing efficiency and scalability. It also ensures full accountability through secure audit logging. Upcoming features include data quality scores and monitor configuration, furthering Decube's mission to empower data teams.

  5. 5
    Article
    Avatar of microsaasexamplesMicro SaaS Examples·1y

    Bricks: The AI Spreadsheet That Simplifies Data Management

    Bricks is an AI-powered spreadsheet that simplifies data management tasks with natural language formulas, smart tables, and AI-powered charts and dashboards. It enhances ease of use, saves time by automating repetitive tasks, and offers live presentation mode for seamless updates. Suitable for diverse users, Bricks allows focusing on insights and decision-making rather than complex technicalities.

  6. 6
    Article
    Avatar of thoughbotthoughbot·45w

    The hard truth about soft deletion

    Soft deletion marks records as deleted without removing them from the database, preserving data for recovery and compliance. However, it introduces significant complexity including the need to consistently exclude deleted records from queries, handle dependent record deletion properly, update indexes and constraints, and manage potential conflicts during restoration. The author suggests evaluating alternatives like better backups, improved UI design, or accepting the tradeoffs based on specific business needs rather than implementing soft deletion by default.

  7. 7
    Article
    Avatar of decuberssDecube·1y

    S3 Tables with Apache Iceberg: Manage Data at Scale

    Discover how integrating S3 Tables with Apache Iceberg can enhance your data management strategy, providing reliable and scalable systems. Learn about key components like the Iceberg catalog and table, and understand the benefits of using Apache Iceberg with Amazon S3, including improved data scalability, reliability, and cost-efficiency. Explore best practices for managing large-scale deployments, optimizing resources, and ensuring secure data governance.

  8. 8
    Article
    Avatar of communityCommunity Picks·1y

    DuckDB Database File as a New Standard for Sharing Data?

    DuckDB offers a simplified approach to data sharing by encapsulating multiple tables into a single database file. This reduces compatibility issues and eliminates the need for packaging files into tar/zip archives. Tests showed that DuckDB handles numerical data more efficiently than PostgreSQL, while string data storage initially appeared less efficient but improved with larger datasets.