Best of Observability — 2024

  1. 1
    Article
    Avatar of foojayioFoojay.io·2y

    Be a Better Java Developer: 19 Tips for Staying Ahead in 2024

    Tips for becoming a better Java developer, including upgrading Java versions, learning Kotlin, exploring other languages/frameworks, and staying updated with the Java community.

  2. 2
    Video
    Avatar of communityCommunity Picks·1y

    12 Logging BEST Practices in 12 minutes

    Effective logging is crucial for troubleshooting and maintaining system health. Key practices include having a clear plan, understanding log levels (info, warning, error, fatal), using structured logging, capturing detailed log entries, implementing log sampling to reduce storage costs, using canonical log lines, centralizing logs, setting retention policies, securing sensitive data, and choosing efficient logging libraries to minimize performance impact. Additionally, metrics should be used alongside logs for real-time monitoring.

  3. 3
    Article
    Avatar of swizecswizec.com·2y

    90% of performance is data access patterns

    Optimizing your data access patterns can significantly enhance application performance. A real-world example illustrates how identifying and removing an outdated session middleware fetching patient appointment data reduced CPU usage by 66% and halved the latency of the slowest API requests. This case underscores the evolving nature of systems and the critical role of observability in catching inefficiencies.

  4. 4
    Article
    Avatar of communityCommunity Picks·2y

    Become a Better Java Developer: 19 Tips for Staying Ahead in 2024

    Tips for becoming a better Java developer, including upgrading Java versions, learning Kotlin, exploring other languages/frameworks, understanding Loom and Structured Concurrency, getting coverage from Oracle, learning Groovy and Scala, practicing Continuous Feedback, using Ktor, building side projects, focusing on observability, staying connected with the Java community, reading professional developer blogs, following influencers on social media, and signing up for a Java articles reading list.

  5. 5
    Article
    Avatar of quastorQuastor Daily·2y

    How Booking.com Processes Millions of Events Every Second

    Booking.com relies on events as the foundation for their observability system, using them to capture detailed, contextual information. They handle tens of millions of events every second and are looking to migrate to OpenTelemetry for its language support, extensibility, and wide vendor support.

  6. 6
    Article
    Avatar of communityCommunity Picks·2y

    The State of Data Engineering 2024

    The 2024 State of Data Engineering report discusses the influence of GenAI on software infrastructure, the expansion of product offerings due to the economic downturn, and the impact of open table formats and their catalogs in the data lake industry. It also highlights the importance of data version control and observability in AI/ML systems.

  7. 7
    Article
    Avatar of gcgitconnected·2y

    Why Use GoFr for Golang Backend?

    GoFr is a Golang framework designed for accelerated microservice development. It offers built-in observability tools such as health-check and heartbeat URLs, metrics, and structured logging. GoFr supports multiple data sources including MySQL, PostgreSQL, and MQTT, and simplifies REST API design, logging, metrics, tracing, and CORS configuration. Its compatibility with Kubernetes and minimal code for route registration enable developers to focus on business logic while ensuring the application is production-ready.

  8. 8
    Article
    Avatar of bitBits and Pieces·2y

    10 Challenges In Implementing Microservices

    Implementing microservices can be challenging, but there are solutions to overcome the common challenges. Domain-Driven Design (DDD) and Event-Driven Architecture (EDA) can help manage complexity. Proper service discovery and communication mechanisms are important for large-scale applications. Data management and consistency can be addressed through strategies like CQRS and the Saga pattern. Deployment and DevOps automation can streamline the process. Monitoring and observability are essential for performance insights. Service resilience and fault tolerance can be achieved through circuit breakers and health checks. Security measures like authentication, secure communication, input validation, data encryption are crucial. Effective team organization and communication are necessary for collaboration. Versioning and compatibility can be managed using semantic versioning and API versioning. Scalability can be achieved through horizontal scaling and container orchestration.

  9. 9
    Article
    Avatar of java_libhuntAwesome Java Newsletter·2y

    Structured logging in Spring Boot 3.4

    Structured logging in Spring Boot 3.4 allows logs to be written in well-defined, machine-readable formats such as JSON. This enables powerful search and analytics capabilities. It supports the Elastic Common Schema (ECS) and Logstash formats and allows for custom formats. Developers can add additional fields to logs for better filtering and analysis. Logs can be output to the console or written to a file for different use cases.

  10. 10
    Article
    Avatar of communityCommunity Picks·1y

    5 Tips for Structured Logging in Spring Boot 3.4

    Spring Boot 3.4 introduces native support for structured logging using formats like Elastic Common Schema (ECS) and Logstash. This reduces dependency management and enhances observability in applications by capturing logs in a structured manner. The post emphasizes the importance of logs in understanding application behavior, particularly in distributed systems, and highlights best practices for logging. It also underscores the role of tools like OpenTelemetry and Digma in achieving comprehensive observability.

  11. 11
    Article
    Avatar of last9Last9·2y

    Golang Logging: A Comprehensive Guide for Developers

    Logging in Go is essential for debugging and maintaining application performance. While the standard log package offers basic functionality, third-party libraries like Zerolog and Zap provide advanced features like structured logging and configurable log levels. Implementing best practices, such as avoiding sensitive data in logs and using context-rich messages, can significantly enhance log analysis and troubleshooting. Integrating with observability platforms like ELK can further improve monitoring capabilities in production environments.

  12. 12
    Article
    Avatar of quastorQuastor Daily·2y

    How Booking.com Processes Millions of Events Every Second

    Booking.com processes millions of events every second using their observability system based on key-value pairs. They rely on events to capture detailed information and generate metrics, logs, and traces. The challenges they face include maintaining proprietary event libraries and custom conversion code.

  13. 13
    Article
    Avatar of communityCommunity Picks·2y

    How Meta Achieves 99.99999999% Cache Consistency 🎯

    Meta has developed a system to achieve 99.99999999% cache consistency, essential for scaling distributed systems. They use an observability solution featuring Polaris to monitor and detect cache inconsistencies and a tracing library to log data changes during race conditions. This approach allows querying the database at controlled intervals to prevent overload and find inconsistencies quickly. These techniques ensure only 1 out of 10 billion cache writes become inconsistent.

  14. 14
    Article
    Avatar of newstackThe New Stack·2y

    What Is OpenTelemetry? The Ultimate Guide

    OpenTelemetry is a vendor-neutral, standardized approach to capturing observability data like metrics, logs, and traces. It allows integration of various observability tools into a unified system, providing a consistent framework for capturing and analyzing telemetry data. Key components include the API/SDKs, automatic instrumentation agents, and the widely used OpenTelemetry Collector. Fundamental to observability are the golden signals: latency, traffic, errors, and saturation, which are essential for effective system performance analysis. OpenTelemetry's future development aims to enhance client-side instrumentation and extend its capabilities across diverse platforms.

  15. 15
    Article
    Avatar of netflixNetflix TechBlog·1y

    Title Launch Observability at Netflix Scale

    Netflix manages over a thousand global content launches each month and faces significant challenges in ensuring the success and discoverability of each title. This post discusses the operational demands of a personalization system, highlighting the need for scalable solutions to automate operations. Two primary options are explored: log processing and observability endpoints, each with its benefits and tradeoffs. Real-time monitoring and proactive issue detection are key strategies in enhancing Netflix's ability to manage title launches effectively.

  16. 16
    Article
    Avatar of medium_jsMedium·2y

    Why use GoFr for Golang Backend?

    GoFr is an opinionated Go Framework for accelerated microservice development. It provides features like REST API principles, logging, metrics and tracing, CORS configuration, dynamic log level, Kubernetes compatibility, database integration, and IoT integration. GoFr simplifies API design and allows for easy monitoring and observability of applications.

  17. 17
    Article
    Avatar of communityCommunity Picks·2y

    Kotlin & Modern Java: The 17 Differences

    This post explores 17 differences between Kotlin and modern Java, including functional programming, null safety, observability, syntax, checked exceptions, coroutines support, data classes, type inference, extension functions, smart casts, constructors, ternary operator, primitive types, string templates, operator overloading, wildcard types vs. declaration-site variance, and cross-platform and multiplatform development.

  18. 18
    Article
    Avatar of communityCommunity Picks·2y

    How to profile a performance issue using Spring Boot profiling tools

    This post explores the importance of profiling performance issues in Spring Boot applications and highlights the scenarios where profiling is crucial. It also discusses the built-in monitoring and observability tools in Spring Boot and introduces Digma as a tool for early detection and profiling of performance issues.

  19. 19
    Article
    Avatar of dotnet.NET Blog·2y

    Introducing ASP.NET Core metrics and Grafana dashboards in .NET 8

    ASP.NET Core in .NET 8 introduces metrics and Grafana dashboards. Metrics can be used to monitor app activity and can be displayed on a dashboard or trigger real-time alerts. There are options available for using these metrics, including the .NET Aspire dashboard and ASP.NET Core Grafana dashboards.

  20. 20
    Article
    Avatar of gcgitconnected·2y

    Choosing the Right Go Framework: GoFr vs Fiber

    This post compares two popular Go web frameworks, GoFr and Fiber, by implementing a basic API using both frameworks and evaluating their performance and usability. GoFr simplifies database management and exhibits higher performance compared to Fiber. It offers a robust choice for developers looking for a framework that balances performance with functionality.

  21. 21
    Article
    Avatar of communityCommunity Picks·2y

    11 Non-AI Java Trends: From GraalVM to Spring Modulith

    Java developers are showcasing interest in several non-AI trends. These include GraalVM for improving runtime efficiency, Data-Oriented Programming (DOP) for treating data as first-class citizens, the debate between Quarkus and Spring Boot, in-depth details of Spring Internals, Kotlin integration with Spring Boot, and security improvements with Spring Security 3 and OAuth2. Additionally, discussions focus on the benefits of Spring Data JDBC, alternatives to Thymeleaf templates, the evolving Spring Modulith architecture, enhanced runtime efficiency with virtual threads and CRaC in Java 21, and observability improvements introduced in Spring Boot 3.2 and 3.3.

  22. 22
    Article
    Avatar of last9Last9·2y

    Top Observability Best Practices for Microservices in 2024

    Microservices architecture provides benefits such as agility, scalability, and flexibility but also introduces complexity that can be managed with the right observability tools. Key best practices, such as standardization, data retention, anomaly detection, root cause analysis, and ongoing optimization, are crucial for effective monitoring and management. Challenges include handling data volume, latency in data processing, skillset requirements, and cost management. Advanced techniques like distributed context propagation and intelligent sampling in tracing, centralized error management, and optimizing telemetry data can enhance observability and system reliability.

  23. 23
    Article
    Avatar of last9Last9·1y

    Python Logging with Structlog: A Comprehensive Guide

    Enhance your Python logging with structlog, a library that creates structured, readable, and machine-friendly logs. structlog helps in preserving context, improving readability and analysis, and providing customizable pipelines for log processing. It integrates easily with existing frameworks like Python's built-in logging module and supports high-throughput systems with features like asynchronous and buffered logging. structlog also works well with microservices architectures and observability tools, ensuring your logs are actionable and insightful.

  24. 24
    Article
    Avatar of grafanaGrafana Labs·1y

    Prometheus 3.0 and OpenTelemetry: a practical guide to storing and querying OTel data

    Prometheus 3.0 aims to improve integration with OpenTelemetry by addressing challenges such as resource attributes, UTF-8 support, and temporalities. The Prometheus 3.0 release includes features like promoting resource attributes to metric labels, a new `info` PromQL function, and stable OTLP support for easier data ingestion and querying. Users can also utilize the delta to cumulative processor in OTel Collector for better data handling. Future developments will focus on enhancing interoperability and scalability.

  25. 25
    Article
    Avatar of zalandoZalando·2y

    Node.js and the tale of worker threads

    A Node.js service faced performance issues due to improper handling of worker threads, which caused high resource consumption and server instability within a Kubernetes environment. By spawning multiple workers per CPU core instead of per allocated resource, and aggressively restarting them on errors, a positive feedback loop overwhelmed both the campaign and translation services. Investigation revealed that limiting worker threads and proper resource allocation could resolve the issue, highlighting the importance of optimized worker management and enhanced observability in production environments.