Best of Observability — July 2024

1
Article
swizec.com·2y
90% of performance is data access patterns
Optimizing your data access patterns can significantly enhance application performance. A real-world example illustrates how identifying and removing an outdated session middleware fetching patient appointment data reduced CPU usage by 66% and halved the latency of the slowest API requests. This case underscores the evolving nature of systems and the critical role of observability in catching inefficiencies.
214
2
2
Article
Community Picks·2y
How Meta Achieves 99.99999999% Cache Consistency 🎯
Meta has developed a system to achieve 99.99999999% cache consistency, essential for scaling distributed systems. They use an observability solution featuring Polaris to monitor and detect cache inconsistencies and a tracing library to log data changes during race conditions. This approach allows querying the database at controlled intervals to prevent overload and find inconsistencies quickly. These techniques ensure only 1 out of 10 billion cache writes become inconsistent.
62
2
3
Article
Zalando·2y
Node.js and the tale of worker threads
A Node.js service faced performance issues due to improper handling of worker threads, which caused high resource consumption and server instability within a Kubernetes environment. By spawning multiple workers per CPU core instead of per allocated resource, and aggressively restarting them on errors, a positive feedback loop overwhelmed both the campaign and translation services. Investigation revealed that limiting worker threads and proper resource allocation could resolve the issue, highlighting the importance of optimized worker management and enhanced observability in production environments.
29
3
4
Article
Community Picks·2y
What are the Four Golden Signals and Why Do They Matter?
The Four Golden Signals – latency, traffic, errors, and saturation – are essential insights for understanding the health and performance of modern cloud applications. Originating from Site Reliability Engineering (SRE) practices at Google, these signals help interpret complex observability data, allowing SREs to improve system reliability, perform root cause analysis, optimize performance, and enhance user experience. Various monitoring tools, including infrastructure monitoring, APM, and synthetic monitoring, are utilized to measure these signals effectively. The methodology contrasts with other frameworks like RED and USE, offering a comprehensive approach to performance management.
18
5
Article
Machine Learning News·2y
Meet Laminar AI: A Developer Platform that Combines Orchestration, Evaluations, Data, and Observability to Empower AI Developers to Ship Reliable LLM Applications 10x Faster
Laminar AI is a developer platform designed to streamline the creation of reliable LLM (Large Language Model) applications by integrating orchestration, evaluations, data management, and observability. It offers a graphical user interface for building dynamic graph-based LLM apps that seamlessly integrate with local code. Key features include semantic search across datasets, support for various models, and a user-friendly interface for constructing and testing pipelines. Laminar AI aims to speed up development times by providing a unified and efficient environment for managing LLM applications.
16
1
6
Article
Community Picks·2y
Design and write code more efficiently by understanding the system flows
Leveraging code usage analytics, developers can significantly enhance their understanding of system flows, resulting in more efficient code design and implementation. Digma provides observability overlays within the IDE, offering automatic views of runtime dependencies and connections between code and traces. It also facilitates navigation of asset trees and identification of dead code through features like the 'Never reached' code lens.
16
7
Article
Grafana Labs·2y
Getting started with Grafana: best practices to design your first dashboard
Observability is crucial for understanding and optimizing complex systems, and Grafana dashboards are a powerful tool for achieving this. Key principles for effective dashboard design include knowing your audience, leveraging visual hierarchies, and selecting appropriate metrics. Techniques like the RED and USE methods can help prioritize which metrics to display. Always iterate your designs based on feedback and changing requirements. For more detailed guidance, check out the provided on-demand webinar and additional resources.
16
8
Article
Planet Python·2y
PyCoder’s Weekly
Learn to create GUI applications with Python and PyQt by building a desktop calculator in a new video course. Umbra Space's new dataset offers satellite-based radar images for visualizing and annotating shipping data. Pydantic's new observability platform, Logfire, helps monitor your app with ease. Explore modern good practices for Python development and check out tutorials on building a guitar synthesizer using Python, understanding Python's security model, and creating high-quality README files for your projects.
16
9
Article
eBPF·2y
Observability Cost-Savings and eBPF Goodness with Groundcover
Groundcover is an innovative, cloud-native platform that leverages eBPF to offer a new model for observability, promising reduced costs and complexity for monitoring, logging, and tracing in Kubernetes environments. The product requires only one agent per host and retains all data within clusters for full observability and efficient APM. The discussion delves into its deployment, architecture, and underlying technology.
15
10
Article
Grafana Labs·2y
CI/CD observability: A rich, new opportunity for OpenTelemetry
Continuous integration and deployment (CI/CD) are central to modern software delivery, but observability in these processes remains limited. OpenTelemetry (OTel) is changing this by enabling deeper visibility throughout the whole CI/CD pipeline, from building and testing to deploying. Shifting observability focus 'left' helps detect and address issues early, increasing efficiency and reducing downtime. The introduction of new semantic conventions and Special Interest Groups (SIGs) for CI/CD observability marks a significant step forward in this area.
14
11
Article
Grafana Labs·2y
How to customize your Loki deployment with Ansible
Discover how to deploy Grafana Loki using Ansible, including both default and customized configurations. Learn about different deployment methods, the Ansible Loki Role, and how to configure Grafana to explore your logs. The role supports major Linux distributions and offers flexibility for various environments. The post also provides links to further documentation and examples of dashboards using Loki.
13
1
12
Article
Grafana Labs·2y
A complete guide to LLM observability with OpenTelemetry and Grafana Cloud
This guide explains the importance of observability for LLM applications and how tools like OpenTelemetry, Grafana Cloud, and OpenLIT can simplify monitoring. It breaks down how LLMs work, the significance of tracking metrics like request frequency, response times, and costs, and offers a step-by-step tutorial for setting up automatic instrumentation. It also highlights how to use Grafana dashboards to visualize key signals and optimize LLM performance.
11
13
Article
Foojay.io·2y
Continuous Feedback Free Udemy Course: Additional Coupons Available
Extra coupons are available for the free Continuous Feedback Udemy course, which focuses on analyzing observability data like logs, traces, and metrics. The course covers tracing fundamentals, using observability data for code improvement, and OSS tools for runtime analysis. It's aimed at Java/Kotlin developers looking to enhance their development skills.
10
14
Article
Last9·2y
The most important aspect of software monitoring
Successful software monitoring hinges on culture and collaboration, bolstered by good instrumentation. Instrumentation involves adding code and tools to capture detailed information about an app's behavior, crucial for diagnosing and optimizing performance. Key challenges include incomplete coverage, manual effort, and lack of standardization. To improve, use automatic instrumentation tools, standardize practices, ensure lightweight performance, and treat it as an ongoing process. While automatic instrumentation offers benefits, it requires careful management to avoid excess data and costs. Consulting experts can help manage this complexity.
10
1

See all Observability archives