Best of ELK2024

  1. 1
    Article
    Avatar of hnHacker News·2y

    Making a Postgres query 1,000 times faster

    The author shares their journey of optimizing a Postgres query to make it 1,000 times faster. They discovered that the query was taking longer and longer each time it was executed due to processing all rows in the table and the use of a filter instead of an index condition. By using row constructor comparisons, they were able to significantly improve the query performance.

  2. 2
    Article
    Avatar of freecodecampfreeCodeCamp·1y

    Learn Elasticsearch with a Comprehensive Beginner-Friendly Course

    Master search functionality in modern applications by learning Elasticsearch. This beginner-friendly course on freeCodeCamp.org's YouTube channel covers Elasticsearch fundamentals such as index management, document storage, text analysis, and search API. You'll also dive into advanced topics like semantic search and pipelines. Apply your skills in a real-world project by building a search engine for NASA's Astronomy Picture of the Day dataset. The 5-hour course is practical, accessible, and ideal for developers, data scientists, and tech enthusiasts.

  3. 3
    Article
    Avatar of bytebytegoByteByteGo·1y

    How Tinder Recommends To 75 Million Users with Geosharding

    Tinder has improved its recommendation engine for over 75 million users by implementing geosharding, where user data is divided into geographically bound shards. This approach enhances performance, reduces latency, and improves scalability. The system leverages tools like Google's S2 Library and Apache Kafka, and addresses consistency challenges and traffic imbalances by using smart load balancing and dynamic adjustments. As a result, Tinder can manage 20 times more computations efficiently while maintaining low latency.

  4. 4
    Article
    Avatar of bytebytegoByteByteGo·2y

    EP112: What is a deadlock?

    A deadlock occurs when transactions are waiting for each other to release locks on resources. It can be prevented through resource ordering, timeouts, and the Banker's Algorithm. Database management systems have algorithms for detecting deadlocks and selecting victims.

  5. 5
    Article
    Avatar of java_libhuntAwesome Java Newsletter·2y

    Structured logging in Spring Boot 3.4

    Structured logging in Spring Boot 3.4 allows logs to be written in well-defined, machine-readable formats such as JSON. This enables powerful search and analytics capabilities. It supports the Elastic Common Schema (ECS) and Logstash formats and allows for custom formats. Developers can add additional fields to logs for better filtering and analysis. Logs can be output to the console or written to a file for different use cases.

  6. 6
    Article
    Avatar of newstackThe New Stack·2y

    Elasticsearch Was Great, But Vector Databases Are the Future

    Keyword matching, represented by Elasticsearch, has been the standard for information retrieval systems. However, as AI-powered semantic search technology advances, vector databases are becoming central to a new era of search. Combining both approaches, hybrid search uses a mix of vector and traditional search methods, balancing semantic relevance with exact keyword matching. Milvus is highlighted as a vector database offering efficiencies and performance improvements over Elasticsearch, particularly in handling dense and sparse vectors. This unified approach simplifies infrastructure and enhances search capabilities, making vector databases a promising solution for future advanced search needs.

  7. 7
    Article
    Avatar of infosecwriteupsInfoSec Write-ups·2y

    Building an Integrated Threat Intelligence Platform Using Python and Kibana

    The post discusses the creation of a comprehensive Threat Intelligence Platform (TIP) using Python, Elasticsearch, and Kibana. Key features include breach monitoring, subdomain enumeration, phishing domain detection, GitHub leak searches, IOCs integration, dark web monitoring, and HTTP header analysis. The system uses Python scripts for data collection, Elasticsearch for data storage, and Kibana for visualization. The post emphasizes ethical considerations, including data privacy, legality, and secure coding practices.

  8. 8
    Article
    Avatar of elasticelastic·2y

    Elasticsearch is open source, again

    Elasticsearch and Kibana are now open source again with the addition of the AGPL license option. Elastic believes this move will reduce confusion and strengthen their open-source commitment. The decision comes three years after changing the license due to issues with AWS, which ultimately resolved market confusion and bolstered the AWS partnership. Existing licenses (ELv2 and SSPL) remain in place, providing more choices for users.

  9. 9
    Article
    Avatar of trendyoltechTrendyol Tech·2y

    Turning Millions of Kafka Events Into Meaningful Reports for Sellers

    Trendyol’s Export Center developed the Seller Reporting API to transform Kafka event data into insightful reports for sellers. They used Elasticsearch for data storage and effective Date Histogram Aggregation to handle time-based data. The implementation involved creating a system to index order events and querying the indexed data to create detailed reports. These reports cater to sellers' needs for data over various periods, comparing sales across different regions and currencies.

  10. 10
    Video
    Avatar of codeheimcodeHeim·2y

    #46 Golang - Full-Text Search with Elasticsearch with Golang

    Learn how to integrate Elasticsearch with a Golang application for full-text search. The guide walks through the setup of an Elasticsearch client, indexing documents, and executing search queries using Golang. It also includes examples and code snippets to help you implement search functionality in a Gin Gonic-based application.

  11. 11
    Article
    Avatar of netflixNetflix TechBlog·2y

    Introducing Netflix TimeSeries Data Abstraction Layer

    Netflix has introduced a TimeSeries Data Abstraction Layer designed to handle vast amounts of temporal data with millisecond access latency. Key features include efficient data partitioning, flexible storage integration (using Apache Cassandra and Elasticsearch), and scalability to manage high-throughput, immutable temporal event data. This abstraction layer optimizes storage and query efficiency, addressing issues like global read/write operations, tunable configurations, bursty traffic management, and cost efficiency. It plays a vital role in various Netflix services like user interaction tracking, feature rollout analysis, and asset impression tracking.

  12. 12
    Video
    Avatar of dreamsofcodeDreams of Code·2y

    ElasticSearch returning to open source is a big deal.

    ElasticSearch has announced a return to open source, transitioning to the AGPL license. This move reverses their 2021 decision to adopt a dual license model due to concerns about AWS's business practices. The shift signals a potential end to the recent trend of companies moving away from open source and could indicate a market shift. Despite positive sentiments from open-source enthusiasts, the company's stock has dropped by 25% in after-hours trading.

  13. 13
    Article
    Avatar of phProduct Hunt·2y

    RepoCloud - One-click deploy 200+ SaaS alternatives, elastic autoscaling

    RepoCloud.io offers 1-click deployments of over 200 popular open-source SaaS alternative applications with elastic autoscaling and lower costs compared to major cloud hosts.

  14. 14
    Article
    Avatar of trendyoltechTrendyol Tech·1y

    Optimizing Elasticsearch with Custom Routing and Handling Routing Value Changes

    Optimizing the integration between Couchbase and Elasticsearch, this piece discusses implementing custom routing to improve search performance. It highlights the benefits of routing queries based on itemNumbers, which reduces query scope, speeds up search operations, and efficiently uses resources. The post explains how to handle changes to itemNumbers, ensuring documents are correctly routed. Key insights from load testing reveal significant performance enhancements with custom routing, achieving faster query response times and higher query volumes.

  15. 15
    Article
    Avatar of hnHacker News·2y

    Full Text Search over Postgres: Elasticsearch vs. Alternatives

    In the quest for a full text search (FTS) solution over data in Postgres, companies often compare Elasticsearch and native Postgres FTS. Postgres FTS is simple, requires no additional infrastructure, and excels at real-time search but falls short in features and performance over large datasets. Conversely, Elasticsearch offers a comprehensive feature set and high performance but involves significant operational overhead and costs. Alternatives like Algolia, Meilisearch, and Typesense provide specialized solutions but aren't tailored for Postgres. A new contender, ParadeDB, aims to combine the benefits of both approaches by providing advanced FTS capabilities within Postgres.

  16. 16
    Article
    Avatar of bigdataboutiqueBigData Boutique blog·2y

    OpenSearch Data Migration from Elasticsearch - The Guide

    Learn how to migrate from Elasticsearch to OpenSearch with minimal downtime and no data loss. This guide covers upgrading your Elasticsearch version, setting up your OpenSearch cluster, checking plugin compatibility, backing up data, and planning the transition period. It also discusses methods for data migration, ensuring data integrity, and post-migration tasks including verifying data accuracy and updating applications to work with OpenSearch.

  17. 17
    Article
    Avatar of baeldungBaeldung·2y

    Logstash vs. Kafka

    Logstash and Kafka are powerful tools for managing real-time data streams, with Logstash specializing in data processing and Kafka excelling in distributed event streaming. Logstash is ideal for transforming log data and forwarding it to various outputs, while Kafka is designed for high-throughput, fault-tolerant message delivery. This post provides an in-depth comparison of their components, command-line examples, and discusses how they can work together to build robust data pipelines.

  18. 18
    Article
    Avatar of bigdataboutiqueBigData Boutique blog·2y

    Elasticsearch and OpenSearch Query Limits

    Elasticsearch and OpenSearch are robust search and analytics engines with several query limits to ensure performance and resource efficiency. They include limits on result size, max clause count, field data, query throughput, and complex joins. Understanding and configuring these limits is essential to maintain optimal performance and avoid errors.

  19. 19
    Article
    Avatar of gcgitconnected·2y

    Python and LLM for Stock Market Analysis Part IV — ElasticSearch for Stock Symbol/Ticker accuracy

    This post discusses the use of ElasticSearch for obtaining accurate stock symbols/tickers in stock market analysis. It explains the limitations of using LLM/NLP models alone and introduces ElasticSearch as an alternative. It also provides a step-by-step guide for setting up ElasticSearch and indexing stock data, as well as integrating it with Yahoo Finance API for symbol lookup. The post highlights the benefits of using ElasticSearch's fuzzy search feature and addresses potential issues with symbol identification.

  20. 20
    Article
    Avatar of bigdataboutiqueBigData Boutique blog·2y

    Elasticsearch Performance and Cost Efficiency on Elastic Cloud and On-Prem

    Discover essential strategies to optimize Elasticsearch performance and cost efficiency for both Elastic Cloud and on-premises deployments. Key tactics include scaling up vs. scaling out, data tiering, continuous monitoring of critical metrics, efficient shard distribution, and advanced query optimization techniques. Participants in a recent webinar hosted by BigData Boutique and Elastic learned how to enhance their Elasticsearch setups for optimal performance and cost-effectiveness.

  21. 21
    Article
    Avatar of towardsdevTowards Dev·2y

    Getting Started with a Basic Elastic SIEM Lab: A Step-by-Step Guide

    Setting up a basic SIEM lab using Elastic on a Kali VM can help entry-level professionals gain practical experience in log management and security monitoring. Key steps include creating an Elastic account, setting up a Kali VM, collecting logs, performing Nmap scans, creating dashboards, and establishing alerts. This hands-on approach helps in understanding network traffic and identifying potential threats effectively.

  22. 22
    Article
    Avatar of infoqInfoQ·2y

    Netflix Uses Elasticsearch Percolate Queries to Implement Reverse Searches Efficiently

    Netflix engineers use Elasticsearch Percolate Queries to implement reverse searches efficiently, allowing dynamic subscription and notification scenarios without direct associations between subscribers and entities.