A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization. - google/langextract

Hacker News is a community-driven platform for sharing and discussing technology news, startups, and programming-related topics. Through user submissions and comments, Hacker News offers insights into emerging technology trends, industry developments, and entrepreneurial ventures. Readers can participate in discussions, share their insights, and stay informed about the latest advancements in technology and innovation.

Hacker News

LangExtract is a Python library by Google that uses Large Language Models to extract structured information from unstructured text documents. It provides precise source grounding by mapping extractions to exact locations in source text, supports various LLMs including Gemini and local models via Ollama, and generates interactive HTML visualizations. The library handles long documents through optimized chunking and parallel processing, requires minimal setup with few-shot examples, and includes specialized applications for medical text processing like medication extraction and radiology report structuring.

google/langextract: A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.