pg_infer 1.0.0 is a new PostgreSQL 18+ extension that embeds small transformer language model internals — gate activations, feature labels, learned associations, and embeddings — directly into PostgreSQL as SQL-queryable relations and a custom index access method. Unlike pgvector (which stores user-supplied embeddings) or RAG pipelines (which call external services), pg_infer stores the model itself in WAL-logged 8KB pages and exposes it as a first-class planner operator. The `<~>` operator is index-backed and composes with WHERE, JOIN, aggregation, and partitioning. It targets CPU-only hardware using BitNet b1.58 ternary-weight transformers and OpenBLAS, making inference viable on existing PostgreSQL replica hosts without GPUs. Functions like `describe(entity)`, `walk(prompt)`, and `implies(a, b)` expose the model's learned knowledge directly in SQL. The project builds on the LARQL project's vindex format and gate-KNN algorithm.

5m read timeFrom postgresql.org
Post cover image
Table of contents
Quick exampleWhat pg_infer does that other extensions do notCPU inference, BitNet, and idle-cluster computeA few queries that are uniquely pg_inferAcknowledgementsA note on stability and feedbackLinks

Sort: