GitHub - antoinezambelli/forge: A Python framework for self-hosted LLM tool-calling and multi-step agentic workflows

Forge is a Python framework (forge-guardrails on PyPI) that adds a reliability layer to self-hosted LLM tool-calling and multi-step agentic workflows. It provides guardrails (rescue parsing, retry nudges, step enforcement), VRAM-aware context management, and tiered compaction to boost small local models (~8B) to competitive performance. Three usage modes are offered: a WorkflowRunner for full agentic loops, composable guardrails middleware for custom orchestration, and a drop-in OpenAI-compatible proxy server. Supported backends include Ollama, llama-server (llama.cpp), Llamafile, and Anthropic. The top self-hosted config (Ministral-3 8B Instruct Q8 on llama-server) scores 86.5% on forge's 26-scenario eval suite. The framework is backed by a published IEEE paper with an ablation study.

#python

#agentic-ai

May 19•6m read time•From github.com

Table of contents

Requirements Install Quick Start Proxy Server Backends Running Tests Eval Harness Project Structure Documentation Paper License

Comment

Bookmark

Copy

Sort: