PrivateGPT turns local inference into a real application backend. It sits above any OpenAI-compatible model server and provides the higher-level capabilities modern AI products need: messages, model selection, file ingestion, retrieval with citations, tool use, database querying, CSV and tabular analysis, web search and extraction, MCP, skills, code execution, custom tools, token counting, embeddings, and async workflows.
The goal of the project is to bring a Claude-style application API to your own infrastructure, so you can build private AI products without depending on cloud AI APIs.
Get PrivateGPT running in under 5 minutes.
Startup flow, profiles, and env vars.
Guided examples over the API.
Explore all REST endpoints in detail.
Ready to test PrivateGPT UI.
Join the community.
Running a model locally is only the first step, but it is not enough.
To build useful AI applications you need a set of high-level building blocks. PrivateGPT provides that layer as an open-source API following the Claude API model, covering:
PrivateGPT does not run models itself. It connects to an external OpenAI-compatible inference server via OPENAI_API_BASE. Any server that implements /v1/chat/completions and /v1/models works — local or remote.
PrivateGPT ships with a built-in UI for testing purposes, available at /ui.
PrivateGPT does not replace local inference providers. It complements them.
Projects like Ollama, LM Studio, vLLM, and llama.cpp make it possible to run and serve models locally. They answer the question: how do I run a model?
PrivateGPT answers the next question: how do I build a useful AI application on top of that model?
Use both together: run your model with the inference server you prefer, then use PrivateGPT as the Claude-style backend for your application. See Providers for a comparison of common local inference setups.
Onyx and Open WebUI are valuable projects solving a different problem — they are app-first experiences focused on chat and enterprise search.
PrivateGPT is API-first. It is not trying to be a final workspace or ChatGPT-style interface. It gives developers the standardized local backend underneath those products: a Claude API-compatible layer for messages, files, retrieval, citations, tools, data analysis, MCP, skills, and custom tools.
The lightweight UI included with PrivateGPT exists to help you test the API and explore capabilities. The API is the actual product.
PrivateGPT follows the Claude API model as the clearest reference for modern AI application APIs. The goal is full feature coverage where it makes sense for a local, open-source application layer.
✅ Supported · ⚠️ Partial / in progress · ❌ Not supported
Contributions are especially welcome in areas marked ⚠️.
PrivateGPT is maintained by the team behind Zylon.
PrivateGPT is the open-source application API layer — messages, ingestion, tools, retrieval, citations, database access, tabular analysis, MCP, skills, and custom tools.
Zylon turns that layer into a complete production platform for regulated organizations: integrated inference server, Kubernetes deployment, API gateway, Dev platform, Workspace application for end users, Agents builder, LDAP, SIEM audit logs, SharePoint/Confluence/FTP connectors, and more than 20 production services packaged together.
Use PrivateGPT if you want the open-source local AI application layer and developer API. Use Zylon if you need the full enterprise platform around it. Learn more at zylon.ai or book a demo.