PrivateGPT connects to any OpenAI-compatible LLM server and exposes a private, self-hosted AI API. This guide gets you from zero to a running server in four steps.
Start your server. PrivateGPT auto-discovers all available models on startup.
Ollama does not expose a tokenizer endpoint. PrivateGPT falls back to approximate token counting, which may affect context-window management. See Ollama limitations.
Point PrivateGPT at your servers with OPENAI_API_BASE and OPENAI_EMBEDDING_API_BASE. Models are discovered automatically — no config file needed.
If startup succeeds, PrivateGPT will be available on port 8080.
Navigate to http://localhost:8080/ui in your browser.
The API is available at http://localhost:8080 and follows the Anthropic API spec. See the API Reference for all endpoints.
Run PrivateGPT with Docker for a fully isolated, production-ready setup.
Install from source with core, add extras only when needed, and use detailed model configuration.
Compare Ollama, LM Studio, LlamaCPP, and vLLM — feature matrix and limitations.
Explore all REST endpoints and start building your application.