Quickstart
PrivateGPT connects to any OpenAI-compatible LLM server and exposes a private, self-hosted AI API. This guide gets you from zero to a running server in four steps.
Start your LLM server
Start your server. PrivateGPT auto-discovers all available models on startup.
Ollama
LM Studio
LlamaCPP Server
vLLM
Ollama does not expose a tokenizer endpoint. PrivateGPT falls back to approximate token counting, which may affect context-window management. See Ollama limitations.
Run PrivateGPT
Point PrivateGPT at your servers with OPENAI_API_BASE and OPENAI_EMBEDDING_API_BASE. Models are discovered automatically — no config file needed.
macOS / Linux
Windows (PowerShell)
Windows (CMD)
If startup succeeds, PrivateGPT will be available on port 8080.
Open the UI
Navigate to http://localhost:8080/ui in your browser.
The API is available at http://localhost:8080 and follows the Anthropic API spec. See the API Reference for all endpoints.
What’s next?
If you plan to use database querying or web search tools, review the dependency guides in Database Tools and Web Tools to install the required drivers, OS libraries, and browser dependencies.
Run PrivateGPT with Docker for a fully isolated, production-ready setup.
Install from source with core, add extras only when needed, and use detailed model configuration.
Compare Ollama, LM Studio, LlamaCPP, and vLLM — feature matrix and limitations.
Explore all REST endpoints and start building your application.

