Quickstart | PrivateGPT

PrivateGPT connects to any OpenAI-compatible LLM server and exposes a private, self-hosted AI API. This guide gets you from zero to a running server in four steps.

Prerequisites: You need an OpenAI-compatible LLM server running locally. Pick one from the Providers page — Ollama is the easiest way to start.

Install PrivateGPT

Linux

macOS

Windows

$ # Install uv first
$ curl -LsSf https://astral.sh/uv/install.sh | sh
$ 
$ # Then install PrivateGPT
$ uv tool install --python 3.11 \
>   --find-links https://wheels.privategpt.dev/packages/ \
>   "private-gpt[core]"

Start your LLM server

Start your server. PrivateGPT auto-discovers all available models on startup.

Ollama

LM Studio

LlamaCPP Server

vLLM

$ # Example: pull a model and start the server
$ ollama pull qwen3.5:35b          # LLM (~24 GB)
$ ollama pull mxbai-embed-large   # Embeddings (~670 MB)
$ 
$ # Start the server (runs on port 11434)
$ ollama serve

Ollama does not expose a tokenizer endpoint. PrivateGPT falls back to approximate token counting, which may affect context-window management. See Ollama limitations.

Run PrivateGPT

Point PrivateGPT at your servers with OPENAI_API_BASE and OPENAI_EMBEDDING_API_BASE. Models are discovered automatically — no config file needed.

macOS / Linux

Windows (PowerShell)

Windows (CMD)

$ OPENAI_API_BASE=http://localhost:<llm-port>/v1 \
>   OPENAI_EMBEDDING_API_BASE=http://localhost:<embedding-port>/v1 \
>   private-gpt serve

If startup succeeds, PrivateGPT will be available on port 8080.

Open the UI

Navigate to http://localhost:8080/ui in your browser.

The API is available at http://localhost:8080 and follows the Anthropic API spec. See the API Reference for all endpoints.

What’s next?

If you plan to use database querying or web search tools, review the dependency guides in Database Tools and Web Tools to install the required drivers, OS libraries, and browser dependencies.

Docker install

Run PrivateGPT with Docker for a fully isolated, production-ready setup.

Local with uv

Install from source with core, add extras only when needed, and use detailed model configuration.

Inference Providers

Compare Ollama, LM Studio, LlamaCPP, and vLLM — feature matrix and limitations.

API Reference

Explore all REST endpoints and start building your application.