Quickstart

PrivateGPT connects to any OpenAI-compatible LLM server and exposes a private, self-hosted AI API. This guide gets you from zero to a running server in four steps.

Prerequisites: You need an OpenAI-compatible LLM server running locally. Pick one from the Providers page — Ollama is the easiest way to start.

1

Install PrivateGPT

$# Install uv first
$curl -LsSf https://astral.sh/uv/install.sh | sh
$
$# Then install PrivateGPT
$uv tool install --python 3.11 \
> --find-links https://wheels.privategpt.dev/packages/ \
> "private-gpt[core]"
2

Start your LLM server

Start your server. PrivateGPT auto-discovers all available models on startup.

$# Example: pull a model and start the server
$ollama pull qwen3.5:35b # LLM (~24 GB)
$ollama pull mxbai-embed-large # Embeddings (~670 MB)
$
$# Start the server (runs on port 11434)
$ollama serve

Ollama does not expose a tokenizer endpoint. PrivateGPT falls back to approximate token counting, which may affect context-window management. See Ollama limitations.

3

Run PrivateGPT

Point PrivateGPT at your servers with OPENAI_API_BASE and OPENAI_EMBEDDING_API_BASE. Models are discovered automatically — no config file needed.

$OPENAI_API_BASE=http://localhost:<llm-port>/v1 \
> OPENAI_EMBEDDING_API_BASE=http://localhost:<embedding-port>/v1 \
> private-gpt serve

If startup succeeds, PrivateGPT will be available on port 8080.

4

Open the UI

Navigate to http://localhost:8080/ui in your browser.

The API is available at http://localhost:8080 and follows the Anthropic API spec. See the API Reference for all endpoints.


What’s next?

If you plan to use database querying or web search tools, review the dependency guides in Database Tools and Web Tools to install the required drivers, OS libraries, and browser dependencies.