Model Configuration | PrivateGPT

Use a model profile when you need more detailed control over model behavior than auto-discovery provides.

This workflow lets you configure model-specific settings such as:

context_window
tokenizer
tool support
reasoning support
image support
sampling parameters

Use it when you want PrivateGPT to know the exact limits and capabilities of each model, or when you need to override what your provider exposes automatically.

This workflow is supported from the source-based Local with uv install:

Generate settings-model.yaml from your running LLM server.
Edit the generated profile.
Start PrivateGPT with PGPT_PROFILES=model.

Generate a model profile

Generate a profile from the models exposed by your OpenAI-compatible server:

macOS / Linux

Windows (PowerShell)

Windows (CMD)

$ OPENAI_API_BASE=http://localhost:11434/v1 \
>   make auto-discover-models
$ # or directly:
$ OPENAI_API_BASE=http://localhost:11434/v1 \
>   uv run python scripts/auto_discover_models.py --out settings-model.yaml

This creates settings-model.yaml with all discovered models as a starting point for detailed configuration.

Start from Local with uv first. Local tokenizer support requires private-gpt[tokenizer-local] or private-gpt[core].

Edit model settings

Open settings-model.yaml and adjust the fields you care about. This is where you explicitly define how PrivateGPT should treat each model. Example:

1 llm:
2   default_model: qwen3.5:35b
3 
4 embedding:
5   default_model: mxbai-embed-large
6 
7 models:
8   - name: qwen3.5:35b
9     type: llm
10     mode: openai
11     context_window: 32768
12     tokenizer: Qwen/Qwen3.5-35B-A3B
13     support_tools: true
14     support_reasoning: true
15     support_image: 0
16     sampling_params:
17       temperature: 0.6
18       top_p: 0.95
19       top_k: 20
20       min_p: 0.0
21 
22   - name: mxbai-embed-large
23     type: embedding
24     mode: openai
25     context_window: 512

Key fields reference

Field	Description
`context_window`	Maximum tokens the model can process. Set explicitly to avoid overflow.
`support_tools`	Enable function and tool calling. Use the specific tool extra you need, or `private-gpt[tools]` as the bundle fallback. `private-gpt[core]` also includes that bundle.
`tokenizer`	HuggingFace repo ID for exact token counting (for example `Qwen/Qwen3.5-35B-A3B`). Requires `private-gpt[tokenizer-local]` or `private-gpt[core]`. Falls back to a character-based estimate if omitted.
`support_reasoning`	Enable extended thinking or reasoning mode.
`support_image`	Number of images per request the model accepts (`0` = disabled).
`sampling_params.temperature`	Randomness (`0` = deterministic, `1` = more creative).
`sampling_params.top_p`	Nucleus sampling probability mass.

Run with the profile

Once settings-model.yaml exists, start PrivateGPT with PGPT_PROFILES=model.

macOS / Linux

Windows (PowerShell)

Windows (CMD)

$ OPENAI_API_BASE=http://localhost:11434/v1 \
>   PGPT_PROFILES=model \
>   uv run python -m private_gpt

PGPT_PROFILES=model tells PrivateGPT to load settings-model.yaml on top of the base config. Profile files follow the naming convention settings-{name}.yaml.