LM Studio | PrivateGPT

LM Studio is a desktop application for discovering, downloading, and running GGUF models locally. Its built-in local server exposes an OpenAI-compatible API with full tokenizer support.

Capabilities with PrivateGPT

Capability	Status
Model discovery (`/v1/models`)	✅
Tokenizer endpoint (`/tokenize`)	✅
Embeddings	✅
Tool / function calling	✅ model-dependent
Structured output	❌
Streaming	✅
Vision / image input	✅ model-dependent

Setup

Install LM Studio

Download and install from lmstudio.ai. Available for macOS, Windows, and Linux.

Download models

Open LM Studio.
Go to the Discover tab (magnifying glass icon).
Search for a model. Example:
- LLM: search unsloth Qwen3.5-35B-A3B and pick a Q4 quantization (~18 GB)
- Embeddings: search mxbai-embed-large
Click the model and select a quantization (Q4_K_M is a good default).
Click Download.

Start the local server

Click the Developer tab (left sidebar, </> icon).
Select your downloaded model from the dropdown.
Click Start Server.

The default server address is http://localhost:1234.

To serve an embeddings model simultaneously, scroll down in the Developer panel and load a second model under “Embedding model”.

Run PrivateGPT

Package install

Docker

uv (local)

$ OPENAI_API_BASE=http://localhost:1234/v1 private-gpt serve

Advanced profile example

1 # settings-model.yaml
2 llm:
3   default_model: qwen3-35b-a3b-q4_k_m
4 
5 embedding:
6   default_model: mxbai-embed-large-v1
7 
8 models:
9   - name: qwen3-35b-a3b-q4_k_m
10     type: llm
11     mode: openai
12     context_window: 32768
13     tokenizer: Qwen/Qwen3.5-35B-A3B
14     support_tools: true
15     support_reasoning: true
16     sampling_params:
17       temperature: 0.6
18       top_p: 0.95
19       top_k: 20
20       min_p: 0.0
21 
22   - name: mxbai-embed-large-v1
23     type: embedding
24     mode: openai
25     context_window: 512

Generate this automatically (with LM Studio server running):

$ OPENAI_API_BASE=http://localhost:1234/v1 \
>   uv run python scripts/auto_discover_models.py --out settings-model.yaml

Troubleshooting

CORS errors from the browser

Enable CORS in LM Studio: Developer → Server Settings → Enable CORS.

Model name doesn’t match

LM Studio uses the file name as the model ID. Check the exact name with:

$ curl http://localhost:1234/v1/models

Use the id field from the response as the model name in your profile.