Getting started

Main Concepts

PrivateGPT is a service that wraps a set of AI RAG primitives in a comprehensive set of APIs providing a private, secure, customizable and easy to use GenAI development framework.

It uses FastAPI and LLamaIndex as its core frameworks. Those can be customized by changing the codebase itself.

It supports a variety of LLM providers, embeddings providers, and vector stores, both local and remote. Those can be easily changed without changing the codebase.

Different Setups support

Setup configurations available

You get to decide the setup for these 3 main components:

  • LLM: the large language model provider used for inference. It can be local, or remote, or even OpenAI.
  • Embeddings: the embeddings provider used to encode the input, the documents and the users’ queries. Same as the LLM, it can be local, or remote, or even OpenAI.
  • Vector store: the store used to index and retrieve the documents.

There is an extra component that can be enabled or disabled: the UI. It is a Gradio UI that allows to interact with the API in a more user-friendly way.

Setups and Dependencies

Your setup will be the combination of the different options available. You’ll find recommended setups in the installation section. PrivateGPT uses poetry to manage its dependencies. You can install the dependencies for the different setups by running poetry install --extras "<extra1> <extra2>...". Extras are the different options available for each component. For example, to install the dependencies for a a local setup with UI and qdrant as vector database, Ollama as LLM and HuggingFace as local embeddings, you would run

poetry install --extras "ui vector-stores-qdrant llms-ollama embeddings-huggingface".

Refer to the installation section for more details.

Setups and Configuration

PrivateGPT uses yaml to define its configuration in files named settings-<profile>.yaml. Different configuration files can be created in the root directory of the project. PrivateGPT will load the configuration at startup from the profile specified in the PGPT_PROFILES environment variable. For example, running:

$PGPT_PROFILES=ollama make run

will load the configuration from settings.yaml and settings-ollama.yaml.

  • settings.yaml is always loaded and contains the default configuration.
  • settings-ollama.yaml is loaded if the ollama profile is specified in the PGPT_PROFILES environment variable. It can override configuration from the default settings.yaml

About Fully Local Setups

In order to run PrivateGPT in a fully local setup, you will need to run the LLM, Embeddings and Vector Store locally.

Vector stores

The vector stores supported (Qdrant, ChromaDB and Postgres) run locally by default.

Embeddings

For local Embeddings there are two options:

  • (Recommended) You can use the ‘ollama’ option in PrivateGPT, which will connect to your local Ollama instance. Ollama simplifies a lot the installation of local LLMs.
  • You can use the ‘embeddings-huggingface’ option in PrivateGPT, which will use HuggingFace.

In order for HuggingFace LLM to work (the second option), you need to download the embeddings model to the models folder. You can do so by running the setup script:

$poetry run python scripts/setup

LLM

For local LLM there are two options:

  • (Recommended) You can use the ‘ollama’ option in PrivateGPT, which will connect to your local Ollama instance. Ollama simplifies a lot the installation of local LLMs.
  • You can use the ‘llms-llama-cpp’ option in PrivateGPT, which will use LlamaCPP. It works great on Mac with Metal most of the times (leverages Metal GPU), but it can be tricky in certain Linux and Windows distributions, depending on the GPU. In the installation document you’ll find guides and troubleshooting.

In order for LlamaCPP powered LLM to work (the second option), you need to download the LLM model to the models folder. You can do so by running the setup script:

$poetry run python scripts/setup