Troubleshooting
Downloading Gated and Private Models
Many models are gated or private, requiring special access to use them. Follow these steps to gain access and set up your environment for using these models.
Accessing Gated Models
- Request Access: Follow the instructions provided here to request access to the gated model.
- Generate a Token: Once you have access, generate a token by following the instructions here.
- Set the Token:
Add the generated token to your
settings.yaml
file:Alternatively, set theHF_TOKEN
environment variable:
Tokenizer Setup
PrivateGPT uses the AutoTokenizer
library to tokenize input text accurately. It connects to HuggingFace’s API to download the appropriate tokenizer for the specified model.
Configuring the Tokenizer
- Specify the Model:
In your
settings.yaml
file, specify the model you want to use: - Set Access Token for Gated Models:
If you are using a gated model, ensure the
access_token
is set as mentioned in the previous section. This configuration ensures that PrivateGPT can download and use the correct tokenizer for the model you are working with.
Embedding dimensions mismatch
If you encounter an error message like Embedding dimensions mismatch
, it is likely due to the embedding model and
current vector dimension mismatch. To resolve this issue, ensure that the model and the input data have the same vector dimensions.
By default, PrivateGPT uses nomic-embed-text
embeddings, which have a vector dimension of 768.
If you are using a different embedding model, ensure that the vector dimensions match the model’s output.
In versions below to 0.6.0, the default embedding model was BAAI/bge-small-en-v1.5
in huggingface
setup.
If you plan to reuse the old generated embeddings, you need to update the settings.yaml
file to use the correct embedding model:
Building Llama-cpp with NVIDIA GPU support
Out-of-memory error
If you encounter an out-of-memory error while running llama-cpp
with CUDA, you can try the following steps to resolve the issue:
- Set the next environment:
- Run PrivateGPT:
Give thanks to MarioRossiGithub for providing the following solution.