Given a text
, returns the most relevant chunks from the ingested documents.
The returned information can be used to generate prompts that can be
passed to /completions
or /chat/completions
APIs. Note: it is usually a very
fast API, because only the Embeddings model is involved, not the LLM. The
returned information contains the relevant chunk text
together with the source
document
it is coming from. It also contains a score that can be used to
compare different results.
The max number of chunks to be returned is set using the limit
param.
Previous and next chunks (pieces of text that appear right before or after in the
document) can be fetched by using the prev_next_chunks
field.
The documents being used can be filtered using the context_filter
and passing
the document IDs to be used. Ingested documents IDs can be found using
/ingest/list
endpoint. If you want all ingested documents to be used,
remove context_filter
altogether.