Given a text, returns the most relevant chunks from the ingested documents.
The returned information can be used to generate prompts that can be
passed to /completions or /chat/completions APIs. Note: it is usually a very
fast API, because only the Embeddings model is involved, not the LLM. The
returned information contains the relevant chunk text together with the source
document it is coming from. It also contains a score that can be used to
compare different results.
The max number of chunks to be returned is set using the limit param.
Previous and next chunks (pieces of text that appear right before or after in the
document) can be fetched by using the prev_next_chunks field.
The documents being used can be filtered using the context_filter and passing
the document IDs to be used. Ingested documents IDs can be found using
/ingest/list endpoint. If you want all ingested documents to be used,
remove context_filter altogether.