Given a list of messages comprising a conversation, return a response.
Optionally include an initial role: system
message to influence the way
the LLM answers.
If use_context
is set to true
, the model will use context coming
from the ingested documents to create the response. The documents being used can
be filtered using the context_filter
and passing the document IDs to be used.
Ingested documents IDs can be found using /ingest/list
endpoint. If you want
all ingested documents to be used, remove context_filter
altogether.
When using 'include_sources': true
, the API will return the source Chunks used
to create the response, which come from the context provided.
When using 'stream': true
, the API will return data chunks following OpenAI’s
streaming model:
Inference result, with the source of the message.
Role could be the assistant or system (providing a default response, not AI generated).
Response from AI.
Either the delta or the message will be present, but never both. Sources used will be returned in case context retrieval was enabled.