Contextual Completions

Completion Stream

POST

We recommend most users use our Chat completions API.

Given a prompt, the model will return one predicted completion.

Optionally include a system_prompt to influence the way the LLM answers.

If use_context is set to true, the model will use context coming from the ingested documents to create the response. The documents being used can be filtered using the context_filter and passing the document IDs to be used. Ingested documents IDs can be found using /ingest/list endpoint. If you want all ingested documents to be used, remove context_filter altogether.

When using 'include_sources': true, the API will return the source Chunks used to create the response, which come from the context provided.

When using 'stream': true, the API will return data chunks following OpenAI’s streaming model:

{"id":"12345","object":"completion.chunk","created":1694268190,
"model":"private-gpt","choices":[{"index":0,"delta":{"content":"Hello"},
"finish_reason":null}]}

Request

This endpoint expects an object.
prompt
stringRequired
stream
trueRequired
system_prompt
stringOptional
use_context
booleanOptional
context_filter
objectOptional
include_sources
booleanOptional

Response

This endpoint returns a stream of object
id
string
created
integer
model
any
choices
list of objects

Response from AI.

Either the delta or the message will be present, but never both. Sources used will be returned in case context retrieval was enabled.

object
enumOptional
Allowed values: completioncompletion.chunk