- Combines vector search, optional knowledge graph integration, and LLM generation
- Automatically cites sources with unique citation identifiers
- Supports both streaming and non-streaming responses
- Compatible with various LLM providers (OpenAI, Anthropic, etc.)
- Web search integration for up-to-date information
rag_generation_config
:
- OpenAI models (default)
- Anthropic Claude models (requires ANTHROPIC_API_KEY)
- Local models via Ollama
- Any provider supported by LiteLLM
stream: true
is set, the endpoint returns Server-Sent Events with the following types:
search_results
: Initial search results from your documentsmessage
: Partial tokens as they’re generatedcitation
: Citation metadata when sources are referencedfinal_answer
: Complete answer with structured citations
Authorizations
The access token received from the authorization server in the OAuth 2.0 flow.
Body
The user's question
Default value of custom allows full control over search settings. Pre-configured search modes: basic: A simple semantic-based search. advanced: A more powerful hybrid search combining semantic and full-text. custom: Full control via search_settings. If filters or limit are provided alongside basic or advanced, they will override the default settings for that mode.
basic
, advanced
, custom
The search configuration object. If search_mode is custom, these settings are used as-is. For basic or advanced, these settings will override the default mode configuration. Common overrides include filters to narrow results and limit to control how many results are returned.
Configuration for RAG generation
Optional custom prompt to override default
Include document titles in responses when available
Include web search results provided to the LLM.
Response
200
"value"