The R2R RAG Query endpoint performs Retrieval-Augmented Generation by combining semantic search, graph-enhanced context, and LLM generation. It supports streaming, source citation, and multiple model providers for contextually accurate AI responses.
| Provider | Description |
|---|---|
| OpenAI | Default provider supporting GPT-based models (gpt-4o, gpt-4o-mini, etc.). |
| Anthropic | Supports Claude models (requires ANTHROPIC_API_KEY). |
| Ollama | Enables local model execution via Ollama runtime. |
| LiteLLM | Provides access to additional supported model providers. |
/search endpoint can be reused here, including filters, hybrid search, and graph-enhanced retrieval.
rag_generation_config object.
Example:
model: Specifies the model used for generation.temperature: Controls output randomness (0 for deterministic, 1 for creative).max_tokens: Sets maximum output length.stream: Enables or disables token streaming for real-time responses.stream: true is enabled, the API emits Server-Sent Events (SSE) during processing.| Event Type | Description |
|---|---|
search_results | Contains the initial search results from your documents. |
message | Streams partial tokens as the model generates them. |
citation | Emits citation metadata when a source is referenced. |
final_answer | Contains the complete, generated response with structured citations. |
The access token received from the authorization server in the OAuth 2.0 flow.
The user's question
Default value of custom allows full control over search settings. Pre-configured search modes: basic: A simple semantic-based search. advanced: A more powerful hybrid search combining semantic and full-text. custom: Full control via search_settings. If filters or limit are provided alongside basic or advanced, they will override the default settings for that mode.
basic, advanced, custom The search configuration object. If search_mode is custom, these settings are used as-is. For basic or advanced, these settings will override the default mode configuration. Common overrides include filters to narrow results and limit to control how many results are returned.
Configuration for RAG generation
Optional custom prompt to override default
Include document titles in responses when available
Include web search results provided to the LLM.
200
"value"