Overview

What Is RAG on R2R?

Retrieval + Generation Workflow: RAG operates in two phases:
- Retrieve relevant document chunks via semantic search or knowledge graph lookup.
- Generate coherent and contextually accurate responses using those retrieved chunks and your customized prompts.
Why It Matters: By tapping into real, structured, or unstructured content, RAG systems produce answers grounded in facts, avoiding hallucinations and improving trustworthiness.

Core Components

Component	Description
Documents & Chunks	Ingested files or text are segmented into Chunks—the basis for retrieval.
Indices	Vector indices enable fast similarity search over chunk embeddings.
Graphs	Knowledge graph extracts relationships and entities, enabling intelligent navigation of concepts.
Prompts	Prompt templates shape the generation step, with type-safe inputs and version control.
System Endpoints	Provide health checks, diagnostics, and monitoring for your RAG pipeline.

Getting Started

To get started with the R2R API, you’ll need to:

Install R2R in your environment
Run the server with python -m r2r.serve, or customize your FastAPI for production settings.

For detailed installation and setup instructions, please refer to R2R Installation Guide.

Authentication

API keys

IO Intelligence APIs authenticate requests using API keys. You can generate API keys from your user account:

Always treat your API key as a secret! Do not share it or expose it in client-side code (e.g., browsers or mobile apps). Instead, store it securely in an environment variable or a key management service on your backend server.

Include the API key in an Authorization HTTP header for all API requests:

Authorization: Bearer \$IOINTELLIGENCE_API_KEY

Examples for RAG Workflows

Step 1: Search for relevant chunks (Retrieval)

curl -X POST https://api.intelligence.io.solutions/api/r2r/v3/retrieval/search /
  -H "Authorization: Bearer \$IOINTELLIGENCE_API_KEY" /
  -H "Content-Type: application/json" /
  -d '{
    "query": "What is Retrieval-Augmented Generation?",
    "top_k": 5
}'

Step 2: Generate a response using RAG

Assuming you’ve retrieved relevant chunks and want to pass them as context:

curl -X POST https://api.intelligence.io.solutions/api/r2r/v3/rag/generate /
  -H "Authorization: Bearer \$IOINTELLIGENCE_API_KEY" /
  -H "Content-Type: application/json" /
  -d '{
    "prompt_name": "default_rag",
    "inputs": {
      "query": "What is Retrieval-Augmented Generation?",
      "context": "Chunk 1 text/nChunk 2 text/nChunk 3 text" //it's just example chunk text
    }
}'

Token Quotas & Usage

Each account has daily usage limits based on model and request volume. Check the IO Intelligence API Quotas for up-to-date information.

Next Steps

Explore the API reference for detailed guides:

Retrieval – perform semantic and hybrid search across ingested data
Documents – management and metadata
Graphs – entity extraction and knowledge graphs
Indices – create and configure embeddings
Chunks – ingest, list, search
Users – manage API users, authentication, and access control
Collections – group related documents and control indexing scope
Conversations – manage chat sessions, history, and context retention
Prompts – template definition and versioning
System – health and diagnostics

IO API

IO Intelligence

RAG API

IO Cloud

What Is RAG on R2R?

Core Components

Getting Started

Authentication

API keys

Examples for RAG Workflows

Step 1: Search for relevant chunks (Retrieval)

Step 2: Generate a response using RAG

Token Quotas & Usage

Next Steps

IO API

IO Intelligence

RAG API

IO Cloud

​What Is RAG on R2R?

​Core Components

​Getting Started

​Authentication

​API keys

​Examples for RAG Workflows

​Step 1: Search for relevant chunks (Retrieval)

​Step 2: Generate a response using RAG

​Token Quotas & Usage

​Next Steps

What Is RAG on R2R?

Core Components

Getting Started

Authentication

API keys

Examples for RAG Workflows

Step 1: Search for relevant chunks (Retrieval)

Step 2: Generate a response using RAG

Token Quotas & Usage

Next Steps