Skip to main content

What Is RAG on R2R?

Retrieval and Generation Workflow:

RAG operates in two phases:
  1. Retrieve relevant document chunks via semantic search or knowledge graph lookup.
  2. Generate coherent and contextually accurate responses using the retrieved chunks and your customized prompts.
  • Why It Matters: By tapping into real, structured, or unstructured content, RAG systems produce answers grounded in facts, avoiding hallucinations and improving trustworthiness.

Core Components

ComponentDescription
Documents & ChunksIngested files or text are segmented into Chunks, the basis for retrieval.
IndicesVector indices enable fast similarity search over chunk embeddings.
GraphsKnowledge graph extracts relationships and entities, enabling intelligent navigation of concepts.
PromptsPrompt templates shape the generation step, with type-safe inputs and version control.
System EndpointsProvide health checks, diagnostics, and monitoring for your RAG pipeline.

Getting Started

To get started with the R2R APIs, you will need to:
  • Install R2R in your environment.
  • Run the server with python -m r2r.serve, or configure FastAPI settings for production use.
For detailed installation and setup instructions, refer to the R2R Installation Guide.

Authentication

API keys

IO Intelligence APIs authenticate requests using API keys. You can generate API keys from your account:
Always treat your API key as a secret. Do not share it or expose it in client-side code (e.g., browsers or mobile apps). Instead, store it securely in an environment variable or a key management service on your backend server.
Include the API key in an Authorization HTTP header for all API requests:
Authorization: Bearer \$IOINTELLIGENCE_API_KEY

Example of a RAG Workflow

Step 1: Retreive relevant chunks

curl -X POST https://api.intelligence.io.solutions/api/r2r/v3/retrieval/search /
  -H "Authorization: Bearer \$IOINTELLIGENCE_API_KEY" /
  -H "Content-Type: application/json" /
  -d '{
    "query": "What is Retrieval-Augmented Generation?",
    "top_k": 5
}'

Step 2: Generate a response

Assuming you have retrieved relevant chunks and want to pass them as context:
curl -X POST https://api.intelligence.io.solutions/api/r2r/v3/rag/generate /
  -H "Authorization: Bearer \$IOINTELLIGENCE_API_KEY" /
  -H "Content-Type: application/json" /
  -d '{
    "prompt_name": "default_rag",
    "inputs": {
      "query": "What is Retrieval-Augmented Generation?",
      "context": "Chunk 1 text/nChunk 2 text/nChunk 3 text" //it's just example chunk text
    }
}'

Token Quotas & Usage

Each account has daily usage limits based on model and request volume. Refer to the IO Intelligence Payments for further information.

Next Steps

Explore the following API references for more detailed guides:
  • Retrieval – Perform semantic and hybrid search across ingested data
  • Documents – Management and metadata of documents.
  • Graphs – Entity extraction and knowledge graphs.
  • Indices – Create and configure embeddings.
  • Chunks – Ingest, list and search documents.
  • Users – Manage API users, authentication, and access control.
  • Collections – Group related documents and control indexing scope.
  • Conversations – Manage chat sessions, history, and context retention.
  • Prompts – Template definition and versioning.
  • System – Health and diagnostics.