A Document in R2R is the system’s digital representation of any ingested content—including PDFs, text files, web pages, images, and audio. It serves as the central container for all downstream data objects such as Chunks, Entities, and Relationships, forming the foundation for R2R’s knowledge processing pipeline. Documents transform raw content into structured, searchable, and analyzable knowledge that powers Retrieval-Augmented Generation (RAG) and agentic workflows.Documentation Index
Fetch the complete documentation index at: https://io.net/docs/llms.txt
Use this file to discover all available pages before exploring further.
Key Processes
Documents in R2R support several key stages of processing:- Ingestion — Accepts multiple input formats (
.pdf,.docx,.txt,.png,.mp3, etc.) via file upload, raw text, or predefined chunks. - Chunking — Splits document content into smaller, retrievable Chunks for semantic search and analysis.
- Metadata & Collections — Associates documents with descriptive metadata (e.g., title, source) and organizes them into Collections for access control and sharing.
- Enrichment (Optional) — Extracts Entities and Relationships to build knowledge graphs or generates embeddings for semantic search.
- Status Tracking — Monitors ingestion, enrichment, and extraction progress for transparency and error handling.
API Endpoints
| Method | Endpoint | Description |
|---|---|---|
POST | /documents | Ingest new information (file, text, or chunks) as a document. |
GET | /documents | List existing documents with pagination and filtering. |
GET | /documents/ | Retrieve metadata, ingestion status, or details for a specific document. |
GET | /documents//download | Download the original source file of a document. |
GET | /documents//chunks | List the text Chunks generated from a document’s content. |
PATCH | /documents//metadata | Add or update metadata for a document. |
PUT | /documents//metadata | Replace all metadata for a document. |
DELETE | /documents/ | Delete a document and its associated data. |
DELETE | /documents/by-filter | Delete multiple documents that match a filter. |
POST | /documents/search | Search across generated document summaries. |
GET | /documents/download_zip | Download multiple original document files as a zip archive. |
POST | /documents//extract | Start entity and relationship extraction for a document. |
GET | /documents//entities | List Entities identified within a document. |
GET | /documents//relationship | List Relationships identified within a document. |
POST | /documents//deduplicate | Start entity deduplication for a document. |