Key Processes
Documents in R2R support several key stages of processing:- Ingestion — Accepts multiple input formats (
.pdf,.docx,.txt,.png,.mp3, etc.) via file upload, raw text, or predefined chunks. - Chunking — Splits document content into smaller, retrievable Chunks for semantic search and analysis.
- Metadata & Collections — Associates documents with descriptive metadata (e.g., title, source) and organizes them into Collections for access control and sharing.
- Enrichment (Optional) — Extracts Entities and Relationships to build knowledge graphs or generates embeddings for semantic search.
- Status Tracking — Monitors ingestion, enrichment, and extraction progress for transparency and error handling.
API Endpoints
| Method | Endpoint | Description |
|---|---|---|
POST | /documents | Ingest new information (file, text, or chunks) as a document. |
GET | /documents | List existing documents with pagination and filtering. |
GET | /documents/ | Retrieve metadata, ingestion status, or details for a specific document. |
GET | /documents//download | Download the original source file of a document. |
GET | /documents//chunks | List the text Chunks generated from a document’s content. |
PATCH | /documents//metadata | Add or update metadata for a document. |
PUT | /documents//metadata | Replace all metadata for a document. |
DELETE | /documents/ | Delete a document and its associated data. |
DELETE | /documents/by-filter | Delete multiple documents that match a filter. |
POST | /documents/search | Search across generated document summaries. |
GET | /documents/download_zip | Download multiple original document files as a zip archive. |
POST | /documents//extract | Start entity and relationship extraction for a document. |
GET | /documents//entities | List Entities identified within a document. |
GET | /documents//relationship | List Relationships identified within a document. |
POST | /documents//deduplicate | Start entity deduplication for a document. |