Supported Models for Image ProcessingDocumentation Index
Fetch the complete documentation index at: https://io.net/docs/llms.txt
Use this file to discover all available pages before exploring further.
| Model Name | Capabilities |
|---|---|
| meta-llama/Llama-3.2-90B-Vision-Instruct | Multi-modal vision model supporting image understanding. |
| Qwen/Qwen2-VL-7B-Instruct | Supports both text and image-based inputs for AI interactions. |
Sending an Image via API Request
The API allows two methods to send an image:- Passing an Image URL (recommended for publicly hosted images)
- Sending a Base64 Encoded Image (for local images)
The image URL must be publicly accessible. Private or authentication-required URLs will not work.
Image Input Requirements
To ensure successful processing, images must meet the following requirements:| Requirement | Details |
|---|---|
| Format | JPEG, PNG, WEBP, or GIF (static) |
| Max File Size | 20MB |
| Resolution | At least 512x512 pixels (recommended) |
| Max Dimensions | 4096×4096 pixels |
| Accessibility | If using a URL, ensure it is publicly accessible |
| Multi-Image Support | Up to 10 images per request |
- Optimize File Size: While the maximum limit is 20MB, smaller files (1-5MB) ensure faster processing.
- Use Clear Images: Avoid blurry or low-resolution images for better AI analysis.
- Ensure Public URLs: If passing a URL, test it in a browser to confirm that it is accessible.
Expected API Response
Upon successful submission, the API will return a structured response with AI-generated insights based on the image. Example Response:Common Issues & Troubleshooting
| Issue | Possible Cause | Solution |
|---|---|---|
| ”An image? I’m in text format, so I can’t see it…” | Model does not support image input. | Ensure you are using one of the supported vision models. |
| ”Invalid image format” | Image not encoded properly. | Convert image to base64 before sending. |
| ”Unauthorized” | API key is missing or incorrect. | Check that your API key is valid and correctly formatted. |