- Basic: Explore and learn with free, light daily access.
- Professional: Create and collaborate with higher daily usage.
- Developer: Build and deploy continuously with frequent refreshes.
- Pay-As-You-Go: Scale instantly, paying only for what you use. This plan is enabled by default for all users.
Plan Overview
Each plan includes a fixed daily or hourly allowance that refreshes automatically, so you can focus on your work instead of tracking tokens or costs.| Plan | Usage | Refresh cycle | Ideal for |
|---|---|---|---|
| Basic | Light daily access. | Once every 24 hours. | Getting started, casual use. |
| Professional | Approximately 5x more daily usage than Basic. | Once every 24 hours. | Creators and power users. |
| Developer | Approximately 10× more daily usage than Professional. | Every 8 hours (3 x per day). | Builders, teams, and automations. |
| Pay-as-you-Go (PAYG) | Continuous access. | No refresh, pay only for what you use. | Teams exceeding plan limits or needing flexible scaling. |
How Usage Works
- Basic and Professional plans refresh once every 24 hours for predictable, worry-free access.
- Developer plans refresh every 8 hours, designed for continuous work or API usage.
- If you hit your allowance, your access will pause until the next refresh, unless you have IO Credits.
- IO Credits (Pay-As-You-Go) allow instant continuation beyond limits, charging per request through your connected payment method.
More on Pay-As-You-Go
- Billed directly to your IO Credits balance.
- Includes the same tools and models as subscription plans.
- The Developer plan offers roughly a 10% discount compared to PAYG for consistent high-volume users.
- PAYG stops when your plan refreshes.
- Enables precise accounting of model-level usage and costs.
For reference, see the Model Rate Overview below for the latest per-model pricing structure.
Upgrade Path
- Basic: Explore the basics with a daily refresh and light usage.
- Professional: Unlock all models and creative workflows.
- Developer: Get API access, hourly refreshes, and better cost efficiency.
- Pay-As-You-Go: Scale instantly when you exceed your plan limits.
Model Rate Overview
Choose the model that best fits your workflow, from lightweight reasoning to complex multimodal generation. Each model consumes credits based on its scale, capability, and computational cost. The table below shows the current per-model usage rates available across all IO Intelligence plans. These rates apply to both Unified Chat and API requests and reflect the approximate cost.All prices are subject to change as model performance, infrastructure efficiency, and market rates evolve.
| Model | Input/M($) | Output/M($) | Note |
|---|---|---|---|
| moonshotai/Kimi-K2-Thinking | $0.55 | $2.25 | Per 1M tokens. |
| zai-org/GLM-4.6 | $0.40 | $1.75 | Per 1M tokens. |
| moonshotai/Kimi-K2-Instruct-0905 | $0.39 | $1.90 | Per 1M tokens. |
| deepseek-ai/DeepSeek-R1-0528 | $0.40 | $1.75 | Per 1M tokens. |
| meta-llama/Llama-3.3-70B-Instruct | $0.13 | $0.38 | Per 1M tokens. |
| mistralai/Magistral-Small-2506 | $0.50 | $1.50 | Per 1M tokens. |
| mistralai/Devstral-Small-2505 | $0.05 | $0.22 | Per 1M tokens. |
| Qwen/Qwen2.5-VL-32B-Instruct | $0.05 | $0.22 | Per 1M tokens. |
| meta-llama/Llama-3.2-90B-Vision-Instruct | $0.35 | $0.40 | Per 1M tokens. |
| openai/gpt-oss-120b | $0.04 | $0.40 | Per 1M tokens. |
| Qwen/Qwen3-235B-A22B-Thinking-2507 | $0.11 | $0.60 | Per 1M tokens. |
| openai/gpt-oss-20b | $0.03 | $0.14 | Per 1M tokens. |
| mistralai/Mistral-Nemo-Instruct-2407 | $0.02 | $0.04 | Per 1M tokens. |
| Qwen/Qwen3-Next-80B-A3B-Instruct | $0.10 | $0.80 | Per 1M tokens. |
| meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 | $0.15 | $0.60 | Per 1M tokens. |
| mistralai/Mistral-Large-Instruct-2411 | $2.00 | $6.00 | Per 1M tokens. |
| Intel/Qwen3-Coder-480B-A35B-Instruct-int4-mixed-ar | $0.22 | $0.95 | Per 1M tokens. |
| BAAI/bge-multilingual-gemma2 | $0.01 | Input based cost only. | Per 1M tokens. |
| speaches-ai/piper-en_US-lessac-medium | $0.62 | Input based cost. | Per 1M tokens. |
| speaches-ai/piper-en_US-amy-low | $0.62 | Input based cost. | Per 1M tokens. |
| speaches-ai/Kokoro-82M-v1.0-ONNX-fp16 | $0.62 | Input based cost. | Per 1M tokens. |
| speaches-ai/piper-en_US-ryan-high | $0.62 | Input based cost. | Per 1M tokens. |
| deepdml/faster-whisper-large-v3-turbo-ct2 | $0.04 | Input based cost. | Per 1 hour. |
| Systran/faster-whisper-large-v3 | $0.04 | Input based cost. | Per 1 hour. |
| guillaumekln/faster-whisper-small.en | $0.04 | Input based cost. | Per 1 hour. |
| guillaumekln/faster-whisper-medium.en | $0.04 | Input based cost. | Per 1 hour. |
To verify the latest model pricing, use the GET /models API endpoint. The response includes detailed pricing information for each available model. The fields
"input_token_price" and "output_token_price" represent the respective costs per token for input and output usage. For implementation details and the full endpoint specification, refer to: GET /models API Documentation.FAQs
What do I do when I hit my daily limit?
What do I do when I hit my daily limit?
You can buy IO Credits or wait for your daily limit to refresh.
If you already have credits, they will automatically cover additional usage with no interruptions.
If you already have credits, they will automatically cover additional usage with no interruptions.
How does the limit work across different models?
How does the limit work across different models?
IO Intelligence uses a shared credit pool system.
Credits can be spent on any model, with each consuming credits at a different rate depending on the complexity.
Credits can be spent on any model, with each consuming credits at a different rate depending on the complexity.
Does my daily limit include both Chat and API calls?
Does my daily limit include both Chat and API calls?
Yes. Chat interactions count toward your API quota and contribute to your daily limit.