IO Intelligence supports models that can “think” before answering — exposing their step-by-step reasoning alongside the final response. You control this behavior with a single, OpenAI-compatibleDocumentation Index
Fetch the complete documentation index at: https://io.net/docs/llms.txt
Use this file to discover all available pages before exploring further.
reasoning field on every chat completion request.
Enable or disable reasoning
Add areasoning object to your /v1/chat/completions payload:
effort value | Behavior |
|---|---|
"none" | Disable reasoning |
"low" | Enable reasoning, lighter thinking |
"medium" | Enable reasoning, balanced thinking |
"high" | Enable reasoning, deeper thinking |
| (field omitted) | Use the model’s default behavior |
Quickstart
Enable reasoning (curl)
Disable reasoning (curl)
Python (OpenAI SDK)
Node.js (OpenAI SDK)
Reading the response
When reasoning is enabled, the model’s chain of thought is returned inchoices[0].message.reasoning_content, and reasoning tokens are reported in usage.completion_tokens_details.reasoning_tokens:
"effort": "none"):
reasoning_contentisnullor omittedreasoning_tokensis0(or near-zero for models that always think internally)- The visible answer is in
contentonly
Streaming
Reasoning works with"stream": true as well. While the model is thinking, you’ll receive delta events with a reasoning_content field; once thinking finishes, regular content deltas follow.
Backwards compatibility
- Requests without the
reasoningfield continue to behave exactly as before. - The legacy
reasoning_effort: "low" | "medium" | "high"top-level string is still accepted.
You can safely roll out the new field at your own pace.
Errors
| Status | Meaning |
|---|---|
| 400 | A small number of always-reasoning models reject effort: "none". Use a different effort or omit the field. |
| 402 | The selected model requires a higher IO Intelligence tier. |
| 429 | You’ve been rate-limited. Back off and retry. |
Models Enabled
The capability of reason enabling/disabling has been enabled for the following models:| Model | Enable | Disable | Notes |
|---|---|---|---|
moonshotai/Kimi-K2.6 | ✓ | ✓ | Clean toggle |
openai/gpt-oss-120b | ✓ | ✓ | Always reasons internally; effort:"none" suppresses reasoning_content only |
deepseek-ai/DeepSeek-V4-Pro | ✓ | ✓ | Clean toggle |
deepseek-ai/DeepSeek-V4-Flash | ✓ | ✓ | |
deepseek-ai/DeepSeek-V3.2 | ✓ | ✓ | Doesn’t always engage reasoning for trivial prompts |
MiniMaxAI/MiniMax-M2.7 | ✓ | N/A | Always-on reasoning by design, effort:"none" is a no-op for this model |
google/gemma-4-26b-a4b-it | — | — | Reasoning routing in progress; check release notes before relying on it |
FAQs
Do I need different code for different models?
Do I need different code for different models?
No. Use
reasoning.effort everywhere, IO Intelligence handles the rest.What does each effort level cost me in tokens?
What does each effort level cost me in tokens?
Reasoning tokens are billed as completion tokens. none skips them entirely; low/medium/high each produce progressively more reasoning tokens. Exact budgets vary by model.
Can I hide the chain of thought from end users while still using it internally?
Can I hide the chain of thought from end users while still using it internally?
Yes, just don’t render
reasoning_content in your UI. It’s returned as a separate field so you can choose whether to show it.Does this work with tool calls / function calling?
Does this work with tool calls / function calling?
Yes. Reasoning happens before tool selection; the chain of thought explains why the model chose a particular tool call.