Quick Answer
The NVIDIA H100 GPU costs $1.49-$2.20 per hour on io.net, depending on whether you choose the PCIe ($1.49/hr) or SXM ($2.20/hr) variant. This represents 70% savings compared to AWS ($4.99-$6.98/hr), Azure ($5.40-$7.20/hr), and CoreWeave ($3.39-$4.76/hr). The H100 is NVIDIA's flagship data center GPU, offering 3x faster training than the A100 and 9x faster inference through its Transformer Engine. For large-scale LLM training, multi-GPU H100 clusters on io.net cost $17.60/hr for 8 GPUs vs. $55.84/hr on AWS - a savings of $38.24/hr or $27,533/month.
H100 Pricing Across All Major Providers
Here's what you'll pay to rent NVIDIA H100 GPUs across the cloud GPU market:
H100 SXM5 (700W TDP, NVLink, Multi-GPU Training)
| Provider | Price/Hour | Availability | Instance Type | Monthly Cost (24/7) |
|---|---|---|---|---|
| io.net | $2.20 | Good | On-demand | $1,584 |
| AWS | $6.98 | Limited | p5.48xlarge (8x H100) | $5,026 |
| Azure | $7.20 | Limited | ND H100 v5 | $5,184 |
| CoreWeave | $4.76 | Good | HGX H100 | $3,427 |
| Lambda Labs | Sold out | N/A | — | — |
| GCP | $6.85 | Preview | a3-highgpu-8g | $4,932 |
io.net saves you 68-70% vs. hyperscalers, 54% vs. CoreWeave
H100 PCIe (350W TDP, Single GPU Inference)
| Provider | Price/Hour | Availability | Instance Type | Monthly Cost (24/7) |
|---|---|---|---|---|
| io.net | $1.49 | Good | On-demand | $1,073 |
| AWS | $4.99 | Limited | p5.2xlarge (single GPU) | $3,593 |
| Azure | $5.40 | Limited | ND H100 PCIe | $3,888 |
| CoreWeave | $3.39 | Good | H100 PCIe | $2,441 |
| Lambda Labs | Sold out | N/A | — | — |
io.net saves you 70% vs. hyperscalers, 56% vs. CoreWeave
Multi-GPU H100 Cluster Pricing (8x H100 SXM)
| Provider | Price/Hour (8 GPUs) | Monthly Cost (24/7) | Annual Cost | Savings vs io.net |
|---|---|---|---|---|
| io.net | $17.60 | $12,672 | $152,064 | Baseline |
| AWS | $55.84 | $40,205 | $482,458 | +217% |
| Azure | $57.60 | $41,472 | $497,664 | +227% |
| CoreWeave | $38.08 | $27,418 | $329,011 | +116% |
For teams training Llama 3 70B or GPT-class models, io.net saves $27,533/month vs. AWS
H100 SXM vs PCIe: Which Should You Choose?
NVIDIA offers two H100 variants with different use cases and pricing:
H100 SXM5 - $2.20/hr on io.net
Best for:
- Large-scale LLM training (70B+ parameter models)
- Multi-GPU distributed training
- Maximum training throughput
- Research requiring fastest iteration
Specifications:
- 700W TDP (higher power = faster performance)
- 80GB HBM3 memory @ 3.35 TB/s bandwidth
- NVLink 4.0: 900 GB/s GPU-to-GPU
- Designed for 8-GPU HGX baseboard configurations
- 60 TFLOPs FP64, 2000 TFLOPs FP8 (with sparsity)
Performance:
- 3x faster training vs. A100
- Llama 3 70B full fine-tuning: 48 hours (vs. 144 hours on A100)
- Stable Diffusion XL training: 6 hours (vs. 18 hours on A100)
When to use: Multi-GPU training clusters where speed matters more than cost. Training runs that would take weeks on A100.
H100 PCIe - $1.49/hr on io.net
Best for:
- High-throughput LLM inference
- Single-GPU training (<13B params)
- Evaluation and fine-tuning experiments
- Cost-sensitive production inference
Specifications:
- 350W TDP (50% lower power, cooler operation)
- 80GB HBM3 memory @ 2.0 TB/s bandwidth
- PCIe Gen5 x16 interface
- Optimized for single-GPU deployments
- 51 TFLOPs FP64, 1600 TFLOPs FP8 (with sparsity)
Performance:
- 2.5x faster training vs. A100 PCIe
- 9x faster inference vs. A100 (with FP8 Transformer Engine)
- Llama 3 8B inference: ~150 tokens/sec (vs. 60 tokens/sec on A100)
When to use: Inference APIs, single-GPU experiments, or when budgets are tight. Still faster than A100 at lower cost.
Decision guide:
- Training >40B params or need 8+ GPU clusters? → H100 SXM ($2.20/hr)
- Inference or single-GPU training? → H100 PCIe ($1.49/hr)
- Tight budget but need modern GPU? → A100 80GB ($1.49/hr) or RTX 4090 ($0.18/hr)
Real-World H100 Cost Scenarios
Scenario 1: Training Llama 3 70B from Scratch
Workload: Full pre-training on 1.5T tokens
- Hardware needed: 8x H100 SXM with NVLink
- Training time: ~720 hours (30 days)
- Optimization: BF16 mixed precision, FSDP
| Provider | Cost |
|---|---|
| io.net | $2.20/hr × 8 GPUs × 720 hrs = $12,672 |
| AWS | $6.98/hr × 8 GPUs × 720 hrs = $40,205 |
| CoreWeave | $4.76/hr × 8 GPUs × 720 hrs = $27,418 |
io.net saves $27,533 vs. AWS or $14,746 vs. CoreWeave
Scenario 2: Fine-Tuning Llama 3 70B (LoRA)
Workload: LoRA fine-tuning on 10K custom examples
- Hardware needed: 4x H100 SXM
- Training time: 12 hours
- Optimization: LoRA rank 64, BF16
| Provider | Cost |
|---|---|
| io.net | $2.20/hr × 4 GPUs × 12 hrs = $105.60 |
| AWS | $6.98/hr × 4 GPUs × 12 hrs = $334.08 |
| CoreWeave | $4.76/hr × 4 GPUs × 12 hrs = $228.48 |
io.net saves $228 vs. AWS or $123 vs. CoreWeave per experiment
Scenario 3: Production LLM Inference API
Workload: Serve 10M requests/day with Llama 3 70B
- Hardware needed: 3x H100 PCIe with vLLM
- Uptime: 24/7
- Optimization: Continuous batching, FP8 quantization
| Provider | Monthly Cost | Cost per 1M tokens |
|---|---|---|
| io.net | $1.49/hr × 3 × 720 hrs = $3,218 | $0.11 |
| AWS | $4.99/hr × 3 × 720 hrs = $10,778 | $0.37 |
| CoreWeave | $3.39/hr × 3 × 720 hrs = $7,322 | $0.25 |
| OpenAI API | — | $0.60 per 1M tokens |
io.net saves $7,560/month vs. AWS and costs 82% less than OpenAI API
Scenario 4: Research Lab - Daily Experimentation
Workload: Run 3-5 training experiments daily
- Hardware needed: 2x H100 PCIe
- Runtime: 6 hours/day average
- Use case: Architecture search, hyperparameter tuning
| Provider | Monthly Cost | Annual Cost |
|---|---|---|
| io.net | $1.49/hr × 2 × 6 hrs × 30 = $536 | $6,432 |
| AWS | $4.99/hr × 2 × 6 hrs × 30 = $1,796 | $21,552 |
| CoreWeave | $3.39/hr × 2 × 6 hrs × 30 = $1,220 | $14,640 |
io.net saves $1,260/month or $15,120/year vs. AWS
Why is io.net's H100 Pricing 70% Lower?
H100s are the most expensive GPUs to purchase ($25K-$40K each), yet io.net rents them for a fraction of competitor prices. Here's how:
1. Decentralized Supply Eliminates Data Center Costs
Traditional cloud approach:
- Purchase H100s at $30K-$40K each
- Build $500M data center with specialized cooling for 700W GPUs
- Install expensive high-speed networking (InfiniBand, NVLink switches)
- Mark up 300-500% to cover infrastructure TCO
io.net approach:
- Aggregate H100s from enterprises with spare capacity (AI labs, research institutions, crypto miners pivoting to AI)
- No data center construction - providers supply cooling and power
- Marketplace pricing: providers earn more than local rental, users pay less than cloud
- Platform fee: 10-20% vs. 300-500% traditional markup
Result: 70% cost savings passed directly to users
2. High-Utilization Economics
H100s on io.net average 75-85% utilization vs. 40-60% on traditional clouds (enterprises overprovision for peak capacity). Higher utilization means providers can charge less per hour while earning more total revenue.
Math: Provider earning $1.80/hr at 80% utilization makes $1,036/month. AWS earning $6.98/hr at 50% utilization makes $2,512/month but passes cost to users.
3. Global Arbitrage
io.net sources H100s globally, optimizing for electricity costs:
- Quebec, Canada: $0.05/kWh (hydro)
- Iceland: $0.06/kWh (geothermal)
- Norway: $0.08/kWh (hydro)
AWS concentrates H100s in us-east-1 ($0.15-$0.20/kWh). For a 700W H100 running 24/7, electricity alone costs $75-100/month on AWS vs. $25-45 on io.net.
4. No Enterprise Overhead
AWS pricing includes:
- Enterprise sales teams and account managers
- Premium support tiers
- Marketing and customer acquisition costs
- Shareholder profit margins
io.net is self-serve with community support, eliminating 40-60% of traditional cloud overhead.
H100 Performance Benchmarks
Here's what you get for your money:
Training Performance (Llama 3 8B, 10K steps)
| GPU | Time | Cost (io.net) | Cost (AWS) | Throughput |
|---|---|---|---|---|
| H100 SXM | 2.3 hrs | $4.84 | $16.05 | 100% |
| H100 PCIe | 2.8 hrs | $4.17 | $13.97 | 82% |
| A100 80GB | 6.5 hrs | $9.75 | $26.65 | 35% |
| RTX 4090 | 8.2 hrs | $1.48 | N/A | 28% |
Insight: H100 PCIe offers best price/performance for single-GPU training
Inference Performance (Llama 3 70B, vLLM, batch=8)
| GPU | Tokens/sec | Cost per 1M tokens (io.net) | Cost per 1M tokens (AWS) |
|---|---|---|---|
| H100 SXM | 185 | $0.033 | $0.105 |
| H100 PCIe | 152 | $0.027 | $0.091 |
| A100 80GB | 62 | $0.054 | $0.146 |
| L40S | 98 | $0.021 | $0.043 |
Insight: L40S ($0.75/hr) offers better cost/token for inference than H100
Multi-GPU Scaling (Llama 3 70B Training)
| Configuration | Training Time | Cost (io.net) | Scaling Efficiency |
|---|---|---|---|
| 1x H100 SXM | 240 hrs | $528 | 100% |
| 2x H100 SXM | 125 hrs | $550 | 96% |
| 4x H100 SXM | 65 hrs | $572 | 92% |
| 8x H100 SXM | 35 hrs | $616 | 86% |
Insight: Near-linear scaling up to 8 GPUs thanks to NVLink
How to Maximize Value from H100 Rentals
1. Use H100 PCIe for Inference
For LLM inference APIs, H100 PCIe ($1.49/hr) delivers 82% of SXM performance at 68% of the cost. Combined with vLLM and FP8 quantization, you'll achieve sub-$0.03 per 1M tokens - 95% cheaper than OpenAI API.
2. Optimize Training with Mixed Precision
H100's FP8 Transformer Engine accelerates training by 2x over FP16:
- Enable FP8 in HuggingFace Transformers (torch_dtype=torch.float8_e4m3fn)
- Use BF16 for non-Transformer layers
- Result: 2x speedup = 50% cost reduction
3. Batch Aggressively for Inference
H100's 80GB memory enables massive batch sizes:
- Llama 3 8B: batch size 128+ (vs. 32 on A100)
- 4x higher throughput per GPU
- 75% cost reduction per request
4. Use Spot-Like Pricing on io.net
While io.net doesn't offer "spot instances" (all instances are stable), prices during off-peak hours (2am-8am UTC) are sometimes 10-15% lower due to marketplace dynamics. Schedule batch training jobs overnight for additional savings.
5. Right-Size Your GPU Choice
Don't overpay for H100 if you don't need it:
- Fine-tuning <13B params? RTX 4090 ($0.18/hr) is 92% cheaper
- Inference for <33B params? L40S ($0.75/hr) offers better value
- Training 70B+ or need 8+ GPU clusters? H100 SXM is worth it
Related Questions
Can I rent a single H100 or do I need to rent 8?
You can rent as few as 1 H100 on io.net. While H100 SXM GPUs are designed for 8-GPU HGX baseboards, io.net's marketplace includes individual H100s for single-GPU workloads. For multi-GPU training, deploy 2, 4, or 8 H100s with NVLink connectivity. No minimums, pay per second.
How does H100 compare to H200?
The H200 (released Q1 2024) offers 141GB HBM3e memory vs. H100's 80GB - ideal for training 175B+ parameter models. Training performance is similar (+5-10% from memory bandwidth improvements). H200s are not yet widely available on cloud platforms. When available on io.net, expect $2.50-$3.00/hr pricing (still 60% below hyperscaler rates).
Is the H100 worth it vs A100 for inference?
For inference, H100 PCIe ($1.49/hr) is 9x faster than A100 ($1.20/hr) thanks to FP8 Transformer Engine. Cost per token is 40-50% lower on H100 despite higher hourly rate. For maximum cost efficiency, L40S ($0.75/hr) offers best $/token for models up to 70B params.
Can I run H100s on io.net with InfiniBand?
Yes. Multi-GPU H100 SXM clusters on io.net include NVLink 4.0 (900 GB/s GPU-to-GPU) and select providers offer InfiniBand networking for distributed training across nodes. For 8+ GPU clusters, specify InfiniBand requirement when deploying.
How long until H100s become cheaper?
NVIDIA's H200 and GB200 (Blackwell) releases will create downward price pressure on H100s in 2026-2027. Expect 20-30% price reductions as newer GPUs enter market. However, io.net's decentralized model already prices H100s 70% below hyperscalers, so absolute prices may not drop significantly - competitive gap will narrow as AWS/Azure reduce rates.
Get Started with H100 GPUs at 70% Savings
Stop overpaying for the world's fastest AI training GPUs:
✅ H100 SXM for $2.20/hr (vs. $6.98/hr on AWS) - 68% savings
✅ H100 PCIe for $1.49/hr (vs. $4.99/hr on AWS) - 70% savings
✅ Instant availability - no waitlists or reservations
✅ Multi-GPU clusters - scale from 1 to 100+ GPUs
Deploy H100 cluster now → | Compare all GPU pricing →
Pricing updated April 2026 | Benchmarks from internal testing and MLPerf results
