API Usage Cost Analysis - Ultra Data
Step 1: Corrected Usage Table from Screenshot
| MODEL | INPUT | OUTPUT | CACHE WRITE | CACHE READ | TOTAL TOKENS | API COST | COST TO YOU |
|---|
| claude-4-sonnet-thinking | 8,105,585 | 6,772,884 | 96,221,876 | 983,461,828 | 1,094,562,173 | $928.64 | $0 |
| claude-4-opus-thinking | 181,494 | 194,581 | 2,878,550 | 37,711,000 | 40,965,625 | $151.46 | $0 |
| gemini-2.5-pro-preview-06-05 | 5,665,633 | 178,125 | 0 | 77,316,674 | 83,160,432 | $39.40 | $0 |
| gemini-2.5-pro-preview-05-06 | 4,765,599 | 285,224 | 0 | 48,766,270 | 53,817,093 | $26.22 | $0 |
| Total | 18,718,311 | 7,430,814 | 99,100,426 | 1,147,255,772 | 1,272,505,323 | $1145.72 | $0 |
Step 2: OpenRouter Pricing for Known Models
Based on the corrected pricing information:
- Claude 4 Sonnet Thinking: $3.00 input / $15.00 output / $0.30 cache read / $3.75 cache write per million tokens
- Claude 4 Opus Thinking: $15.00 input / $75.00 output / $1.50 cache read / $18.75 cache write per million tokens
- Gemini 2.5 Pro (all versions): $1.25 input / $10.00 output / $0.30 cache read / unknown cache write per million tokens
Step 3: Calculated Costs Using OpenRouter Pricing
| MODEL | INPUT TOKENS | OUTPUT TOKENS | CACHE WRITE | CACHE READ | INPUT COST | OUTPUT COST | CACHE WRITE COST | CACHE READ COST | TOTAL CALCULATED COST |
|---|
| claude-4-sonnet-thinking | 8,105,585 | 6,772,884 | 96,221,876 | 983,461,828 | $24.317 | $101.593 | $360.832 | $295.039 | $781.781 |
| claude-4-opus-thinking | 181,494 | 194,581 | 2,878,550 | 37,711,000 | $2.722 | $14.594 | $53.973 | $56.567 | $127.856 |
| gemini-2.5-pro-preview-06-05 | 5,665,633 | 178,125 | 0 | 77,316,674 | $7.082 | $1.781 | $0.000 | $23.195 | $32.058 |
| gemini-2.5-pro-preview-05-06 | 4,765,599 | 285,224 | 0 | 48,766,270 | $5.957 | $2.852 | $0.000 | $14.630 | $23.439 |
| | | | | | | | TOTAL: | $965.134 |
Note: Gemini cache write costs assumed to be $0 as not specified.
Summary of Findings
- Formatting Issues: The original usage data was already properly formatted in this screenshot.
- Cost Accuracy: The calculated total cost of $965.13 is significantly lower than the reported API cost of $1145.72, showing a difference of $180.59 (15.8% difference).
- Model-by-Model Comparison:
- Claude 4 Sonnet Thinking: $781.78 calculated vs $928.64 reported (15.8% difference)
- Claude 4 Opus Thinking: $127.86 calculated vs $151.46 reported (15.6% difference)
- Gemini 2.5 Pro Preview 06-05: $32.06 calculated vs $39.40 reported (18.6% difference)
- Gemini 2.5 Pro Preview 05-06: $23.44 calculated vs $26.22 reported (10.6% difference)
- Massive Scale Usage: This Ultra subscription shows extraordinary usage levels:
- 1.27 billion total tokens processed
- 1.14 billion cache read tokens (90.2% of all usage)
- 99.1 million cache write tokens
- Cache operations represent 97.8% of all token usage
- Cache Usage Impact: Cache operations dominate costs, contributing approximately $750 to the total calculated cost. The cache efficiency is remarkable, with cache reads being 61x more frequent than cache writes.
- Cost Efficiency: The heavy reliance on cache reads (at $0.30/M vs $3.00/M for input tokens) provides massive cost savings. Without caching, the equivalent fresh API calls would cost approximately $3.4 billion instead of $1145.72.
- Possible Explanations for Discrepancies:
- Enterprise/Volume Pricing: At this scale (1.27B tokens), enterprise pricing tiers likely apply
- Unknown Gemini Cache Write Costs: Gemini models may have cache write fees not accounted for
- Infrastructure Costs: Ultra-scale usage may include additional infrastructure/priority access fees
- Different Cache Pricing: The actual cache pricing may differ from public rates at this volume
- Usage Patterns: This represents enterprise-level AI usage with sophisticated caching strategies, likely for production applications serving many users with optimized response caching.
- Cost Per Token: The effective cost per token is approximately $0.0009 across all usage, demonstrating the extreme efficiency achieved through caching at scale.