Why You're Overpaying for Claude Max (The Token Problem Everyone Misses)
Published: January 5, 2026 - 8 min read
Okay, okay, I KNOW. I've written a lot about tokens now, but this? THIS is the post I've been dying to write.
Back in December, I went down a rabbit hole learning about tokens and context windows. I thought I understood them pretty well. Then I started paying closer attention to my Claude usage and had this moment where I literally stopped mid-conversation and thought: "Wait. How many people are paying $100 or $200 a month when they could be paying $20?"
The answer? A lot. Probably most.
And here's the uncomfortable truth: Most people who pay $100-$200/month for Claude Max would be perfectly fine on the $20/month Pro plan - if they understood how tokens actually work.
This is the first post in a 4-part series where I'm going to break down everything I've learned about Claude pricing and tokens. By the end, you'll know exactly which plan you actually need, and more importantly, why.
Quick note: This applies mainly to Claude.ai (the web interface), but the principles absolutely transfer if you're using Claude Code or the API. The token mechanics are the same everywhere.
The Core Problem: Why People Overpay
Here's what I've realized after spending way too much time thinking about this. The problem isn't that Max is overpriced. The problem is that most users:
- Think "1 message = 1 message" (it doesn't)
- Don't realize long conversations compound exponentially
- Upload the same files repeatedly instead of using Projects
- Use the most powerful model for simple tasks
- Don't understand that Claude "forgets" everything after each message
That last one? That's the big one. And I'm going to explain exactly what I mean.
But first, we need to talk about what tokens actually are just in case you haven't read my previous blog posts. And I promise you, even if you think you know, there just might be something here you might've missed.
What Are Tokens? (The Foundation Everyone Misses)
The Simple Explanation
A token is not a word. It's not a message. It's the smallest unit of text that Claude can process.
Key facts:
- 1 token is approximately 3/4 of a word (about 4 characters in English)
- 100 tokens is approximately 75 words
- 1 million tokens is approximately 750,000 words (roughly 4-5 novels)
Why This Matters for Your Usage
When Claude's documentation says you can send "45 messages per 5 hours," that's an estimate based on short messages. Here's the reality:
| Message Type | Approximate Token Cost |
|---|---|
| Short question ("What's 2+2?") | ~10-20 tokens |
| Paragraph of text | ~100-200 tokens |
| Full page of text | ~500-800 tokens |
| 10-page document | ~5,000-8,000 tokens |
| Large PDF (50 pages) | ~25,000-50,000 tokens |
See what's happening here? A "message" with a 50-page PDF attached isn't 1 message. It's potentially 50,000 tokens in a single exchange.
Token Examples (So You Can Visualize This)
"Hello, how are you?"
- Tokenized as:
["Hello", ",", " how", " are", " you", "?"] - = 6 tokens
"Can you help me write a professional email to my manager about requesting time off next Friday for a doctor's appointment?"
- = Approximately 25 tokens
The Non-English Factor
If you've been following my blog post on why French costs more tokens, you already know this one: Non-English languages often use dramatically more tokens than English to express the same idea.
This was one of the first shocking discoveries I made when I started digging into token mechanics back in December. For example, "Bonjour, comment allez-vous?" uses approximately 9 tokens while "Hello, how are you?" only uses 6 tokens—that's 50% more tokens for the French version.
And it gets worse with other languages. One study found that a sentence in Telugu used 7 times more tokens than the same sentence in English, despite having fewer characters.
What this means for your quota:
- If you chat in Spanish, French, or other languages, you're consuming more tokens per message
- Code with lots of special characters can be token-heavy
- Emojis and special symbols add up quickly
For someone like me who literally has Claude respond to me in French to help me practice, this is just one more thing I have to keep in mind. I'm not changing my approach (I need the practice), but knowing this helps me understand why my quota drains faster when I'm practicing French versus English.
The "Memory Illusion": Claude's Biggest Secret
Okay, this is the part that blew my mind when I first really understood it. I wrote about this in my tokens and context windows deep dive, but let me explain it again because it's THAT important.
The Truth About Claude's "Memory"
Here's the most important thing to understand:
Claude has NO memory between messages. Every API request is completely stateless.
When you're having a conversation with Claude and it "remembers" what you said earlier, here's what's actually happening:
- You send message #1
- Claude responds
- You send message #2
- The Claude.ai app secretly sends message #1 + Claude's response + message #2 all together
- Claude processes everything as if it's brand new
- You send message #3
- The app sends message #1 + response #1 + message #2 + response #2 + message #3
Do you see what's happening? Every single message re-sends the ENTIRE conversation history. Claude isn't "remembering." It's re-reading everything, every single time.
The Exponential Cost of Long Conversations
This is why long conversations destroy your usage quota:
| Message # | What Gets Sent to Claude | Token Growth |
|---|---|---|
| 1 | Your first message | 100 tokens |
| 2 | Msg 1 + Response 1 + Msg 2 | ~400 tokens |
| 3 | Everything above + Msg 3 | ~800 tokens |
| 5 | Full history | ~2,000 tokens |
| 10 | Full history | ~5,000 tokens |
| 20 | Full history | ~15,000+ tokens |
Your 20th message costs 20x more than your first message in the same conversation.
Let that sink in. If you're someone who keeps one long conversation going all day, you're burning through tokens exponentially faster than someone who starts fresh conversations.
Real Example: The Document Analysis Trap
Let me paint a picture that I'm sure many of you have lived through:
Scenario: You upload a 30-page PDF and have a 10-message conversation about it.
What most people think:
- "I uploaded 1 file and sent 10 messages"
What actually happens:
- Message 1: PDF (20,000 tokens) + question (50 tokens) = 20,050 tokens
- Message 2: PDF + previous exchange + new question = ~25,000 tokens
- Message 10: PDF re-processed 10 times + all previous exchanges = ~80,000+ tokens
You didn't just read the PDF once. Claude had to re-read it with EVERY SINGLE MESSAGE.
No wonder you're hitting your limits so fast.
What This Costs You
Here's the quick overview of what you're paying for:
| Plan | Price | Messages per 5 Hours* |
|---|---|---|
| Pro | $20/mo | ~45 short messages |
| Max 5x | $100/mo | ~225 short messages |
| Max 20x | $200/mo | ~900 short messages |
*Estimates based on short conversations with basic models
But here's the thing: those "messages per 5 hours" estimates? They're based on short, simple conversations. As you now understand from the memory illusion section above, your 20th message in a conversation costs 20x more than your first message. Upload a PDF and have a 10-message conversation about it? You just re-read that entire PDF 10 times.
Those neat little "message" estimates start to fall apart pretty quickly.
In the final post of this series, I'll break down exactly what you get in each plan, when you actually need Max, and how to decide which tier makes sense for you. For now, just know that most people paying for Max could probably thrive on Pro if they understood what we just covered.
Key Takeaways (So Far)
Let me summarize what we've covered:
-
Tokens are not messages. A "message" with a PDF attached could be 50,000 tokens.
-
Claude has no memory. Every message re-sends the entire conversation history. Your 20th message costs exponentially more than your 1st.
-
Documents get re-read every single message. Upload once, pay for it repeatedly.
-
The "45 messages per 5 hours" estimate is based on tiny messages. Your mileage will vary dramatically based on how you use Claude.
-
Most people paying for Max would be fine on Pro - if they understood these mechanics and adjusted their usage accordingly.
What's Next
But wait - there's more. In my next post, I'm going to show you the 6 hidden token drains that are destroying your quota without you even realizing it. Extended thinking, file uploads, tools, model selection... there's so much eating your tokens that you never see.
Stay tuned!
As always, thanks for reading!