Text to Token Calculator
Paste or type text to see character count, word count, and estimated token count. Choose a model family (GPT-4, Claude, etc.) for a more accurate estimate. Optionally include special tokens for chat or completion framing.
Your text
Input
Paste or type the text you want to estimate tokens for.
Token counts vary by model. Choose the family closest to your target API.
Adds a small overhead for message boundaries and system framing.
About this calculator
This calculator is for developers and teams who need to estimate token count from raw text—for prompt design, context window limits, or API cost estimation. Different model families tokenize differently; selecting GPT-4, Claude, Llama, or Mistral gives an estimate aligned with that provider’s typical behavior.
Use the result to stay within model context limits and to approximate cost (tokens × per-token price). For full documents or RAG pipelines, use the PDF Token Size Estimator to estimate tokens from page and word counts.
Approximate tokens per word by model family
Tokenizers vary by model. These are typical approximations for English; other languages may differ.
| Model family | Typical tokens per word (approx) | Notes |
|---|---|---|
| GPT-4 | ~1.3 | Closer to 1 token per 4 characters for English. |
| GPT-3.5 | ~1.3 | Similar to GPT-4; subword tokenization. |
| Claude | ~1.2–1.4 | Varies by model; often slightly fewer tokens than GPT-4 for same text. |
| Llama | ~1.4–1.6 | Often more tokens per word; check specific model. |
| Mistral | ~1.4–1.6 | Similar to Llama-style tokenizers. |
| Default | ~1.3 | Generic estimate when no model is selected. |
Frequently asked questions
Related calculators
- PDF Token Size Estimator — Estimate tokens from PDF page and word counts for RAG and chunking.
- Document Processing Cost Calculator — Monthly cost for parsing, extraction, OCR, and chat.