🔤

Token

The basic unit AI models use to process text, roughly corresponding to word parts, common words, or character sequences.

Core Concept

Simple Explanation

Tokens are the chunks of text that AI models read and produce. They're not quite words, common words like 'the' are one token, but longer words get broken into pieces. Roughly, 1 token ≈ 4 characters or ¾ of a word in English. When you see AI pricing like '$10 per 1M tokens' or context limits like '128K tokens', it refers to these chunks.

💡An Analogy

Think of tokens like LEGO blocks. Common words like 'the' or 'apple' are single pre-made blocks. Longer or less common words get built from smaller pieces. The word 'tokenization' might be three tokens: 'token' + 'iz' + 'ation'. AI models work with these blocks, not whole words, which gives them flexibility to handle new or rare terms.

Technical Detail

Tokens are produced by tokenizers, algorithms that split text into meaningful units for model processing. Most modern LLMs use subword tokenization like Byte-Pair Encoding (BPE) or SentencePiece, which balance vocabulary size with the ability to represent rare words. A tokenizer learns common character sequences during training; frequent patterns become single tokens while uncommon words are broken into multiple pieces. Different models have different tokenizers, so the same text can have different token counts across models. Non-English languages often require more tokens per character due to less training representation.

Real Examples

1 token ≈ 4 characters (English)

'Hello world!' = 3 tokens. A typical paragraph of 100 words ≈ 130 tokens.

Context window

'128K tokens' = roughly 300 pages of a book. GPT-4o has 128K, Claude has up to 1M.

API pricing

'$2.50 per 1M input tokens' = you pay $2.50 for about 750,000 words of input. Most conversations cost pennies.

Non-English penalty

Languages like Chinese, Japanese, and Korean use more tokens per character, making them more expensive to process.

Frequently Asked Questions

How do I count tokens in my text?▼

OpenAI provides a free tokenizer tool at platform.openai.com/tokenizer. Anthropic's Claude has similar tools. Programmatically, libraries like tiktoken (Python) calculate tokens for OpenAI models. For rough estimation: 1 token ≈ 4 English characters ≈ 0.75 words. A 2,000-word essay is approximately 2,700 tokens.

Why do tokens matter for pricing?▼

LLM APIs charge per token, usually with separate rates for input and output. Costs can add up with long prompts or conversations. For example, a conversation using 10K input tokens + 2K output tokens per message at $5/1M input, $15/1M output = $0.08/message. Multiply across users and it becomes significant.

Why do different AI models have different token counts for the same text?▼

Each model has its own tokenizer trained on different data. A tokenizer that saw more technical text will have shorter token counts for technical terms. This is why the same prompt might cost differently across providers, not just per-token pricing differs, but the token count itself can vary by 10-20% between models.

Related Terms

📏

Context Window

The maximum amount of text an AI model can consider at once, including your prompt, conversation history, and the response being generated.

🧩

Embeddings

A way of representing text (or other data) as lists of numbers that capture meaning, enabling similarity search and semantic operations.

📚

Large Language Model (LLM)

A neural network trained on massive text data to understand and generate human-like language.

✍️

Prompt Engineering

The skill of writing instructions to AI models to get the best possible output.

Ready to apply these concepts?

Our free AI course teaches you to use these ideas in real projects.

Start Free AI Course →

🔤

Token

The basic unit AI models use to process text, roughly corresponding to word parts, common words, or character sequences.

Core Concept

Simple Explanation

💡An Analogy

Technical Detail

Real Examples

1 token ≈ 4 characters (English)

'Hello world!' = 3 tokens. A typical paragraph of 100 words ≈ 130 tokens.

Context window

'128K tokens' = roughly 300 pages of a book. GPT-4o has 128K, Claude has up to 1M.

API pricing

'$2.50 per 1M input tokens' = you pay $2.50 for about 750,000 words of input. Most conversations cost pennies.

Non-English penalty

Languages like Chinese, Japanese, and Korean use more tokens per character, making them more expensive to process.

Frequently Asked Questions

How do I count tokens in my text?▼

Why do tokens matter for pricing?▼

Why do different AI models have different token counts for the same text?▼

Related Terms

📏

Ready to apply these concepts?

Our free AI course teaches you to use these ideas in real projects.

Start Free AI Course →

Token

Simple Explanation

💡An Analogy

Technical Detail

Real Examples

1 token ≈ 4 characters (English)

Context window

API pricing

Non-English penalty

Frequently Asked Questions

Related Terms

Context Window

Embeddings

Large Language Model (LLM)

Prompt Engineering

Ready to apply these concepts?

What to read next

Midjourney Prompts

Claude Prompts

Resume Prompts

Token

Simple Explanation

💡An Analogy

Technical Detail

Real Examples

1 token ≈ 4 characters (English)

Context window

API pricing

Non-English penalty

Frequently Asked Questions

Related Terms

Context Window

Embeddings

Large Language Model (LLM)

Prompt Engineering

Ready to apply these concepts?