Glossary/Token

Token

A token is the unit a large language model processes and is billed by — typically about 4 characters of English text or roughly 0.75 of a word, though the exact mapping depends on the model’s tokenizer.

LLMs don’t see characters or words; they see tokens. The tokenizer breaks input text into tokens before processing and the model produces output one token at a time. English text is roughly 4 characters per token on average; non-English languages and code can be significantly less efficient.

Tokens matter to marketers because LLM APIs price by input and output tokens. A 500-word system prompt is around 650 tokens; a 200-word generated post is around 270 tokens. At common pricing, a single generation might cost a fraction of a cent; running a large brand-conditioning prompt against hundreds of generations a day adds up to real money.

Why it matters

Token cost is what makes AI marketing economics work or not. A tool that uses a 50-token system prompt is cheap and generic; a tool that uses a 4,000-token brand-conditioning prompt is expensive and on-brand. The cost difference is real and shows up in pricing.