Just Think AIStart thinking

GlossaryTerm

Tokens

The chunks of text models count and bill by — usually 3-4 characters each.

Models don't see characters or words — they see tokens. A token is roughly 3-4 characters of English text (or about ¾ of a word). The word "tokenization" is one token. The word "antidisestablishmentarianism" is about six.

Why it matters: providers bill per token, both input and output. A 500-word email is roughly 650 tokens. If you process a million 500-word emails per month with GPT-4o ($2.50 per million input tokens), that's about $1,625 in input plus whatever the model writes back.

Different providers tokenize differently. OpenAI uses cl100k. Anthropic has its own. The same English text will produce roughly the same count across major providers — but code, JSON, and non-Latin scripts can vary by 30% or more. Use our token counter to estimate before you build.

Bring this to your business

Knowing the term is one thing. Shipping it is another.

We do two-week AI Sprints — one term, one workflow, into production by Day 10.