GuideNovember 15, 2025

What is TOON? A Complete Guide to Token-Oriented Object Notation

Discover TOON format - a revolutionary data serialization format designed to reduce LLM token usage by 30-60%. Learn how TOON optimizes data for GPT-4, Claude, and other AI models.

By TOON Kit Team

What is TOON? A Complete Guide to Token-Oriented Object Notation

In the era of Large Language Models, every token counts. If you're building AI chatbots, data analytics tools, or LLM-powered applications, you've probably noticed that token costs add up fast. That's the problem TOON was designed to solve.

What is TOON Format?

TOON (Token-Oriented Object Notation) is a compact, human-readable data format specifically designed for Large Language Models. Unlike JSON, which was created for web APIs in the early 2000s, TOON was built for the age of AI.

The core issue TOON addresses is redundancy in JSON. When you have an array of 1000 user objects, field names like "id", "name", and "role" get repeated 1000 times. That's wasteful, especially when you're paying per token. TOON solves this by declaring field names once and presenting data in a tabular format.

Key Features

30-60% Token Savings

TOON typically reduces token consumption by 30-60% compared to JSON for uniform data structures. This translates to:

Lower API costs for GPT-4, Claude, and other LLMs
Faster processing times
More room in context windows for actual content

Better LLM Accuracy

In testing, TOON achieves 73.9% accuracy compared to JSON's 69.7% on LLM data retrieval tasks. The structured, compact format helps AI models understand and process data more effectively. The explicit headers and tabular layout make it easier for models to parse relationships.

100% JSON Compatible

TOON is fully compatible with the JSON data model. You can convert between formats seamlessly without losing information. All JSON data types are supported: objects, arrays, strings, numbers, booleans, and null values.

The conversion is lossless and bidirectional. Your existing JSON data can be converted to TOON for LLM prompts, and you can convert TOON responses back to JSON for storage or API responses.

Human-Readable Structure

Despite being optimized for tokens, TOON remains intuitive and readable. It combines the best aspects of CSV (for arrays) and YAML (for nested objects). Developers find it easy to read and debug, even without prior experience.

How TOON Works: A Simple Example

Let's compare JSON and TOON with a real example:

JSON (126 tokens)

{
  "users": [
    {
      "id": 1,
      "name": "Alice",
      "role": "admin"
    },
    {
      "id": 2,
      "name": "Bob",
      "role": "user"
    }
  ]
}

TOON (49 tokens) - 61% savings

users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user

What Changed?

Field declaration: {id,name,role} declares all fields once instead of repeating them
Array length: [2] explicitly states the array size for validation
Tabular format: Data rows contain only values, not property names
Minimal punctuation: No repetitive brackets, quotes, or commas around keys

The token count drops from 126 to 49. Scale that up to 1000 users, and you're looking at significant cost savings.

When to Use TOON

TOON works best in specific scenarios:

Ideal Use Cases

Uniform arrays of objects: Employee records, user lists, transaction logs. When you have multiple objects with the same fields, TOON's tabular format shines.

Time-series data: Analytics data, sensor readings, stock prices. The CSV-like structure is perfect for temporal data.

API responses: REST API results with consistent structure. If your API returns arrays of similar objects, TOON can cut the token count dramatically.

Database query results: SQL query outputs are already tabular, making them perfect for TOON conversion.

E-commerce catalogs: Product lists with standard fields (id, name, price, category) compress well.

CSV-style datasets: Any data that's naturally tabular benefits from TOON's format.

Less Ideal Use Cases

Deeply nested configurations: Complex config files with varying structures don't benefit as much. JSON might be clearer.

Non-uniform data: Objects with varying field sets lose TOON's main advantage. The tabular format requires consistent fields.

Small datasets: Less than 10 objects won't show much benefit. The overhead of field declarations isn't worth it.

Pure tabular data: If you're just sending tabular data with no nesting, plain CSV is even more compact.

TOON Syntax Basics

Simple Objects

id: 123
name: Alice
active: true

Just key-value pairs, one per line. Similar to YAML but simpler.

Arrays of Primitives

tags[3]: admin,ops,dev

Declare the length, then comma-separate the values.

Tabular Arrays (Most Efficient)

employees[3]{id,name,department,salary}:
  101,Alice Johnson,Engineering,120000
  102,Bob Smith,Marketing,95000
  103,Carol White,Sales,105000

This is where TOON really shines. The fields are declared once in the header ({id,name,department,salary}), and each row is just the values.

Nested Objects

company:
  name: TechCorp
  employees[2]{id,name}:
    1,Alice
    2,Bob

You can nest objects and arrays naturally. Indentation shows the structure.

Using TOON in Production

TOON is production-ready with several implementations:

Official TypeScript/JavaScript SDK

Available on npm as @toon-format/toon. Install it with:

npm install @toon-format/toon

Quick Start Example

import { encode, decode } from '@toon-format/toon';

// JSON to TOON
const data = {
  users: [
    { id: 1, name: "Alice", role: "admin" },
    { id: 2, name: "Bob", role: "user" }
  ]
};

const toonString = encode(data);
// users[2]{id,name,role}:
//   1,Alice,admin
//   2,Bob,user

// TOON to JSON
const jsonData = decode(toonString);
// Back to original structure

The API is straightforward: encode() converts JSON to TOON, decode() converts TOON back to JSON.

Other Languages

Community libraries for Python, Go, and Rust are in development. The specification is open, so you can implement it yourself if needed.

Real-World Impact

Cost Savings Example

Let's say you process 1 million API requests per month with GPT-4:

With JSON: ~500M tokens/month at $0.03/1K = $15,000/month
With TOON: ~200M tokens/month at $0.03/1K = $6,000/month
Savings: $9,000/month (60% reduction)

For a high-volume application, that's real money.

Context Window Optimization

GPT-4's context window is 8K tokens. If your dataset takes 6,000 tokens in JSON:

JSON: 6,000 tokens for data → 2K tokens left for your prompt
TOON: 2,400 tokens for data → 5.6K tokens left for your prompt

That's 2.8x more space for instructions, examples, and multi-turn conversation history.

Getting Started

Here's how to try TOON:

Use our free online converter - Paste your JSON, get TOON instantly with token counts
Explore the playground - Compare TOON with JSON, YAML, and XML side by side
Read the documentation - Complete syntax guide and API reference
Install the npm package: npm install @toon-format/toon

Common Questions

Is TOON a replacement for JSON?

No. TOON is optimized for LLM input, not general data interchange. Use JSON for APIs and storage, then convert to TOON when sending data to LLMs. Think of it as a transformation layer, not a replacement.

Does TOON work with all LLMs?

Yes. TOON works with GPT-4, Claude, Gemini, LLaMA, and any text-based LLM. The format is intuitive enough that models understand it with minimal prompting. In some cases, you don't even need to explain it—just include it in a code block and the model figures it out.

Can I convert TOON back to JSON?

Absolutely. TOON is 100% lossless and bidirectional. Use decode() to convert TOON back to JSON at any time.

What about performance?

TOON encoding and decoding is fast—typically sub-millisecond for datasets under 1MB. The token savings far outweigh any processing overhead. For most applications, the conversion time is negligible compared to the LLM API call latency.

Do I need to explain TOON to the LLM?

Usually, a brief explanation helps: "Here's data in TOON format (fields declared once in the header)." After that, models understand it well. Some models even recognize it without explanation if you use code blocks.

The Bottom Line

TOON was built to solve a specific problem: reducing token waste when sending structured data to LLMs. It does this by:

Eliminating field name repetition
Using a tabular format for arrays
Minimizing punctuation overhead
Providing explicit validation markers

The result is typically 30-60% fewer tokens, which means lower costs, better accuracy, and more efficient use of context windows.

If you're working with LLMs and sending structured data in your prompts, TOON is worth trying. Start with our free converter to see the impact on your actual data.

Learn More:

← Back to all articles

Ready to reduce your LLM costs?

Try our free JSON to TOON converter and see instant token savings

Try Converter →Read Docs