TOON Format Documentation

What is TOON?

TOON (Token-Oriented Object Notation) is a compact, human-readable encoding of the JSON data model designed specifically for Large Language Models. It provides:

30-60% fewer tokens compared to JSON
73.9% LLM accuracy vs JSON's 69.7%
100% compatible with JSON data model
Explicit validation with array lengths and field headers

Basic Syntax

📦 Objects

Simple objects use key-value pairs with indentation for nesting:

JSON

{
  "id": 123,
  "name": "Alice",
  "active": true
}

TOON

id: 123
name: Alice
active: true

📋 Primitive Arrays

Arrays show their length in brackets and list values inline:

JSON

{
  "tags": ["admin", "ops", "dev"]
}

TOON

tags[3]: admin,ops,dev

📊 Tabular Arrays (Most Efficient!)

Arrays of uniform objects declare fields once and stream rows:

JSON (126 tokens)

{
  "users": [
    {
      "id": 1,
      "name": "Alice",
      "role": "admin"
    },
    {
      "id": 2,
      "name": "Bob",
      "role": "user"
    }
  ]
}

TOON (49 tokens - 61% savings!)

users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user

Advanced Features

Alternative Delimiters

Use tab or pipe delimiters for additional token savings:

// Tab delimiter (often more efficient)
users[2\t]{id\tname\trole}:
  1\tAlice\tadmin
  2\tBob\tuser

// Pipe delimiter (useful when data contains commas)
addresses[2|]{street|city|country}:
  123 Main St, Suite 100|Boston|USA
  456 Oak Ave, Apt 5B|Seattle|USA

Key Folding

Collapse nested chains into dotted paths to reduce tokens:

Standard

data:
  metadata:
    items[2]: a,b

With Key Folding

data.metadata.items[2]: a,b

📝 Best Practices

✅ TOON Excels At:

Uniform arrays of objects (100% tabular eligibility)
Time-series data and analytics
Employee records, user lists
API responses with consistent structure

⚠️ Consider JSON For:

Deeply nested configurations (0% tabular eligibility)
Non-uniform or semi-uniform data structures
Pure tabular data (CSV is even smaller)
Latency-critical applications

🔌 API Reference

POST /api/convert

Convert between JSON and TOON formats programmatically.

Rate Limit: 60 requests per minute per IP address. Maximum input size: 1MB. Rate limit headers are included in all responses.

Request Body:

{
  "input": string,              // Required: Input text
  "mode": string,               // Required: "json-to-toon" | "toon-to-json"
  "options": {
    "delimiter": string,        // Optional: "," | "\t" | "|"
    "indent": number,           // Optional: 2 | 4
    "keyFolding": string,       // Optional: "off" | "safe"
    "includeTokenCount": boolean  // Optional: Include token stats
  }
}

Example:

curl -X POST https://toon-kit.com/api/convert \
  -H "Content-Type: application/json" \
  -d '{
    "input": "{\"users\":[{\"id\":1,\"name\":\"Alice\"}]}",
    "mode": "json-to-toon",
    "options": {
      "includeTokenCount": true
    }
  }'

📚 Resources

📖 Official TOON Repo

Full specification, benchmarks, and TypeScript SDK

📜 TOON Specification

Official spec with ABNF grammar and test fixtures

Frequently Asked Questions

How does TOON reduce token usage?

TOON reduces tokens by declaring field names once for entire arrays instead of repeating them for each object. For example, a JSON array with 1000 user objects repeats "id", "name", "role" 1000 times. TOON declares these fields once in the header: users[1000]{id,name,role}: and then provides only values, saving 30-60% tokens.

Is TOON compatible with existing JSON data?

Yes, TOON is 100% compatible with the JSON data model. You can convert any valid JSON to TOON and back without losing any information. TOON supports all JSON data types including objects, arrays, strings, numbers, booleans, and null.

Which Large Language Models work with TOON?

TOON works with all major LLMs including GPT-4, GPT-3.5, Claude 3, Claude 2, and any other Large Language Model. Studies show TOON achieves 73.9% accuracy compared to JSON's 69.7% on LLM tasks, while using 30-60% fewer tokens.

When should I use TOON instead of JSON?

Use TOON when working with uniform arrays of objects (employee records, user lists, API responses), time-series data, or any structured dataset sent to Large Language Models. TOON excels with tabular data where field names are consistent across array elements. For deeply nested configurations or non-uniform data, JSON may be more appropriate.

Can I use TOON in production applications?

Yes! TOON has a stable specification and TypeScript SDK available via npm (@toon-format/toon). You can integrate TOON conversion into your workflow using our REST API or the official TypeScript library. The format is designed for production use with proper error handling and validation.

Does TOON support nested objects and arrays?

Yes, TOON fully supports nested objects and arrays. Nested objects use indentation-based structure similar to YAML, while nested arrays maintain their hierarchical structure. However, TOON achieves the highest token efficiency with flat, uniform data structures like tabular arrays.

How do I use TOON with ChatGPT or Claude?

To use TOON with ChatGPT or Claude, simply convert your JSON data to TOON format using our converter, then include it in your prompt. The LLM will understand TOON's tabular format naturally. For best results, add a brief note like "Data is in TOON format with headers showing field names" in your prompt.

What are the limitations of TOON format?

TOON works best with uniform data structures. For deeply nested configurations with varying field sets across objects, JSON might be more efficient. TOON is optimized for LLM input rather than as a replacement for JSON in APIs or storage systems. Token savings also vary depending on the tokenizer and data structure.

Can I convert TOON back to JSON?

Yes! TOON is 100% lossless and bidirectional. You can convert TOON back to JSON without losing any data. Our converter supports both JSON-to-TOON and TOON-to-JSON conversion. The conversion is deterministic and maintains all data types, structure, and values.

Is my data safe when using the online converter?

Yes, your data is completely safe. All conversions happen locally in your browser using client-side JavaScript. Your data never leaves your device or gets sent to our servers. The converter works entirely offline after the initial page load.

What types of data benefit most from TOON?

TOON excels with uniform tabular data such as database query results, API responses with lists of objects, user records, employee data, time-series analytics, e-commerce product catalogs, and CSV-style datasets. Data with repeated fields across multiple objects sees the highest token savings (40-60%).

How does TOON compare to YAML?

TOON and YAML both use indentation-based structure, but TOON adds tabular format for arrays which YAML lacks. For uniform arrays, TOON is more token-efficient than YAML. TOON also includes explicit length markers [N] for validation. Both are more readable than JSON for nested data.

Are there libraries available for TOON in other programming languages?

The official TOON library is available for JavaScript/TypeScript via npm (@toon-format/toon). Community implementations are being developed for Python, Go, Rust, .NET, and other languages. Check the official GitHub repository (github.com/toon-format/toon) for the latest list of supported languages.

What is the maximum file size the converter can handle?

Our online converter can handle files up to 10MB for browser-based conversion. The API has a 1MB limit per request with rate limiting of 60 requests per minute. For larger datasets, we recommend using the @toon-format/toon npm package directly in your application.

Does TOON work with streaming LLM responses?

Yes, TOON works with streaming responses. You use TOON primarily to optimize input tokens in your prompts. The LLM can respond in any format (streaming or not). If you want the LLM to output TOON format, include format instructions and examples in your prompt.

Ready to try TOON?

Try Converter →Open Playground →