ZON Logo
Documentation
Docs
Essentials
Getting Started

Getting Started

Zero Overhead Notation - A compact, human-readable way to encode JSON for LLMs.

File Extension: .zonf | Media Type: text/zon | Encoding: UTF-8

ZON is a token-efficient serialization format designed for LLM workflows. It achieves 35-50% token reduction vs JSON through tabular encoding, single-character primitives, and intelligent compression while maintaining 100% data fidelity.

Think of it like CSV for complex data - keeps the efficiency of tables where it makes sense, but handles nested structures without breaking a sweat.

The ZON format is stable, but it's also an evolving concept. There's no finalization yet, so your input is valuable. Contribute to the spec or share your feedback to help shape its future.


Why ZON?

AI is becoming cheaper and more accessible, but larger context windows allow for larger data inputs as well. LLM tokens still cost money - and standard JSON is verbose and token-expensive:

{
  "context": {
    "task": "Our favorite hikes together",
    "location": "Boulder",
    "season": "spring_2025"
  },
  "friends": ["ana", "luis", "sam"],
  "hikes": [
    {
      "id": 1,
      "name": "Blue Lake Trail",
      "distanceKm": 7.5,
      "elevationGain": 320,
      "companion": "ana",
      "wasSunny": true
    },
    {
      "id": 2,
      "name": "Ridge Overlook",
      "distanceKm": 9.2,
      "elevationGain": 540,
      "companion": "luis",
      "wasSunny": false
    },
    {
      "id": 3,
      "name": "Wildflower Loop",
      "distanceKm": 5.1,
      "elevationGain": 180,
      "companion": "sam",
      "wasSunny": true
    }
  ]
}
TOON already conveys the same information with fewer tokens.
context:
  task: Our favorite hikes together
  location: Boulder
  season: spring_2025
friends[3]: ana,luis,sam
hikes[3]{id,name,distanceKm,elevationGain,companion,wasSunny}:
  1,Blue Lake Trail,7.5,320,ana,true
  2,Ridge Overlook,9.2,540,luis,false
  3,Wildflower Loop,5.1,180,sam,true

ZON conveys the same information with even fewer tokens than TOON - using compact table format with explicit headers:

context{task:Our favorite hikes together,location:Boulder,season:spring_2025}
friends[ana,luis,sam]
hikes:@(3):companion,distanceKm,elevationGain,id,name,wasSunny
ana,7.5,320,1,Blue Lake Trail,T
luis,9.2,540,2,Ridge Overlook,F
sam,5.1,180,3,Wildflower Loop,T

Key Features

  • 100% LLM Accuracy: Achieves perfect retrieval (24/24 questions) with self-explanatory structure - no hints needed
  • Most Token-Efficient: 4-15% fewer tokens than TOON across all tokenizers
  • JSON Data Model: Encodes the same objects, arrays, and primitives as JSON with deterministic, lossless round-trips
  • Minimal Syntax: Explicit headers (@(N) for count, column list) eliminate ambiguity for LLMs
  • Tabular Arrays: Uniform arrays collapse into tables that declare fields once and stream row values
  • Canonical Numbers: No scientific notation (1000000, not 1e6), NaN/Infinity -> null
  • Deep Nesting: Handles complex nested structures efficiently (91% compression on 50-level deep objects)
  • Security Limits: Automatic DOS prevention (100MB docs, 1M arrays, 100K keys)
  • Production Ready: 94/94 tests pass, 27/27 datasets verified, zero data loss

Installation & Quick Start

TypeScript Library

npm install zon-format

Example usage:

import { encode, decode } from 'zon-format';

const data = {
  users: [
    { id: 1, name: 'Alice', role: 'admin', active: true },
    { id: 2, name: 'Bob', role: 'user', active: true }
  ]
};

console.log(encode(data));
// users:@(2):active,id,name,role
// T,1,Alice,admin
// T,2,Bob,user

// Decode back to JSON
const decoded = decode(encoded);
console.log(decoded);

//{
// users: [
// { id: 1, name: 'Alice', role: 'admin', active: true },
// { id: 2, name: 'Bob', role: 'user', active: true }
// ]
//};

// Identical to original - lossless!