ZON Logo
Documentation
Docs
Essentials
Benchmarks

Benchmarks

Retrieval Accuracy

Benchmarks test LLM comprehension using 309 data retrieval questions on gpt-5-nano (Azure OpenAI).

Dataset Catalog

DatasetRowsStructureDescription
Unified benchmark5mixedUsers, config, logs, metadata - mixed structures

Structure: Mixed uniform tables + nested objects
Structure: Mixed uniform tables + nested objects
Questions: 309 total (field retrieval, aggregation, filtering, structure awareness)

Efficiency Ranking (Accuracy per 10K Tokens)

Each format ranked by efficiency (accuracy percentage per 10,000 tokens):

ZON            ████████████████████ 1430.6 acc%/10K │  99.0% acc │ 692 tokens 👑
CSV            ███████████████████░ 1386.5 acc%/10K │  99.0% acc │ 714 tokens
JSON compact   ████████████████░░░░ 1143.4 acc%/10K │  91.7% acc │ 802 tokens
TOON           ████████████████░░░░ 1132.7 acc%/10K │  99.0% acc │ 874 tokens
JSON           ██████████░░░░░░░░░░  744.6 acc%/10K │  96.8% acc │ 1,300 tokens

Efficiency score = (Accuracy % / Tokens) x 10,000. Higher is better.

ZON achieves 99.0% accuracy while using 20.8% fewer tokens than TOON and 13.7% fewer than Minified JSON.

Per-Model Comparison

Accuracy on the unified dataset with gpt-5-nano:

gpt-5-nano (Azure OpenAI)
→ ZON            ████████████████████  99.0% (306/309) │ 692 tokens
  TOON           ████████████████████  99.0% (306/309) │ 874 tokens
  CSV            ████████████████████  99.0% (306/309) │ 714 tokens
  JSON           ███████████████████░  96.8% (299/309) │ 1,300 tokens
  JSON compact   ██████████████████░░  91.7% (283/309) │ 802 tokens

ZON matches TOON's 99.0% accuracy while using 20.8% fewer tokens.

⚡️ Token Efficiency (vs Compact JSON)
TokenizerZON Savingsvs TOONvs CSV
GPT-4o-23.8% 👑-36.1%-12.9%
Claude 3.5-21.3% 👑-26.0%-9.9%
Llama 3-16.5% 👑-26.6%-9.2%

Note: ZON is the only human-readable format that consistently beats CSV in token count while maintaining full structural fidelity.


Token Efficiency Benchmark

Tokenizers: GPT-4o (o200k), Claude 3.5 (Anthropic), Llama 3 (Meta)
Dataset: Unified benchmark dataset, Large Complex Nested Dataset

BYTE SIZES:

CSV:              1,384 bytes
ZON:              1,399 bytes
TOON:             1,665 bytes
JSON (compact):   1,854 bytes
YAML:             2,033 bytes
JSON (formatted): 2,842 bytes
XML:              3,235 bytes

Unified Dataset

GPT-4o (o200k):

    ZON          █████████░░░░░░░░░░░ 513 tokens 👑
    CSV          ██████████░░░░░░░░░░ 534 tokens (+4.1%)
    JSON (cmp)   ███████████░░░░░░░░░ 589 tokens (+12.9%)
    TOON         ███████████░░░░░░░░░ 614 tokens (+19.7%)
    YAML         █████████████░░░░░░░ 728 tokens (+41.9%)
    JSON format  ████████████████████ 939 tokens (+45.4%)
    XML          ████████████████████ 1,093 tokens (+113.1%)

Claude 3.5 (Anthropic): 

    CSV          ██████████░░░░░░░░░░ 544 tokens 👑
    ZON          ██████████░░░░░░░░░░ 548 tokens (+0.7%)
    TOON         ██████████░░░░░░░░░░ 570 tokens (+4.0%)
    JSON (cmp)   ███████████░░░░░░░░░ 596 tokens (+8.1%)
    YAML         ████████████░░░░░░░░ 641 tokens (+17.0%)
    JSON format  ████████████████████ 914 tokens (+40.0%)
    XML          ████████████████████ 1,104 tokens (+101.5%)

Llama 3 (Meta):

    ZON          ██████████░░░░░░░░░░ 696 tokens 👑
    CSV          ██████████░░░░░░░░░░ 728 tokens (+4.6%)
    JSON (cmp)   ███████████░░░░░░░░░ 760 tokens (+8.4%)
    TOON         ███████████░░░░░░░░░ 784 tokens (+12.6%)
    YAML         █████████████░░░░░░░ 894 tokens (+28.4%)
    JSON format  ████████████████████ 1,225 tokens (+43.1%)
    XML          ████████████████████ 1,392 tokens (+100.0%)

Large Complex Nested Dataset

gpt-4o (o200k):

    ZON          █████████░░░░░░░░░░░ 143,661 tokens 👑
    CSV          ██████████░░░░░░░░░░ 164,919 tokens (+14.8%)
    JSON (cmp)   ███████████░░░░░░░░░ 188,604 tokens (+23.8%)
    TOON         █████████████░░░░░░░ 224,940 tokens (+56.6%)
    YAML         █████████████░░░░░░░ 224,938 tokens (+56.6%)
    JSON format  ████████████████████ 284,132 tokens (+97.8%)
    XML          ████████████████████ 335,239 tokens (+133.4%)

claude 3.5 (anthropic):

    ZON          █████████░░░░░░░░░░░ 145,652 tokens 👑
    CSV          ██████████░░░░░░░░░░ 161,701 tokens (+11.0%)
    JSON (cmp)   ███████████░░░░░░░░░ 185,136 tokens (+21.3%)
    TOON         ████████████░░░░░░░░ 196,893 tokens (+35.2%)
    YAML         ████████████░░░░░░░░ 196,892 tokens (+35.2%)
    JSON format  ████████████████████ 274,149 tokens (+88.2%)
    XML          ████████████████████ 327,274 tokens (+124.7%)

llama 3 (meta):

    ZON          ██████████░░░░░░░░░░ 230,838 tokens 👑
    CSV          ███████████░░░░░░░░░ 254,181 tokens (+10.1%)
    JSON (cmp)   ████████████░░░░░░░░ 276,405 tokens (+16.5%)
    TOON         █████████████░░░░░░░ 314,824 tokens (+36.4%)
    YAML         █████████████░░░░░░░ 314,820 tokens (+36.4%)
    JSON format  ████████████████████ 407,488 tokens (+76.5%)
    XML          ████████████████████ 480,125 tokens (+108.0%)

Overall Summary:

GPT-4o (o200k):
  ZON Wins: 2/2 datasets
  
  Total tokens across all datasets:
    ZON:         147,267
    CSV:         165,647 (+12.5%)
    JSON (cmp):  189,193 (+28.4%)
    TOON:        225,510 (+53.1%)
    
  ZON vs TOON: -34.7% fewer tokens
  ZON vs JSON: -22.2% fewer tokens

Claude 3.5 (Anthropic):
  ZON Wins: 1/2 datasets
  
  Total tokens across all datasets:
    ZON:         149,281
    CSV:         162,245 (+8.7%)
    JSON (cmp):  185,732 (+24.4%)
    TOON:        197,463 (+32.3%)
    
  ZON vs TOON: -24.4% fewer tokens
  ZON vs JSON: -19.6% fewer tokens

Llama 3 (Meta):
  ZON Wins: 2/2 datasets
  
  Total tokens across all datasets:
    ZON:         234,623
    CSV:         254,909 (+8.7%)
    JSON (cmp):  277,165 (+18.1%)
    TOON:        315,608 (+34.5%)
    
  ZON vs TOON: -25.7% fewer tokens
  ZON vs JSON: -15.3% fewer tokens

Key Insights:

  • ZON wins on all Llama 3 and GPT-4o tests (best token efficiency across both datasets).

  • Claude shows CSV has slight edge (0.2%) on simple tabular data, but ZON dominates on complex nested data.

  • Average savings: 25-35% vs TOON, 15-28% vs JSON across all tokenizers.

  • ZON wins on all Llama 3 and GPT-4o tests (best token efficiency across both datasets).

  • ZON is 2nd on Claude (CSV wins by only 0.2%, ZON still beats TOON by 4.6%).

  • ZON consistently outperforms TOON on every tokenizer (from 4.6% up to 34.8% savings).

Key Insight: ZON is the only format that wins or nearly wins across all models & datasets.