ZON Logo
Documentation
Docs
Essentials
LLM Best Practices

LLM Best Practices

Version: 1.1.0

Guide for maximizing ZON's effectiveness in LLM applications.

Encoding Mode Selection for LLMs

Choose the right ZON mode for your LLM workflow:

ScenarioModeWhy
Prompts (GPT-4, Claude)llm-optimizedUses true/false for better comprehension
High-volume APIscompactMaximum token savings
RAG contextllm-optimizedBalances clarity and efficiency
Function callingcompactMinimal tokens
Human review neededreadableYAML-like, easy to verify

Token Impact Examples

Same Data, Different Modes:

const userData = {
  users: [
    { id: 1, name: "Alice", active: true, role: "admin" },
    { id: 2, name: "Bob", active: false, role: "user" }
  ]
};

Compact Mode (38 tokens):

users:@(2):active,id,name,role
T,1,Alice,admin
F,2,Bob,user

LLM-Optimized Mode (42 tokens - 10% more, but clearer):

users:@(2):active,id,name,role
true,1,Alice,admin
false,2,Bob,user

Readable Mode (52 tokens - 37% more, human-friendly):

users:
  - active:true
    id:1
    name:Alice
    role:admin
  - active:false
    id:2
    name:Bob
    role:user

Use compact for production, llm-optimized for better AI understanding, readable for debugging.


Why ZON for LLMs?

LLM API costs are directly tied to token count. ZON reduces tokens by 23.8% vs JSON while achieving 100% retrieval accuracy.

Key Benefits:

  • Lower costs: Fewer tokens = lower API bills
  • Better accuracy: 100% vs JSON's 91.7%
  • Self-documenting: Explicit headers @(N):columns
  • Human-readable: Easy to debug and verify

Sending ZON as Input

Basic Pattern

Wrap ZON data in code blocks with format label:

Here's the user data in ZON format:

```zon
users:@(3):active,id,name,role
T,1,Alice,admin
T,2,Bob,user
F,3,Carol,guest
```

Question: How many active users are there?

Why this works:

  • Code blocks prevent formatting issues
  • zon label helps model recognize format
  • Explicit headers (@(3):columns) give clear schema

Alternative: No Code Block

For simple queries, code blocks aren't required:

Data:
users:@(3):id,name,active
1,Alice,T
2,Bob,F
3,Carol,T

Question: List all active users.

Prompting Strategies

Strategy 1: Show the Format (No Explanation)

Best approach - Let the model infer the structure:

```zon
products:@(4):category,id,name,price,stock
Electronics,1,Laptop,999,45
Books,2,Python Guide,29.99,120
Electronics,3,Mouse,19.99,200
Books,4,JavaScript Basics,24.95,85
```

Find products with stock below 100.

Why it works: The explicit headers (@(4):category,id,name,price,stock) are self-documenting.

Strategy 2: Minimal Context

For complex queries, add brief context:

Data format: ZON (tabular)
@(N) = row count
Column names listed in header

```zon
logs:@(100):level,message,timestamp,userId
ERROR,Database timeout,2025-01-15T10:30:00Z,1001
WARN,High memory usage,2025-01-15T10:31:15Z,1002
ERROR,API rate limit,2025-01-15T10:32:45Z,1001
...
```

How many ERROR logs are from userId 1001?

Strategy 3: Comparison (When Teaching Format)

If the model hasn't seen ZON before:

Traditional JSON:
```json
{"users": [{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}]}
```

Same data in compact ZON format:
```zon
users:@(2):id,name
1,Alice
2,Bob
```

Now answer based on this ZON data:
```zon
sales:@(5):amount,date,product,region
1250,2025-01-10,Laptop,West
890,2025-01-11,Mouse,East
...
```

Common Use Cases

1. Data Retrieval Questions

Perfect for ZON - table format excels here:

```zon
employees:@(20):active,department,id,name,salary
T,Engineering,1,Alice Chen,95000
T,Sales,2,Bob Smith,75000
F,Marketing,3,Carol Lee,68000
...
```

Questions:
1. What's the average salary in Engineering?
2. How many inactive employees are there?
3. List all Sales department employees.

2. Aggregation Tasks

```zon
transactions:@(1000):amount,category,date,userId
45.99,groceries,2025-01-10,1001
120.00,electronics,2025-01-10,1002
23.50,groceries,2025-01-11,1001
...
```

Calculate total spending by category for userId 1001.
```zon
products:@(500):category,inStock,name,price,rating
Electronics,T,Laptop Pro,1299,4.5
Books,F,Python Guide,29.99,4.8
Electronics,T,USB Mouse,19.99,4.2
...
```

Find all in-stock Electronics with rating above 4.0.

4. Structure Awareness

```zon
metadata{deployed:2025-01-15,env:production,version:1.3.0}
users:@(5):id,name,active
1,Alice,T
2,Bob,F
3,Carol,T
4,Dan,T
5,Eve,F
config{database{host:localhost,port:5432},cache{ttl:3600}}
```

Questions:
- What are the top-level keys?
- How many users are in the dataset?
- What's the database port?

Validation and Error Handling

Ask Model to Validate

```zon
users:@(3):id,name,active
1,Alice,T
2,Bob,F
```

Before answering: verify the data has exactly 3 rows as declared.
Then answer: How many users are active?

Handle Missing Data

```zon
products:@(4):id,name,price,stock
1,Laptop,999,45
2,Mouse,19.99,null
3,Keyboard,79.99,0
4,Monitor,299,15
```

Note: `null` means missing value.
Question: Which products have unknown stock levels?

Optimizing Token Usage

Tip 1: Use Compact Field Names

# Good  (shorter column names)
u:@(100):id,n,e,a
1,Alice,alice@ex.com,T
2,Bob,bob@ex.com,F

# Acceptable  (verbose names)
users:@(100):userId,fullName,emailAddress,isActive
1,Alice,alice@ex.com,true
2,Bob,bob@ex.com,false

Token savings: ~20% with compact names

Tip 2: Boolean Shorthand

ZON uses T/F instead of true/false:

users:@(100):id,name,active,verified
1,Alice,T,T
2,Bob,F,T
3,Carol,T,F

Token savings: ~40% on boolean fields

Tip 3: Null Handling

ZON uses explicit null:

data:@(50):id,value,note
1,100,null
2,null,Missing value
3,200,null

Token savings: Consistent with JSON, but unambiguous type.


Advanced Patterns

Multi-Table Structures

```zon
users:@(3):id,name,role
1,Alice,admin
2,Bob,user
3,Carol,user

posts:@(5):authorId,content,id,likes
1,Hello world,101,42
2,My first post,102,15
1,ZON is great,103,89
3,Learning LLMs,104,23
2,Second post,105,31
```

Question: How many posts did each admin user create?

Nested Config + Tables

```zon
config{env:prod,features{beta:F,darkMode:T},version:1.0}
users:@(1000):id,name,lastLogin
...
stats{activeToday:245,avgSessionTime:420,totalUsers:1000}
```

What percentage of users were active today?

Testing LLM Comprehension

Benchmark Your Model

Test with simple queries first:

```zon
test:@(3):id,value
1,100
2,200
3,300
```

1. How many rows? (Answer: 3)
2. What's the sum of values? (Answer: 600)
3. What's the average? (Answer: 200)

If model gets these right → ready for complex queries!

Common Failure Modes

  1. Counting mismatch: Model counts incorrectly

    • Fix: Add explicit count in question: "The data has @(N) rows..."
  2. Type confusion: Model treats T as string not boolean

    • Fix: Remind: "T=true, F=false"
  3. Missing columns: Model assumes column exists

    • Fix: Headers are explicit - validate first

Model-Specific Tips

GPT-4/GPT-5

  • Works perfectly out of box
  • No hints needed
  • 100% accuracy on ZON

Claude

  • Also works great
  • Slightly more verbose responses
  • 100% accuracy

Llama Models

  • Works well
  • Warning: May need reminder: "@(N) means N rows"
  • 90%+ accuracy

Complete Example: E-Commerce Query

Here's today's sales data in ZON format:

```zon
orders:@(245):amount,category,customerId,orderId,status
129.99,electronics,C1001,ORD5001,shipped
45.50,books,C1002,ORD5002,pending
89.99,electronics,C1001,ORD5003,shipped
23.99,books,C1003,ORD5004,delivered
199.99,electronics,C1004,ORD5005,shipped
...
```

Questions:
1. How many orders are from customer C1001?
2. What's the total revenue from electronics?
3. How many orders are still pending?
4. What's the average order value?

Please analyze the data and provide numerical answers.

Why this works:

  • Clear format with @(245) count
  • Explicit column headers
  • Self-documenting structure
  • No ambiguity

Comparison: ZON vs TOON vs JSON

AspectJSONTOONZON
Token count28,04220,98819,995 (Best)
LLM accuracy91.7%100%100% (Verified)
Hints neededSometimesNoNo
Self-documentingNoYesYes
Boolean formattrue/falsetrue/falseT/F (Best)

Verdict: ZON offers best balance of compactness and accuracy.


Quick Reference

Do's

  • Use code blocks for formatting
  • Include @(N) row counts
  • List column names explicitly
  • Use T/F for booleans
  • Use null for null values

Don'ts

  • Don't explain ZON syntax (show, don't tell)
  • Don't mix formats (stick to ZON)
  • Don't omit row counts
  • Don't use verbose field names unnecessarily

See also: