LLM Best Practices
Version: 1.1.0
Guide for maximizing ZON's effectiveness in LLM applications.
Encoding Mode Selection for LLMs
Choose the right ZON mode for your LLM workflow:
Recommended Modes
| Scenario | Mode | Why |
|---|---|---|
| Prompts (GPT-4, Claude) | llm-optimized | Uses true/false for better comprehension |
| High-volume APIs | compact | Maximum token savings |
| RAG context | llm-optimized | Balances clarity and efficiency |
| Function calling | compact | Minimal tokens |
| Human review needed | readable | YAML-like, easy to verify |
Token Impact Examples
Same Data, Different Modes:
const userData = {
users: [
{ id: 1, name: "Alice", active: true, role: "admin" },
{ id: 2, name: "Bob", active: false, role: "user" }
]
};
Compact Mode (38 tokens):
users:@(2):active,id,name,role
T,1,Alice,admin
F,2,Bob,user
LLM-Optimized Mode (42 tokens - 10% more, but clearer):
users:@(2):active,id,name,role
true,1,Alice,admin
false,2,Bob,user
Readable Mode (52 tokens - 37% more, human-friendly):
users:
- active:true
id:1
name:Alice
role:admin
- active:false
id:2
name:Bob
role:user
Use compact for production, llm-optimized for better AI understanding, readable for debugging.
Why ZON for LLMs?
LLM API costs are directly tied to token count. ZON reduces tokens by 23.8% vs JSON while achieving 100% retrieval accuracy.
Key Benefits:
- Lower costs: Fewer tokens = lower API bills
- Better accuracy: 100% vs JSON's 91.7%
- Self-documenting: Explicit headers
@(N):columns - Human-readable: Easy to debug and verify
Sending ZON as Input
Basic Pattern
Wrap ZON data in code blocks with format label:
Here's the user data in ZON format:
```zon
users:@(3):active,id,name,role
T,1,Alice,admin
T,2,Bob,user
F,3,Carol,guest
```
Question: How many active users are there?
Why this works:
- Code blocks prevent formatting issues
zonlabel helps model recognize format- Explicit headers (
@(3):columns) give clear schema
Alternative: No Code Block
For simple queries, code blocks aren't required:
Data:
users:@(3):id,name,active
1,Alice,T
2,Bob,F
3,Carol,T
Question: List all active users.
Prompting Strategies
Strategy 1: Show the Format (No Explanation)
Best approach - Let the model infer the structure:
```zon
products:@(4):category,id,name,price,stock
Electronics,1,Laptop,999,45
Books,2,Python Guide,29.99,120
Electronics,3,Mouse,19.99,200
Books,4,JavaScript Basics,24.95,85
```
Find products with stock below 100.
Why it works: The explicit headers (@(4):category,id,name,price,stock) are self-documenting.
Strategy 2: Minimal Context
For complex queries, add brief context:
Data format: ZON (tabular)
@(N) = row count
Column names listed in header
```zon
logs:@(100):level,message,timestamp,userId
ERROR,Database timeout,2025-01-15T10:30:00Z,1001
WARN,High memory usage,2025-01-15T10:31:15Z,1002
ERROR,API rate limit,2025-01-15T10:32:45Z,1001
...
```
How many ERROR logs are from userId 1001?
Strategy 3: Comparison (When Teaching Format)
If the model hasn't seen ZON before:
Traditional JSON:
```json
{"users": [{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}]}
```
Same data in compact ZON format:
```zon
users:@(2):id,name
1,Alice
2,Bob
```
Now answer based on this ZON data:
```zon
sales:@(5):amount,date,product,region
1250,2025-01-10,Laptop,West
890,2025-01-11,Mouse,East
...
```
Common Use Cases
1. Data Retrieval Questions
Perfect for ZON - table format excels here:
```zon
employees:@(20):active,department,id,name,salary
T,Engineering,1,Alice Chen,95000
T,Sales,2,Bob Smith,75000
F,Marketing,3,Carol Lee,68000
...
```
Questions:
1. What's the average salary in Engineering?
2. How many inactive employees are there?
3. List all Sales department employees.
2. Aggregation Tasks
```zon
transactions:@(1000):amount,category,date,userId
45.99,groceries,2025-01-10,1001
120.00,electronics,2025-01-10,1002
23.50,groceries,2025-01-11,1001
...
```
Calculate total spending by category for userId 1001.
3. Filtering and Search
```zon
products:@(500):category,inStock,name,price,rating
Electronics,T,Laptop Pro,1299,4.5
Books,F,Python Guide,29.99,4.8
Electronics,T,USB Mouse,19.99,4.2
...
```
Find all in-stock Electronics with rating above 4.0.
4. Structure Awareness
```zon
metadata{deployed:2025-01-15,env:production,version:1.3.0}
users:@(5):id,name,active
1,Alice,T
2,Bob,F
3,Carol,T
4,Dan,T
5,Eve,F
config{database{host:localhost,port:5432},cache{ttl:3600}}
```
Questions:
- What are the top-level keys?
- How many users are in the dataset?
- What's the database port?
Validation and Error Handling
Ask Model to Validate
```zon
users:@(3):id,name,active
1,Alice,T
2,Bob,F
```
Before answering: verify the data has exactly 3 rows as declared.
Then answer: How many users are active?
Handle Missing Data
```zon
products:@(4):id,name,price,stock
1,Laptop,999,45
2,Mouse,19.99,null
3,Keyboard,79.99,0
4,Monitor,299,15
```
Note: `null` means missing value.
Question: Which products have unknown stock levels?
Optimizing Token Usage
Tip 1: Use Compact Field Names
# Good (shorter column names)
u:@(100):id,n,e,a
1,Alice,alice@ex.com,T
2,Bob,bob@ex.com,F
# Acceptable (verbose names)
users:@(100):userId,fullName,emailAddress,isActive
1,Alice,alice@ex.com,true
2,Bob,bob@ex.com,false
Token savings: ~20% with compact names
Tip 2: Boolean Shorthand
ZON uses T/F instead of true/false:
users:@(100):id,name,active,verified
1,Alice,T,T
2,Bob,F,T
3,Carol,T,F
Token savings: ~40% on boolean fields
Tip 3: Null Handling
ZON uses explicit null:
data:@(50):id,value,note
1,100,null
2,null,Missing value
3,200,null
Token savings: Consistent with JSON, but unambiguous type.
Advanced Patterns
Multi-Table Structures
```zon
users:@(3):id,name,role
1,Alice,admin
2,Bob,user
3,Carol,user
posts:@(5):authorId,content,id,likes
1,Hello world,101,42
2,My first post,102,15
1,ZON is great,103,89
3,Learning LLMs,104,23
2,Second post,105,31
```
Question: How many posts did each admin user create?
Nested Config + Tables
```zon
config{env:prod,features{beta:F,darkMode:T},version:1.0}
users:@(1000):id,name,lastLogin
...
stats{activeToday:245,avgSessionTime:420,totalUsers:1000}
```
What percentage of users were active today?
Testing LLM Comprehension
Benchmark Your Model
Test with simple queries first:
```zon
test:@(3):id,value
1,100
2,200
3,300
```
1. How many rows? (Answer: 3)
2. What's the sum of values? (Answer: 600)
3. What's the average? (Answer: 200)
If model gets these right → ready for complex queries!
Common Failure Modes
-
Counting mismatch: Model counts incorrectly
- Fix: Add explicit count in question: "The data has @(N) rows..."
-
Type confusion: Model treats
Tas string not boolean- Fix: Remind: "
T=true,F=false"
- Fix: Remind: "
-
Missing columns: Model assumes column exists
- Fix: Headers are explicit - validate first
Model-Specific Tips
GPT-4/GPT-5
- Works perfectly out of box
- No hints needed
- 100% accuracy on ZON
Claude
- Also works great
- Slightly more verbose responses
- 100% accuracy
Llama Models
- Works well
- Warning: May need reminder: "@(N) means N rows"
- 90%+ accuracy
Complete Example: E-Commerce Query
Here's today's sales data in ZON format:
```zon
orders:@(245):amount,category,customerId,orderId,status
129.99,electronics,C1001,ORD5001,shipped
45.50,books,C1002,ORD5002,pending
89.99,electronics,C1001,ORD5003,shipped
23.99,books,C1003,ORD5004,delivered
199.99,electronics,C1004,ORD5005,shipped
...
```
Questions:
1. How many orders are from customer C1001?
2. What's the total revenue from electronics?
3. How many orders are still pending?
4. What's the average order value?
Please analyze the data and provide numerical answers.
Why this works:
- Clear format with
@(245)count - Explicit column headers
- Self-documenting structure
- No ambiguity
Comparison: ZON vs TOON vs JSON
| Aspect | JSON | TOON | ZON |
|---|---|---|---|
| Token count | 28,042 | 20,988 | 19,995 (Best) |
| LLM accuracy | 91.7% | 100% | 100% (Verified) |
| Hints needed | Sometimes | No | No |
| Self-documenting | No | Yes | Yes |
| Boolean format | true/false | true/false | T/F (Best) |
Verdict: ZON offers best balance of compactness and accuracy.
Quick Reference
Do's
- Use code blocks for formatting
- Include
@(N)row counts - List column names explicitly
- Use
T/Ffor booleans - Use
nullfor null values
Don'ts
- Don't explain ZON syntax (show, don't tell)
- Don't mix formats (stick to ZON)
- Don't omit row counts
- Don't use verbose field names unnecessarily
See also:
- Syntax Cheatsheet - Quick reference
- API Reference - encode/decode functions
- Format Specification - Formal grammar
