MindIR Compact (MIC)
MIC is a family of compact serialization formats for Mind IR graphs, designed for minimal token usage, fast parsing, and deterministic output.
Format Versions
mic@1 (Legacy)
Original text format with explicit node IDs and verbose opcodes.
- Explicit IDs:
N0,N1,N2 - Verbose ops:
add,matmul,relu - Bracket syntax:
[f32;3,4]
mic@2 (Current)
Next-gen text format with implicit IDs and compact opcodes.
- Implicit IDs by order of appearance
- Compact ops:
+,m,r - Space syntax:
f32 3 4
MIC-B v2 (Binary)
Compact binary format with ULEB128 varints.
- ~4x smaller than mic@2 text
- Direct memory mapping possible
- Deterministic byte output
Format Detection
Automatic format detection by magic bytes or header.
use mind::ir::compact::v2::detect_format;
match detect_format(data) {
MicFormat::Mic2 => parse_mic2(..),
MicFormat::MicB => parse_micb(..),
MicFormat::Mic1 => parse_mic(..),
_ => Err("unknown format"),
}Format Comparison
| Format | Tokens | Bytes | vs JSON | Parse Speed | Use Case |
|---|---|---|---|---|---|
| JSON | 278 | 1,133 | baseline | 5.31 µs | Legacy interchange |
| TOML | 151 | 607 | 1.8x | 137.06 µs | Config files |
| TOON | 67 | 269 | 4.1x | 2.67 µs | Compact text |
| mic@1 | 52 | 209 | 5.3x | 2.26 µs | Mind IR (text) |
| mic@2 | ~35 | ~140 | ~8x | ~1.8 µs | LLM prompts, git diffs |
| MIC-B v2 | - | ~50 | 22x (bytes) | ~0.5 µs | Storage, network |
Benchmark: 6-node neural network layer (param, matmul, add, relu). See BENCHMARK_RESULTS.md for methodology.
Feature Comparison
| Feature | JSON | TOON | mic@1 | mic@2 | MIC-B |
|---|---|---|---|---|---|
| Human readable | Yes | Yes | Yes | Yes | No |
| Git-friendly | No | Partial | Yes | Yes | No |
| Deterministic | No | No | Yes | Yes | Yes |
| LLM-optimized | No | No | Partial | Yes | N/A |
| Binary format | No | No | No | No | Yes |
| Implicit IDs | No | No | No | Yes | Yes |
Side-by-Side Example
mic@1 (120 bytes)
mic@1 T0 [f16;128,128] T1 [f16;128] N0 param "X" T0 N1 param "W" T0 N2 param "b" T1 N3 matmul N0 N1 T0 N4 add N3 N2 T0 N5 relu N4 T0 N6 add N5 N0 T0 O N6
mic@2 (85 bytes)
mic@2 T0 f16 128 128 T1 f16 128 a X T0 p W T0 p b T1 m 0 1 + 3 2 r 4 + 5 0 O 6
Key Design Principles
- Token efficiency: Minimize LLM tokens for AI agent workflows
- Git-friendly: One operation per line for clean diffs
- Deterministic: Same graph always produces identical bytes
- Lossless roundtrip: mic@2 ↔ MIC-B ↔ Mind IR
- Security limits: Bounded inputs prevent DoS attacks
Rust API
use mind::ir::compact::v2::{
parse_mic2, emit_mic2, // Text format
parse_micb, emit_micb, // Binary format
detect_format, MicFormat, // Auto-detection
};
// Parse mic@2 text to Graph
let graph = parse_mic2(text)?;
// Emit Graph to mic@2 text
let text = emit_mic2(&graph);
// Parse MIC-B binary to Graph
let graph = parse_micb(&mut cursor)?;
// Emit Graph to MIC-B binary
emit_micb(&graph, &mut writer)?;
// Roundtrip is deterministic
assert_eq!(emit_mic2(&parse_mic2(&text)?), text);Security Limits
All MIC parsers enforce strict limits to prevent denial-of-service attacks:
- Input size: 10 MB maximum
- Value count: 100,000 maximum
- String count: 1,000,000 maximum (binary)
- Shape dimensions: 32 maximum
- String length: 64 KB maximum (binary)
Learn More
See the full specifications at star-ga/mind-spec.