mic@2 Text Format
mic@2 is a compact, line-oriented text format for Mind IR graphs, designed for minimal token usage when working with AI agents.
Key Improvements over mic@1
- ~33% token reduction (35 vs 52 tokens) through implicit value IDs
- ~33% size reduction (140 vs 209 bytes)
- Compact opcodes:
m,+,rinstead ofmatmul,add,relu - Space-separated dims:
f16 128 128instead of[f16;128,128]
Quick Example
mic@2 T0 f16 128 128 T1 f16 128 a X T0 p W T0 p b T1 m 0 1 + 3 2 r 4 + 5 0 O 6
This represents Y = relu(X @ W + b) + X (a residual block).
Value ID Assignment
Values are assigned implicit sequential IDs starting at 0, in the order they appear:
| Line | ID | Description |
|---|---|---|
| a X T0 | 0 | Input tensor X |
| p W T0 | 1 | Weight matrix W |
| p b T1 | 2 | Bias vector b |
| m 0 1 | 3 | X @ W |
| + 3 2 | 4 | (X @ W) + b |
| r 4 | 5 | relu((X @ W) + b) |
| + 5 0 | 6 | relu(...) + X (residual) |
| O 6 | - | Output is value 6 |
Line Types
| Prefix | Purpose | Syntax |
|---|---|---|
| mic@2 | Version header | mic@2 |
| S | Symbol declaration | S <name> |
| T<idx> | Type definition | T0 f16 128 128 |
| a | Argument (input) | a <name> T<idx> |
| p | Parameter (weight) | p <name> T<idx> |
| <op> | Node operation | m 0 1 |
| O | Output marker | O <value_id> |
| # | Comment | # Layer 1 |
Opcodes
| Token | Name | Arity | Description |
|---|---|---|---|
| m | Matmul | 2 | Matrix multiplication |
| + | Add | 2 | Element-wise addition |
| - | Sub | 2 | Element-wise subtraction |
| * | Mul | 2 | Element-wise multiplication |
| / | Div | 2 | Element-wise division |
| r | Relu | 1 | ReLU activation |
| s | Softmax | 1 | Softmax (optional axis) |
| sig | Sigmoid | 1 | Sigmoid activation |
| th | Tanh | 1 | Tanh activation |
| gelu | GELU | 1 | GELU activation |
| ln | LayerNorm | 1 | Layer normalization |
| t | Transpose | 1 | Transpose (perm params) |
| rshp | Reshape | 1 | Reshape |
| sum | Sum | 1 | Sum reduction (axis params) |
| mean | Mean | 1 | Mean reduction (axis params) |
| max | Max | 1 | Max reduction (axis params) |
| cat | Concat | N | Concatenate (axis param) |
| split | Split | 1 | Split (axis, count params) |
| gth | Gather | 2 | Gather along axis |
Type Syntax
# Type definition: T<idx> <dtype> <dim0> <dim1>... T0 f16 128 128 # 128x128 float16 matrix T1 f32 B seq 768 # Batch x sequence x 768 T2 i64 ? # Dynamic-shape int64 # Supported dtypes f16, f32, f64 # Floating point bf16 # BFloat16 i8, i16, i32, i64 # Signed integers u8, u16, u32, u64 # Unsigned integers bool # Boolean # Dimension types 128 # Fixed dimension B, seq, hidden # Symbolic (must declare with S) ? # Wildcard/dynamic
Symbolic Dimensions
Declare symbolic dimension names before using them in types:
mic@2 S B S seq S hidden T0 f32 B seq hidden a X T0 O 0
Canonicalization Rules
For deterministic output, canonical mic@2 follows these rules:
- Unix line endings (
\n) - Exactly one space between tokens
- No trailing whitespace on lines
- No trailing newline after output line
- Section order: header, symbols, types, values, output
- Comments are not preserved in canonical output
Validation Rules
- Type indices must be sequential starting at 0
- Type references must refer to defined types
- Node inputs must reference earlier values (no forward refs)
- Output must reference a valid value
- Opcode arity must match input count
Rust API
use mind::ir::compact::v2::{parse_mic2, emit_mic2, Graph};
// Parse mic@2 text
let graph = parse_mic2(text)?;
// Emit canonical mic@2
let canonical = emit_mic2(&graph);
// Roundtrip is deterministic
assert_eq!(emit_mic2(&parse_mic2(&canonical)?), canonical);Error Handling
use mind::ir::compact::v2::{parse_mic2, Mic2ParseError};
match parse_mic2(input) {
Ok(graph) => { /* use graph */ }
Err(Mic2ParseError { line, message }) => {
eprintln!("mic@2:{}: error: {}", line, message);
}
}