Migration Guide: PyTorch to MIND

Side-by-side examples showing how PyTorch patterns map to MIND. Every comparison highlights what MIND catches at compile time that PyTorch only finds at runtime.

Tensor Creation

PyTorch

import torch

x = torch.tensor([[1.0, 2.0], [3.0, 4.0]])
w = torch.randn(2, 3)
b = torch.zeros(3)

MIND

// Shape is part of the type — verified at compile time
let x: Tensor<f32, 2, 2> = tensor([[1.0, 2.0], [3.0, 4.0]])
param w: Tensor<f32, 2, 3>
param b: Tensor<f32, 3>

In MIND, tensor shapes are compile-time types. A shape mismatch is a compile error, not a runtime crash.

Shape Safety

PyTorch

# PyTorch: crashes at RUNTIME
x = torch.randn(32, 784)
w = torch.randn(256, 784)  # Wrong shape!
y = x @ w  # RuntimeError: mat1 and mat2 shapes
           # cannot be multiplied (32x784 and 256x784)

MIND

// MIND: caught at COMPILE TIME
let x: Tensor<f32, ?, 784> = input()
param w: Tensor<f32, 256, 784>       // Wrong shape!
let y = matmul(x, w)
// COMPILE ERROR: matmul inner dimensions
// do not match: 784 != 256
// hint: did you mean Tensor<f32, 784, 256>?

Zero shape bugs reach production. The compiler catches them all.

Autodiff

PyTorch

# Runtime autograd tape
x = torch.randn(32, 784, requires_grad=True)
y = model(x)
loss = criterion(y, target)
loss.backward()  # Builds + walks tape at runtime
optimizer.step()

MIND

// Compile-time gradient generation
@grad
fn train_step(x: Tensor<f32, ?, 784>,
              target: Tensor<f32, ?, 10>,
              lr: f32) -> Tensor<f32, 1> {
    let pred = forward(x)
    let loss = cross_entropy(pred, target)
    for param in parameters {
        param = sub(param, mul_scalar(grad(loss, param), lr))
    }
    loss
}

MIND's @grad compiles gradients at build time. The optimizer sees the full graph and fuses ops.

Linear Layer

PyTorch

import torch.nn as nn

class MLP(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 256)
        self.fc2 = nn.Linear(256, 10)

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        return self.fc2(x)

MIND

module mlp

param W1: Tensor<f32, 784, 256>
param b1: Tensor<f32, 256>
param W2: Tensor<f32, 256, 10>
param b2: Tensor<f32, 10>

fn forward(x: Tensor<f32, ?, 784>)
    -> Tensor<f32, ?, 10> {
    let h = relu(add(matmul(x, W1),
                     broadcast(b1, [?, 256])))
    add(matmul(h, W2), broadcast(b2, [?, 10]))
}

Explicit parameter tensors instead of opaque Module objects. Every shape is visible and verified.

Model Export

PyTorch

# Multiple export paths, each with quirks
torch.onnx.export(model, dummy_input, "model.onnx")
traced = torch.jit.trace(model, dummy_input)
traced.save("model.pt")
# TensorRT, CoreML need separate conversion

MIND

# Single source, multiple targets
mindc build model.mind                 # CPU
mindc build model.mind --target cuda   # GPU
mindc build model.mind --target metal  # Metal
mindc build model.mind --export onnx   # ONNX
# Same semantics. No translation bugs.

No rewrite needed. The compiler handles target-specific code generation from the same source.

What maps, what doesn't

FeatureMaps?Notes
Tensor operations (matmul, conv2d, etc.)YesFull coverage
Shape-checked tensorsYesCompile-time (vs runtime)
Autograd / autodiffYesCompile-time @grad
Custom CUDA kernelsYesVia CUDA backend
Distributed training (DDP)YesNCCL backend
Model servingYesBuilt-in HTTP/gRPC
Dynamic computation graphsNoMIND is static-graph
Eager execution / REPLNoCompiled, not interpreted
Python ecosystem (pandas, sklearn)NoMIND has its own stdlib
Pre-trained model zoo (HuggingFace)NoRequires reimplementation

Estimated migration effort

Model ComplexityEffortMIND LOC
Simple model (linear, MLP)1-2 hours50-100 lines
Medium model (CNN, RNN)2-4 hours100-300 lines
Complex model (Transformer)4-8 hours300-600 lines
Full training pipeline1-2 days500-1000 lines

Need help migrating?

Our pilot program includes hands-on migration support. We'll help you port your first model and verify compliance artifacts.

Start a pilot