intelnav · llama.cpp · p2p · inference · rust
IntelNav: running a 33B model across three 8GB GPUs
Notes on building a peer-to-peer LLM inference network — how the chain is shaped, what crosses the wire, and why a 33B model fits on hardware that can't hold a quarter of it.