Posts
All the posts I've published.
Building a KV Cache Block Scheduler in Rust
Published: at 10:00 AMA from-scratch PagedAttention-style KV cache block manager in Rust - reference counting, prefix caching via radix trie, LRU eviction, and copy-on-write for beam search.
Elephant VM - stack-based VM written in Rust
Published: at 09:11 AMAn implementation of a simple stack-based VM written in Rust
Memory Optimization Deep Dive Running 8B Models on a Single 4090 using vLLM
Published: at 03:01 PMAn exploration of quantization techniques and memory optimization strategies for running Llama 8B models efficiently on consumer hardware using vLLM