Posts
All the posts I've published.
Elephant VM - stack-based VM written in Rust
Published: at 09:11 AMAn implementation of a simple stack-based VM written in Rust
Memory Optimization Deep Dive Running 8B Models on a Single 4090 using vLLM
Published: at 03:01 PMAn exploration of quantization techniques and memory optimization strategies for running Llama 8B models efficiently on consumer hardware using vLLM
Data capture for ML endpoints
Published: at 04:01 PMOne of the approaches of how how you can add data capture to your ML endpoints