Tag: quantization
All the articles with the tag "quantization".
Memory Optimization Deep Dive Running 8B Models on a Single 4090 using vLLM
Published: at 03:01 PMAn exploration of quantization techniques and memory optimization strategies for running Llama 8B models efficiently on consumer hardware using vLLM