March 24, 2025 Comments off “Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization” ACM/IEEE International Symposium on Computer Architecture (ISCA), 2025