[ISCA 2025] Sungmin Hong’s paper on Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization is accepted

Congratulations!

We have a paper accepted to ACM/IEEE International Symposium on Computer Architecture (ISCA), 2025

“Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization”

  • *Minsu Kim, *Seongmin Hong, RyeoWook Ko, Soongyu Choi, Hunjong Lee, Junsoo Kim, Joo-Young Kim, Jongse Park (*equal contribution)