Congratulations!
We have a paper accepted to ACM/IEEE International Symposium on Computer Architecture (ISCA), 2025
“Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization”
- *Minsu Kim, *Seongmin Hong, RyeoWook Ko, Soongyu Choi, Hunjong Lee, Junsoo Kim, Joo-Young Kim, Jongse Park (*equal contribution)