Home
Research
People
Publications
Talks & Events
Awards
News
LINKEDIN

July 10, 2024
Comments off

“LPU: A Latency-Optimized Highly Scalable Processor for Large Language Model Inference” IEEE Micro, 2024

Post navigation

Search for:

Recent Posts

[VLSI 2025] Jung-Hoon Kim’s paper on Adelia: A 4nm LLM Accelerator with Streamlined Dataflow and Dual-Mode Parallelization for Efficient Generative AI Inference is accepted

[ISCA 2025] Seungjae Moon and Junseo Cha’s paper on Hybe: GPU-NPU Hybrid System for Efficient LLM Inference with Million-Token Context Window is accepted

[ISCA 2025] Sungmin Hong’s paper on Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization is accepted

Address: #4209, School of Electrical Engineering (E3-2), KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, South Korea Tel: +82-42-350-7461 Email: castlab@kaist.ac.kr

Copyright© 2019 - CastLab