July 10, 2024 Comments off “LPU: A Latency-Optimized Highly Scalable Processor for Large Language Model Inference” IEEE Micro, 2024