March 24, 2025 Comments off “Hybe: GPU-NPU Hybrid System for Efficient LLM Inference with Million-Token Context Window” ACM/IEEE International Symposium on Computer Architecture (ISCA), 2025