We propose a novel text-to-motion processor called HuMoniX, enabling real-time human motion generation by integrating two heterogeneous engine clusters. The 12mm2 HuMoniX chip, fabricated using 14nm technology, operates at 50-600MHz with a supply voltage of 0.63 to 0.94V. Figure 23.10.6 shows its measurement results and comparison table. It demonstrates robust…Read More
This paper presents Picasso, an end-to-end diffusion accelerator. Picasso proposes a novel hyper-precision data type and reconfigurable architecture that can maximize hardware efficiency with extended dynamic range, with no compromise in accuracy. Picasso also proposes a unified engine operating all non-matrix operations in a streamlined processing flow and minimizes the…Read More
This paper presents DPIM, the first 2T1C eDRAM Transformer-in-memory chip. Its high-density eDRAM cell supports large-capacity processing-in-memory (PIM) macros of 1.38 Mb/mm2, reducing external memory access. DPIM adopts a sparse-aware quantization scheme to entire layers of Transformer, which quantizes the model to 8-bit integer (INT8) with a minimal accuracy drop…Read More