2024
62 |
MICRO
TOP-TIER
|
AdapTiV: Sign-Similarity based Image-Adaptive Token Merging for Vision Transformer Acceleration | Architecture |
ACM/IEEE International Symposium on Microarchitecture (MICRO), 2024 | ||
*Seungjae Yoo, *Hangyeol Kim, Joo-Young Kim (*equal contribution) |
61 |
ICCAD
TOP-TIER
|
APINT: A Full-Stack Framework for Acceleration of Privacy-Preserving Inference of Transformers based on Garbled Circuits | Architecture Automation |
ACM/IEEE International Conference on Computer-Aided Design(ICCAD), 2024 | ||
Hyunjun Cho, Jaeho Jeon, Jaehoon Heo, Joo-Young Kim |
60 |
ESSERC
MAJOR
|
A 28nm 4.96 TOPS/W End-to-End Diffusion Accelerator with Reconfigurable Hyper-Precision Unified Non-Matrix Processing Engine | Circuit |
European Solid-State Electronics Research Conference(ESSERC), 2024 | ||
Sungyeob Yoo, Geonwoo Ko, Seri Ham, Seeyeon Kim, Yi Chen, Joo-Young Kim |
59 |
ESSERC
MAJOR
|
DPIM: A 19.36TOPS/W 2T1C eDRAM Transformer-in-Memory Chip with Sparsity-Aware Quantization Heterogeneous Dense-Sparse Core | Circuit |
European Solid-State Electronics Research Conference(ESSERC), 2024 | ||
Donghyuk Kim, Jae-Young Kim, Hyunjun Cho, Seungjae Yoo, Sukjin Lee, Sungwoong Yune, Hoichang Jeong, Keonhee Park, Ki-soo Lee, Jongchan Lee, Chanheum Han, Gunmo Koo, Yuli Han, Jaejin Kim, Jaemin Kim, Kyuho Lee, Joo-Hyung Chae, Kunhee Cho, Joo-Young Kim |
58 |
HotChips
|
Picasso: An Area/Energy-Efficient End-to-End Diffusion Accelerator with Hyper-Precision Data Type | Circuit |
Hot Chips: A Symposium on High Performance Chips (HOTCHIPS), 2024 | ||
Sungyeob Yoo, Geonwoo Ko, Seri Ham, Seeyeon Kim, Yi Chen, Joo-Young Kim |
57 |
ISCA
TOP-TIER
|
BLESS: Bandwidth Locality Enhanced SMEM Seeding Acceleration for DNA Sequencing | Architecture |
ACM/IEEE International Symposium on Computer Architecture (ISCA), 2024 | ||
*Seunghee Han, *Seungjae Moon, Teokkyu Suh, Jaehoon Heo, Joo-Young Kim (*equal contribution) |
56 |
CICC
MAJOR
|
A 38.5TOPS/W Point Cloud Neural Network Processor with Virtual Pillar Quadtree-based Workload Management for Real-Time Outdoor BEV Detection | Circuit |
IEEE Custom Integrated Circuits Conference (CICC), 2024 | ||
Sukbin Lim, Jaehoon Heo, Jinho Yang, Joo-Young Kim |
55 |
HPCA
TOP-TIER
|
Morphling: A Throughput-Maximized TFHE-based Accelerator using Transform-domain Reuse | Architecture |
IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2024 | ||
Prasetiyo, Adiwena Putra, Joo-young Kim |
54 |
ASP-DAC
MAJOR
|
ACane: An Efficient FPGA-based Embedded Vision Platform with Accumulation-as-Convolution Packing for Autonomous Mobile Robots | Architecture |
Asia South Pacific Design Automation Conference (ASP-DAC), 2024 | ||
Jinho Yang, Sungwoong Yune, Sukbin Lim, Donghyuk Kim, Joo-Young Kim |
2023
53 |
MICRO
TOP-TIER
|
Strix: An End-to-End Streaming Architecture with Two-Level Ciphertext Batching for Fully Homomorphic Encryption with Programmable Bootstrapping | Architecture |
IEEE/ACM International Symposium on Microarchitecture (MICRO), 2023 | ||
Adiwena Putra, Prasetiyo, Yi Chen, John Kim, Joo-Young Kim |
52 |
ICCAD
TOP-TIER
|
PRIMO: A Full-Stack Processing-in-DRAM Emulation Framework for Machine Learning Workloads | FPGA Architecture Automation |
IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2023 | ||
Jaehoon Heo, Yongwon Shin, Sangjin Choi, Sungwoong Yune, Jung-Hoon Kim, Hyojin Sung, Youngjin Kwon, Joo-Young Kim |
51 |
ESSCIRC
MAJOR
|
JNPU: A 1.04TFLOPS Joint-DNN Training Processor with Speculative Cyclic Quantization Triple Heterogeneity on Microarchitecture / Precision / Dataflow | Circuit |
IEEE European Solid-State Circuits Conference (ESSCIRC), 2023 | ||
Je Yang, Sukbin Lim, Sukjin Lee, Jae-Young Kim, Joo-Young Kim |
50 |
HotChips
|
HyperAccel LPU: Accelerating Hyperscale Models for Generative AI | FPGA Architecture HyperAccel |
Hot Chips: A Symposium on High Performance Chips (HOTCHIPS), 2023 | ||
Seungjae Moon, Junsoo Kim, Jung-Hoon Kim, Junseo Cha, Gyubin Choi, Seongmin Hong, Joo-Young Kim |
49 |
VLSI
TOP-TIER
|
SP-PIM: A 22.41TFLOPS/W, 8.81Epochs/Sec Super-Pipelined Processing-In-Memory Accelerator with Local Error Prediction for On-Device Learning | Circuit |
Symposium on VLSI Technology Circuits (VLSI), 2023 | ||
*Jung-Hoon Kim, *Jaehoon Heo, Wontak Han, Jaeuk Kim, Joo-Young Kim (*equal contribution) |
48 |
CICC
MAJOR
|
A 26.55TOPS/W Explainable AI Processor with Dynamic Workload Allocation Heat Map Compression/Pruning | Circuit |
IEEE Custom Integrated Circuits Conference (CICC), 2023 | ||
Junsoo Kim, Geonwoo Ko, Ji-Hoon Kim, Changha Lee, Taewoo Kim, Chan-Hyun Yoon, Joo-Young Kim |
47 |
HPCA
TOP-TIER
|
LightTrader: A Standalone AI-enabled High-Frequency Trading System with 16 TFLOPS / 64 TOPS Deep Learning Inference Accelerators | Architecture Rebellions |
IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2023 | ||
Sungyeob Yoo*, Hyunsung Kim*, Jinseok Kim, Sunghyun Park, Joo-Young Kim, Jinwook Oh (*equal contribution) |
2022
46 |
FPT
MAJOR
|
LearningGroup: A Real-Time Sparse Training on FPGA via Learnable Weight Grouping for Multi-Agent Reinforcement Learning | Architecture FPGA |
The International Conference on Field Programmable Technology (FPT), 2022 | ||
Je Yang, Jaeuk Kim, Joo-Young Kim |
45 |
ASSCC
MAJOR
|
A 409.6 GOPS 204.8 GFLOPS Mixed-Precision Vector Processor System for General-Purpose Machine Learning Acceleration | Circuit |
IEEE Asian Solid-State Circuits Conference (A-SSCC), 2022 | ||
Jung-Hoon Kim, Sukjin Lee, Seungjae Moon, Sungyeob Yoo, Joo-Young Kim |
44 |
MICRO
TOP-TIER
|
DFX: A Low-latency Multi-FPGA Appliance for Accelerating Transformer-based Text Generation | Architecture FPGA Naver |
IEEE/ACM International Symposium on Microarchitecture (MICRO), 2022 | ||
Seongmin Hong, Seungjae Moon, Junsoo Kim, Sungjae Lee, Minsub Kim, Dongsoo Lee, Joo-Young Kim |
43 |
HotChips
|
LightTrader: World's first AI-enabled High-Frequency Trading Solution with 16 TFLOPS / 64 TOPS Deep Learning Inference Accelerators | Architecture Rebellions |
Hot Chips: A Symposium on High Performance Chips (HOTCHIPS), 2022 | ||
Hyunsung Kim*, Sungyeob Yoo*, Jaewan Bae, Kyeongryeol Bong, Yoonho Boo, Karim Charfi, Hyo-Eun Kim, Hyun Suk Kim, Jinseok Kim, Byungjae Lee, Jaehwan Lee, Myeongbo Shim, Sungho Shin, Jeong Seok Woo, Joo-Young Kim, Sunghyun Park, Jinwook Oh (*equal contribution) |
42 |
HotChips
|
DFX: A Low-latency Multi-FPGA Appliance for Accelerating Transformer-based Text Generation | FPGA Architecture Naver |
Hot Chips: A Symposium on High Performance Chips (HOTCHIPS), 2022 | ||
Seongmin Hong, Seungjae Moon, Junsoo Kim, Sungjae Lee, Minsub Kim, Dongsoo Lee, Joo-Young Kim |
41 |
HotChips
|
Trinity: End-to-End In-Database Near-Data Machine Learning Acceleration Platform for Advanced Data Analytics | FPGA Architecture Microsoft Samsung |
Hot Chips: A Symposium on High Performance Chips (HOTCHIPS), 2022 | ||
Ji-Hoon Kim, Seunghee Han, Kwanghyun Park, Soo-Young Ji, Joo-Young Kim |
40 |
FPL
MAJOR
|
FSHMEM: Supporting Partitioned Global Address Space on FPGAs for Large-Scale Hardware Acceleration Infrastructure | FPGA Flapmax |
IEEE International Conference on Field Programmable Logic Applications (FPL), 2022 | ||
Yashael Faith Arthanto, David Ojika, Joo-Young Kim |
39 |
FCCM
TOP-TIER
|
A Dual-Mode Similarity Search Accelerator based on Embedding Compression for Online Cross-Modal Image-Text Retrieval | FPGA Amazon |
IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2022 | ||
Yeo-Reum Park, Ji-Hoon Kim, Jaeyoung Do, Joo-Young Kim |
38 |
FCCM
TOP-TIER
|
An Open-Source Shell Generation Framework for High-Performance Design on Multi-Die FPGAs | FPGA |
IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2022 | ||
Gyeongcheol Shin, Junsoo Kim, Joo-Young Kim |
37 |
CICC
MAJOR
|
T-PIM: A 2.21-to-161.08TOPS/W Processing-In-Memory Accelerator for End-to-End On-Device Training | Circuit |
IEEE Custom Integrated Circuits Conference (CICC), 2022 | ||
Jaehoon Heo, Junsoo Kim, Wontak Han, Sukbin Lim, Joo-Young Kim |
2021
36 |
ACSMD
|
A Heterogeneous Vector-Array Architecture with Resource Scheduling for Multi-User/Multi-DNN Workloads | Architecture |
Architecture, Compiler, System Support for Multi-model DNN Workloads (ACSMD) Workshop, 2021 (MICRO Workshop) | ||
Sungyeob Yoo, Jung-Hoon Kim, Joo-Young Kim |
35 |
FCCM
TOP-TIER
|
Accelerating Large-Scale Nearest Neighbor Search with Computational Storage Device | FPGA Samsung |
IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2021 | ||
Ji-Hoon Kim, Yeo-Reum Park, Jaeyoung Do, Soo-Young Ji, Joo-Young Kim |
34 |
DAC
TOP-TIER
|
FIXAR: A Fixed-Point Deep Reinforcement Learning Platform with Quantization-Aware Training Adaptive Parallelism | Architecture Automation |
ACM/IEEE Design Automation Conference (DAC), 2021 | ||
Je Yang, Seongmin Hong, Joo-Young Kim |
2020
33 |
VLSI
TOP-TIER
|
Z-PIM: An Energy-Efficient Sparsity-Aware Processing-In-Memory Architecture with Fully-Variable Weight Precision | Circuit |
IEEE Symposium on VLSI Circuits (VLSI), 2020 | ||
Ji-Hoon Kim, Juhyoung Lee, Jinsu Lee, Hoi-Jun Yoo, Joo-Young Kim |
Before 2020
32 |
MICRO
TOP-TIER
|
A Cloud-Scale Acceleration Architecture | Architecture |
International Symposium on Microarchitecture (MICRO), 2016 | ||
Adrian Caulfield, Eric Chung, Andrew Putnam, Hari Angepat, Jeremy Fowers, Michael Haselman, Stephen Heil, Matt Humphrey, Puneet Kaur, Joo-Young Kim, Daniel Lo, Todd Massengill, Kalin Ovtcharov, Michael Papamichael, Lisa Woods, Sitaram Lanka, Derek Chiou, Doug Burger |
31 |
HOTCHIPS
|
Toward Accelerating Deep Learning at Scale Using Specialized Logic | Circuit |
Hot Chips: A Symposium on High Performance Chips (HOTCHIPS), 2015 | ||
Kalin Ovtcharov, Olatunji Ruwase, Joo-Young Kim, Jeremy Fowers, Karin Strauss, Eric Chung |
30 |
FCCM
TOP-TIER
|
A Scalable High-Bandwidth Architecture for Lossless Compression on FPGAs | FPGA |
International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2015 | ||
Jeremy Fowers, Joo-Young Kim, Scott Hauck, Doug Burger |
29 |
ISCA
TOP-TIER
|
A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services | Architecture |
International Symposium on Computer Architecture (ISCA), 2014 | ||
Andrew Putnam, Adrian M. Caulfield, Eric S. Chung, Derek Chiou, Kypros Constantinides, John Demme, Hadi Esmaeilzadeh, Jeremy Fowers, Jan Gray, Michael Haselman, Scott Hauck, Stephen Heil, Amir Hormati, Joo-Young Kim, Sitaram Lanka, James R. Larus, Eric Peterson, Gopi Prashanth, Aaron Smith, Jason Thong, Phillip Yi Xiao, Doug Burger |
28 |
ASAP
|
Energy Efficient Canonical Huffman Encoding | Architecture |
International Conference on Application-specific Systems, Architectures Processors (ASAP), 2014 | ||
Janarbek Matai, Joo-Young Kim, Ryan Kastner |
27 |
FCCM
TOP-TIER
|
A Scalable Multi-engine Xpress9 Compressor with Asynchronous Data Transfer | FPGA |
International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2014 | ||
Joo-Young Kim, Scott Hauck, Doug Burger |
26 |
CICC
MAJOR
|
Intelligent NoC with Neuro-Fuzzy Bandwidth Regulation for a 51 IP Object Recognition Processor | Circuit |
IEEE Custom Integrated Circuits Conference (CICC), 2010 | ||
Seungjin Lee, Jinwook Oh, Minsu Kim, Junyoung Park, Joonsoo Kwon, Joo-Young Kim, Hoi-Jun Yoo |
25 |
VLSI
TOP-TIER
|
A 1.2mW On-Line Learning Mixed Mode Intelligent Inference Engine for Robust Object Recognition | Circuit |
IEEE Symposium on VLSI Circuits (VLSI), 2010 | ||
Jinwook Oh, Seungjin Lee, Minsu Kim, Joonsoo Kwon, Junyoung Park, Joo-Young Kim, Hoi-Jun Yoo |
24 |
COOLCHIPS
|
A 36 Heterogeneous Core Architecture with Resource-Aware Fine-grained Task Scheduling for Feedback Attention based Object Recognition | Circuit |
IEEE Symposium on Low-Power High-Speed Chips (COOLCHIPS), 2010 | ||
Seungjin Lee, Jinwook Oh, Minsu Kim, Joonyoung Park, Joonsoo Kwon, Joo-Young Kim, Hoi-Jun Yoo |
Before 2010
23 |
ESSCIRC
MAJOR
|
A 118.4GB/s Multi-Casting Network-on-Chip for Real-Time Object Recognition Processor | Circuit |
IEEE European Solid-State Circuits Conference (ESSCIRC), 2009 | ||
Joo-Young Kim, Kwanho Kim, Minsu Kim, Seungjin Lee, Jinwook Oh, Hoi-Jun Yoo |
22 |
ISLPED
|
A 60fps 496mW Multi-Object Recognition Processor with Workload-Aware Dynamic Power Management | Circuit |
ACM/IEEE International Symposium on Low Power Electronics Design (ISLPED), 2009 | ||
Joo-Young Kim, Seungjin Lee, Jinwook Oh, Minsu Kim, Hoi-Jun Yoo |
21 |
VLSI
TOP-TIER
|
A 22.8GOPS 2.83mW Neuro-fuzzy Object Detection Engine for Fast Multi-Object Recognition | Circuit |
IEEE Symposium on VLSI Circuits (VLSI), 2009 | ||
Minsu Kim, Joo-Young Kim, Seungjin Lee, Jinwook Oh, Hoi-Jun Yoo |
20 |
COOLCHIPS
|
An Energy Efficient Real-Time Object Recognition Processor with Neuro-Fuzzy Controlled Task Pipelining | Circuit |
IEEE Symposium on Low- Power High-Speed Chips (COOLCHIPS), 2009 | ||
Joo-Young Kim, Minsu Kim, Seungjin Lee, Jinwook Oh, Kwanho Kim, Hoi-Jun Yoo |
19 |
ISSCC
TOP-TIER
|
A 201.4GOPS 496mW Real-Time Multi-Object Recognition Processor with Bio-Inspired Neural Perception Engine | Circuit |
IEEE International Solid-State Circuits Conference (ISSCC), 2009 | ||
Joo-Young Kim, Minsu Kim, Seungjin Lee, Jinwook Oh, Kwanho Kim, Sejong Oh, Jeong-Ho Woo, Donghyun Kim, Hoi-Jun Yoo |
18 |
ASSCC
MAJOR
|
A 66fps 38mW Nearest Neighbor Matching Processor with Hierarchical VQ Algorithm for Real-Time Object Recognition | Circuit |
IEEE Asian Solid-State Circuits Conference (A-SSCC), 2008 | ||
Joo-Young Kim, Kwanho Kim, Seungjin Lee, Minsu Kim, Hoi-Jun Yoo |
17 |
ASSCC
MAJOR
|
A 76.8 GB/s 46 mW Low-latency Network-on-Chip for Real-time Object Recognition Processor | Circuit |
IEEE Asian Solid-State Circuits Conference (A-SSCC), 2008 | ||
Kwanho Kim, Joo-Young Kim, Seungjin Lee, Minsu Kim, Hoi-Jun Yoo |
16 |
ESSCIRC
MAJOR
|
A 211 GOPS/W Dual-Mode Real-Time Object Recognition Processor with Network-on-Chip | Circuit |
IEEE European Solid-State Circuits Conference (ESSCIRC), 2008 | ||
Kwanho Kim, Joo-Young Kim, Seungjin Lee, Minsu Kim, Hoi-Jun Yoo |
15 |
VLSI
TOP-TIER
|
The Brain Mimicking Visual Attention Engine: An 80x60 Digital Cellular Neural Network for Rapid Global Feature Extraction | Circuit |
IEEE Symposium on VLSI Circuits (VLSI), 2008 | ||
Seungjin Lee, Kwanho Kim, Minsu Kim, Joo-Young Kim, Hoi-Jun Yoo |
14 |
DAC
TOP-TIER
|
Vision Platform for Mobile Intelligent Robots Based on 81.6 GOPS Objects Recognition Processor | Automation |
ACM Design Automation Conference (DAC), 2008 | ||
Donghyun Kim, Kwanho Kim, Joo-Young Kim, Seungjin Lee, Hoi-Jun Yoo |
13 |
ISCAS
|
A 0.6pJ/b 3Gb/s/ch Transceiver in 0.18 um CMOS for 10mm On-chip interconnects | Circuit |
IEEE International Symposium on Circuit Systems (ISCAS), 2008 | ||
Joonsung Bae, Joo-Young Kim, Hoi-Jun Yoo |
12 |
ISSCC
TOP-TIER
|
A 125GOPS 583mW Network-on-Chip Based Parallel Processor with Bio-inspired Visual Attention Engine | Circuit |
IEEE International Solid-State Circuits Conference (ISSCC), 2008 | ||
Kwanho Kim, Seungjin Lee, Joo-Young Kim, Minsu Kim, Donghyun Kim, Jeong-Ho Woo, Hoi-Jun Yoo |
11 |
ASSCC
MAJOR
|
Bitwise Competition Logic for Compact Digital Comparator | Circuit |
IEEE Asian Solid-State Circuits Conference (A-SSCC), 2007 | ||
Joo-Young Kim, Hoi-Jun Yoo |
10 |
ASSCC
MAJOR
|
Implementation of Memory-Centric NoC for 81.6 GOPS Object Recognition Processor | Circuit |
IEEE Asian Solid-State Circuits Conference (A-SSCC), 2007 | ||
Donghyun Kim, Kwanho Kim, Joo-Young Kim, Seungjin Lee, Hoi-Jun Yoo |
9 |
ESSCIRC
MAJOR
|
Visual Image Processing RAM for Fast 2-D Data Location Search | Circuit |
IEEE European Solid-State Circuits Conference (ESSCIRC), 2007 | ||
Joo-Young Kim, Donghyun Kim, Seungjin Lee, Kwanho Kim, Hoi-Jun Yoo |
8 |
CICC
MAJOR
|
An 81.6 GOPS Object Recognition Processor Based on NoC Visual Image Processing Memory | Circuit |
IEEE Custom Circuits Conference (CICC), 2007 | ||
Donghyun Kim, Kwanho Kim, Joo-Young Kim, Seungjin Lee, Hoi-Jun Yoo |
7 |
NOCS
|
Solutions for Real Chip Implementation Issues of NoC Their Application to Memory-Centric NoC | Circuit |
IEEE International Symposium on Network-on-Chip (NOCS), 2007 | ||
Donghyun Kim, Kwanho Kim, Joo-Young Kim, Seungjin Lee, Hoi-Jun Yoo |
6 |
ISCAS
|
A 372ps 64-bit Adder using Fast Pull-up Logic in 0.18-um CMOS | Circuit |
IEEE International Symposium on Circuit Systems (ISCAS), 2006 | ||
Joo-Young Kim, Kangmin Lee, Hoi-Jun Yoo |
5 |
ASSCC
MAJOR
|
A TCAM-based Periodic Event Generator for Multi-Node Management in the Body Sensor Network | Circuit |
IEEE Asian Solid-State Circuits Conference (A-SSCC), 2006 | ||
Sungdae Choi, Kyomin Sohn, Jooyoung Kim, Jerald Yoo, Hoi-Jun Yoo |
4 |
ASSCC
MAJOR
|
A 0.6-V, 6.8-uW Embedded SRAM for Ultra-low Power SoC | Circuit |
IEEE Asian Solid-State Circuits Conference (A-SSCC), 2006 | ||
Kyomin Sohn, Sungdae Choi, Jeong-Ho Woo, Jooyoung Kim, Hoi-Jun Yoo |
3 |
ESSCIRC
MAJOR
|
A 24.2-uW Dual-Mode Human Body Communication Controller for Body Sensor Network | Circuit |
IEEE European Solid-State Circuits Conference (ESSCIRC), 2006 | ||
Sungdae Choi, Seong-Jun Song, Kyomin Sohn, Hyejung Kim, Jooyoung Kim, Namjun Cho, Jeong-Ho Woo, Jerald Yoo, Hoi-Jun Yoo |
2 |
CICC
MAJOR
|
A Multi-Nodes Human Body Communication Sensor Network Control Processor | Circuit |
IEEE Custom Circuits Conference (CICC), 2006 | ||
Sungdae Choi, Seong-Jun Song, Kyomin Sohn, Hyejung Kim, Jooyoung Kim, Namjun Cho, Jeong-Ho Woo, Jerald Yoo, Hoi-Jun Yoo |
1 |
ISWC
|
A Low-power Star-topology Body Area Network Controller for Periodic Data Monitoring Around Inside the Human Body | Circuit |
IEEE International Symposium on Wearable Computers (ISWC), 2006 | ||
Sungdae Choi, Seong-Jun Song, Kyomin Sohn, Hyejung Kim, Jooyoung Kim, Jerald Yoo, Hoi-Jun Yoo |