• CAST Lab
    Circuits, Architecture, Systems, Technology

    We aim to advance modern computer systems based on specialized hardware in the post-Moore’s law era. We conduct research in various fields of hardware design such as computer architecture, VLSI, FPGA, hardware/software co-design, and processing-in-memory with holistic design approach to improve overall system performance. Our current mission is to build a high-performance and scalable computing platform for future AI applications.


AI Accelerators

Machine learning (ML), the study of algorithms that enable artificial intelligence (AI), has become the prominent computing paradigm as it revolutionizes how computers handle cognitive tasks based on a massive amount of observed data. With more industries adopting this technology, we face growing demand for hardware support that achieves high-performance...

Multi-FPGA Systems

Cloud computing is rapidly changing how enterprises run their services by offering a virtualized computing infrastructure over the internet. Datacenter is a powerhouse behind cloud computing, which physically hosts millions of computer servers, communication cables, and data storages. Recently, as the number of services using AI in data centers is increasing...


Traditionally, CPU is the center of the computing systems that executes arithmetic and logic calculation, while memory is built around it to simply load and store the data. Today, compute unit is executing operations faster than the memory unit can load and store the required data due to technology scaling. Therefore, compute unit is no longer the most time-consuming...

Near-Data Processing

Near-data processing (NDP) is another alternative to address the expensive data movement problem of traditional compute-centric model. It refers to augmenting the memory or the storage with processing power. By placing computing capabilities directly on the memory or the storage, data is allowed to be processed in place, which significantly reduces data movement...

Selected Publications

Please see the following selected publications to learn more about CastLab’s research.

  • A Cloud-Scale Acceleration Architecture, International Symposium on Microarchitecture (MICRO), 2016 link

  • Toward Accelerating Deep Learning at Scale Using Specialized Logic, Hot Chips: A Symposium on High Performance Chips (HOTCHIPS) 2015 link

  • A 201.4GOPS 496mW Real-Time Multi-Object Recognition Processor with Bio-Inspired Neural Perception Engine, IEEE Journal of Solid-State Circuits (JSSC), 2010 link

  • A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services, International Symposium on Computer Architecture (ISCA), 2014 link

  • Real-Time Object Recognition with Neuro-Fuzzy Controlled Workload-aware Task Pipelining, IEEE Micro, Vol. 29, No. 6, 2009 link

Research Partners

삼성전자 로고
sk 하이닉스