カスタムコンピューティング特論

コンピュータシステムの更なる高性能化・効率化のためには、AI等のアプリケーションドメインに特化したハードウェアおよびソフトウェアを活用するカスタムコンピューティングが重要である。本講義では、カスタムコンピューティングを行う際に必要となるコンピュータアーキテクチャに関する知識と、高度なアプリケーションを実現するためのハードウェア、ソフトウェア、アルゴリズムに関する技術を習得する。

Custom computing, which effectively utilizes domain-specific hardware and software optimized for particular application domains, especially AI, is a crucial approach for further improving the performance and efficiency of computer systems. In this lecture, you will learn the fundamental concepts of computer architecture for custom computing. You will then learn various hardware, software, and system-aware algorithm techniques for realizing advanced applications.

MIMA Search

授業計画

(1) カスタムコンピューティング導入: コンピュータアーキテクチャの基礎、計算が消費するエネルギー、CPUに代わる計算デバイス (2) 並列処理: マルチコア、SIMD、GPU、NPU、スーパコンピュータ、相互結合網 (3) メモリシステムと性能モデル: キャッシュメモリ、DRAM、フォンノイマンボトルネック、ルーフラインモデル (4) ハードウェアアルゴリズム: フィルタ、シストリックアレイ (5) FPGAコンピューティング: FPGAアーキテクチャ、高位合成コンパイラ (6-7) AIハードウェア: AIチップ、NPUアーキテクチャ、低精度演算 (8-9) AIシステム: LLM推論、LLMサービング、KVキャッシュ、Flash Attention、Speculative Decoding、Chunked Prefill (10-11) インメモリ計算: SRAM CIM、ReRAM CIM、DRAM PIM、Processing using DRAM (12) 分散AIシステム: Federated Learning、中央集権型FL、非中央集権型FL、FLシステム (13) セキュアコンピューティング: TEE、メインメモリ暗号化・整合性検証 (1) Introduction to Custom Computing: Fundamentals of computer architecture, Energy consumption of computation, Alternative computing devices beyond CPUs (2) Parallel Processing: Multicore, SIMD, GPUs, NPUs, Supercomputers, Interconnection networks (3) Memory Systems and Performance Models: Cache memory, DRAM, von Neumann bottleneck, Roofline model (4) Hardware Algorithms: Filters, Systolic arrays (5) FPGA Computing: FPGA architecture, High-level synthesis (HLS) compilers (6–7) AI Hardware: AI chip, NPU architecture, Low-precision computation (8–9) AI Systems: LLM inference, LLM serving, KV cache, Flash Attention, Speculative decoding, Chunked prefill (10–11) Computing-in-Memory/Processing-in-Memory: SRAM-based CIM, ReRAM-based CIM, DRAM-based PIM, Processing using DRAM (12) Distributed AI Systems: Federated learning, Centralized FL, Decentralized FL, FL system (13) Secure Computing: Trusted execution environments (TEE), Main memory encryption and integrity verification