Institution: Sungkyunkwan University
Advisor: Prof. Woocheol Choi, Prof. Jeonggyu Huh

Overview

This project was based on the paper Towards Understanding Asynchronous Advantage Actor-Critic: Convergence and Linear Speedup.

The study covered both the theoretical convergence analysis of Asynchronous Advantage Actor-Critic (A3C) and its multiprocessing implementation for distributed reinforcement learning, including the role of CPU cores, threads, and process management. Through this work, I developed a broader interest in distributed computation, which led me to further study MPI-based parallel computing and scientific computing during the 15th KIAS CAC Summer School on Artificial Intelligence & Parallel Computing

Contributions

  • Reviewed and summarized the convergence analysis of A3C from the reference paper
  • Analyzed the multiprocessing-based implementation of A3C, including shared memory, shared optimizer, and synchronization mechanisms
  • Explained system-level concepts such as CPU cores, threads, and process scheduling in the context of distributed reinforcement learning

Slides

  • A3C — Overview of Asynchronous Advantage Actor-Critic (A3C) and its convergence analysis
  • A3C Multiprocessing Code — Walkthrough of Python multiprocessing code implementing A3C with shared memory and optimizer
  • Multiprocessing A3C — Explanation of multiprocessing concepts in A3C, including process creation, locks, and shared counters
  • CPU Core & Thread A3C — Background on CPU, cores, threads, and their role in running multiple A3C worker processes