Institution: Sungkyunkwan University
Advisor: Prof. Woocheol Choi, Prof. Jeonggyu Huh
Overview
This project was based on the paper Towards Understanding Asynchronous Advantage Actor-Critic: Convergence and Linear Speedup.
The study covered both the theoretical convergence analysis of Asynchronous Advantage Actor-Critic (A3C) and its multiprocessing implementation for distributed reinforcement learning, including the role of CPU cores, threads, and process management. Through this work, I developed a broader interest in distributed computation, which led me to further study MPI-based parallel computing and scientific computing during the 15th KIAS CAC Summer School on Artificial Intelligence & Parallel Computing
Contributions
- Reviewed and summarized the convergence analysis of A3C from the reference paper
- Analyzed the multiprocessing-based implementation of A3C, including shared memory, shared optimizer, and synchronization mechanisms
- Explained system-level concepts such as CPU cores, threads, and process scheduling in the context of distributed reinforcement learning
Slides
- A3C — Overview of Asynchronous Advantage Actor-Critic (A3C) and its convergence analysis
- A3C Multiprocessing Code — Walkthrough of Python multiprocessing code implementing A3C with shared memory and optimizer
- Multiprocessing A3C — Explanation of multiprocessing concepts in A3C, including process creation, locks, and shared counters
- CPU Core & Thread A3C — Background on CPU, cores, threads, and their role in running multiple A3C worker processes