An Advantage Actor-Critic Deep Reinforcement Learning Method for Power Management in HPC Systems

Khasyah, Fitra Rahmani and Santiyuda, Kadek Gemilang and Kaunang, Gabrie and Makhrus, Faizal and Amrizal, Muhammad Alfian and Takizawa, Hiroyuki (2023) An Advantage Actor-Critic Deep Reinforcement Learning Method for Power Management in HPC Systems. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7 - 9 December 2022, Sendai.

Full text not available from this repository. (Request a copy)

Abstract

A primary concern when deploying a High-Performance Computing (HPC) system is its high energy consumption. Typical HPC systems consist of hundreds to thousands of compute nodes that consume huge amount of electrical power even during their idle states. One way to increase the energy efficiency is to apply the backfilling method to the First Come First Serve (FCFS) job scheduler (FCFS+Backfilling). The backfilling method allows jobs that arrive later than the first job in the queue to be executed earlier if the starting time of the first job is not affected, therefore increasing the throughput and the energy efficiency of the system. Nodes that are idle for a specific amount of time can also be switched off to further improve the energy efficiency. However, switching off nodes based only on their idle time can also impair the energy efficiency and the throughput instead of improving them. As an example, new jobs may immediately arrive after nodes are switched off, hence missing the chance of directly executing the jobs via backfilling. This paper proposed a Deep Reinforcement Learning (DRL)-based method to predict the most appropriate timing to switch on/off nodes. A DRL agent is trained with Advantage Actor-Critic algorithm to decide which nodes must be switched on/off at a specific timestep. Our simulation results on NASA iPSC/860 HPC historical job dataset show that the proposed method can reduce the total energy consumption compared to most of the conventional timeout policies that switch off nodes after they became idle for some period of time.

Item Type: Conference or Workshop Item (Paper)
Uncontrolled Keywords: Advantage actor-critic; Deep reinforcement learning; Energy consumption; HPC; Power management
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Faculty of Mathematics and Natural Sciences > Computer Science & Electronics Department
Depositing User: Ismu WIDARTO
Date Deposited: 24 Sep 2024 01:29
Last Modified: 24 Sep 2024 01:29
URI: https://ir.lib.ugm.ac.id/id/eprint/7462

Actions (login required)

View Item
View Item