Risk-sensitive Reinforcement Learning via Policy Gradient Search

Tutorial at AAAI, 2023

Tutorial Description

The objective in traditional reinforcement learning (RL) usually involves an expected value of a cost function that doesn't include risk considerations. In this tutorial, we consider risk-sensitive RL in two settings: one where the goal is to find a policy that optimizes the usual expected value objective while ensuring that a risk constraint is satisfied, and the other where the risk measure is the objective. We focus on policy gradient search as the solution approach.

Thus, the main purpose of this tutorial is to introduce and survey research results on policy gradient methods for reinforcement learning with risk-sensitive criteria, as well as to outline some promising avenues for future research following the risk-sensitive RL framework.

Tutorial Outline

Tutorial Overview
Review of MDPs/RL
Risk Measures
Background
Policy Gradient Templates for Risk-sensitive RL
MDPs with Risk as the Constraint
MDPs with Risk as the Objective

Slides

Click here

Research monograph

Prashanth L.A. and Michael Fu, Risk-Sensitive Reinforcement Learning via Policy Gradient Search, Foundations and Trends in Machine Learning, 2022. [pdf] [Book page]

Presenters

Prashanth L.A. and Michael Fu will be the presenters of this tutorial.

Target Audience

The target audience includes both researchers and practitioners who study and/or use reinforcement learning (RL) in their work and who wish to incorporate risk measures or behavioral considerations in their decision-making process. The background needed for this tutorial can be found in a a first course in RL and optimization.

Speaker Bios

Prashanth L.A. is an Associate Professor in the Department of Computer Science and Engineering at Indian Institute of Technology Madras. His research interests are in reinforcement learning, stochastic optimization and multi-armed bandits, with applications in transportation systems, wireless networks and recommendation systems.

Michael C. Fu holds the Smith Chair of Management Science at the University of Maryland. His research interests include simulation optimization and applied probability, particularly with applications towards supply chain management and financial engineering.