# Risk-sensitive Reinforcement Learning via Policy Gradient Search

**Upcoming tutorial at AAAI, 2023**

## Tutorial Description

The objective in traditional reinforcement learning (RL) usually involves an expected value of a cost function that doesn't include risk considerations. In this tutorial, we consider risk-sensitive RL in two settings: one where the goal is to find a policy that optimizes the usual expected value objective while ensuring that a risk constraint is satisfied, and the other where the risk measure is the objective. We focus on policy gradient search as the solution approach.

Thus, the main purpose of this tutorial is to introduce and survey research results on policy gradient methods for reinforcement learning with risk-sensitive criteria, as well as to outline some promising avenues for future research following the risk-sensitive RL framework.

## Tutorial Outline

Tutorial Overview

Review of MDPs/RL

Risk Measures

Background

Policy Gradient Templates for Risk-sensitive RL

MDPs with Risk as the Constraint

MDPs with Risk as the Objective

## Slides

Coming soon

## Presenters

Prashanth L.A. and Michael Fu will be the presenters of this tutorial.

## Target Audience

The target audience includes both researchers and practitioners who study and/or use reinforcement learning (RL) in their work and who wish to incorporate risk measures or behavioral considerations in their decision-making process. The background needed for this tutorial can be found in a a first course in RL and optimization.

## Speaker Bios

**Prashanth L.A.** is an Assistant Professor in the Department of Computer Science and Engineering at Indian Institute of Technology Madras. His research interests are in reinforcement learning, stochastic optimization and multi-armed bandits, with applications in transportation systems, wireless networks and recommendation systems.

**Michael C. Fu** holds the Smith Chair of Management Science at the University of Maryland. His research interests include simulation optimization and applied probability, particularly with applications towards supply chain management and financial engineering.