Get Free Shipping on orders over $79
Exploiting Environment Configurability in Reinforcement Learning

Exploiting Environment Configurability in Reinforcement Learning

eText | 15 December 2022 | Edition Number 1

At a Glance

eText


$194.40

or 4 interest-free payments of $48.60 with

 or 

Instant online reading in your Booktopia eTextbook Library *

Why choose an eTextbook?

Instant Access *

Purchase and read your book immediately

Read Aloud

Listen and follow along as Bookshelf reads to you

Study Tools

Built-in study tools like highlights and more

* eTextbooks are not downloadable to your eReader or an app and can be accessed via web browsers only. You must be connected to the internet and have no technical issues with your device or browser that could prevent the eTextbook from operating.
In recent decades, Reinforcement Learning (RL) has emerged as an effective approach to address complex control tasks. In a Markov Decision Process (MDP), the framework typically used, the environment is assumed to be a fixed entity that cannot be altered externally. There are, however, several real-world scenarios in which the environment can be modified to a limited extent. This book, Exploiting Environment Configurability in Reinforcement Learning, aims to formalize and study diverse aspects of environment configuration. In a traditional MDP, the agent perceives the state of the environment and performs actions. As a consequence, the environment transitions to a new state and generates a reward signal. The goal of the agent consists of learning a policy, i.e., a prescription of actions that maximize the long-term reward. Although environment configuration arises quite often in real applications, the topic is very little explored in the literature. The contributions in the book are theoretical, algorithmic, and experimental and can be broadly subdivided into three parts. The first part introduces the novel formalism of Configurable Markov Decision Processes (Conf-MDPs) to model the configuration opportunities offered by the environment. The second part of the book focuses on the cooperative Conf-MDP setting and investigates the problem of finding an agent policy and an environment configuration that jointly optimize the long-term reward. The third part addresses two specific applications of the Conf-MDP framework: policy space identification and control frequency adaptation. The book will be of interest to all those using RL as part of their work.
on
Desktop
Tablet
Mobile

More in Computer Science

Amazon.com : Get Big Fast - Robert Spector

eBOOK

AI-Powered Search - Trey Grainger

eBOOK

Tissue Proteomics : Methods and Protocols - Taufika Islam Williams

eBOOK

RRP $369.00

$332.99

10%
OFF

This product is categorised by