Decision Making and Reinforcement Learning
half-circle
vector

Decision Making and Reinforcement Learning

Highlights

This course is an introduction to sequential decision making and reinforcement learning. We start with a discussion of utility theory to learn how preferences can be represented and modeled for decision making. We first model simple decision problems as multi-armed bandit problems in and discuss several approaches to evaluate feedback. We will then model decision problems as finite Markov decision processes (MDPs), and discuss their solutions via dynamic programming algorithms. We touch on the notion of partial observability in real problems, modeled by POMDPs and then solved by online planning methods. Finally, we introduce the reinforcement learning problem and discuss two paradigms: Monte Carlo methods and temporal difference learning. We conclude the course by noting how the two paradigms lie on a spectrum of n-step temporal difference methods. An emphasis on algorithms and examples will be a key part of this course.

About the Course Provider

Coursera provides access to more than 3000+ courses across a wide variety of subjects in parntership with different universities and organizations.

Course by

  • self
    Self paced
  • dueration
    Duration 47 hours
  • domain
    Domain Data Science & AI
  • subs
    Monthly Subscription
    Course is included in
    1. Starter @ AED 99 + VAT
    2. Professional @ AED 149 + VAT
  • fee
    Buy Now Option not available
  • language
    Language English