bestcourses is supported by learners. When you buy through links on our website, we may earn an affiliate commission. Learn more
Curiosity Driven Deep Reinforcement Learning
How Agents Can Learn In Environments With No Rewards
Created by Phil Tabor, offered on Udemy
To make sure that we score courses properly, we pay a lot of attention to the reviews students leave on courses and how many students are taking a course in the first place. This course has a total of 506 students which left 34 reviews at an average rating of 4.55, which is average.
We analyze course length to see if courses cover all important aspects of a topic, taking into account how long the course is compared to the category average. This course has a length of 3 hours 46 minutes, which is pretty short. This might not be a bad thing, but we've found that longer courses are often more detailed & comprehensive. The average course length for this entire category is 7 hours 54 minutes.
This course currently has a bestcourses score of 5.3/10, which makes it an average course. Overall, there are probably better courses available for this topic on our platform.
If reinforcement learning is to serve as a viable path to artificial general intelligence, it must learn to cope with environments with sparse or totally absent rewards. Most real life systems provided rewards that only occur after many time steps, leaving the agent with little information to build a successful policy on. Curiosity based reinforcement learning solves this problem by giving the agent an innate sense of curiosity about its world, enabling it to explore and learn successful policies for navigating the world.
In this advanced course on deep reinforcement learning, motivated students will learn how to implement cutting edge artificial intelligence research papers from scratch. This is a fast paced course for those that are experienced in coding up actor critic agents on their own. We'll code up two papers in this course, using the popular PyTorch framework.
The first paper covers asynchronous methods for deep reinforcement learning; also known as the popular asynchronous advantage actor critic algorithm (A3C). Here students will discover a new framework for learning that doesn't require a GPU. We will learn how to implement multithreading in Python and use that to train multiple actor critic agents in parallel. We will go beyond the basic implementation from the paper and implement a recent improvement to reinforcement learning known as generalized advantage estimation. We will test our agents in the Pong environment from the Open AI Gym's Atari library, and achieve nearly world class performance in just a few hours.
From there, we move on to the heart of the course: learning in environments with sparse or totally absent rewards. This new paradigm leverages the agent's curiosity about the environment as an intrinsic reward that motivates the agent to explore and learn generalizable skills. We'll implement the intrinsic curiosity module (ICM), which is a bolt-on module for any deep reinforcement learning algorithm. We will train and test our agent in an maze like environment that only yields rewards when the agent reaches the objective. A clear performance gain over the vanilla A3C algorithm will be demonstrated, conclusively showing the power of curiosity driven deep reinforcement learning.
Please keep in mind this is a fast paced course for motivated and advanced students. There will be only a very brief review of the fundamental concepts of reinforcement learning and actor critic methods, and from there we will jump right into reading and implementing papers.
The beauty of both the ICM and asynchronous methods is that these paradigms can be applied to nearly any other reinforcement learning algorithm. Both are highly adaptable and can be plugged in with little modification to algorithms like proximal policy optimization, soft actor critic, or deep Q learning.
Students will learn how to:
Implement deep reinforcement learning papers
Leverage multi core CPUs with parallel processing in Python
Code the A3C algorithm from scratch
Code the ICM from first principles
Code generalized advantage estimation
Modify the Open AI Gym Atari Library
Write extensible modular code
This course is launching with the PyTorch implementation, with a Tensorflow 2 version coming.
I'll see you on the inside.
What you will learn
- How to Code A3C Agents
- How to Do Parallel Processing in Python
- How to Implement Deep Reinforcement Learning Papers
- How to Code the Intrinsic Curiosity Module
- Experience in coding actor critic agents