bestcourses is supported by learners. When you buy through links on our website, we may earn an affiliate commission. Learn more

Curiosity Driven Deep Reinforcement Learning

How Agents Can Learn In Environments With No Rewards

4.55 / 5.0
506 students3 hours 46 minutes

Created by Phil Tabor, offered on Udemy

bestcourses score™

Student feedback


To make sure that we score courses properly, we pay a lot of attention to the reviews students leave on courses and how many students are taking a course in the first place. This course has a total of 506 students which left 34 reviews at an average rating of 4.55, which is average.

Course length


We analyze course length to see if courses cover all important aspects of a topic, taking into account how long the course is compared to the category average. This course has a length of 3 hours 46 minutes, which is pretty short. This might not be a bad thing, but we've found that longer courses are often more detailed & comprehensive. The average course length for this entire category is 7 hours 54 minutes.

Overall score


This course currently has a bestcourses score of 5.3/10, which makes it an average course. Overall, there are probably better courses available for this topic on our platform.


If reinforcement learning is to serve as a viable path to artificial general intelligence, it must learn to cope with environments with sparse or totally absent rewards. Most real life systems provided rewards that only occur after many time steps, leaving the agent with little information to build a successful policy on. Curiosity based reinforcement learning solves this problem by giving the agent an innate sense of curiosity about its world, enabling it to explore and learn successful policies for navigating the world.

In this advanced course on deep reinforcement learning, motivated students will learn how to implement cutting edge artificial intelligence research papers from scratch. This is a fast paced course for those that are experienced in coding up actor critic agents on their own. We'll code up two papers in this course, using the popular PyTorch framework.

The first paper covers asynchronous methods for deep reinforcement learning; also known as the popular asynchronous advantage actor critic algorithm (A3C). Here students will discover a new framework for learning that doesn't require a GPU. We will learn how to implement multithreading in Python and use that to train multiple actor critic agents in parallel. We will go beyond the basic implementation from the paper and implement a recent improvement to reinforcement learning known as generalized advantage estimation. We will test our agents in the Pong environment from the Open AI Gym's Atari library, and achieve nearly world class performance in just a few hours.

From there, we move on to the heart of the course: learning in environments with sparse or totally absent rewards. This new paradigm leverages the agent's curiosity about the environment as an intrinsic reward that motivates the agent to explore and learn generalizable skills. We'll implement the intrinsic curiosity module (ICM), which is a bolt-on module for any deep reinforcement learning algorithm. We will train and test our agent in an maze like environment that only yields rewards when the agent reaches the objective. A clear performance gain over the vanilla A3C algorithm will be demonstrated, conclusively showing the power of curiosity driven deep reinforcement learning.

Please keep in mind this is a fast paced course for motivated and advanced students. There will be only a very brief review of the fundamental concepts of reinforcement learning and actor critic methods, and from there we will jump right into reading and implementing papers.

The beauty of both the ICM and asynchronous methods is that these paradigms can be applied to nearly any other reinforcement learning algorithm. Both are highly adaptable and can be plugged in with little modification to algorithms like proximal policy optimization, soft actor critic, or deep Q learning.

Students will learn how to:

  • Implement deep reinforcement learning papers

  • Leverage multi core CPUs with parallel processing in Python

  • Code the A3C algorithm from scratch

  • Code the ICM from first principles

  • Code generalized advantage estimation

  • Modify the Open AI Gym Atari Library

  • Write extensible modular code

This course is launching with the PyTorch implementation, with a Tensorflow 2 version coming.

I'll see you on the inside.

What you will learn

  • How to Code A3C Agents
  • How to Do Parallel Processing in Python
  • How to Implement Deep Reinforcement Learning Papers
  • How to Code the Intrinsic Curiosity Module


  • Experience in coding actor critic agents
Udemy logo
Available on


With almost 200,000 courses and close to 50 million students, Udemy is one of the most visited online learning platforms. Popular topics include software development, the digital economy, but also more traditional topics like cooking and music.

Frequently asked questions

  • Price: $109.99
  • Platform: Udemy
  • Language: English
  • 3 hours 46 minutes
Curiosity Driven Deep Reinforcement Learning thumbnail

bestcourses score: 5.3/10

There might be better courses available for this topic.