We may earn an affiliate commission when you visit our partners.
Course image
Martha White and Adam White

In this course, you will learn about several algorithms that can learn near optimal policies based on trial and error interaction with the environment---learning from the agent’s own experience. Learning from actual experience is striking because it requires no prior knowledge of the environment’s dynamics, yet can still attain optimal behavior. We will cover intuitively simple but powerful Monte Carlo methods, and temporal difference learning methods including Q-learning. We will wrap up this course investigating how we can get the best of both worlds: algorithms that can combine model-based planning (similar to dynamic programming) and temporal difference updates to radically accelerate learning.

Read more

In this course, you will learn about several algorithms that can learn near optimal policies based on trial and error interaction with the environment---learning from the agent’s own experience. Learning from actual experience is striking because it requires no prior knowledge of the environment’s dynamics, yet can still attain optimal behavior. We will cover intuitively simple but powerful Monte Carlo methods, and temporal difference learning methods including Q-learning. We will wrap up this course investigating how we can get the best of both worlds: algorithms that can combine model-based planning (similar to dynamic programming) and temporal difference updates to radically accelerate learning.

By the end of this course you will be able to:

- Understand Temporal-Difference learning and Monte Carlo as two strategies for estimating value functions from sampled experience

- Understand the importance of exploration, when using sampled experience rather than dynamic programming sweeps within a model

- Understand the connections between Monte Carlo and Dynamic Programming and TD.

- Implement and apply the TD algorithm, for estimating value functions

- Implement and apply Expected Sarsa and Q-learning (two TD methods for control)

- Understand the difference between on-policy and off-policy control

- Understand planning with simulated experience (as opposed to classic planning strategies)

- Implement a model-based approach to RL, called Dyna, which uses simulated experience

- Conduct an empirical study to see the improvements in sample efficiency when using Dyna

Enroll now

What's inside

Syllabus

Welcome to the Course!
Welcome to the second course in the Reinforcement Learning Specialization: Sample-Based Learning Methods, brought to you by the University of Alberta, Onlea, and Coursera. In this pre-course module, you'll be introduced to your instructors, and get a flavour of what the course has in store for you. Make sure to introduce yourself to your classmates in the "Meet and Greet" section!
Read more
Monte Carlo Methods for Prediction & Control
This week you will learn how to estimate value functions and optimal policies, using only sampled experience from the environment. This module represents our first step toward incremental learning methods that learn from the agent’s own interaction with the world, rather than a model of the world. You will learn about on-policy and off-policy methods for prediction and control, using Monte Carlo methods---methods that use sampled returns. You will also be reintroduced to the exploration problem, but more generally in RL, beyond bandits.
Temporal Difference Learning Methods for Prediction
This week, you will learn about one of the most fundamental concepts in reinforcement learning: temporal difference (TD) learning. TD learning combines some of the features of both Monte Carlo and Dynamic Programming (DP) methods. TD methods are similar to Monte Carlo methods in that they can learn from the agent’s interaction with the world, and do not require knowledge of the model. TD methods are similar to DP methods in that they bootstrap, and thus can learn online---no waiting until the end of an episode. You will see how TD can learn more efficiently than Monte Carlo, due to bootstrapping. For this module, we first focus on TD for prediction, and discuss TD for control in the next module. This week, you will implement TD to estimate the value function for a fixed policy, in a simulated domain.
Temporal Difference Learning Methods for Control
This week, you will learn about using temporal difference learning for control, as a generalized policy iteration strategy. You will see three different algorithms based on bootstrapping and Bellman equations for control: Sarsa, Q-learning and Expected Sarsa. You will see some of the differences between the methods for on-policy and off-policy control, and that Expected Sarsa is a unified algorithm for both. You will implement Expected Sarsa and Q-learning, on Cliff World.
Planning, Learning & Acting
Up until now, you might think that learning with and without a model are two distinct, and in some ways, competing strategies: planning with Dynamic Programming verses sample-based learning via TD methods. This week we unify these two strategies with the Dyna architecture. You will learn how to estimate the model from data and then use this model to generate hypothetical experience (a bit like dreaming) to dramatically improve sample efficiency compared to sample-based methods like Q-learning. In addition, you will learn how to design learning systems that are robust to inaccurate models.

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Examines model-based and model-free approaches to reinforcement learning
Teaches foundational concepts in RL, including TD learning, Monte Carlo methods, and Dyna
Provides hands-on experience implementing TD and Q-learning algorithms
Taught by experienced RL researchers with a strong track record in the field
Focuses on practical applications, with examples from simulated domains
Assumes some prior knowledge of RL concepts and Python programming

Save this course

Save Sample-based Learning Methods to your list so you can find it easily later:
Save

Reviews summary

Sample-based learning methods

learners say this well received course largely positive reviews. It is part of a largely positive reinforcement learning specialization. Engaging assignments are paired with a well structured textbook to help you build a strong foundation in sample-based learning methods. Assignments are challenging but fair, and provide valuable hands-on experience.
Short but clear and concise lectures
"Enlightening explanations, well-structured content and challenging assignments. Very engaging course I thoroughly enjoyed!"
"Concepts and methods introduced in this course are well motivated and presented. The assignments are very thoughtfully designed."
Excellent textbook assigned with the course
"Excellent course that naturally extends the first specialization course. The application examples in programming are very good and I loved how RL gets closer and closer to how a living being thinks."
"This is an excellent course in reinforcement learning. They provide a PDF for a textbook which is very clear and readable, and the lectures do a great job at reinforcing the concepts. The programming assignments are pretty interesting as well."
A lot of engaging and challenging assignments
"Very well prepared and interesting course!"
"Excellent paced course that helped me understand sample based methods. Assignments were thoroughly build to practically utilize these concepts"
"E​xcellent paced course that helped me understand sample based methods. Assignments were thoroughly build to practically utilize these concepts"
Instructors do not interact in discussions and responses are rare
"The instructors basically completely absent."
"Having issues or problems? They don't bother. Not a single reply from either instructor in the forums for months or years."
Difficult exams with difficult questions

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Sample-based Learning Methods with these activities:
Review basic probability and statistics concepts
Refresh your knowledge of probability and statistics to strengthen your foundation for understanding reinforcement learning concepts.
Browse courses on Probability
Show steps
  • Review probability distributions
  • Practice solving basic statistical problems
Review basic linear algebra concepts
Refresh your knowledge of linear algebra to enhance your understanding of the mathematical underpinnings of reinforcement learning.
Browse courses on Linear Algebra
Show steps
  • Review vector spaces and matrices
  • Practice solving systems of linear equations
Participate in online discussion forums
Engage in discussions with peers to clarify concepts, share insights, and get feedback on your understanding of the course material.
Show steps
  • Join online discussion forums
  • Participate in discussions by asking questions
  • Provide thoughtful responses to others' questions
Four other activities
Expand to see all activities and additional details
Show all seven activities
Practice coding with Q-learning and Expected Sarsa
Practice implementing and applying temporal difference learning algorithms to better understand how they work and reinforce your understanding of the concepts.
Browse courses on Q-Learning
Show steps
  • Implement the Q-learning algorithm
  • Implement the Expected Sarsa algorithm
  • Test both algorithms on a simple environment
Compile a list of resources for further learning
Gather and organize a collection of resources such as articles, videos, or websites that can supplement your learning and provide additional insights into the course topics.
Browse courses on Self-Directed Learning
Show steps
  • Search for relevant resources
  • Organize and categorize the resources
  • Share the compilation with others
Explore applications of Monte Carlo and Temporal Difference Learning
Research and learn about practical applications of Monte Carlo and Temporal Difference Learning to gain a broader perspective on their usefulness and potential impact.
Browse courses on Monte Carlo Methods
Show steps
  • Review real-world applications of Monte Carlo methods
  • Explore case studies and examples of Temporal Difference Learning in practice
Create a simulation-based RL project
Develop a project that demonstrates your understanding of simulation-based RL techniques and showcases your ability to apply them to solve a real-world problem.
Browse courses on Project-Based Learning
Show steps
  • Define the problem and environment
  • Design and implement the simulation
  • Apply RL algorithms to the simulated environment
  • Evaluate and analyze the results

Career center

Learners who complete Sample-based Learning Methods will develop knowledge and skills that may be useful to these careers:
Data Scientist
Data Scientists use their knowledge of statistics, machine learning, and data analysis to extract insights from data. They work in a variety of industries, including finance, healthcare, and retail. The University of Alberta's Sample-based Learning Methods course can provide you with the skills you need to succeed in this role by teaching you about the fundamental concepts of machine learning, including temporal difference learning, Monte Carlo methods, and model-based reinforcement learning. This knowledge will help you to develop and deploy machine learning models that can be used to solve complex problems and make better decisions.
Machine Learning Engineer
Machine Learning Engineers are responsible for designing, developing, and deploying machine learning models for a wide range of applications. These models can be used to automate tasks, improve decision-making, and gain insights from data. The University of Alberta's Sample-based Learning Methods course can provide you with the skills you need to succeed in this role by teaching you about the fundamental concepts of machine learning, including temporal difference learning, Monte Carlo methods, and model-based reinforcement learning. This knowledge will help you to develop and deploy machine learning models that are efficient, accurate, and robust.
Business Analyst
Business Analysts use their knowledge of business processes and data analysis to help organizations improve their performance. They work in a variety of industries, including finance, healthcare, and manufacturing. The University of Alberta's Sample-based Learning Methods course can provide you with the skills you need to succeed in this role by teaching you about the fundamental concepts of machine learning, including temporal difference learning, Monte Carlo methods, and model-based reinforcement learning. This knowledge will help you to develop and deploy machine learning models that can be used to improve efficiency, reduce costs, and make better decisions.
Operations Research Analyst
Operations Research Analysts use mathematical and statistical models to solve complex problems in a variety of industries, including logistics, manufacturing, and healthcare. The University of Alberta's Sample-based Learning Methods course can provide you with the skills you need to succeed in this role by teaching you about the fundamental concepts of machine learning, including temporal difference learning, Monte Carlo methods, and model-based reinforcement learning. This knowledge will help you to develop and deploy machine learning models that can be used to improve efficiency, reduce costs, and make better decisions.
Software Engineer
Software Engineers design, develop, and maintain software applications. They work in a variety of industries, including finance, healthcare, and technology. The University of Alberta's Sample-based Learning Methods course can provide you with the skills you need to succeed in this role by teaching you about the fundamental concepts of machine learning, including temporal difference learning, Monte Carlo methods, and model-based reinforcement learning. This knowledge will help you to develop software applications that are efficient, reliable, and user-friendly.
Financial Analyst
Financial Analysts analyze financial data to make investment recommendations. They work in a variety of financial institutions, including banks, hedge funds, and asset management companies. The University of Alberta's Sample-based Learning Methods course can provide you with the skills you need to succeed in this role by teaching you about the fundamental concepts of machine learning, including temporal difference learning, Monte Carlo methods, and model-based reinforcement learning. This knowledge will help you to develop and deploy machine learning models that can be used to identify investment opportunities and make better decisions.
Risk Analyst
Risk Analysts identify, assess, and mitigate risks for organizations. They work in a variety of industries, including finance, insurance, and healthcare. The University of Alberta's Sample-based Learning Methods course can provide you with the skills you need to succeed in this role by teaching you about the fundamental concepts of machine learning, including temporal difference learning, Monte Carlo methods, and model-based reinforcement learning. This knowledge will help you to develop and deploy machine learning models that can be used to identify and mitigate risks.
Product Manager
Product Managers are responsible for the development and launch of new products and features. They work in a variety of industries, including technology, finance, and healthcare. The University of Alberta's Sample-based Learning Methods course can provide you with the skills you need to succeed in this role by teaching you about the fundamental concepts of machine learning, including temporal difference learning, Monte Carlo methods, and model-based reinforcement learning. This knowledge will help you to develop and launch products that meet the needs of users and achieve business goals.
Consultant
Consultants provide advice and guidance to organizations on a variety of topics, including business strategy, operations, and technology. The University of Alberta's Sample-based Learning Methods course can provide you with the skills you need to succeed in this role by teaching you about the fundamental concepts of machine learning, including temporal difference learning, Monte Carlo methods, and model-based reinforcement learning. This knowledge will help you to develop and deploy machine learning models that can be used to solve complex problems and make better decisions.
Data Engineer
Data Engineers design, build, and maintain data pipelines that collect, store, and process data. They work in a variety of industries, including finance, healthcare, and technology. The University of Alberta's Sample-based Learning Methods course can provide you with the skills you need to succeed in this role by teaching you about the fundamental concepts of machine learning, including temporal difference learning, Monte Carlo methods, and model-based reinforcement learning. This knowledge will help you to develop and deploy machine learning models that can be used to improve the efficiency and accuracy of data pipelines.
Quantitative Analyst
Quantitative Analysts use mathematical and statistical models to analyze financial data and make investment decisions. They work in a variety of financial institutions, including banks, hedge funds, and asset management companies. The University of Alberta's Sample-based Learning Methods course can provide you with the skills you need to succeed in this role by teaching you about the fundamental concepts of machine learning, including temporal difference learning, Monte Carlo methods, and model-based reinforcement learning. This knowledge will help you to develop and deploy machine learning models that can be used to identify investment opportunities and make better decisions.
Actuary
Actuaries use mathematical and statistical models to assess and manage risk for insurance companies and other financial institutions. The University of Alberta's Sample-based Learning Methods course can provide you with the skills you need to succeed in this role by teaching you about the fundamental concepts of machine learning, including temporal difference learning, Monte Carlo methods, and model-based reinforcement learning. This knowledge will help you to develop and deploy machine learning models that can be used to assess and manage risk more effectively.
Policy Analyst
Policy Analysts develop and evaluate policies for governments and organizations. The University of Alberta's Sample-based Learning Methods course can provide you with the skills you need to succeed in this role by teaching you about the fundamental concepts of machine learning, including temporal difference learning, Monte Carlo methods, and model-based reinforcement learning. This knowledge will help you to develop and evaluate policies that are effective and evidence-based.
Teacher
Teachers educate students in a variety of subjects, including science, math, and social studies. The University of Alberta's Sample-based Learning Methods course can provide you with the skills you need to succeed in this role by teaching you about the fundamental concepts of machine learning, including temporal difference learning, Monte Carlo methods, and model-based reinforcement learning. This knowledge will help you to develop and deliver lesson plans that are engaging and effective.
Researcher
Researchers conduct original research in a variety of fields, including science, engineering, and medicine. The University of Alberta's Sample-based Learning Methods course can provide you with the skills you need to succeed in this role by teaching you about the fundamental concepts of machine learning, including temporal difference learning, Monte Carlo methods, and model-based reinforcement learning. This knowledge will help you to develop and deploy machine learning models that can be used to solve complex problems and make new discoveries.

Reading list

We've selected nine books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Sample-based Learning Methods.
Provides a comprehensive overview of deep reinforcement learning, covering both the theoretical foundations and practical applications. It valuable resource for anyone interested in learning more about this field.
Provides a comprehensive overview of reinforcement learning, covering both the theoretical foundations and practical applications. It valuable resource for anyone interested in learning more about this field.
This textbook provides a comprehensive overview of machine learning, covering both the theoretical foundations and practical applications. It valuable resource for anyone interested in learning more about this field.
This textbook provides a comprehensive overview of deep learning, covering both the theoretical foundations and practical applications. It valuable resource for anyone interested in learning more about this field.
This textbook provides a comprehensive overview of probabilistic graphical models, which are a powerful tool for representing and reasoning about uncertainty. It valuable resource for anyone interested in learning more about this field.
This textbook provides a comprehensive overview of information theory, inference, and learning algorithms. It valuable resource for anyone interested in learning more about these fields.
This textbook provides a comprehensive overview of Bayesian reasoning and machine learning. It valuable resource for anyone interested in learning more about these fields.
Provides a comprehensive overview of Markov decision processes, which are a fundamental mathematical framework for reinforcement learning. It valuable resource for anyone interested in understanding the theoretical foundations of reinforcement learning.
Provides a gentle introduction to machine learning, making it a good choice for those who are new to the field. It covers a wide range of topics, including reinforcement learning.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Sample-based Learning Methods.
Decision Making and Reinforcement Learning
Most relevant
Introduction to Reinforcement Learning in Python
Most relevant
Monte Carlo Simulation Fundamentals
Most relevant
Forecast Answers to Agile Team Questions
Most relevant
Creating Project Contingencies
Most relevant
Advanced Bayesian Statistics Using R
Most relevant
Greeks, American Options and Volatility
Most relevant
Understanding Algorithms for Reinforcement Learning
Most relevant
Bayesian Inference with MCMC
Most relevant
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser