We may earn an affiliate commission when you visit our partners.
Course image
Kostas Daniilidis and Jianbo Shi

How can robots perceive the world and their own movements so that they accomplish navigation and manipulation tasks? In this module, we will study how images and videos acquired by cameras mounted on robots are transformed into representations like features and optical flow. Such 2D representations allow us then to extract 3D information about where the camera is and in which direction the robot moves. You will come to understand how grasping objects is facilitated by the computation of 3D posing of objects and navigation can be accomplished by visual odometry and landmark-based localization.

Enroll now

What's inside

Syllabus

Geometry of Image Formation
Welcome to Robotics: Perception! We will begin this course with a tutorial on the standard camera models used in computer vision. These models allow us to understand, in a geometric fashion, how light from a scene enters a camera and projects onto a 2D image. By defining these models mathematically, we will be able understand exactly how a point in 3D corresponds to a point in the image and how an image will change as we move a camera in a 3D environment. In the later modules, we will be able to use this information to perform complex perception tasks such as reconstructing 3D scenes from video.
Read more
Projective Transformations
Now that we have a good camera model, we will explore the geometry of perspective projections in depth. We will find that this projection is the cause of the main challenge in perception, as we lose a dimension that we can no longer directly observe. In this module, we will learn about several properties of projective transformations in depth, such as vanishing points, which allow us to infer complex information beyond our basic camera model.
Pose Estimation
In this module we will be learning about feature extraction and pose estimation from two images. We will learn how to find the most salient parts of an image and track them across multiple frames (i.e. in a video sequence). We will then learn how to use features to find the position of the camera with respect to another reference frame on a plane using Homographies. We will also learn about how to make these techniques more robust, using least squares to hand noisy feature points or RANSAC to remove completely erroneous feature points.
Multi-View Geometry
Now we will use what we learned from two view geometry and extend it to sequences of images, such as a video. We will explain the fundamental geometric constraints between point features in images, the Epipolar constraint, and learn how to use it to extract the relative poses between multiple frames. We will finish by combining all this information together for the application of Structure from Motion, where we will compute the trajectory of a camera and a map throughout many frames and refine our estimates using Bundle adjustment.

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Derives equations of geometry of images, a core concept for perception-based robotics
Explores projective transformations, a fundamental aspect of robot perception
Develops feature extraction and pose estimation techniques, essential for robot perception
Teaches multi-view geometry, a key concept for robot navigation and mapping
Provides a strong foundation for students interested in robotics perception

Save this course

Save Robotics: Perception to your list so you can find it easily later:
Save

Reviews summary

Robotics: perception course

According to students, Robotics: Perception is a difficult but worthwhile course. While few reviews are available to determine its overall sentiment, positive feedback centers on the course's challenging and rewarding coursework.
The course is rewarding.
"This was by far the best course."
"It is worth studying."
The workload is demanding.
"This course is very difficult and complex."
"The workload is not for everyone."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Robotics: Perception with these activities:
Geometry of Image Formation
Reinforce your understanding of the basic building block of computer vision by applying and practicing the new skills you are learning.
Browse courses on Image Formation
Show steps
  • Practice using the pinhole and thin lens camera models to map 3D points to 2D image points.
  • Apply your knowledge of camera models to derive the equations for perspective projection.
Complete coding exercises on projective transformations
Sharpen your understanding of how images transform based on camera movements by completing coding exercises on projective transformations.
Show steps
  • Review concepts of projective transformations
  • Implement algorithms for projecting points between image planes
  • Test your implementation on real images
Projective Transformations
Expand your understanding of how images are distorted and transformed by cameras through practicing with projective geometry.
Show steps
  • Explore the properties of projective transformations, such as vanishing points and cross-ratios.
  • Apply your knowledge of projective transformations to derive the equations for image rectification.
Ten other activities
Expand to see all activities and additional details
Show all 13 activities
Explore online tutorials on feature extraction
Gain a deeper understanding of feature extraction and how it aids in object recognition.
Browse courses on Pose Estimation
Show steps
  • Identify different feature extraction techniques
  • Understand how features are used for object recognition
  • Apply feature extraction algorithms to sample images
Attend a workshop on visual odometry and landmark-based localization
Gain practical experience and insights into visual odometry and landmark-based localization by attending a dedicated workshop.
Browse courses on Navigation
Show steps
  • Research and identify relevant workshops
  • Register and attend the workshop
  • Actively participate in discussions and hands-on exercises
Pose Estimation
Develop your skills in estimating the position and orientation of a camera from images by applying the techniques you are learning.
Browse courses on Pose Estimation
Show steps
  • Practice extracting salient features from images.
  • Apply your knowledge of feature extraction to track features across multiple frames of a video sequence.
  • Use your skills in feature tracking to estimate the pose of a camera from two images.
Create a visual representation of the Epipolar constraint
Solidify your understanding of the Epipolar constraint by creating a visual representation that demonstrates its geometric relationships.
Show steps
  • Study the mathematical definition of the Epipolar constraint
  • Develop a visual representation using geometric constructions
  • Present your visual representation and explain its significance
Multi-View Geometry
Deepen your understanding of how to reconstruct 3D scenes from multiple images by creating your own projects.
Show steps
  • Explain the concept of the epipolar constraint and how it can be used to match features across multiple images.
  • Develop an algorithm for computing the fundamental matrix from a set of corresponding features.
  • Use your knowledge of multi-view geometry to create a 3D reconstruction of a scene from a set of images.
Review Computer Vision: Algorithms and Applications by Richard Szeliski
Enhance your understanding of computer vision concepts by reviewing an authoritative text that covers advanced topics related to the course.
View Computer Vision on Amazon
Show steps
  • Read selected chapters relevant to course topics
  • Summarize key concepts and algorithms
  • Discuss the book's insights with peers or mentors
Contribute to an open-source computer vision library
Enhance your practical skills and contribute to the computer vision community by participating in an open-source project.
Show steps
  • Identify a suitable open-source project related to computer vision
  • Read the project documentation and understand its goals
  • Propose and implement improvements or new features
Visual SLAM
Expand your knowledge of robotics perception by creating a project that applies visual SLAM techniques to a real-world problem.
Browse courses on Computer Vision
Show steps
  • Research different visual SLAM algorithms and select one to implement.
  • Implement the visual SLAM algorithm and test it on a real-world dataset.
  • Use your visual SLAM system to create a map of a real-world environment.
  • Use your visual SLAM system to navigate a robot through a real-world environment.
Develop a simple visual navigation system for a mobile robot
Apply your understanding of computer vision and robotics by building a functional visual navigation system for a mobile robot.
Browse courses on Navigation
Show steps
  • Design the system architecture and algorithms
  • Implement the system using a suitable programming language and framework
  • Test and evaluate the system's performance in various scenarios
Object Recognition
Extend your understanding of computer vision techniques by creating a project that applies object recognition to a real-world problem.
Browse courses on Object Recognition
Show steps
  • Research different object recognition algorithms and select one to implement.
  • Implement the object recognition algorithm and test it on a real-world dataset.
  • Use your object recognition system to create a system that can identify objects in real-time.
  • Use your object recognition system to create a system that can interact with objects in real-time.

Career center

Learners who complete Robotics: Perception will develop knowledge and skills that may be useful to these careers:
Robotics Engineer
Robotics Engineers use their understanding of mechanical engineering, computer science, and electronics to design, build, and maintain robots. To accomplish navigation and manipulation tasks, they must have a strong comprehension of perception for robots. This course teaches the fundamental principles of robot perception from images and videos, which would be invaluable for a Robotics Engineer.
Research Scientist
Research Scientists conduct research in a variety of fields, including computer vision. This course can provide Research Scientists with a strong foundation in computer vision, which can help them develop new and innovative computer vision algorithms and applications.
Computer Vision Engineer
Computer Vision Engineers are responsible for developing and implementing computer vision systems that can interpret visual data, such as images and videos. This course can help you build a solid foundation in computer vision by teaching you the mathematical principles behind image formation, projective transformations, pose estimation, and multi-view geometry.
Machine Learning Engineer
A Machine Learning Engineer utilizes principles from machine learning, statistics, and computer science to build intelligent systems that perform tasks for humans. Being able to interpret camera images and videos would be a key component in creating robust and dynamic learning systems. This course, Robotics: Perception, teaches the mathematical principles of camera and video interpretation. By understanding the foundations of scene and object recognition, a Machine Learning Engineer could create more accurate and effective learning systems.
Data Scientist
Data Scientists collect, analyze, and interpret data to help businesses make informed decisions. This course can provide Data Scientists with a strong foundation in computer vision, which can be valuable for extracting insights from visual data. For example, a Data Scientist could use the techniques learned in this course to develop a system that can automatically identify objects in images or videos.
User Experience Designer
User Experience Designers create user interfaces for websites, apps, and other products. This course can help User Experience Designers understand how users interact with visual information, which can help them design more intuitive and user-friendly interfaces.
Software Engineer
Software Engineers design, develop, and maintain software applications. This course can help Software Engineers build a foundation in computer vision, which is increasingly being used in a variety of software applications, such as self-driving cars, facial recognition systems, and medical image analysis.
Educator
Educators teach students about a variety of subjects, including computer vision. This course can help Educators stay up-to-date on the latest developments in computer vision, which can help them provide their students with the best possible education.
Product Manager
Product Managers are responsible for the development and launch of new products. This course can help Product Managers understand the technical challenges and opportunities of computer vision, which can help them make better decisions about which products to develop and how to market them.
Technical Writer
Technical Writers create documentation for software, hardware, and other products. This course can help Technical Writers understand the technical details of computer vision, which can help them write more accurate and informative documentation.
Consultant
Consultants provide advice and expertise to businesses on a variety of topics, including computer vision. This course can help Consultants understand the business value of computer vision, which can help them provide their clients with more valuable advice.
Entrepreneur
Entrepreneurs start and run their own businesses. This course can help Entrepreneurs understand the potential of computer vision, which can help them develop new products and services that meet the needs of the market.
Artist
Artists use their creativity to express themselves through a variety of media, including computer vision. This course can help Artists learn how to use computer vision to create new and innovative forms of art.
Forensic Scientist
Forensic Scientists analyze evidence to help solve crimes. This course can help Forensic Scientists understand how to use computer vision to analyze images and videos, which can help them uncover important evidence.
Ethical Hacker
Ethical Hackers use their skills to identify and exploit vulnerabilities in computer systems and networks. This course can help Ethical Hackers develop a deeper understanding of computer vision systems, which can help them identify and exploit vulnerabilities in these systems.

Reading list

We've selected 12 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Robotics: Perception.
This foundational reference covers the mathematical basis of multi-view geometry, providing a solid understanding of camera calibration, stereo vision, and structure from motion. It complements the course's exploration of multi-view techniques.
This seminal work in robotics introduces the principles of probabilistic modeling and its applications in robotics. It provides a valuable perspective on incorporating uncertainty and noise into perception systems, complementing the course's focus on deterministic approaches.
This concise and practical text provides a thorough introduction to the algorithms used in computer vision, covering topics like image processing, feature extraction, and motion analysis. It offers a solid foundation for understanding the implementation of perception algorithms.
This comprehensive reference provides a modern and accessible introduction to computer vision, covering image formation, feature extraction, object recognition, and image understanding. It serves as a valuable resource for deepening the understanding of computer vision fundamentals.
This widely adopted textbook introduces the fundamentals of digital image processing, covering image enhancement, restoration, compression, and analysis. It serves as a valuable supplement for understanding the underlying image processing techniques used in robotics perception.
This specialized text explores the principles and applications of active perception, where robots actively interact with their environment to improve perception accuracy. It provides a unique perspective on the role of perception in decision-making and action planning.
This comprehensive textbook provides an introduction to robotics, computer vision, and control theory. It offers a broad overview of the field, including topics like kinematics, dynamics, and perception, complementing the course's focus on perception.
This in-depth textbook provides a comprehensive introduction to robot modeling and control, including kinematics, dynamics, and control techniques. It complements the course's focus on robot perception by covering the fundamental principles of robot movement and behavior.
This engaging textbook introduces the principles of machine learning as applied to computer vision. It provides a practical understanding of supervised and unsupervised learning algorithms, enhancing the course's exploration of perceptual tasks through machine learning.
This classic textbook provides a comprehensive overview of computer graphics, including topics like image synthesis, geometry processing, and animation. It offers a valuable perspective on the rendering and visualization aspects of robotics perception.

Share

Help others find this course page by sharing it with your friends and followers:
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser