We may earn an affiliate commission when you visit our partners.
Course image
Ahmed Rafik

What is web scraping ?

Read more

What is web scraping ?

Let's say your boss at work gave you a task where he wants you to extract about 1000 product from a website, structure the data and save it to a database, would you copy paste manually all the product details from the product name, url and price ? I can imagine you would work days and nights and you wont finish the task, so this is where web scraping shines. So web scraping, or web harvesting or web data extraction is like writing a script that will automate data extraction from websites in a matter of minutes . .

Why learn Web Scraping ?

Whether you're a data analyst, a web developer or even someone who wants to work as a freelancer you should learn web scraping. 

For a data analyst building a dataset is extremely important, so without web scraping you simply can't generate it in addition to that adding web scraping in your resume is a plus for you.

Web scraping can be used in a variety of fields, so let me give you some examples on what you can do with it:

  1. Generate leads,

  2. Drop shipping where basically you gonna constantly scrape products from different online stores and show case them on your website to make money,

  3. Monitor products prices to get the best deals,

  4. automation,

  5. Machine learning,

  6. Web scraping freelancer

Of course there are tons and tons and variety of fields where web scraping can be extremely beneficial.

Is this course the right one for you ?

I've carefully planned and designed this course to be beginner friendly, from my experience I know those who do web scraping are mostly data analyst with no background knowledge on how the web works, how requests are made, how to locate and parse the data from the web and much much more, in addition to that this the most updated course regarding the material included and the tools used, so in this course

  1. I'll introduce to you the most used web scraping tools/frameworks

  2. We will setup the development environment from scratch

  3. You will learn and understand LXML core fundamentals

  4. How to use XPath & CSS selectors to select the data from a web page

  5. How the web works (Request/Response)

  6. How to scrape simple HTML web pages

  7. How to scrape multiple web pages

  8. Extract data from APIs

  9. You will learn Splash(crash course) so you can use it to scrape JavaScript websites

  10. Authentication/Login

  11. Store the extracted data whether to JSON/CSV files or MongoDb/SQLite3

  12. Exclusive tips and tricks regarding web scraping

Finally this course is project based, each section starting from the 2nd one we will experiment with a different website, each project has a certain degree of difficulty  and each one is completely independent from other projects. 

Is there is any assignments/exercises included in this course ? 

Yes, each section has an assignment included to it, this will help to get your hands dirty and by the end of each section after doing the assignment included you will feel more confident and comfortable with web scraping.

Why LXML and not BeautifulSoup ?

LXML is a lightweight HTML parser even the most popular web scraping framework (Scrapy) is built on the top of LXML, BeautifulSoup is a little bit overloaded with the number of functions exposed to us, it has more functions to use, yes that's right . however in Web Scraping most of the time we use XPath and CSS Selectors to navigate and select what to scrape from the HTML web page (tree) so there is no need to learn about new functions and wasting all that time to familiarize yourself with the BeautifulSoup API and the internal architecture, in addition to all of that LXML in terms of performance is way better than BeautifulSoup.

Who is your instructor ? 

Hi. I'm Ahmed nice to meet you, my students prefer to call me web scraping Ninja and currently I have taught more than 2000 students around the world how to do web scraping. I personally do web scraping on daily basis whether for fun, for personal projects or as a freelancer and guess what ? I even have a master degree in computer science.

Should I enroll to this course ?

Honestly, by enrolling to this course you have nothing to lose, because if this course didn't meet your requirements, you can always ask for a refund in less than 30 days from the day you enrolled to the course guaranteed by Udemy with

SO IF YOU DON'T KNOW ANYTHING ABOUT WEB SCRAPING & YOU DON'T KNOW WHERE TO START  ENROLL NOW . :)

Enroll now

What's inside

Learning objectives

  • Lxml core fundamentals
  • Xpath & css selectors
  • How send http requests with python
  • Scraping html web pages
  • Scraping multiple pages using recursion
  • Scraping apis
  • Splash http api
  • Scraping javascript websites using splash
  • Authentication and login to websites using requests
  • Web scraping best practices
  • Building datasets
  • Show more
  • Show less

Syllabus

Getting Started
Course Introduction
Web Scraping tools
Setting up the development environnement
Read more
Udemy 101 (OPTIONAL)
How to Ask questions (Please don't skip)
LXML core fundamentals
Section Info
ElementTree object
Element object
Introduction to LXML with XPath
Introduction to LXML with CSS Selectors
Code source lecture by lecture
XPath & CSS Selectors
What is XPath & CSS
CSS Selectors fundamentals
CSS selectors in theory
XPath fundamentals
Navigating using XPath(Going UP)
Navigating using XPath(Going DOWN)
XPath in theory
HTTP Requests with Python
How the web works
Python Requests
Request/Response headers
Quiz
Project 1: Simple & Clean
Locating the data
Building the Scraper
Cleaning the data
Writing data to JSON/CSV files
Turning it into a command line app
Project 1 source code
Ebay trending products
Project 2: Recursion
Getting rid of unnecessary JavaScript
Scraping Data
Scraping multiple pages (Recursion)
Storing the data in MongoDb cloud
6-5 Prevent storing same records and updating records
CoinMarketCap update
Project 2 source code
IMDB top movies
Project 3: APIs
API/HTML What's the difference ?
Generating code using Postman
Parsing APIs
Recursion challenge
Challenge solution (Scraping APIs recursively)
Inserting data into SQLite3 database
Project 3 source code
Splash crash course
What is Splash ?
Setting up Splash
Intro to Splash
Selecting Elements, filling Inputs and clicking on Buttons
Splash Request & Response headers
Very useful resource (SPLASH FAQ)
Flight Aware
Project 4: Scraping JavaScript websites using Splash, Requests and LXML
Splash private mode and cookies
Quick note (Splash private mode)
Using Splash with Requests
Parsing the Response
Project 4 source code
Scraping FlightAware using Splash & Requests
Project 5: Authentication/Login
Browser authentication
Requests authentication
Parse and clean HTML
Project 5 source code
My other web scraping courses
Bonus lecture

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Introduces LXML, a lightweight HTML parser
Emphasizes practical application and minimizes jargon
Provides clear and concise instructions
Utilizes a variety of resources and tools for hands-on learning
Develops foundational skills in data extraction
Provides a strong foundation for further exploration in web scraping

Save this course

Save Web Scraping 101 with Python3 using REQUESTS, LXML & SPLASH to your list so you can find it easily later:
Save

Reviews summary

Well received web scraping explorations

According to students, Web Scraping 101 with Python3 using REQUESTS, LXML & SPLASH is largely positive online course for beginners interested in web scraping. Learners note that this course is well-structured, easy to understand, and that it provides engaging assignments. With over 10 hours of content featuring lectures, readings, quizzes, and homework assignments, learners say this course is relatively comprehensive for beginners seeking to master web scraping.
Course covers a lot of material.
"over 10 hours of content"
"she talks widely about each card."
Course content is engaging.
"very interesting, easy to follow, easy to understand. Very enjoyable thank you!"
"I like the segmented nature of Sal’s approach; bite size nuggets of information. She speaks clearly and offers great insight to the new learner, like me."
"It is an informative class that gives great feedback."
Course is great for beginners.
"good easy to understand"
"Loved the course. It was easy to learn as Iam beginner"
"it is absolutely great! I am a total beginner and the explanations are great to understand the tarot clearly!"
Instructor is knowledgeable and experienced.
"truly educational course, Sal Jade teaches every aspect of the deck, leaving nothing out"
"The course is good, profound knowledge."
"Sal's background as a teacher really shows in the way she structures her courses and in the way she produces the content. She makes the material engaging and easy to understand. She really knows her stuff and is a great resource for mastering the tarot!"
Concepts are explained clearly.
"easy to understand and lots of examples"
"It was great course for beginners. simple to follow"
"I like the way the course is set up. I've found the topics easy to remember so far"
"Very insightful. Feel like really learning the overarching meaning of the cards well and will be able to use on tarot readings for myself and others. love the pace."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Web Scraping 101 with Python3 using REQUESTS, LXML & SPLASH with these activities:
Follow online tutorials on web scraping
Supplement your learning with additional resources and tutorials on web scraping.
Show steps
  • Search for online tutorials on web scraping.
  • Follow the tutorials to learn more about specific web scraping techniques.
Find a mentor in web scraping
Connect with experienced web scraping professionals to receive guidance and support.
Show steps
  • Attend industry events and meetups.
  • Reach out to web scraping experts on LinkedIn.
Go over HTTP Request/Response basics
Refresh your knowledge on the basics of HTTP requests and responses to strengthen your understanding of how scraping data from the web works.
Browse courses on HTTP
Show steps
  • Review HTTP request/response concepts
  • Go through examples of HTTP requests and responses
Ten other activities
Expand to see all activities and additional details
Show all 13 activities
Practice XPath and CSS selectors
Practice using XPath and CSS selectors to navigate and select data from HTML web pages.
Browse courses on Xpath
Show steps
  • Find a simple HTML web page.
  • Use XPath and CSS Selectors to locate and extract data from the web page.
  • Repeat the process with different web pages.
Practice using Python Requests Library
Sharpen your skills in using the Python Requests library for web scraping, ensuring you're comfortable with making HTTP requests and parsing responses.
Browse courses on Python
Show steps
  • Review the Requests library documentation
  • Work on practice problems using Requests
XPaths Explained Tutorial
Learn about XPath, an XPath Syntax Guide, and XPath Tutorial for beginners for better parsing of HTML content and extracting data from web pages.
Browse courses on Xpath
Show steps
  • Start an online tutorial on XPath
  • Follow the steps in the tutorial to practice XPath
CSS Selectors Tutorial
Understand element selection in HTML and XML by learning CSS selectors in this tutorial.
Browse courses on CSS
Show steps
  • Enroll in a guided tutorial on CSS
  • Work on practice questions on CSS selectors
LXML Basics Practice
Practice extracting data from HTML using LXML to build a foundation in data extraction using coding.
Show steps
  • Start a practice drill on LXML
  • Attempt to solve exercises based on LXML usage
Create a collection of web scraping resources
Organize and share useful web scraping resources to benefit others.
Show steps
  • Gather a list of valuable web scraping resources.
  • Organize the resources into a meaningful structure.
  • Share the collection with the community.
Build a web scraping project
Apply the concepts of web scraping to build a project that solves a real-world problem.
Show steps
  • Identify a problem that can be solved using web scraping.
  • Design and develop a web scraping solution.
  • Test and deploy the solution.
Recursion Challenges
Reinforce your understanding of recursion by working through a series of challenges to master the concept in practice.
Browse courses on Recursion
Show steps
  • Research recursion techniques
  • Enroll in a leetcode challenge for recursion
  • Attempt to solve practice problems
Web Scraper Project: FlightAware using Splash & Requests
Build a web scraping project using Splash and Requests to extract data from a website to enhance your web scraping abilities.
Browse courses on Web Scraping
Show steps
  • Plan the project
  • Implement the scraper
  • Test and debug the scraper
Write a blog on How to Scrape Data from APIs
Create a resource to share your knowledge and understanding of web scraping data from APIs, further solidifying your understanding of the concept.
Browse courses on Web Scraping
Show steps
  • Research APIs
  • Write the blog post
  • Publish the blog post

Career center

Learners who complete Web Scraping 101 with Python3 using REQUESTS, LXML & SPLASH will develop knowledge and skills that may be useful to these careers:
Data Analyst
Data Analysts collect, clean, and interpret data to help businesses make informed decisions. This course can help you develop the skills you need to succeed in this role, including how to use Python, LXML, and Splash to extract data from websites. You will also learn how to store and manage data, and how to use data visualization tools to present your findings.
Web Developer
Web Developers design and develop websites. This course can help you develop the skills you need to succeed in this role, including how to use Python, LXML, and Splash to extract data from websites. You will also learn how to use HTML, CSS, and JavaScript to create and style websites.
Freelance Web Scraper
Freelance Web Scrapers use their skills to extract data from websites for clients. This course can help you develop the skills you need to succeed in this role, including how to use Python, LXML, and Splash to extract data from websites. You will also learn how to find and work with clients, and how to manage your own business.
Product Manager
Product Managers plan and develop products. This course can help you develop the skills you need to succeed in this role, including how to use Python, LXML, and Splash to extract data from websites. You will also learn how to analyze data to identify customer needs, and how to develop and launch new products.
Data Scientist
Data Scientists use their skills in data analysis, machine learning, and statistics to solve business problems. This course can help you develop the skills you need to succeed in this role, including how to use Python, LXML, and Splash to extract data from websites. You will also learn how to use data analysis and machine learning techniques to identify patterns and trends in data.
Software Engineer
Software Engineers design, develop, and maintain software applications. This course can help you develop the skills you need to succeed in this role, including how to use Python, LXML, and Splash to extract data from websites. You will also learn how to use programming languages and software development tools to create and maintain software applications.
Business Analyst
Business Analysts use their skills in data analysis and problem solving to help businesses improve their operations. This course can help you develop the skills you need to succeed in this role, including how to use Python, LXML, and Splash to extract data from websites. You will also learn how to analyze data to identify areas for improvement, and how to develop and implement solutions.
Quantitative Analyst
Quantitative Analysts use their skills in mathematics and statistics to analyze financial data. This course can help you develop the skills you need to succeed in this role, including how to use Python, LXML, and Splash to extract data from websites. You will also learn how to use financial modeling and analysis techniques to identify investment opportunities and risks.
Market Researcher
Market Researchers use their skills in data analysis and research to understand consumer behavior. This course can help you develop the skills you need to succeed in this role, including how to use Python, LXML, and Splash to extract data from websites. You will also learn how to use market research techniques to identify trends and opportunities in the marketplace.
SEO Specialist
SEO Specialists use their skills in web development and marketing to improve the visibility of websites in search engine results. This course can help you develop the skills you need to succeed in this role, including how to use Python, LXML, and Splash to extract data from websites. You will also learn how to use SEO techniques to optimize websites for search engines.
Data Journalist
Data Journalists use their skills in journalism and data analysis to write stories that are based on data. This course can help you develop the skills you need to succeed in this role, including how to use Python, LXML, and Splash to extract data from websites. You will also learn how to use data visualization tools to present your findings in a clear and concise way.
Librarian
Librarians use their skills in information management to help people find and access information. This course can help you develop the skills you need to succeed in this role, including how to use Python, LXML, and Splash to extract data from websites. You will also learn how to organize and manage information, and how to provide reference and research services.
Archivist
Archivists use their skills in history and information management to preserve and make accessible historical records. This course can help you develop the skills you need to succeed in this role, including how to use Python, LXML, and Splash to extract data from websites. You will also learn how to appraise, arrange, and describe historical records, and how to provide reference and research services.
Museum curator
Museum Curators use their skills in history and art to manage and interpret museum collections. This course can help you develop the skills you need to succeed in this role, including how to use Python, LXML, and Splash to extract data from websites. You will also learn how to acquire, catalog, and preserve museum objects, and how to develop and deliver museum exhibitions.
Teacher
Teachers use their skills in education and content knowledge to teach students. This course may be helpful for teachers who want to learn how to use Python, LXML, and Splash to extract data from websites. This can be useful for creating lesson plans, assignments, and other teaching materials.

Reading list

We've selected seven books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Web Scraping 101 with Python3 using REQUESTS, LXML & SPLASH.
Provides a comprehensive introduction to web scraping with R and rvest. It covers the basics of web scraping, including how to send HTTP requests, parse HTML and XML, and work with APIs. It also covers how to use rvest to scrape websites that use JavaScript or AJAX.
Provides a collection of recipes for solving common web scraping problems using Python. It valuable resource for anyone who wants to learn more about web scraping or to find solutions to specific problems they may encounter.
Provides a comprehensive introduction to web scraping, covering the basics of HTTP requests, parsing HTML and XML, and working with APIs. It valuable resource for anyone who wants to learn more about web scraping.
Provides a comprehensive introduction to web scraping with R, covering the basics of HTTP requests, parsing HTML and XML, and working with APIs. It valuable resource for anyone who wants to learn more about web scraping or to use R for web scraping projects.
The Beautiful Soup documentation comprehensive resource for learning how to use the Beautiful Soup library for web scraping.
Provides a comprehensive introduction to web scraping with Java, covering the basics of HTTP requests, parsing HTML and XML, and working with APIs. It valuable resource for anyone who wants to learn more about web scraping or to use Java for web scraping projects.
This tutorial provides a clear and concise introduction to CSS selectors, which are used for selecting elements from HTML documents. It valuable resource for anyone who wants to learn more about CSS selectors or to use them for web scraping projects.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Web Scraping 101 with Python3 using REQUESTS, LXML & SPLASH.
Scrapy : Python Web Scraping & Crawling for Beginners
Most relevant
Web Scraping: Python Data Playbook
Most relevant
Scraping Media from the Web with R
Most relevant
Advanced Web Scraping Tactics: Python 3 Playbook
Most relevant
Advanced Web Scraping Tactics: R Playbook
Most relevant
Scrapy: Powerful Web Scraping & Crawling with Python
Most relevant
Supercharged Web Scraping with Asyncio and Python
Most relevant
Extracting Data from HTML with BeautifulSoup
Most relevant
Extracting Structured Data from the Web Using Scrapy
Most relevant
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser