We may earn an affiliate commission when you visit our partners.
Course image
Udemy logo

Web Scraping in Nodejs & JavaScript

Stefan Hyltoft

In this course you will learn how to scrape a websites, with practical examples on real websites using JavaScript Nodejs Request, Cheerio, NightmareJs and Puppeteer. You will be using the newest JavaScript ES7 syntax with async/await.

Read more

In this course you will learn how to scrape a websites, with practical examples on real websites using JavaScript Nodejs Request, Cheerio, NightmareJs and Puppeteer. You will be using the newest JavaScript ES7 syntax with async/await.

You will learn how to scrape a Craigslist website for software engineering jobs, using Nodejs Request and Cheerio. You will be using the newest JavaScript ES7 syntax with async/await.

You will then learn how to scrape more advanced websites that require JavaScript such as iMDB and AirBnB using NighmareJs and Puppeteer.

I'm gong to also show you with a practical real-life website, how you can even avoid wasting time on creating a web scraper in the first place, by reverse engineering websites and finding their hidden API's.

Learn how to avoid being blocked from websites when developing out your scraper, by building out the scraper in a test-driven way with mocked html, rather than hitting the website every time as you're debugging and developing it. You'll also learn what you can do if you're blocked and your alternatives to get your scraper up and running regardless.

You will also learn how to scrape on a server with a bad connection, or even if you have a bad connection.

You'll even learn how to save your results to a CSV file and MongoDB.

How do you build a scraper that scrapes every 1 hour (or other interval), and deploy it do a cloud host like Heroku or Google Cloud? Let me show you, quick and easy.

How do you scrape a site requiring passwords? I'm going to show you that too with a real website (Craigslist).

How do you serve your scraping results in a REST API with Nodejs Express? And how can we build a React frontend that's showing the results? You'll learn that too, in the quickest and simplest way possible.

Plus, a section covering how to make a basic GraphQL API is included in the course.

As a last cherry on the top, I have a section containing a secret backdoor showing you how to scrape Facebook using only Request.

If you have issues regarding a site you're trying to scrape yourself, it's totally okay to reach out to me for some help. I'd be happy to point you in the right direction. Whatever issues my students are facing, I use that to expand on my course.

Enroll now

What's inside

Learning objectives

  • Be able to scrape jobs from a page on craigslist
  • Learn how to use request
  • Learn how to use nightmarejs
  • Learn how to use puppeteer
  • Learn how to scrape elements without any identifiable classes or id's
  • Learn how to save scraping data to csv
  • Learn how to save scraping data to mongodb
  • Learn how to scrape facebook using only request!
  • Learn how you can reverse engineer sites and find hidden api's!
  • Learn different technologies used for scraping, and when it's best to use them
  • Learn how to scrape sites using authentication
  • Learn how to scrape html tables using request/cheerio
  • Show more
  • Show less

Syllabus

Learn software to use to web scrape in JavaScript
Software to web scrape in JavaScript

Feel free to watch this later in the course if you wonder what the deprecation of the request/request-promise packages means to you

Read more
How you can even avoid writing a web scraper in the first place
This could save you A LOT of time and effort!
Learn how to select elements with CSS selectors and know what tools to use for scraping
Intro to section
Using Chrome Developer Tools
Selecting our element
Building our first scraper!
Selecting multiple elements
Selecting using CSS ID
Selecting using CSS classes
Selecting using HTML attributes
You're on your way to become a scraping ninja!
Learn how to scrape HTML tables with Request and Cheerio in Nodejs

Learn the basic HTML structure of HTML tables, so you can better understand how to scrape data from them

See our end goal of the scraped data, the data structure of our scraped data from the HTML table using Request/Cheerio

Learn how to easily copy a selector in Chrome tools, so you can select the data you need from the HTML table with jQuery.

Scraping all table cells in Chrome Tools
Scraping data in Nodejs with Cheerio/Request
Scraping Company Names in Nodejs
Scraping all table columns
BONUS - dynamic table headers when scraping tables
Learn how to scrape a pagination website using Axios in Nodejs
Project Intro
Project Initializing & Package Import
Requesting HTML using Axios library
CSS Selectors + jQuery Injection
Scraping Job Titles in Nodejs
CSS Selector for Job URLs
Extracting into Data Object in Nodejs
Scraping All Pages
Scraping Job Descriptions
Putting Job Descriptions Into Data Objects
Avoid Getting Banned with Sequential Requests
Another Trick to Avoid Getting Banned
Learn how to scrape Craigslist using Puppeteer
Intro to project
Why are we using Puppeteer instead of Nodejs Request?
Initialising project

In this lecture we'll learn how to open any given URL with Puppeteer and the Chromium browser.

What data are we scraping?
Data Structure
Job Title Css Selector
Scraping job title using Cheerio
Scraping description url
Creating array of scraping objects
Scraping job post date
Scraping Neighborhood data
Scraping List of Pages with Puppeteer
Limiting Scraping Requests per Second
Scraping job descriptions from different pages
Scraping compensation from job listings
mLab is now MongoDB Atlas

Setting up a MongoDB database is fast, easy and free with MLab!

Connecting to MongoDB database with Mongoose
Creating Listing mongoose schema
Saving listing data to MongoDB
What can you do if you're blocked from websites?
Help! I'm blocked!
What can you do if you're blocked?
Scraping API's
Using a proxy in Request
Learn how to develop a scraper while avoiding getting blocked
Initializing project and adding packages
Creating tests folder and setting up test script
Writing our first simple test
Making our first simple test pass!
Getting HTML from the website for our tests
Reading HTML file for our tests
Writing out our tests
Getting title test to pass
Making URL test pass!
Making hood test pass!
Making the final test for datePosted pass!
End notes + refactoring
Learn how to simply export your scraping results to CSV
Exporting web scraping results to CSV
Learn how to scrape on bad networks
Handling Network Problems in our Craigslist scraper
Learn how to parse Robots.txt and follow the rules a site have
What is robots.txt?
Example of usage robots-parser
Parsing robots.txt from a real site
Learn how to scrape sites with pagination
Simple Pagination Scraper in 10 mins!
Learn how to scrape a site with authentication
Intro to authentication scraping project
Looking at Login request
Recreating login in Postman
Creating our login request in Nodejs
Using Puppeteer instead of Request
Learn how to web scrape a site requiring cookie/session authentication and CSRF tokens
Replicating login request inside Postman - seeing how cookies are required
Building out our request inside Node.js and enabling cookieJar
Getting CSRF token from saved cookies and using it in our POST login request

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Emphasizes avoiding being blocked from websites and developing scrapers in a test-driven way, which are essential skills for web scraping
Teaches how to scrape advanced websites that require JavaScript, such as iMDB and AirBnB, using NightmareJs and Puppeteer
Covers a comprehensive range of topics, including scraping websites with pagination, using authentication, and handling network problems
Involves practical examples of real-world website scraping in various industries, such as software engineering and real estate
Includes a section on how to reverse engineer websites and find their hidden APIs, providing a valuable technique for web scraping
Provides guidance on how to save and export scraping results to CSV and MongoDB, ensuring data storage and accessibility

Save this course

Save Web Scraping in Nodejs & JavaScript to your list so you can find it easily later:
Save

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Web Scraping in Nodejs & JavaScript with these activities:
Review JavaScript Syntax (ES7)
Strengthen your foundation by revisiting the ES7 syntax used throughout the course.
Browse courses on JavaScript
Show steps
  • Review online resources or tutorials on ES7 syntax.
  • Practice writing simple code examples using ES7 features.
Tutorial: Scraping a Basic Table Using Cheerio
Learn the basics of web scraping by following a guided tutorial that demonstrates how to extract data from a simple HTML table using Cheerio.
Show steps
  • Find a website with a table you want to scrape.
  • Follow the tutorial to set up your scraping environment and install Cheerio.
  • Use Cheerio's selectors to locate the table and extract the data.
Practice Selecting Elements with CSS Selectors
Reinforce your understanding of CSS selectors and practice selecting elements from real web pages.
Browse courses on CSS Selectors
Show steps
  • Open any website and inspect its HTML using Chrome's DevTools.
  • Select different elements on the web page using CSS selectors.
  • Try to select elements with various CSS selectors, such as class, id, and tag name.
  • Experiment with different combinations of CSS selectors to select specific elements.
Five other activities
Expand to see all activities and additional details
Show all eight activities
Discussion: Troubleshooting Common Web Scraping Challenges
Engage with peers to discuss common obstacles encountered in web scraping and share solutions to overcome them.
Browse courses on Troubleshooting
Show steps
  • Join a discussion forum or online group focused on web scraping.
  • Pose questions or share your experiences regarding web scraping challenges.
  • Engage in discussions with peers to explore different perspectives and learn from their experiences.
Develop a Web Scraping Tool for Personal Use
Apply your knowledge by creating a custom web scraping tool tailored to your specific needs or interests.
Browse courses on Project-Based Learning
Show steps
  • Identify a problem or task that can be solved using web scraping.
  • Design and develop a web scraping script to automate the task.
  • Test and refine your tool to ensure its accuracy and efficiency.
Attend a Web Scraping Workshop or Meetup
Expand your knowledge and network by attending a workshop or meetup focused on web scraping.
Browse courses on Skill Development
Show steps
  • Identify and register for a relevant workshop or meetup.
  • Attend the event and actively engage with speakers and attendees.
  • Ask questions, share insights, and learn from others' experiences.
Project: Scrape Product Data from an E-commerce Website
Demonstrate your web scraping skills by creating a scraper that extracts product data from an e-commerce website.
Browse courses on Data Extraction
Show steps
  • Identify the target website and analyze its structure.
  • Build a web scraping script using JavaScript and Node.js.
  • Deploy your scraper to a cloud platform to run regularly.
Contribute to an Open-Source Web Scraping Library
Deepen your understanding of web scraping by contributing to an open-source library that supports this functionality.
Browse courses on Open-Source
Show steps
  • Identify an open-source web scraping library with active development.
  • Explore the project's documentation and codebase.
  • Identify an area where you can contribute code improvements or new features.
  • Submit a pull request with your proposed changes.
  • Review feedback from the project maintainers and iterate on your contributions.

Career center

Learners who complete Web Scraping in Nodejs & JavaScript will develop knowledge and skills that may be useful to these careers:
Web Developer
Web Developers design, build, and maintain websites and web applications. The Web Scraping in Nodejs & JavaScript course can provide Web Developers with the skills and knowledge needed to scrape data from websites, which can be useful for a variety of purposes, such as data analysis, market research, and competitive intelligence.
Data Scientist
A Data Scientist designs and implements algorithms to extract knowledge from data to gain insights, derive patterns, and solve complex business problems. The Web Scraping in Nodejs & JavaScript course can be a valuable asset to a Data Scientist, as it provides the skills and knowledge needed to gather and extract data from a variety of sources. This can be particularly useful for Data Scientists who are working on projects that involve large amounts of unstructured data, such as web pages.
Business Analyst
Business Analysts study business processes and recommend improvements. The Web Scraping in Nodejs & JavaScript course can provide Business Analysts with the skills and knowledge needed to scrape data from websites, which can be useful for a variety of purposes, such as process improvement, cost reduction, and customer satisfaction.
Software Engineer
Software Engineers design, develop, and maintain software applications. The Web Scraping in Nodejs & JavaScript course can provide Software Engineers with the skills and knowledge needed to scrape data from websites, which can be useful for a variety of purposes, such as data analysis, testing, and debugging.
Data Analyst
Data Analysts collect, clean, and analyze data to identify trends and patterns. The Web Scraping in Nodejs & JavaScript course can provide Data Analysts with the skills and knowledge needed to scrape data from websites, which can be useful for a variety of purposes, such as market research, customer segmentation, and fraud detection.
Journalist
Journalists investigate and report on news stories. The Web Scraping in Nodejs & JavaScript course can provide Journalists with the skills and knowledge needed to scrape data from websites, which can be useful for a variety of purposes, such as data-driven journalism, investigative reporting, and fact-checking.
Product Manager
Product Managers plan and develop products. The Web Scraping in Nodejs & JavaScript course can provide Product Managers with the skills and knowledge needed to scrape data from websites, which can be useful for a variety of purposes, such as product development, market research, and competitive analysis.
Information Architect
Information Architects design and organize websites and web applications. The Web Scraping in Nodejs & JavaScript course can provide Information Architects with the skills and knowledge needed to scrape data from websites, which can be useful for a variety of purposes, such as website design, usability testing, and content strategy.
User Experience Designer
User Experience Designers design and develop the user experience of websites and web applications. The Web Scraping in Nodejs & JavaScript course can provide User Experience Designers with the skills and knowledge needed to scrape data from websites, which can be useful for a variety of purposes, such as user research, prototyping, and testing.
Market Researcher
Market Researchers study market trends and consumer behavior. The Web Scraping in Nodejs & JavaScript course can provide Market Researchers with the skills and knowledge needed to scrape data from websites, which can be useful for a variety of purposes, such as competitive analysis, product development, and marketing campaigns.
Project Manager
Project Managers plan and execute projects. The Web Scraping in Nodejs & JavaScript course can provide Project Managers with the skills and knowledge needed to scrape data from websites, which can be useful for a variety of purposes, such as project planning, risk management, and quality control.
Technical Writer
Technical Writers create and maintain technical documentation. The Web Scraping in Nodejs & JavaScript course may be useful for Technical Writers who are responsible for documenting web scraping processes or software.
Database Administrator
Database Administrators design and maintain databases. The Web Scraping in Nodejs & JavaScript course may be useful for Database Administrators who are responsible for managing databases that contain data scraped from websites.
Data Entry Clerk
Data Entry Clerks enter data into computer systems. The Web Scraping in Nodejs & JavaScript course may be useful for Data Entry Clerks who are responsible for entering data that has been scraped from websites.
QA Tester
Quality Assurance Testers test software for bugs and defects. The Web Scraping in Nodejs & JavaScript course may be useful for QA Testers who are responsible for testing web scraping software or websites.

Reading list

We've selected seven books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Web Scraping in Nodejs & JavaScript.
Covers web scraping using PHP and its frameworks, such as Symfony and Laravel, providing a good option for those working with PHP.
This popular book covers a wide range of automation tasks, including web scraping, providing a solid foundation for beginners.
Covers the basics of web scraping with Python and BeautifulSoup. It includes a section on how to avoid getting blocked from websites.
Covers the basics of web scraping with R and rvest. It includes a section on how to avoid getting blocked from websites.
Covers the basics of web scraping with R, including how to use the rvest and httr packages. It also includes a section on how to avoid getting blocked from websites.
Provides an overview of web scraping using PhantomJS. It covers topics such as web page parsing, data extraction, and data manipulation. It can serve as a useful resource for those who want to learn more about web scraping using PhantomJS.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Web Scraping in Nodejs & JavaScript.
Scrapy: Powerful Web Scraping & Crawling with Python
Most relevant
Advanced Web Scraping Tactics: Python 3 Playbook
Web Scraping 101 with Python3 using REQUESTS, LXML &...
Scraping Dynamic Web Pages with Python 3 and Selenium
Learn and Understand NodeJS
TypeScript 4: Getting Started
Advanced Typescript programming with NodeJs and Webpack
Node JS Curso Completo do Básico ao Avançado
Build a CRUD Node.js and MongoDB employee management web...
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser