We may earn an affiliate commission when you visit our partners.
Course image
Packt Publishing

This video course will help you get familiar with Jupyter Notebook and all of its features to perform various data science tasks in Python. Jupyter Notebook is a powerful tool for interactive data exploration and visualization and has become the standard tool among data scientists. In the course, we will start from basic data analysis tasks in Jupyter Notebook and work our way up to learn some common scientific Python tools such as pandas, matplotlib, and plotly. We will work with real datasets, such as crime and traffic accidents in New York City, to explore common issues such as data scraping and cleaning. We will create insightful visualizations, showing time-stamped and spatial data.

Read more

This video course will help you get familiar with Jupyter Notebook and all of its features to perform various data science tasks in Python. Jupyter Notebook is a powerful tool for interactive data exploration and visualization and has become the standard tool among data scientists. In the course, we will start from basic data analysis tasks in Jupyter Notebook and work our way up to learn some common scientific Python tools such as pandas, matplotlib, and plotly. We will work with real datasets, such as crime and traffic accidents in New York City, to explore common issues such as data scraping and cleaning. We will create insightful visualizations, showing time-stamped and spatial data.

By the end of the course, you will feel confident about approaching a new dataset, cleaning it up, exploring it, and analyzing it in Jupyter Notebook to extract useful information in the form of interactive reports and information-dense data visualizations.

This course uses Jupyter 5.4.1, while not the latest version available, it provides relevant and informative content for data science enthusiasts.

About the Author

Dražen Lucanin is a developer, data analyst, and the founder of Punk Rock Dev, an indie web development studio. He's been building web applications and doing data analysis in Python, JavaScript, and other technologies professionally since 2009. In the past, Dražen worked as a research assistant and did a PhD in computer science at the Vienna University of Technology. There he studied the energy efficiency of geographically distributed datacenters and worked on optimizing VM scheduling based on real-time electricity prices and weather conditions. He also worked as an external associate at the Ruder Boškovic Institute, researching machine learning methods for forecasting financial crises. During Dražen's scientific work Python, Jupyter Notebook (back then still IPython Notebook), Matplotlib, and Pandas were his best friends over many nights of interactive manipulation of all sorts of time series and spatial data. Dražen also did a Master's degree in computer science at the University of Zagreb.

Enroll now

What's inside

Learning objectives

  • Learn how to efficiently use jupyter notebook for data manipulation and visualisation
  • Perform interactive data analysis and visualisation using jupyter notebook on real data
  • Analyse time series data using pandas
  • Create interactive widgets where non-technical users can also get involved in the data exploration using the notebooks you create
  • Scrape websites to build datasets and deal with common challenges like unstructured or missing data
  • Combine different datasets in a single graph to enable people to compare them visually and gain new insights
  • Analyse and visualise geographic datasets to create stunning information-rich maps

Syllabus

Jupyter Notebook Introduction

This video provides an overview of the entire course.

In this video, we will show how to install a Jupyter Notebook environment on your machine.

  • Cover the ways of installing a Jupyter Notebook

  • Show how to install Docker

  • Show how to use the Jupyter Notebook Data Science Docker stack

Read more

In this video, we will show you how to work with Jupyter Notebooks.

  • Show how to navigate cells

  • Show how the documentation is read and shell code accessed

  • Show how to work with a sample notebook for analyzing life expectancies

In this video, we explain how to publish finished Jupyter Notebooks.

  • Explain the different notebook formats

  • Show how some of these formats can be obtained

  • Export the example notebook

In this section, we examine the Chicago crime dataset and show how to download and import it using Pandas.

  • Explain and download the Chicago crime dataset

  • Examine the dataset format in Jupyter Notebook

  • Choose the necessary options to successfully read it as a Pandas DataFrame

We will examine the core data structures available in Pandas.

  • Examine the 1D Series data structure

  • Examine the 2D DataFrame data structure

  • Explore the Pandas API for said data structures in a Jupyter Notebook

In this video, we will learn about Pandas hierarchical indexes and apply them to visually explore the crime dataset.

  • Examine the Pandas MultiIndex for hierarchically indexed data

  • Show a MultiIndex example in Jupyter Notebook

  • Use a MultiIndex to restructure the crime dataset and visualize it

We explain how to add basic interactivity to a Jupyter Notebook.

  • Explain what interactive widgets are

  • Create an example interactive widget using our crimes dataset

  • Show where to find more examples of interactive widgets

In this video, we will learn what scraping is and why it's important.

  • Explain what unstructured data is

  • Explain the different data formats and their differences: CSV, Excel, REST APIs, plain websites, scanned PDFs...

This video will teach you how to scrape data from a REST API.

  • Explain the weather API

  • Show how to set up the API key to download the data

  • Cover user requests to fetch data from a REST API

This video takes the last example further to import the downloaded REST data into pandas.

  • Show how to convert a Python dict provided to us by Requests into a pandas DataFrame

  • Show how to iterate over multiple API requests to download all the data chunks

  • Combine data chunks into a singular Chicago weather DataFrame

In this video, we will show a more difficult example of scraping data from an unstructured website.

  • Show the website we will be using to fetch the Chicago weather data

  • Show how to use BeautifulSoup to download the website and parse the HTML

  • Show how to convert the parsed HTML object into a pandas DataFrame

In this video, we will learn what information-dense visualisations are.

  • Explain data visualisation as visual storytelling

  • Talk about Edward Tufte's books and website

  • Explain Charles Joseph Minard's excellent map

This section explains how to visualise scatter plots for examining data correlation.

  • Explain time series components

  • Show how to plot a scatter plot

  • Explain data correlation a bit better

This video takes the last example further to import the downloaded REST data into pandas.

  • Explain what linear regression is

  • Explain how modeling real-world behavior relates to general scientific research

  • Show how to create a linear model using linear regression in Python

In this video, we will show a more difficult example of scraping data from an unstructured website.

  • Explain why correlation matrices are useful

  • Show how to create a correlation matrix in Python

See why maps are helpful.

  • Talk about spatial data

  • Talk about John Snow's London cholera outbreak map and how it was helpful

  • Mention I Quant NY's visual storytelling

See how we can build a map from our dataset.

  • Explain how data layers can be overlaid on maps

  • Show how to use Basemap to tile a map

  • Show how to zoom into a specific area of the map and overlay the data there

In this section, we talk about adding interactivity to our map using Plotly.

  • Set up Plotly/Mapbox API keys

  • Show how to draw points on the Plotly map

  • Show how to render roads on a Plotly map

Closing words for the course.

  • Summarize what was learned

  • Suggest some possible next steps for the viewer

  • Instructions for feedback

Traffic lights

Read about what's good
what should give you pause
and possible dealbreakers
Uses Jupyter Notebook, a standard tool among data scientists, for interactive data exploration and visualization, which is highly relevant to the field
Covers common scientific Python tools such as pandas, matplotlib, and plotly, which are essential for data analysis and visualization
Explores real datasets, such as crime and traffic accidents in New York City, to explore common issues such as data scraping and cleaning
Taught by Dražen Lucanin, who has extensive experience in web applications and data analysis using Python, JavaScript, and other technologies
Uses Jupyter 5.4.1, which, while not the latest version available, still provides relevant and informative content for data science enthusiasts
Teaches data scraping from websites and REST APIs, which may be against the terms of service in certain jurisdictions

Save this course

Create your own learning path. Save this course to your list so you can find it easily later.
Save

Reviews summary

Practical data analysis with jupyter

According to learners, this course provides a solid foundation in using Jupyter Notebook for data science tasks. Students find it covers essential tools like Pandas for analysis and Matplotlib and Plotly for visualization. The focus on practical application using real datasets and skills like data scraping is particularly appreciated. Learners gain insights into specialized areas such as time series and geographic data analysis, enabling them to create interactive reports and visualizations. A minor point to note is that the course uses an older version of Jupyter. Overall, it seems well-suited for those looking to build practical data analysis skills using Python and Jupyter.
Explores time series and geographic data.
"The module on geographic data was insightful and practical."
"Handling time series data with Pandas was covered well."
Focuses on real-world data tasks.
"Using the Chicago crime dataset was a great practical exercise."
"Learning data scraping from actual websites was very useful."
"Applied the concepts directly to real data analysis."
Covers essential data science tools.
"I learned the basics of Jupyter and Pandas effectively."
"Good overview of Matplotlib and Plotly for visualization."
"The section on core Pandas data structures was helpful."
Uses an older Jupyter version.
"It uses Jupyter 5.4.1 which isn't the latest."
"Note the version of Jupyter Notebook is not current."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Jupyter Notebook for Data Science with these activities:
Review Python Fundamentals
Solidify your understanding of Python syntax, data structures, and control flow to ensure a smooth learning experience with data analysis libraries.
Browse courses on Python Basics
Show steps
  • Review basic Python syntax and data types.
  • Practice writing simple Python functions.
  • Work through basic Python tutorials online.
Review 'Python for Data Analysis' by Wes McKinney
Deepen your understanding of Pandas, a core library used extensively in the course, by studying this comprehensive guide.
Show steps
  • Read the chapters on Pandas data structures and data cleaning.
  • Work through the examples provided in the book.
  • Apply the techniques learned to a sample dataset.
Practice Pandas Exercises
Reinforce your Pandas skills by working through a series of practical exercises focused on data manipulation and analysis.
Show steps
  • Find a set of Pandas exercises online.
  • Work through the exercises, focusing on data cleaning and transformation.
  • Compare your solutions with the provided answers.
Four other activities
Expand to see all activities and additional details
Show all seven activities
Analyze a Public Dataset with Jupyter Notebook
Apply the skills learned in the course by analyzing a public dataset using Jupyter Notebook, focusing on data cleaning, exploration, and visualization.
Show steps
  • Select a public dataset from a source like Kaggle or the UCI Machine Learning Repository.
  • Create a Jupyter Notebook to load, clean, and explore the data.
  • Generate visualizations to highlight key trends and insights.
  • Write a summary of your findings and conclusions.
Create a Data Visualization Portfolio
Showcase your data visualization skills by creating a portfolio of Jupyter Notebook projects that demonstrate your ability to create insightful and engaging visualizations.
Show steps
  • Choose three datasets that interest you.
  • Create a Jupyter Notebook for each dataset, focusing on data exploration and visualization.
  • Write a brief description of each project and its key findings.
  • Publish your portfolio online using GitHub Pages or a similar platform.
Review 'Storytelling with Data' by Cole Nussbaumer Knaflic
Improve your data visualization skills by learning the principles of effective data storytelling.
Show steps
  • Read the chapters on choosing the right chart type and eliminating clutter.
  • Analyze examples of good and bad data visualizations.
  • Apply the principles learned to your own data visualizations.
Follow Advanced Plotly Tutorials
Expand your knowledge of interactive mapping and data visualization by following advanced tutorials on Plotly.
Show steps
  • Search for advanced Plotly tutorials online.
  • Follow a tutorial on creating interactive dashboards with Plotly.
  • Follow a tutorial on creating custom map visualizations with Plotly.

Career center

Learners who complete Jupyter Notebook for Data Science will develop knowledge and skills that may be useful to these careers:
Data Scientist
A data scientist uses advanced analytical techniques and machine learning to extract insights from data and solve complex problems with the help of tools like Jupyter Notebook. This role involves data cleaning, model building, and creating visualizations. This course specifically covers using Jupyter Notebook for data analysis with Pandas, creating interactive visualizations, scraping data from the web, and analyzing both time series and geographic data. Data scientists often perform these tasks, making this course extremely valuable. This course is highly relevant to any aspiring data scientist.
Data Analyst
A data analyst interprets data and identifies trends using tools like Jupyter Notebook. This role involves cleaning, processing, and visualizing data to provide insights for decision-making, often working with diverse datasets to uncover valuable information. This course helps a data analyst by providing a comprehensive overview of data analysis using Jupyter Notebook, including data manipulation with Pandas, data visualization techniques, and web scraping methods, all of which are fundamental skills in this field. The course teaches you how to approach a new dataset, how to analyze it in Jupyter Notebook, and how to extract useful information, which are tasks performed by a data analyst.
Web Scraper
A web scraper extracts data from websites using specialized tools and programming techniques. This role requires experience in parsing web content and handling different data formats (CSV, JSON, etc.). This course is very helpful for a web scraper as it provides instruction on scraping data from REST APIs, parsing HTML content, and converting scraped data into a Pandas DataFrame within Jupyter Notebook. These are all core tasks performed by a web scraper. The course teaches how to handle unstructured data and convert it into a usable structured format.
Data Visualization Specialist
A data visualization specialist creates graphical representations of data to make complex information more accessible and understandable. They use tools like Jupyter Notebook to create interactive dashboards and visual stories. This role requires not only analytical skills but also a strong understanding of design. This course is very helpful for a data visualization specialist because it focuses on how to create information dense data visualizations, how to visualize scatter plots for examining data correlation, how to create maps, and how to apply interactivity to visualizations. These are some of the core skills they must possess.
Statistical Analyst
A statistical analyst applies statistical techniques, often using tools like Jupyter Notebook, to analyze data and provide insights that support decision making. This role involves building statistical models, interpreting findings from datasets, and creating reports. This course helps build a foundation for a statistical analyst. The course focuses heavily on data analysis using tools like Pandas and Jupyter Notebook. Its lessons on cleaning and exploring data will be useful to a statistical analyst.
Geospatial Data Analyst
A geospatial data analyst works with geographic data to create maps and perform spatial analysis. They use tools, including Jupyter Notebook, to analyze and visualize spatial datasets for urban planning, environmental studies, and other applications. This course may be useful because it specifically introduces how to analyze geographic data with Jupyter Notebook, build maps from datasets, and add interactivity to those maps. The course specifically helps a geospatial data analyst to build foundational skills necessary for working with spatial data.
Research Assistant
A research assistant works alongside researchers in a variety of fields, often utilizing data analysis and visualization skills. This role supports research projects through data collection, analysis, and presentation, sometimes using tools like Jupyter Notebook to manage complex datasets and generate reports. This course may be very helpful for a research assistant as it allows you to build a foundation in data analysis using Jupyter Notebook. The course's emphasis on data cleaning, data exploration, and interactive visualization is directly applicable to the tasks performed by a research assistant, and it will allow you to work with multiple data sets.
Market Research Analyst
A market research analyst studies market trends and consumer behavior to advise companies on product development, pricing, and marketing strategy. They perform data analysis and generate reports based on their research. Using Jupyter Notebook to analyze market data, this role benefits from a familiarity with data visualization techniques and data analysis. This course is a great starting point. It teaches the fundamentals of how to use Jupyter Notebook to clean, explore, and analyze data, and will assist market research analysts by teaching them how to visually represent their findings.
Bioinformatician
A bioinformatician analyzes biological data, often using computational tools. This role combines biology with data science to process and interpret genomic and other biological information. Bioinformaticians use analysis tools including Jupyter Notebook to examine biological datasets, visualize data, and create reports. This course helps bioinformaticians by providing a foundation in data analysis, data visualization and data exploration using tools like Jupyter Notebook, pandas, and matplotlib. Its introduction to exploratory analysis and interactive visualizations is valuable for bioinformaticians. A bioinformatician needs strong data analysis skills, skills that this course helps to build.
Business Intelligence Analyst
A business intelligence analyst transforms data into actionable insights to inform business strategy. They use tools like Jupyter Notebook to analyze market trends, customer behavior, and operational performance. The role involves creating reports and dashboards that provide stakeholders with a clear understanding of key performance indicators. This course may be helpful because it specifically provides instruction in how to explore data using Jupyter Notebook and how to create interactive reports and visualizations. The course also demonstrates how to combine datasets and make comparisons between them, which is a vital skill for a business intelligence analyst.
Research Scientist
A research scientist conducts research in various scientific fields, often utilizing data analysis and visualization skills. They design experiments, collect data, and analyze results using tools such as Jupyter Notebook. This role often requires a doctoral degree. This course may be useful for a research scientist because it provides instruction in how to use Jupyter Notebook for data manipulation and visualization, which are critical skills in this field. The course’s emphasis on extracting useful information from data, exploring common data issues, and creating insightful visualizations is highly relevant to the work of a research scientist.
Quantitative Analyst
A quantitative analyst, often working in finance, develops mathematical and statistical models to analyze financial markets and risk. They use programming languages to process and interpret large datasets. While there is overlap with data science, this role is more focused on model building and analysis. This course may be useful to a quantitative analyst as it teaches the foundations of data analysis with tools such as Jupyter Notebook, Pandas, and Matplotlib. The course focus on dealing with real-world data sets, data cleaning, and interactive data visualizations is a valuable starting point.
Machine Learning Engineer
A machine learning engineer develops and implements machine learning algorithms and models. They use various tools, including data analysis techniques, to build and deploy these models. This role often requires experience with data processing, model evaluation, and deployment. While this course may not delve into the intricacies of machine learning algorithms, it helps build a foundation in data analysis using Jupyter Notebook, which is an important preliminary step for any machine learning engineer. The course's focus on data exploration, visualization and analysis assists machine learning engineers in this process, which makes this course relevant.
Financial Analyst
A financial analyst utilizes financial data to assess investments, track financial performance, and provide projections. This role requires strong analytical skills to assist with financial planning, budgeting, and forecasting. While this course might not directly cover financial analysis techniques, it introduces fundamental data analysis skills using Jupyter Notebook, data cleaning techniques, and data visualization. These skills are valuable to a financial analyst, and this course may be useful. They are skills that help a professional approach a new dataset, analyze it, and extract useful information.
Operations Analyst
An operations analyst seeks to optimize a company’s business processes in order to improve efficiency and reduce costs. Using data analysis, they identify areas for improvement and implement solutions. Using tools like Jupyter Notebook, an operations analyst also monitors processes to ensure that they meet key performance indicators. This course may be useful for an operations analyst because it introduces key concepts in data analysis with tools like Jupyter Notebook and Pandas. The course teaches how to approach new datasets to extract useful information, which are all useful skills for this role.

Reading list

We've selected two books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Jupyter Notebook for Data Science.
Comprehensive guide to using Pandas for data manipulation, analysis, and visualization. It provides in-depth coverage of Pandas data structures, data cleaning techniques, and data aggregation methods. It valuable resource for anyone looking to master Pandas and apply it to real-world data science problems. This book is commonly used as a textbook at academic institutions.
Focuses on the principles of effective data visualization and communication. It teaches you how to choose the right chart type, eliminate clutter, and focus your audience's attention on the key insights. It valuable resource for anyone looking to improve their data storytelling skills. This book is more valuable as additional reading than it is as a current reference.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Similar courses are unavailable at this time. Please try again later.
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2025 OpenCourser