We may earn an affiliate commission when you visit our partners.
Genevieve M. Lipp, Nick Eubank, Kyle Bradbury, and Andrew D. Hilton

Put the keystone in your Python Data Science skills by becoming proficient with Data Visualization and Modeling. This course is suited for intermediate programmers, who have some experience with NumPy and Pandas, that want to expand their skills for any career in data science. Whether you come to data science through social sciences and Statistics, or from a programming background, this course will integrate the two perspectives and offer unique insights from each.

Read more

Put the keystone in your Python Data Science skills by becoming proficient with Data Visualization and Modeling. This course is suited for intermediate programmers, who have some experience with NumPy and Pandas, that want to expand their skills for any career in data science. Whether you come to data science through social sciences and Statistics, or from a programming background, this course will integrate the two perspectives and offer unique insights from each.

You’ll begin by becoming adept with matplotlib, an essential plotting library in Python that will enable you to discover and communicate insights about data effectively. You’ll progress to classification algorithms by creating a K-Nearest Neighbors (KNN) classifier, a foundational algorithm used in data science and machine learning. Finally, you will write Python programs that leverage your newfound data science skills based on inferential statistics, and be able to describe relationships between variables in your data.

By the end of the course, you’ll be able to quickly visualize a dataset, explore it for insights, determine relationships between data, and communicate it all with effective plots. In the last module of this course, you’ll produce a publication-quality figure based on data that you’ve prepared and cleaned yourself; the first artifact in your data science portfolio.

Throughout this course you’ll get plenty of hands-on experience through interactive programming assignments, live coding demos from data scientists, and analyzing the data behind important real-world problems (like carbon emissions, real estate prices, and infant mortality). Guided activities throughout each module will reinforce your proficiency with data science techniques and analytical approach as a data scientist.

Solidify your understanding of these critical data science concepts and begin your data science portfolio by mastering visualization and modeling. Start this integrative and transformative learning journey today!

Enroll now

What's inside

Syllabus

Plotting
In this module, you will learn about plotting in Python—an important technique for exploring a dataset, and an indispensable tool for communicating insights. We’ll learn to make all the most common types of plots used in data science including the basics like line, bar, and scatter plots, as well as more advanced plot types including histograms and heatmaps. We’ll learn both how to make these plots and how they can be customized for your needs using a core plotting library for python, matplotlib, which serves as the backbone for many python plotting tools. You’ll learn how to create professional, accessible, and information-rich plots, which you will leverage to quickly identify trends in data that would be difficult to otherwise recognize. We've also included some optional additional readings if you want to further enhance your learning!
Read more

Traffic lights

Read about what's good
what should give you pause
and possible dealbreakers
Expands skills for any career in data science, integrating perspectives from social sciences, statistics, and programming backgrounds, offering unique insights from each
Uses matplotlib, an essential plotting library in Python, enabling learners to discover and communicate insights about data effectively, which is crucial for data scientists
Develops a K-Nearest Neighbors (KNN) classifier, a foundational algorithm used in data science and machine learning, providing a strong base for further study
Culminates in producing a publication-quality figure based on data that learners have prepared and cleaned themselves, creating the first artifact in a data science portfolio
Requires learners to explore and prepare 4 datasets and merge them into a composite dataset that they’ll plot, which may require additional software or tools
Taught by Duke University, which is known for its groundbreaking research and innovation in many fields, including computer science and data science

Save this course

Create your own learning path. Save this course to your list so you can find it easily later.
Save

Reviews summary

Practical data visualization and modeling in python

According to learners, this course provides a solid introduction to data visualization and modeling using Python, particularly highlighting the practical hands-on projects and coding exercises that help build a data science portfolio. Students appreciate the coverage of matplotlib for plotting and the introduction to K-Nearest Neighbors. However, some learners with less background in math or statistics found the difficulty level challenging, especially in later sections like regression, suggesting prerequisites might be underestimated for some. Overall, it's seen as a useful stepping stone for intermediate programmers entering data science, though it may require supplementary learning for deeper theoretical understanding.
Good for entering data science field.
"This course is a great stepping stone for intermediate programmers looking to pivot into data science."
"It provided practical tools and strategies that I could apply immediately to my work."
"Helped solidify my understanding of core data science concepts needed for a career."
"The skills learned are definitely relevant for entry-level data analysis or data science roles."
Clear and effective introduction to plotting.
"The module on matplotlib was excellent and taught me how to create professional-quality plots effectively."
"I found the data visualization section to be particularly well-explained and useful."
"Learning to use matplotlib enabled me to discover and communicate insights about data effectively."
"The plotting module was a great starting point for understanding data representation."
Course projects are practical and helpful.
"The hands-on coding and projects are the strongest part of the course for me; I learned by doing."
"I really enjoyed the practical assignments and the final project which helped build my portfolio."
"The guided activities throughout each module reinforced my proficiency with data science techniques."
"Getting to analyze real-world data like carbon emissions was very engaging and practical."
Regression section could be clearer.
"The regression module was the weakest part; I struggled to fully understand the implementation details."
"I wished the linear regression section had more detailed explanations and examples."
"Understanding the difference between prediction and inference was clear, but implementing linear regressions felt rushed."
"Needed extra resources to understand the regression concepts properly."
Pace increases significantly later on.
"The first module was great, but the difficulty jumped significantly starting with prediction and regression."
"I felt lost in the later modules; the concepts became much harder and the pace felt faster."
"While the plotting section was accessible, the shift to predictive algorithms and regression was abrupt."
"The later parts required a lot more outside reading to fully grasp."
Requires solid math/stats background.
"I found the course challenging without a strong statistics or linear algebra background; some concepts weren't fully explained."
"Be prepared for the math! While programming is covered, understanding the underlying statistical concepts is key."
"The course description says 'intermediate programmers', but I think solid foundational math/stats is also needed."
"Some parts felt difficult because the statistical theory wasn't covered in enough depth for me."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Data Visualization and Modeling in Python with these activities:
Review Pandas DataFrames
Reinforce your understanding of Pandas DataFrames, a fundamental data structure used extensively in the course for data manipulation and analysis.
Show steps
  • Review the Pandas documentation on DataFrames.
  • Practice creating, manipulating, and querying DataFrames using sample datasets.
  • Work through examples of common DataFrame operations like filtering, sorting, and grouping.
Review 'Python Data Science Handbook'
Deepen your understanding of Python data science libraries and techniques by studying a comprehensive handbook.
Show steps
  • Read the chapters relevant to plotting, prediction, and regression.
  • Work through the examples provided in the book.
  • Experiment with different datasets and techniques.
Matplotlib Plotting Exercises
Improve your Matplotlib skills by completing a series of plotting exercises that cover different plot types and customization options.
Browse courses on Matplotlib
Show steps
  • Find a set of Matplotlib exercises online or create your own.
  • Practice creating various plot types, such as line plots, scatter plots, bar charts, and histograms.
  • Experiment with different customization options, such as colors, labels, and titles.
Four other activities
Expand to see all activities and additional details
Show all seven activities
Review 'Storytelling with Data'
Improve your data storytelling skills by learning how to create compelling and informative visualizations.
Show steps
  • Read the book and take notes on the key concepts.
  • Analyze examples of good and bad data visualizations.
  • Apply the principles to your own visualizations.
Analyze and Visualize Real Estate Data
Apply your data visualization and modeling skills to analyze a real-world dataset of real estate prices and create insightful visualizations.
Browse courses on Data Visualization
Show steps
  • Find a publicly available dataset of real estate prices.
  • Clean and prepare the data for analysis.
  • Create visualizations to explore the relationships between different variables, such as location, size, and price.
  • Build a regression model to predict real estate prices based on these variables.
Follow Advanced Matplotlib Tutorials
Refine your Matplotlib skills by following advanced tutorials that cover topics such as custom plot styles, interactive plots, and 3D visualizations.
Browse courses on Matplotlib
Show steps
  • Search for advanced Matplotlib tutorials online.
  • Choose a tutorial that covers a topic you are interested in.
  • Follow the tutorial step-by-step and experiment with the code.
Create a Data Visualization Portfolio
Showcase your data visualization skills by creating a portfolio of your best plots and visualizations.
Browse courses on Data Visualization
Show steps
  • Select your most compelling and informative visualizations from the course and your own projects.
  • Write a brief description for each visualization, explaining the data and the insights it reveals.
  • Organize your visualizations into a portfolio website or presentation.

Career center

Learners who complete Data Visualization and Modeling in Python will develop knowledge and skills that may be useful to these careers:
Data Scientist
A Data Scientist uses programming, statistical analysis, and visualization to extract knowledge and insights from data, and this course helps you build a foundation for that role. This course is a great fit because it teaches the fundamentals of plotting with matplotlib, a core skill for data scientists to explore and communicate effectively. It also leads you to build algorithms like KNN for classification and regression, two essential techniques for prediction models. Moreover, this course emphasizes the importance of creating publication-quality visualizations that can communicate key findings, a critical skill for any data scientist.
Quantitative Analyst
A Quantitative Analyst uses mathematical and statistical models to solve problems in finance or other industries, and this course helps build a foundation for that role. This course teaches skills in data visualization and modeling, using Python and matplotlib, and develops your grasp of foundational algorithms such as the K-Nearest Neighbors classifier. You will also learn to use regression models to find relationships between variables in data. The course’s focus on creating publication-quality figures from data you’ve cleaned and prepared yourself can set you up for success in this career path.
Operations Research Analyst
Operations Research Analysts use mathematical and analytical methods to help organizations make better decisions. Skills in data analysis, modeling, and communication are very important to this role. This course teaches data visualization with matplotlib, a core skill to explore and present data. You will also learn to implement statistical models such as regression and KNN. The course’s final project, which involves merging data from different sources to create an impactful visualization, helps you show your ability to use data techniques in a practical context.
Machine Learning Engineer
Machine Learning Engineers develop and implement machine learning models, and this course helps you begin to explore this role. This course introduces classification algorithms, including the K-Nearest Neighbors algorithm, and teaches you how to evaluate a predictive algorithm's accuracy. You will get hands on experience building your own models. The emphasis on data visualization using matplotlib will help you explore and communicate your findings, which is a very important part of debugging and improving machine learning models. This course provides the foundational elements of data analysis and model building that form the basis for a career as a machine learning engineer.
Market Research Analyst
Market Research Analysts use data to research what products consumers want, and a strong data toolkit is needed for this role. A central component of this role is data visualization. This course will train you to create informative plots using matplotlib. You will learn to explore a dataset, extract key trends, and communicate your findings, which are all important steps in a market research analysis. This course goes a step further by teaching how to create publication-quality figures, improving your presentation abilities. This will give you tools that are very pertinent to a market research analyst.
Bioinformatician
Bioinformaticians develop and apply computational tools to analyze biological data, and this course helps to understand the methods used in this kind of work. This course teaches you to create visualizations, which are useful to explore complex datasets, like those found in biology. Learning to implement regression models using statistical inference is useful if you work with genomics data, and you will also learn to develop models, including the K-Nearest Neighbors algorithm. This course provides the fundamental skills required for success as a bioinformatician.
Business Intelligence Analyst
Business Intelligence Analysts use data to improve business practices. This role is all about analyzing and visualizing data to make better decisions. The matplotlib based plotting techniques in this course help you create visualizations to communicate insights. The course teaches methods of exploring a dataset, inferring relationships between variables and communicating your findings. The course’s final project focuses on producing publication quality figures, which are an essential part of creating reports for a team. You will have an easier time communicating data to business leaders, and will be better positioned for success in this career.
Epidemiologist
Epidemiologists study the patterns and causes of diseases in populations. To be successful in this profession, quantitative research skills are necessary, and this course may be useful for that. This course will help you build a foundation in data visualization with matplotlib. It will teach you to implement statistical models, including regression techniques, that are commonly used in epidemiology. The hands-on experience in analyzing real-world data, such as carbon emissions, real estate prices, and infant mortality, will be exceptionally useful for a future epidemiologist.
Data Analyst
Data Analysts interpret data to identify trends and insights that drive business decisions, and this course may be helpful for entering that type of career. A core component of this role is data visualization, and this course teaches you how to create effective plots using matplotlib. Through this course, you will also learn to use inferential statistics that helps in data analysis. The course's final project where you visualize the relationship between income and greenhouse emissions demonstrates to any hiring manager that you can explore a dataset, extract key trends, and communicate your findings through accessible plots, all of which are key responsibilities of a data analyst.
Research Scientist
Research scientists conduct experiments and analyze data to further scientific knowledge, and this course may be helpful to begin that career path. This course will help you to visualize data with matplotlib, which is a useful tool for exploring scientific data and communicating findings. The regression module of this course will teach you to find relationships between variables in a dataset, which may be useful for research. The course’s final project involves merging data from multiple datasets and then visualizing the relationships for publication, which is very relevant to research.
Academic Researcher
Academic Researchers conduct in-depth research into particular ideas of interest. This course may be helpful for those seeking a career in this field. This course will teach you to use Python and matplotlib to make visualizations, which can be useful for exploring data. The lessons on building statistical models, such as the KNN algorithm, can help you analyze data. The final project of creating a publication-quality visualization is crucial for communicating findings, which every researcher needs to practice.
Statistician
Statisticians analyze and interpret numerical data using statistical methods, and this course may be useful for those seeking to enter this profession. This course will teach you to use programming to find relationships between variables, and implement regression models. Learning to visualize these relationships with matplotlib is also very useful for a statistician. The course integrates perspectives from statistics and programming, which helps form a unique perspective. The final project of creating a visualization from a complex dataset allows you to demonstrate your grasp of inferential statistics, which is essential for any statistician.
Actuary
Actuaries analyze the financial consequences of risk, and a very strong quantitative background is needed for this role. This course may be helpful for building basic skills in the area. This course will teach you to use Python and matplotlib to visualize data, which is useful for actuaries to examine relationships in financial data. You will implement statistical modeling techniques such as regression, which are frequently used in actuarial models. The focus on creating professional plots using real world data may assist you in communicating findings to a team.
Financial Analyst
A financial analyst provides guidance to businesses and individuals around investment decisions, and this course may be helpful for new learners in this field. This course will help you learn the basics of data visualization using matplotlib, allowing you to understand relationships in financial data. You will learn to perform regression analysis, a key skill for financial modeling. While the course does not use financial data explicitly, the techniques are transferable to this field. The final project, which involves creating visualizations, can be useful in presenting findings.
Urban Planner
Urban planners develop plans for the development of cities, and this course may be useful for those seeking a career in this field. The course will help you learn to use matplotlib to visualize data, which is essential for analyzing demographic trends and urban development. The course will teach you how to use data to find relationships between variables, which is useful for understanding urban data. Although the course does not use urban data specifically, the techniques learned are directly transferable to this field.

Reading list

We've selected two books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Data Visualization and Modeling in Python.
Provides a comprehensive overview of essential Python data science tools, including NumPy, Pandas, Matplotlib, and Scikit-learn. It serves as an excellent reference for understanding the underlying principles and practical applications of these libraries. The book is particularly helpful for solidifying your understanding of data manipulation, visualization, and modeling techniques. It is commonly used as a textbook in data science courses.
Focuses on the art of communicating data insights effectively through visualizations. It provides practical guidance on choosing the right chart types, designing clear and compelling visuals, and crafting a narrative around your data. This book is more valuable as additional reading than it is as a current reference. It is commonly used by business professionals to improve their data communication skills.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Similar courses are unavailable at this time. Please try again later.
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2025 OpenCourser