We may earn an affiliate commission when you visit our partners.
Course image
Joseph Santarcangelo and Yan Luo

Please Note: Learners who successfully complete this IBM course can earn a skill badge — a detailed, verifiable and digital credential that profiles the knowledge and skills you’ve acquired in this course. Enroll to learn more, complete the course and claim your badge!

Read more

Please Note: Learners who successfully complete this IBM course can earn a skill badge — a detailed, verifiable and digital credential that profiles the knowledge and skills you’ve acquired in this course. Enroll to learn more, complete the course and claim your badge!

Now that you've taken several courses on data science and machine learning, it’s time to put your learning to work on a data problem involving a real life scenario. Employers really care about how well you can apply your knowledge and skills to solve real world problems, and the work you do in this capstone project will make you stand out in the job market.

In this capstone project, you’ll explore data sets in New York’s 311 system, which is used by New Yorkers to report complaints for the non-emergency problems they face. Upon being reported, various agencies in New York get assigned to resolve these problems. The data related to these complaints is available in the New York City Open Dataset. On investigation, one can see that in the last few years the 311 complaints coming to the Department of Housing Preservation and Development in New York City have increased significantly.

Your task is to find out the answers to some of the questions that would help the Department of Housing Preservation and Development in New York City effectively tackle the 311 complaints coming to them. You will need to use the techniques you learned in your previous Python, data science, and machine learning courses, including data ingestion, data exploration, data visualization, feature engineering, probabilistic modeling, model validation, and more.

By the end of this course, you will have used real world data science tools to create a showcase project and demonstrate to employers that you are job ready and a worthy candidate in the field of data science.

Three deals to help you save

What's inside

Learning objectives

  • Apply your knowledge of data science and machine learning to a real life scenario
  • Analyze and visualize data using python
  • Perform a feature engineering exercise using python
  • Build and validate a predictive machine learning model using python
  • Create and share actionable insights to real life data problems

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Builds a strong foundation for beginners in data science and machine learning
Develops and strengthens core skills and tools for data scientists
Uses real-world data from the New York City Open Dataset
Provides hands-on labs and interactive materials for practical application
Taught by instructors with experience in the field of data science
Introduces industry-relevant data science and machine learning techniques

Save this course

Save Data Science and Machine Learning Capstone Project to your list so you can find it easily later:
Save

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Data Science and Machine Learning Capstone Project with these activities:
Participate in Data Analysis Discussions
Encourages students to engage with peers, exchange ideas, and gain diverse perspectives on data analysis techniques.
Browse courses on Data Analysis
Show steps
  • Attend online or in-person discussion forums to share insights and ask questions.
  • Collaborate on data analysis projects with peers to enhance teamwork and communication skills.
Practice Data Ingestion Techniques
Helps students improve their understanding of data ingestion techniques and their proficiency in using Python for data science tasks.
Browse courses on Data Ingestion
Show steps
  • Use Python libraries to read and load data from various sources (e.g., CSV, JSON, SQL).
  • Practice cleaning and transforming data to prepare it for analysis.
Explore Advanced Data Visualization Techniques
Helps students expand their knowledge and proficiency in data visualization techniques beyond the scope of the course.
Browse courses on Data Visualization
Show steps
  • Follow online tutorials on advanced data visualization libraries (e.g., Plotly, Bokeh).
  • Practice creating interactive and informative visualizations.
Four other activities
Expand to see all activities and additional details
Show all seven activities
Develop a Machine Learning Model
Provides students with hands-on experience in developing and validating a machine learning model using Python.
Browse courses on Machine Learning
Show steps
  • Select appropriate machine learning algorithms and train models using Python.
  • Evaluate and refine models to improve their performance.
  • Document the model development process and present findings.
Develop a Data Exploration and Analysis Report
Provides students with an opportunity to synthesize their learning and demonstrate their ability to analyze and present data effectively.
Browse courses on Data Exploration
Show steps
  • Select a dataset and conduct exploratory data analysis.
  • Develop visualizations and insights based on the analysis.
  • Write a comprehensive report summarizing the findings.
Contribute to Open-Source Data Science Projects
Encourages students to contribute to the data science community by participating in open-source projects.
Browse courses on Open Source
Show steps
  • Explore open-source data science projects relevant to course topics.
  • Identify areas where contributions can be made.
  • Submit code contributions or documentation improvements to the project.
Participate in Data Science Competitions
Challenges students to apply their skills in real-world scenarios and gain experience in collaborative problem-solving.
Browse courses on Data Science
Show steps
  • Identify appropriate data science competitions aligned with course topics.
  • Participate in teams or individually to develop innovative solutions.
  • Reflect on the competition experience to enhance skills and knowledge.

Career center

Learners who complete Data Science and Machine Learning Capstone Project will develop knowledge and skills that may be useful to these careers:
Data Scientist
Data Scientists build and deploy machine learning models to solve real world problems in a wide range of applications using methodologies learned in the course Data Science and Machine Learning Capstone Project, like data visualization, and model validation. Data Science and Machine Learning Capstone Project gives learners experience working with real data sets and demonstrates the full data science process from data ingestion to presenting actionable insights. This course is particularly valuable for aspiring Data Scientists who want to gain experience with using Python to solve real-world data problems.
Machine Learning Engineer
Machine Learning Engineers build and deploy machine learning models to solve real world problems in a wide range of applications using methodologies learned in the course Data Science and Machine Learning Capstone Project, like data visualization, and model validation. Data Science and Machine Learning Capstone Project gives learners experience working with real data sets and demonstrates the full data science process from data ingestion to presenting actionable insights. This course is particularly valuable for aspiring Machine Learning Engineers who want to gain experience with using Python to solve real-world data problems.
Business Analyst
Business Analysts use data to analyze business performance and develop recommendations to improve efficiency and profitability. The course Data Science and Machine Learning Capstone Project provides Business Analysts with the opportunity to learn how to use data science and machine learning techniques to solve real-world business problems. This course is particularly valuable for aspiring Business Analysts who want to gain experience with using Python to solve real-world data problems.
Statistician
Statisticians collect, analyze, interpret, and present data to inform decision-making. The course Data Science and Machine Learning Capstone Project provides Statisticians with the opportunity to learn how to use data science and machine learning techniques to solve real-world data problems. This course is particularly valuable for aspiring Statisticians who want to gain experience with using Python to solve real-world data problems.
Software Engineer
Software Engineers design, develop, and maintain software applications. The course Data Science and Machine Learning Capstone Project provides Software Engineers with the opportunity to learn how to use data science and machine learning techniques to solve real-world software problems. This course is particularly valuable for aspiring Software Engineers who want to gain experience with using Python to solve real-world data problems.
Data Analyst
Data Analysts collect, analyze, and interpret data to make informed decisions in a business setting. The course Data Science and Machine Learning Capstone Project provides Data Analysts with the opportunity to build a portfolio of real-world projects that demonstrate their skills in data analysis and machine learning. This course is particularly valuable for aspiring Data Analysts who want to gain experience with using Python to solve real-world data problems.
Data Engineer
Data Engineers design and build the infrastructure and pipelines needed to manage and analyze data. The course Data Science and Machine Learning Capstone Project provides Data Engineers with the opportunity to learn how to use data science and machine learning techniques to solve real-world data problems. This course is particularly valuable for aspiring Data Engineers who want to gain experience with using Python to solve real-world data problems.
Data Visualization Specialist
Data Visualization Specialists design and create visualizations to communicate data insights. The course Data Science and Machine Learning Capstone Project provides Data Visualization Specialists with the opportunity to learn how to use data science and machine learning techniques to solve real-world data visualization problems. This course is particularly valuable for aspiring Data Visualization Specialists who want to gain experience with using Python to solve real-world data problems.
Quantitative Analyst
Quantitative Analysts use mathematical and statistical models to analyze financial data. The course Data Science and Machine Learning Capstone Project provides Quantitative Analysts with the opportunity to learn how to use data science and machine learning techniques to solve real-world financial problems. This course is particularly valuable for aspiring Quantitative Analysts who want to gain experience with using Python to solve real-world data problems.
Actuary
Actuaries use mathematical and statistical models to assess risk. The course Data Science and Machine Learning Capstone Project provides Actuaries with the opportunity to learn how to use data science and machine learning techniques to solve real-world risk assessment problems. This course is particularly valuable for aspiring Actuaries who want to gain experience with using Python to solve real-world data problems.
Epidemiologist
Epidemiologists investigate the distribution and patterns of health events and diseases in a population. The course Data Science and Machine Learning Capstone Project provides Epidemiologists with the opportunity to learn how to use data science and machine learning techniques to solve real-world epidemiology problems. This course is particularly valuable for aspiring Epidemiologists who want to gain experience with using Python to solve real-world data problems.
Biostatistician
Biostatisticians use statistical methods to design and analyze studies in the field of biology. The course Data Science and Machine Learning Capstone Project provides Biostatisticians with the opportunity to learn how to use data science and machine learning techniques to solve real-world biostatistics problems. This course is particularly valuable for aspiring Biostatisticians who want to gain experience with using Python to solve real-world data problems.
Market Researcher
Market Researchers conduct research to understand consumer needs and behaviors. The course Data Science and Machine Learning Capstone Project provides Market Researchers with the opportunity to learn how to use data science and machine learning techniques to solve real-world market research problems. This course is particularly valuable for aspiring Market Researchers who want to gain experience with using Python to solve real-world data problems.
Econometrician
Econometricians use statistical methods to analyze economic data. The course Data Science and Machine Learning Capstone Project provides Econometricians with the opportunity to learn how to use data science and machine learning techniques to solve real-world econometrics problems. This course is particularly valuable for aspiring Econometricians who want to gain experience with using Python to solve real-world data problems.
Data Scientist (Healthcare)
Data Scientists (Healthcare) use data science and machine learning techniques to solve real-world problems in the healthcare industry. The course Data Science and Machine Learning Capstone Project provides Data Scientists (Healthcare) with the opportunity to learn how to use data science and machine learning techniques to solve real-world healthcare problems. This course is particularly valuable for aspiring Data Scientists (Healthcare) who want to gain experience with using Python to solve real-world data problems.

Reading list

We've selected 13 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Data Science and Machine Learning Capstone Project.
Provides a comprehensive overview of data science and machine learning using Python. Covers data ingestion, exploration, visualization, feature engineering, model building, and evaluation. Suitable as a primary textbook or supplementary reading.
A practical guide to building and deploying machine learning models using popular Python libraries. Covers data preprocessing, model selection, hyperparameter tuning, and real-world applications. Useful as a reference or for additional depth.
A classic textbook that covers a wide range of statistical learning methods. Provides a comprehensive overview of supervised and unsupervised learning, including regression models, tree-based methods, and support vector machines. Serves as a valuable reference for foundational concepts.
Focuses on feature engineering, a crucial aspect of machine learning projects. Provides techniques for creating, transforming, and selecting features to improve model performance. Serves as a valuable reference for this topic.
A comprehensive textbook that covers data mining and machine learning techniques. Provides a practical overview of data preprocessing, model selection, and evaluation. Suitable as a primary textbook or as additional reading.
A comprehensive guide to deep learning, focusing on the underlying mathematical concepts and practical implementation. Covers neural networks, convolutional neural networks, and recurrent neural networks. Provides a solid foundation for understanding and building deep learning models.
A hands-on guide to machine learning, using Python and R. Covers a wide range of topics, including supervised and unsupervised learning, natural language processing, and computer vision. Provides practical examples and code snippets.
Provides a broad overview of machine learning, focusing on applications in web development. Covers data collection, feature engineering, model training, and evaluation. Explores topics such as natural language processing, recommender systems, and fraud detection.
Serves as a gentler introduction to machine learning, particularly suitable for beginners. Covers essential concepts and algorithms in a clear and concise manner. Can be used as background reading or as a preparatory resource.
Introduces machine learning concepts and techniques using Go. Covers data preprocessing, model building, and evaluation. Provides hands-on examples and exercises. Suitable for those interested in using Go for machine learning.
Provides a theoretical foundation for machine learning. Explores the underlying mathematical principles and algorithms, including linear regression, support vector machines, and decision trees. Suitable for those looking for a deeper understanding of the theoretical underpinnings of machine learning.
A more advanced book that explores the probabilistic foundations of machine learning. Provides a deeper understanding of statistical models, Bayesian inference, and probabilistic graphical models. Suitable for those seeking a rigorous and theoretical treatment.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Data Science and Machine Learning Capstone Project.
Predict Taxi Fare with a BigQuery ML Forecasting Model
New Science of Cities | 新城市科学
Applied Data Science Capstone
Generative AI: Elevate Your Data Science Career
GenAI For Business Analysis: Fine-Tuning LLMs
Fundamentals of Market Structure
Data Analysis with Python: Inform a Business Decision
Monitoring and Alerting with Prometheus
Ace The Data Science Interview: Real-Life Examples and...
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser