We may earn an affiliate commission when you visit our partners.
Course image
Rav Ahuja and Ramesh Sannareddy

In this Capstone you’ll demonstrate your ability to perform like a Data Engineer. Your mission is to design, implement, and manage a complete data and analytics platform consisting of relational and non-relational databases, data warehouses, data pipelines, big data processing engines, and Business Intelligence (BI) tools.

Read more

In this Capstone you’ll demonstrate your ability to perform like a Data Engineer. Your mission is to design, implement, and manage a complete data and analytics platform consisting of relational and non-relational databases, data warehouses, data pipelines, big data processing engines, and Business Intelligence (BI) tools.

This Capstone project will require that you apply and sharpen the skills and knowledge you developed in the various courses in the IBM Data Engineering Professional Certificate and utilize multiple tools and technologies to design databases, collect data from multiple sources, extract, transform and load data into a data warehouse, and utilize a cloud-based BI tool to create analytic reports and visualizations. You will also implement predictive analytics and machine learning models using big data tools and techniques.

This capstone requires significant amount of hands-on lab effort throughout the course. You’ll exhibit your knowledge and proficiency working with Python, Bash scripts, SQL, NoSQL, RDBMSes, ETL, MySQL, PostgreSQL, Db2, MongoDB, Apache Airflow, Apache Spark, and Cognos Analytics.

Upon successfully completing this Capstone, you should have the confidence and portfolio to take on real-world data engineering projects and showcase your abilities to perform as an entry-level data engineer.

What's inside

Learning objectives

  • Build a complete data and analytics platform.
  • Setup, manage and query relational and nosql databases.
  • Create data pipelines and etl processes using apache airflow.
  • Design and populate a star/snowflake schema data warehouse and query it using sql.
  • Analyze warehouse data using business intelligence (bi) tool cognos analytics to create reports and dashboards.
  • Deploy a big data machine learning model using apache spark.

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Core audience is individuals with foundational knowledge in data engineering concepts and technologies who are seeking to advance their skills and knowledge in designing, implementing, and managing data and analytics platforms
Ideal for aspiring data engineers looking to enhance their capabilities in building and maintaining comprehensive data and analytics solutions
Suitable for individuals aiming to transition into entry-level data engineering roles with a demand for extensive hands-on experience in a variety of tools and technologies
Provides a foundation for learners to strengthen their understanding of data pipelines, ETL processes, data warehousing, data analysis, and machine learning
Practical hands-on lab assignments emphasize skill development in data engineering technologies
Taught by industry experts, Rav Ahuja and Ramesh Sannareddy, who have extensive experience and recognition in data engineering

Save this course

Save Data Engineering Capstone Project to your list so you can find it easily later:
Save

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Data Engineering Capstone Project with these activities:
Review: Relational Database Concepts
Reinforce your understanding of relational database concepts, ensuring a solid foundation for working with structured data in the course.
Browse courses on Relational Databases
Show steps
  • Revisit your previous notes or online resources on relational databases
  • Practice creating and querying sample databases
SQL Practice: Queries and Data Manipulation
Sharpen your SQL skills by practicing queries and data manipulation, improving your proficiency in working with relational databases.
Browse courses on SQL
Show steps
  • Find a set of practice SQL queries online
  • Run the queries on a sample database
  • Analyze the results and ensure correct data manipulation
Python Drills: Data Analysis with Pandas
Enhance your data analysis skills by practicing with Python and Pandas, improving your ability to manipulate and analyze data.
Browse courses on Python
Show steps
  • Find a set of coding drills on data analysis with Pandas
  • Implement the drills using Python and Pandas
  • Review the results and ensure accurate data analysis
Five other activities
Expand to see all activities and additional details
Show all eight activities
Tutorial: Create a Data Warehouse with Apache Airflow
Review the basics of Apache Airflow and how to use it to create a data warehouse, providing a solid foundation for the course.
Browse courses on Apache Airflow
Show steps
  • Find an online tutorial on Apache Airflow
  • Follow the tutorial and create a simple data warehouse
  • Query the data warehouse and perform basic analysis
Study Group: Data Warehousing and Business Intelligence
Collaborate with peers to reinforce concepts related to data warehousing and business intelligence, enhancing your understanding and ability to apply these techniques.
Browse courses on Data Warehousing
Show steps
  • Form a study group with fellow learners
  • Choose a topic related to data warehousing or business intelligence
  • Prepare presentations and lead discussions on the topic
Attend a Data Engineering Meetup
Expand your network and learn from professionals in the field by attending a data engineering meetup, providing insights into real-world applications of course concepts.
Browse courses on Data Engineering
Show steps
  • Find a data engineering meetup in your area
  • Attend the meetup and engage with other attendees
  • Share your knowledge and learn from others
Project: Design and Implement a Data Pipeline
Apply your knowledge and skills to design and implement a complete data pipeline, demonstrating your ability to manage data flow and transformation.
Browse courses on Data Pipelines
Show steps
  • Define the data sources and destination for the pipeline
  • Design the data transformation and cleaning process
  • Implement the pipeline using Apache Airflow
  • Test and validate the pipeline
Contribute to an Open-Source Data Engineering Project
Deepen your understanding and contribute to the data engineering community by participating in an open-source project, showcasing your skills and fostering collaboration.
Browse courses on Open Source
Show steps
  • Find an open-source data engineering project on GitHub
  • Identify an area where you can contribute
  • Submit a pull request with your contribution
  • Review feedback and iterate on your contribution

Career center

Learners who complete Data Engineering Capstone Project will develop knowledge and skills that may be useful to these careers:
Data Engineer
Data Engineers design and build data pipelines that collect, process, and store data. They work with large datasets to ensure that data is accessible and usable for analysis. This course provides a solid foundation in data engineering techniques and tools, which are essential for success in this role.
Data Analyst
Data Analysts are responsible for analyzing data to extract insights and trends. They use their skills in statistics, programming, and data visualization to communicate their findings to stakeholders. This course provides a strong foundation in data analysis techniques and tools, which are essential for success in this role.
Data Scientist
Data Scientists use their skills in mathematics, statistics, and programming to build models that can predict future outcomes. They work with large datasets to identify patterns and trends that can help businesses make better decisions. This course provides a solid foundation in data science techniques and tools, which are essential for success in this role.
Machine Learning Engineer
Machine Learning Engineers build and deploy machine learning models that can solve complex problems. They work with large datasets to train models that can make predictions or recommendations. This course provides a strong foundation in machine learning techniques and tools, which are essential for success in this role.
Database Administrator
Database Administrators manage and maintain databases. They work with databases to ensure that data is secure, accessible, and performant. This course provides a strong foundation in database administration techniques and tools, which are essential for success in this role.
Software Engineer
Software Engineers design, develop, and maintain software applications. They work with code to create software that meets the needs of users. This course provides a foundation in software engineering principles and practices, which can be helpful for success in this role.
Business Analyst
Business Analysts use their skills in data analysis and problem-solving to help businesses make better decisions. They work with stakeholders to understand their needs and develop solutions that meet those needs. This course provides a strong foundation in business analysis techniques and tools, which can be helpful for success in this role.
Project Manager
Project Managers plan, execute, and close projects. They work with teams to ensure that projects are completed on time, within budget, and to the required quality standards. This course provides a foundation in project management principles and practices, which can be helpful for success in this role.
Data Architect
Data Architects design and implement data architectures. They work with stakeholders to understand their data needs and develop solutions that meet those needs. This course provides a foundation in data architecture principles and practices, which can be helpful for success in this role.

Reading list

We've selected ten books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Data Engineering Capstone Project.
Classic work on data warehouse design and implementation. It provides a comprehensive overview of the data warehousing process, from data modeling to performance tuning. It valuable resource for anyone involved in data warehousing.
Provides a comprehensive overview of designing data-intensive applications. It covers topics such as data modeling, data storage, and data processing. It valuable resource for anyone involved in designing data-intensive applications.
Comprehensive textbook on database systems. It covers topics such as data models, query optimization, and transaction processing. It valuable resource for anyone involved in database systems.
Provides a practical guide to using Apache Spark for machine learning. It covers topics such as data preparation, model training, and model evaluation. It valuable resource for anyone involved in machine learning with Spark.
Collection of SQL recipes for common data management tasks. It covers topics such as data manipulation, data aggregation, and data analysis. It valuable resource for anyone involved in SQL programming.
Provides a comprehensive overview of data-driven marketing. It covers topics such as data collection, data analysis, and data visualization. It valuable resource for anyone involved in data-driven marketing.
Provides a comprehensive overview of Python for data analysis. It covers topics such as data manipulation, data visualization, and machine learning. It valuable resource for anyone involved in data analysis with Python.
Provides a comprehensive overview of big data analytics. It covers topics such as data storage, data processing, and data analysis. It valuable resource for anyone involved in big data analytics.
Provides a comprehensive overview of data science for business. It covers topics such as data mining, data analysis, and data visualization. It valuable resource for anyone involved in data science for business.
Provides a comprehensive overview of data engineering with Python, covering topics such as data collection, cleaning, transformation, and analysis. It also includes case studies and examples to help readers apply their learning to real-world scenarios.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Data Engineering Capstone Project.
Data Engineering Capstone Project
Most relevant
Introduction to Data Engineering
Most relevant
Linux and Bash for Data Engineering
Most relevant
Building ETL and Data Pipelines with Bash, Airflow and...
Most relevant
ETL Testing: From Beginner to Expert
The Path to Insights: Data Models and Pipelines
Big Data - Capstone Project
Generative AI: Elevate your Data Engineering Career
Apache Spark for Data Engineering and Machine Learning
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser