We may earn an affiliate commission when you visit our partners.
Course image
Course image
edX logo

Cloud Data Engineering

Noah Gift
  • Discover the principles of data engineering and its role in building scalable, cloud-based systems.
  • Explore the challenges of the end of Moore's Law and learn to develop distributed systems.
  • Gain hands-on experience with big data technologies and best practices for implementing solutions.
  • Learn to build serverless data engineering pipelines and apply effective data governance strategies.
  • Develop expertise in key data engineering tasks, including ETL, cloud databases, and cloud storage.

What's inside

Learning objectives

  • Evaluate best practices for dealing with the end of moore's law
  • Develop distributed systems applying software engineering best practices
  • Evaluate best practices for implementing solutions with big data
  • Analyze best practices in data engineering
  • Build serverless data engineering systems
  • Evaluate effective data governance strategies
  • Develop cloud etl (extract, load, transfer) pipelines
  • Evaluate best practices for cloud databases and cloud storage

Syllabus

Here is the course structure formatted with bullets for each module:
1. Module 1: Methodologies in Data Engineering (12 hours)
- Videos:
- Introduction and Course Overview (4 minutes)
Read more
- The End of Moore's Law and Concurrency in Python (7 minutes)
- Using CUDA, Numba, and ASICs (13 minutes)
- Exploring Colab Pro and Colab AI (9 minutes)
- Distributed Systems Concepts (9 minutes)
- Debugging Python Code (25 minutes)
- Exploring Google BigQuery (12 minutes)
- Introduction to Big Data and Data Lakes (4 minutes)
- Big Data Processing (3 minutes)
- AWS Data Engineering Design Principles (20 minutes)
- Processing Big Data with AWS (25 minutes)
- Transform Data with Databricks Spark SQL (5 minutes)
- Readings (22 readings, 220 minutes)
- Quizzes (5 quizzes, 150 minutes)
- Discussion Prompts (4 discussion prompts, 40 minutes)
- Ungraded Labs (3 ungraded labs, 180 minutes)
2. Module 2: Principles of Data Engineering (11 hours)
- Readings (7 readings, 70 minutes)
- Introduction to Data Engineering (1 minute)
- Data Driven Organizations (19 minutes)
- Batch vs. Streaming vs. Events (1 minute)
- Ingesting by Batch or Stream (20 minutes)
- Building CLI Tools with Click (33 minutes)
- Building Containerized Command-line Tools (12 minutes)
- Rust and Python (5 minutes)
- Python Calculator CLI and Caesar Cipher CLI (7 minutes)
- Advanced Testing with Amazon CodeGuru and AWS CodeBuild (44 minutes)
- Mapping Functions to CLI (58 minutes)
- AWS CodeWhisperer CLI and SDK (7 minutes)
- Readings (10 readings, 100 minutes)
- Quizzes (4 quizzes, 120 minutes)
- Discussion Prompts (3 discussion prompts, 30 minutes)
- Ungraded Labs (4 ungraded labs, 240 minutes)
3. Module 3: Building Data Engineering Pipelines (6 hours)
- Introduction to Serverless Data Engineering (0 minutes)
- Automating Pipelines (21 minutes)
- Serverless Concepts (17 minutes)
- AWS Lambda (42 minutes)
- Build a Serverless Data Pipeline (37 minutes)
- Serverless Cookbook with AWS and GCP (49 minutes)
- Introduction to Data Governance (0 minutes)
- The Principle of Least Privilege (1 minute)
- Cloud Security with IAM on AWS (30 minutes)
- Encrypt at Rest and Transit (3 minutes)
- Quizzes (3 quizzes, 90 minutes)
- Discussion Prompts (2 discussion prompts, 20 minutes)
4. Module 4: Applying Key Data Engineering Tasks (10 hours)
- Introduction to Extract, Transform, Load (ETL) (0 minutes)
- Ingesting and Preparing Data on AWS (19 minutes)
- Using Amazon Athena with AWS Glue (22 minutes)
- Real-World Problems in ETL (13 minutes)
- Introduction to Cloud Databases (6 minutes)
- MySQL Overview and Usage (28 minutes)
- Big Query with Prompt Engineering and Colab Pipeline (14 minutes)
- Introduction to Cloud Storage (0 minutes)
- Cloud Storage Deep Dive (13 minutes)
- Using Amazon S3 (4 minutes)

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Develops cloud-based systems, which is standard in industry
Taught by Noah Gift, who are recognized for their work in data engineering
Offers hands-on labs and interactive materials, which help learners apply skills

Save this course

Save Cloud Data Engineering to your list so you can find it easily later:
Save

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Cloud Data Engineering with these activities:
Create a Comprehensive Study Guide
Organize, expand, and review your notes, assignments, quizzes, and exams to create a comprehensive study guide that consolidates your learning from the course.
Show steps
  • Gather all your course materials, including notes, assignments, and exams.
  • Organize and categorize the materials by topic and subtopic.
  • Rewrite and expand your notes, ensuring clarity and completeness.
  • Include additional examples, explanations, and resources to enhance understanding.
Review Cloud Computing Basics
Review the fundamentals of cloud computing, including key concepts such as virtualization, cloud storage, and service models.
Browse courses on Cloud Computing
Show steps
  • Read articles and tutorials on cloud computing concepts.
  • Complete online courses or workshops on cloud computing basics.
  • Practice hands-on exercises using cloud platforms such as AWS or GCP.
Follow Tutorials on Serverless Data Engineering
Enhance your knowledge of serverless data engineering by following guided tutorials and demonstrations provided by industry experts and platforms like AWS or GCP.
Browse courses on Cloud Computing
Show steps
  • Identify and enroll in online tutorials on serverless data engineering.
  • Follow the tutorials step-by-step, taking notes and practicing the concepts.
  • Build and deploy serverless data pipelines using cloud services.
Five other activities
Expand to see all activities and additional details
Show all eight activities
ETL Code Challenges
Practice your ETL skills by solving code challenges and exercises that require you to design and implement ETL pipelines using cloud-based tools.
Browse courses on ETL
Show steps
  • Find ETL code challenges and exercises online or on platforms like LeetCode.
  • Attempt to solve the challenges on your own.
  • Review solutions and best practices.
  • Implement ETL pipelines using cloud-based tools such as AWS Glue or GCP Dataflow.
Read 'Designing Data-Intensive Applications'
Gain insights into the principles and patterns for designing and building data-intensive applications, which are essential for effective data engineering.
View Secret Colors on Amazon
Show steps
  • Read the book thoroughly, taking notes and highlighting key concepts.
  • Summarize the main ideas and principles discussed in the book.
  • Apply the concepts to your own data engineering projects and designs.
Design and Implement a Data Governance Framework
Develop a data governance framework for your organization, ensuring that data is managed and protected effectively and in compliance with regulatory requirements.
Browse courses on Data Governance
Show steps
  • Research data governance best practices and standards.
  • Define the scope, principles, and policies of the data governance framework.
  • Design and implement data governance processes and procedures.
  • Monitor and enforce compliance with the framework.
Contribute to an Open-Source Data Engineering Project
Contribute to the development and maintenance of open-source data engineering tools and projects to enhance your understanding and practical skills.
Browse courses on Open Source
Show steps
  • Identify a suitable open-source data engineering project on platforms like GitHub.
  • Review the project's documentation and contribute to documentation improvements.
  • Fix bugs or implement new features based on project requirements.
  • Collaborate with other contributors and maintainers.
Mentor Junior Data Engineers
Share your knowledge and expertise by mentoring junior data engineers, providing guidance and support to help them develop their skills and grow their careers.
Show steps
  • Connect with junior data engineers through platforms or networking events.
  • Provide mentorship and guidance on technical skills, career development, and industry trends.
  • Review code, provide feedback, and suggest improvements.
  • Encourage and support their professional growth.

Career center

Learners who complete Cloud Data Engineering will develop knowledge and skills that may be useful to these careers:

Reading list

We haven't picked any books for this reading list yet.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Cloud Data Engineering.
Cloud Computing Foundations
Most relevant
Cloud Data Engineering
Most relevant
Serverless Computing: The Big Picture
Cloud Computing Applications, Part 2: Big Data and...
Cloud Computing Applications, Part 1: Cloud Systems and...
Mastering AWS Glue, QuickSight, Athena & Redshift Spectrum
Building Batch Data Pipelines on Google Cloud
Serverless Analytics on AWS
Fundamentals of Software Architecture for Big Data
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser