Save for later

Building ETL and Data Pipelines with Bash, Airflow and Kafka

Data Engineering,

Well-designed and automated data pipelines and ETL processes are the foundation of a successful Business Intelligence platform. Defining your data workflows, pipelines and processes early in the platform design ensures the right raw data is collected, transformed and loaded into desired storage layers and available for processing and analysis as and when required.

This course is designed to provide you the critical knowledge and skills needed by Data Engineers and Data Warehousing specialists to create and manage ETL, ELT, and data pipeline processes.

Upon completing this course you’ll gain a solid understanding of Extract, Transform, Load (ETL), and Extract, Load, and Transform (ELT) processes; practice extracting data, transforming data, and loading transformed data into a staging area; create an ETL data pipeline using Bash shell-scripting, build a batch ETL workflow using Apache Airflow and build a streaming data pipeline using Apache Kafka.

You’ll gain hands-on experience with practice labs throughout the course and work on a real-world inspired project to build data pipelines using several technologies that can be added to your portfolio and demonstrate your ability to perform as a Data Engineer.

This course pre-requisites that you have prior skills to work with datasets, SQL, relational databases, and Bash shell scripts.

What you'll learn

  • Describe and differentiate between Extract, Transform, Load (ETL) and Extract, Load, Transform (ELT) processes.
  • Define data pipeline components, processes, tools, and technologies.
  • Create batch ETL processes using Apache Airflow and streaming data pipelines using Apache Kafka.
  • Demonstrate understanding of how shell-scripting is used to implement an ETL pipeline.

Get Details and Enroll Now

OpenCourser is an affiliate partner of edX and may earn a commission when you buy through our links.

Get a Reminder

Send to:
Rating Not enough ratings
Length 5 weeks
Effort 5 weeks, 2–4 hours per week
Starts On Demand (Start anytime)
Cost $99
From IBM via edX
Instructors Rav Ahuja, Yan Luo, Jeff Grossman
Download Videos On all desktop and mobile devices
Language English
Subjects Programming
Tags Computer Science

Get a Reminder

Send to:

Similar Courses

Careers

An overview of related careers and their average salaries in the US. Bars indicate income percentile.

Pipeline Production Coordinator $63k

Pipeline 2 $74k

Pipeline Scheduler $81k

Pipeline Developer $83k

Pipeline Tech $86k

Pipeline Accountant $86k

Pipeline Design $92k

Pipeline maintenance $99k

Pipeline Construction $106k

Pipeline Foreman $114k

pipeline tech. Lead $115k

Data Pipeline Engineer $142k

Write a review

Your opinion matters. Tell us what you think.

Rating Not enough ratings
Length 5 weeks
Effort 5 weeks, 2–4 hours per week
Starts On Demand (Start anytime)
Cost $99
From IBM via edX
Instructors Rav Ahuja, Yan Luo, Jeff Grossman
Download Videos On all desktop and mobile devices
Language English
Subjects Programming
Tags Computer Science

Similar Courses

Sorted by relevance

Like this course?

Here's what to do next:

  • Save this course for later
  • Get more details from the course provider
  • Enroll in this course
Enroll Now