We may earn an affiliate commission when you visit our partners.
Course image
Timotius Pamungkas

"Data is the new oil".

Read more

"Data is the new oil".

You might have heard the quote before. Data in digital era is as valuable as oil in industrial era. However, just like oil, raw data itself is not usable. Rather, the value is created when it is gathered completely and accurately, connected to other relevant data, and done so in a timely manner.

Data engineers design and build pipelines that transform and transport data into a usable format. A different role, like data scientist or machine learning engineer then able to use the data into valuable business insight. Just like raw oil transformed into petrol to be used through complex process.

To be a data engineer requires a lot of data literacy and practice. This course is the first step for you who want to know about data engineering. In this course, we will see theories and hands-on to introduce you to data engineering. As data field is very wide, this course will show you the basic, entry level knowledge about data engineering process and tools.

This course is very suitable to build foundation for you to go to data field. In this course, we will learn about:

  • Introduction to data engineering

  • Relational & non relational database

  • Relational & non relational data model

  • Table normalization

  • Fact & dimension tables

  • Table denormalization for data warehouse

  • ETL (Extract Transform Load) & data staging using pyhton pandas

  • Elasticsearch basic

  • Data warehouse

  • Numbers every engineers should know & how it is related to big data

  • Hadoop

  • Spark cluster on google cloud dataproc

  • Data lake

Important Notes

Data field is HUGE.   This course will be continuously updated, but for time being, this contains introduction to concept, and sample hands-on for data engineering.

For now, this course is intended for beginner on data engineering.

If you have some experience on programming and wonder about data engineering, this course is for you.

If you have experience in data engineering field, this course might be too basic for you (although I'm very happy if you still purchase the course)

If you never write python or SQL before, this course is not for you. To understand the course, you must have basic knowledge on SQL and pyhton.

Enroll now

What's inside

Learning objectives

  • Basic data engineering, what is data engineering, why needed, how to do it from zero
  • Relational database model, database modelling for normalization design & hands-on using postgresql & python / pandas
  • Nosql database model, denormalization design & hands-on using elasticsearch & python / pandas
  • Introduction to spark & spark cluster using google cloud platform

Syllabus

Introduction
Welcome to This Course
Course Structure & Coverage
How To Get Maximum Value From This Course
Read more
Introduction to Data Engineering
What is Data Engineering?
Data Engineering Example
What is Data Modelling?
Database
What is Database
Relational Database
When Not To Use Relational Database?
NoSQL Database
Demo : Postgresql
Demo : Python for Postgresql
Demo : Elasticsearch
Demo : Python for Elasticsearch
Relational Database Model
The Importance of Relational Data Model
OLTP vs OLAP
Database Normalization
First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
Normalization Python Demo
Normalization Tips
Database Denormalization
Denormalization Python Demo
Fact & Dimension Tables
Star Schema
Star Schema Python Demo
Snowflake Schema
Galaxy Schema
Extract Transform Load (ETL) & Staging Tables
ETL & Staging Tables - Demo Overview
ETL & Staging Tables - Python Demo 1
ETL & Staging Tables - Python Demo 2
To Insert or To Update?
ETL & Staging Tables - Python Demo 3
ETL & Staging Tables - Python Demo 4
ETL & Staging Tables - Tips
NoSQL Database Model
Basic NoSQL Concept
CAP Theorem
Denormalization on Elasticsearch
Elasticsearch Basic Usage
Elasticsearch Index & Document
Elasticsearch ETL - Overview
Elasticsearch Query DSL
Elasticsearch ETL - Python Demo
Data Warehouse
Business Perspective
Technical Perspective
More Fact & Dimension Table
OLAP Cube
On-Premise or Cloud?
Various Techniques
Demo Overview
Demo 1 - PostgreSQL Data Warehouse
Demo 2 - BigQuery Data Warehouse
Demo 3 - Data Warehouse Operations
Numbes Every Engineer Should Know
Numbers Every Engineer Should Know
Small Numbers
Big Numbers
Hadoop & Spark
Hadoop Ecosystem
Introducing Spark
Spark Programming
Data Formats
Hello Spark
Spark Demo - Dataframe
Spark Demo - Spark SQL
Spark & BigQuery - Setting Environment
Spark & BigQuery - ETL Movies
Spark & BigQuery - Lesson Learned
Spark Cluster on Google Cloud (Dataproc)
Spark Cluster - Overview
Demo : Big Data
Google Dataproc
Data Lake
Data Lake Overview
Schema On Read
Lake, not Swamp
Google Data Catalog
Resources & References
Download Source Code & Datasets
Bonus & Discount Codes

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Suitable for beginners, offers basic knowledge of data engineering process and tools
Provides hands-on practice for data engineering concepts
Covers a wide range of topics, including relational and non-relational databases, data modeling, ETL, and data warehousing
Introduces industry-standard tools such as PostgreSQL, Elasticsearch, Spark, and Hadoop
Assumes basic knowledge of SQL and Python
Intended for beginners with some programming experience

Save this course

Save Data Engineering for Beginner using Google Cloud & Python to your list so you can find it easily later:
Save

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Data Engineering for Beginner using Google Cloud & Python with these activities:
Practice Data Manipulation with Pandas
Reinforce your understanding of data manipulation principles by practicing with Pandas.
Browse courses on Data Manipulation
Show steps
  • Load a CSV file into a DataFrame.
  • Clean the DataFrame by removing duplicates and handling missing values.
  • Explore the DataFrame using statistical methods.
Follow Tutorials on Hadoop Ecosystem
Enhance your knowledge of the Hadoop ecosystem by following guided tutorials.
Show steps
  • Find reputable tutorials on Hadoop components.
  • Follow the tutorials and try out the hands-on exercises.
  • Experiment with different configurations and settings.
Design a Data Pipeline for a Business Scenario
Develop your problem-solving and analytical skills by creating a data pipeline for a real-world business scenario.
Browse courses on Business Analytics
Show steps
  • Identify the data sources and requirements.
  • Design the data transformation and cleansing process.
  • Write a detailed description of your pipeline.
Show all three activities

Career center

Learners who complete Data Engineering for Beginner using Google Cloud & Python will develop knowledge and skills that may be useful to these careers:
Data Engineer
A Data Engineer may design and build data pipelines to transform raw data into a usable format. Data pipelines can be used to cleanse and transform data, move data between systems, and perform other data-related tasks. These skills may help build a foundation for your career as a Data Engineer. This course can help you develop hands-on skills in data engineering, including working with data using Python and SQL.
Data Architect
A Data Architect may design and manage an organization's data infrastructure. They may work with stakeholders to understand data needs and develop data models and data management strategies. This course provides an introduction to data engineering concepts and tools, which may help you build a foundation for a career as a Data Architect.
Database Developer
A Database Developer may design and develop databases. They may work with stakeholders to understand data needs and develop database schemas and data models. This course provides an introduction to relational and NoSQL databases, which may help you build a foundation for a career as a Database Developer.
Database Administrator
A Database Administrator may manage and maintain an organization's databases. They may install, configure, and monitor databases, as well as perform data backups and recovery. This course provides an introduction to relational and NoSQL databases, which may help you build a foundation for a career as a Database Administrator.
Data Analyst
A Data Analyst may collect, clean, and analyze data to identify trends and patterns. They may use data to develop insights and make recommendations to improve business operations. This course provides an introduction to data engineering concepts and tools, which may help you build a foundation for a career as a Data Analyst.
Data Scientist
A Data Scientist may use data to develop machine learning models and other predictive analytics. They may work with stakeholders to understand data needs and develop data models and data management strategies. This course provides an introduction to data engineering concepts and tools, which may help you build a foundation for a career as a Data Scientist.
Business Intelligence Analyst
A Business Intelligence Analyst may use data to identify trends and patterns to improve business operations. They may work with stakeholders to understand data needs and develop data models and data management strategies. This course provides an introduction to data engineering concepts and tools, which may help you build a foundation for a career as a Business Intelligence Analyst.
Software Engineer
A Software Engineer may design, develop, and maintain software applications. They may work with stakeholders to understand software needs and develop software solutions. This course provides an introduction to data engineering concepts and tools, which may help you build a foundation for a career as a Software Engineer.
IT Manager
An IT Manager may plan, implement, and manage an organization's IT infrastructure. They may work with stakeholders to understand IT needs and develop IT strategies. This course provides an introduction to data engineering concepts and tools, which may help you build a foundation for a career as an IT Manager.
Systems Analyst
A Systems Analyst may analyze and design business systems. They may work with stakeholders to understand business needs and develop systems solutions. This course provides an introduction to data engineering concepts and tools, which may help you build a foundation for a career as a Systems Analyst.
Data Privacy Officer
A Data Privacy Officer may develop and implement data privacy policies and procedures. They may work with stakeholders to understand data needs and develop data management strategies. This course provides an introduction to data engineering concepts and tools, which may help you build a foundation for a career as a Data Privacy Officer.
Information Security Analyst
An Information Security Analyst may develop and implement information security policies and procedures. They may work with stakeholders to understand security needs and develop security solutions. This course provides an introduction to data engineering concepts and tools, which may help you build a foundation for a career as an Information Security Analyst.
Data Governance Specialist
A Data Governance Specialist may develop and implement data governance policies and procedures. They may work with stakeholders to understand data needs and develop data management strategies. This course provides an introduction to data engineering concepts and tools, which may help you build a foundation for a career as a Data Governance Specialist.
Project Manager
A Project Manager may plan, execute, and close projects. They may work with stakeholders to understand project goals and develop project plans. This course provides an introduction to data engineering concepts and tools, which may help you build a foundation for a career as a Project Manager.
Cloud Engineer
A Cloud Engineer may design, develop, and maintain cloud-based applications and infrastructure. They may work with stakeholders to understand cloud needs and develop cloud solutions. This course provides an introduction to data engineering concepts and tools, which may help you build a foundation for a career as a Cloud Engineer.

Reading list

We've selected nine books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Data Engineering for Beginner using Google Cloud & Python.
Provides a comprehensive overview of big data technologies and systems. It covers the fundamentals of data storage, processing, and analysis. It also discusses the challenges and opportunities of big data in various domains.
Introduces Apache Hadoop, the open-source framework for distributed data processing. It provides a hands-on guide to designing, implementing, and managing Hadoop clusters. It also covers advanced topics such as data warehousing and machine learning.
Provides a practical introduction to data science for business professionals. It covers the basics of data science, including data analysis, machine learning, and data visualization. It also discusses the challenges and opportunities of data science in various industries.
Provides a comprehensive overview of machine learning with Python. It covers the fundamentals of machine learning, including supervised and unsupervised learning, regression, classification, and clustering. It also discusses the challenges and opportunities of machine learning in various domains.
Provides a comprehensive overview of deep learning with Python. It covers the fundamentals of deep learning, including neural networks, convolutional neural networks, and recurrent neural networks. It also discusses the challenges and opportunities of deep learning in various domains.
Provides a comprehensive overview of data science with R. It covers the fundamentals of data science, including data analysis, machine learning, and data visualization. It also discusses the challenges and opportunities of data science in various domains.
Provides a practical guide to data mining with SAS. It covers the entire data mining process, from data preparation and exploration to model building and evaluation. It includes hands-on examples and case studies.
Provides a practical guide to data visualization with Tableau. It covers the entire data visualization process, from data preparation and cleaning to chart creation and dashboard design. It includes hands-on examples and case studies.
Provides a practical guide to data analytics with Power BI. It covers the entire data analytics process, from data acquisition and preparation to data analysis and visualization. It includes hands-on examples and case studies.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Data Engineering for Beginner using Google Cloud & Python.
Introduction to Petroleum Engineering
Logistics of crude oil and petroleum products - Oil and...
Applied Petroleum Reservoir Engineering
Introduction to Data Engineering
Unconventional Reservoir Geomechanics
Reservoir Geomechanics
Data Cleaning in Snowflake: Techniques to Clean Messy Data
Water and Wastewater Treatment Engineering: Biochemical...
Object Oriented Development using C#
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser