Data Engineering using AWS Data Analytics from Udemy

Data Engineering is all about building Data Pipelines to get data from multiple sources into Data Lakes or Data Warehouses and then from Data Lakes or Data Warehouses to downstream systems. As part of this course, I will walk you through how to build Data Engineering Pipelines using AWS Data Analytics Stack. It includes services such as Glue, Elastic Map Reduce (EMR), Lambda Functions, Athena, EMR, Kinesis, and many more.

Here are the high-level steps which you will follow as part of the course.

Setup Development Environment
Getting Started with AWS
Storage - All about AWS s3 (Simple Storage Service)
User Level Security - Managing Users, Roles, and Policies using IAM
Infrastructure - AWS EC2 (Elastic Cloud Compute)
Data Ingestion using AWS Lambda Functions
Overview of AWS Glue Components
Setup Spark History Server for AWS Glue Jobs
Deep Dive into AWS Glue Catalog
Exploring AWS Glue Job APIs
AWS Glue Job Bookmarks
Development Life Cycle of Pyspark
Getting Started with AWS EMR
Deploying Spark Applications using AWS EMR
Streaming Pipeline using AWS Kinesis
Consuming Data from AWS s3 using boto3 ingested using AWS Kinesis
Populating GitHub Data to AWS Dynamodb
Overview of Amazon AWS Athena
Amazon AWS Athena using AWS CLI
Amazon AWS Athena using Python boto3
Getting Started with Amazon AWS Redshift
Copy Data from AWS s3 into AWS Redshift Tables
Develop Applications using AWS Redshift Cluster
AWS Redshift Tables with Distkeys and Sortkeys
AWS Redshift Federated Queries and Spectrum

Here are the details about what you will be learning as part of this course. We will cover most of the commonly used services with hands-on practice which are available under AWS Data Analytics.

Getting Started with AWS

As part of this section, you will be going through the details related to getting started with AWS.

Introduction - AWS Getting Started
Create s3 Bucket
Create All IT professionals who would like to work on AWS should be familiar with it. We will get into quite a few common features related to AWS s3 in this section.
- Getting Started with AWS S3
- Setup Data Set locally to upload to AWS s3
- Adding AWS S3 Buckets and Managing Objects (files and folders) in AWS s3 buckets
- Version Control for AWS S3 Buckets
- Cross-Region Replication for AWS S3 Buckets
- Overview of AWS S3 Storage Classes
- Overview of AWS S3 Glacier
- Managing AWS S3 using As part of this section, you will understand the details related to AWS IAM users, groups, roles as well as policies.
  - Creating As part of this section, we will go through some of the basics related to
    - Getting Started with AWS EC2
    - Create
      - Getting Started with AWS EC2
      - Understanding In this section, we will understand how we can develop and deploy Lambda functions using Python as a programming language. We will also see how to maintain a bookmark or checkpoint using s3.
        Hello World using AWS Lambda
        Setup Project for local development of AWS Lambda Functions
        Deploy Project to AWS Lambda console
        Develop download functionality using requests for AWS Lambda Functions
        Using 3rd party libraries in AWS Lambda Functions
        Validating AWS s3 access for local development of AWS Lambda Functions
        Develop upload functionality to s3 using AWS Lambda Functions
        Validating AWS Lambda Functions using AWS Lambda Console
        Run AWS Lambda Functions using AWS Lambda Console
        Validating files incrementally downloaded using AWS Lambda Functions
        Reading and Writing Bookmark to s3 using AWS Lambda Functions
        Maintaining Bookmark on s3 using AWS Lambda Functions
        Review the incremental upload logic developed using AWS Lambda Functions
        Deploying AWS Lambda Functions
        Schedule AWS Lambda Functions using AWS Event Bridge
        Overview of AWS Glue Components
        In this section, we will get a broad overview of all important Glue Components such as Glue Crawler, Glue Databases, Glue Tables, etc. We will also understand how to validate Glue tables using AWS Athena. AWS Glue (especially Glue Catalog) is one of the key components in the realm of AWS Data Analytics Services.
        Introduction - Overview of AWS Glue Components
        Create AWS Glue Crawler and AWS Glue Catalog Database as well as Table
        Analyze Data using AWS Athena
        Creating AWS S3 Bucket and Role to create AWS Glue Catalog Tables using Crawler on the s3 location
        Create and Run the AWS Glue Job to process data in AWS Glue Catalog Tables
        Validate using AWS Glue Catalog Table and by running queries using AWS Athena
        Create and Run AWS Glue Trigger
        Create AWS Glue Workflow
        Run AWS Glue Workflow and Validate
        Setup Spark History Server for AWS Glue Jobs
        AWS Glue uses Apache Spark under the hood to process the data. It is important we setup Spark History Server for AWS Glue Jobs to troubleshoot any issues.
        Introduction - Spark History Server for AWS Glue
        Setup Spark History Server on AWS
        Clone AWS Glue Samples repository
        Build AWS Glue Spark UI Container
        Update AWS IAM Policy Permissions
        Start AWS Glue Spark UI Container
        Deep Dive into AWS Glue Catalog
        AWS Glue has several components, but the most important ones are nothing but AWS Glue Crawlers, Databases as well as Catalog Tables. In this section, we will go through some of the most important and commonly used features of the AWS Glue Catalog.
        Prerequisites for AWS Glue Catalog Tables
        Steps for Creating AWS Glue Catalog Tables
        Download Data Set to use to create AWS Glue Catalog Tables
        Upload data to s3 to crawl using AWS Glue Crawler to create required AWS Glue Catalog Tables
        Create AWS Glue Catalog Database - itvghlandingdb
        Create AWS Glue Catalog Table - ghactivity
        Running Queries using AWS Athena - ghactivity
        Crawling Multiple Folders using AWS Glue Crawlers
        Managing AWS Glue Catalog using AWS CLI
        Managing AWS Glue Catalog using Python Boto3
        Exploring AWS Glue Job APIs
        Once we deploy AWS Glue jobs, we can manage them using AWS Glue Job APIs. In this section we will get overview of AWS Glue Job APIs to run and manage the jobs.
        Update In this section, we will go through the details related to AWS Glue Job Bookmarks.
        Introduction to AWS Glue Job Bookmarks
        Cleaning up the data to run AWS Glue Jobs
        Overview of AWS Glue CLI and Commands
        Run AWS Glue Job using AWS Glue Bookmark
        Validate AWS Glue Bookmark using AWS CLI
        Add new data to the landing zone to run AWS Glue Jobs using Bookmarks
        Rerun AWS Glue Job using Bookmark
        Validate AWS Glue Job Bookmark and Files for Incremental run
        Recrawl the AWS Glue Catalog Table using We will use this application later while exploring EMR in detail.
        Setup Virtual Environment and Install Pyspark
        Getting Started with Pycharm
        Passing Run Time Arguments
        Accessing OS Environment Variables
        Getting Started with Spark
        Create Function for Spark Session
        Setup Sample Data
        Read data from files
        Process data using Spark APIs
        Write data to files
        Validating Writing Data to Files
        Productionizing the Code
        Getting Started with AWS EMR (Elastic Map Reduce)
        As part of this section, we will understand how to get started with AWS EMR Cluster. We will primarily focus on AWS EMR Web Console. Elastic Map Reduce is one of the key service in AWS Data Analytics Services which provide capability to run applications which process large scale data leveraging distributed computing frameworks such as Spark.
        Planning for We will be using the Spark Application we deployed earlier.
        Deploying Applications using AWS EMR - Introduction
        Setup We will use AWS Kinesis Firehose Agent and AWS Kinesis Delivery Stream to read the data from log files and ingest it into AWS s3.
        Building Streaming Pipeline using AWS Kinesis Firehose Agent and Delivery Stream
        Rotating Logs so that the files are created frequently which will be eventually ingested using AWS Kinesis Firehose Agent and AWS Kinesis Firehose Delivery Stream
        Set up AWS Kinesis Firehose Agent to get data from logs into AWS Kinesis Delivery Stream.
        Create AWS Kinesis Firehose Delivery Stream
        Planning the Pipeline to ingest data into s3 using AWS Kinesis Delivery Stream
        Create AWS IAM Group and User for Streaming Pipelines using AWS Kinesis Components
        Granting Permissions to
        Start and Validate AWS Kinesis Firehose Agent
        Conclusion - Building Simple Steaming Pipeline using AWS Kinesis Firehose
        Consuming Data from AWS s3 using Python boto3 ingested using AWS Kinesis
        As data is ingested into AWS S3, we will understand how data can ingested in AWS s3 can be processed using boto3.
        Customizing AWS s3 folder using AWS Kinesis Delivery Stream
        Create AWS IAM Policy to read from AWS s3 Bucket
        Validate AWS s3 access using AWS CLI
        Setup Python Virtual Environment to explore boto3
        Validating access to AWS s3 using Python boto3
        Read Content from AWS s3 object
        Read multiple AWS s3 Objects
        Get the number of AWS s3 Objects using Marker
        Get the size of AWS s3 Objects using Marker
        Populating GitHub Data to AWS Dynamodb
        As part of this section, we will understand how we can populate data to AWS Dynamodb tables using Python as a programming language.
        Install required libraries to get GitHub Data to AWS Dynamodb tables.
        Understanding GitHub APIs
        Setting up GitHub API Token
        Understanding GitHub Rate Limit
        Create New Repository for since
        Extracting Required Information using Python
        Processing Data using Python
        Grant Permissions to create AWS dynamodb tables using boto3
        Create AWS Dynamodb Tables
        AWS Dynamodb CRUD Operations
        Populate AWS Dynamodb Table
        AWS Dynamodb Batch Operations
        Overview of Amazon AWS Athena
        As part of this section, we will understand how to get started with AWS Athena using AWS Web console. We will also focus on basic DDL and DML or CRUD Operations using AWS Athena Query Editor.
        Getting Started with Amazon AWS Athena
        Quick Recap of AWS Glue Catalog Databases and Tables
        Access AWS Glue Catalog Databases and Tables using AWS Athena Query Editor
        Create a Database and Table using AWS Athena
        Populate Data into Table using AWS Athena
        Using CTAS to create tables using AWS Athena
        Overview of Amazon AWS Athena Architecture
        Amazon AWS Athena Resources and relationship with Hive
        Create a Partitioned Table using AWS Athena
        Develop Query for Partitioned Column
        Insert into Partitioned Tables using AWS Athena
        Validate Data Partitioning using AWS Athena
        Drop AWS Athena Tables and Delete Data Files
        Drop Partitioned Table using AWS Athena
        Data Partitioning in AWS Athena using CTAS
        Amazon AWS Athena using AWS CLI
        As part of this section, we will understand how to interact with AWS Athena using AWS CLI Commands.
        Amazon AWS Athena using AWS CLI - Introduction
        Get help and list AWS Athena databases using AWS CLI
        Managing AWS Athena Workgroups using AWS CLI
        Run AWS Athena Queries using AWS CLI
        Get AWS Athena Table Metadata using AWS CLI
        Run AWS Athena Queries with a custom location using AWS CLI
        Drop AWS Athena table using AWS CLI
        Run CTAS under AWS Athena using AWS CLI
        Amazon AWS Athena using Python boto3
        As part of this section, we will understand how to interact with AWS Athena using Python boto3.
        Amazon AWS Athena using Python boto3 - Introduction
        Getting Started with Managing AWS Athena using Python boto3
        List Amazon AWS Athena Databases using Python boto3
        List Amazon AWS Athena Tables using Python boto3
        Run Amazon AWS Athena Queries with boto3
        Review AWS Athena Query Results using boto3
        Persist Amazon AWS Athena Query Results in Custom Location using boto3
        Processing AWS Athena Query Results using Pandas
        Run CTAS against Amazon AWS Athena using Python boto3
        Getting Started with Amazon AWS Redshift
        As part of this section, we will understand how to get started with AWS Redshift using AWS Web console. We will also focus on basic DDL and DML or CRUD Operations using AWS Redshift Query Editor.
        Getting Started with Amazon AWS Redshift - Introduction
        Create AWS Redshift Cluster using Free Trial
        Connecting to Database using AWS Redshift Query Editor
        Get a list of tables querying information schema
        Run Queries against AWS Redshift Tables using Query Editor
        Create AWS Redshift Table using Primary Key
        Insert Data into AWS Redshift Tables
        Update Data in AWS Redshift Tables
        Delete data from AWS Redshift tables
        Redshift Saved Queries using Query Editor
        Deleting AWS Redshift Cluster
        Restore AWS Redshift Cluster from Snapshot
        Copy Data from s3 into AWS Redshift Tables
        As part of this section, we will go through the details about copying data from s3 into AWS Redshift tables using the AWS Redshift Copy command.
        Copy Data from s3 to AWS Redshift - Introduction
        Setup Data in s3 for AWS Redshift Copy
        Copy Database and Table for AWS Redshift Copy Command
        Create IAM User with full access on s3 for AWS Redshift Copy
        Run Copy Command to copy data from s3 to AWS Redshift Table
        Troubleshoot Errors related to AWS Redshift Copy Command
        Run Copy Command to copy from s3 to AWS Redshift table
        Validate using queries against AWS Redshift Table
        Overview of AWS Redshift Copy Command
        Create IAM Role for AWS Redshift to access s3
        Copy Data from s3 to AWS Redshift table using IAM Role
        Setup JSON Dataset in s3 for AWS Redshift Copy Command
        Copy JSON Data from s3 to AWS Redshift table using IAM Role
        Develop Applications using AWS Redshift Cluster
        As part of this section, we will understand how to develop applications against databases and tables created as part of AWS Redshift Cluster.
        Develop application using AWS Redshift Cluster - Introduction
        Allocate Elastic Ip for AWS Redshift Cluster
        Enable Public Accessibility for AWS Redshift Cluster
        Update Inbound Rules in Security Group to access AWS Redshift Cluster
        Create Database and User in AWS Redshift Cluster
        Connect to the database in AWS Redshift using psql
        Change Owner on AWS Redshift Tables
        Download AWS Redshift JDBC Jar file
        Connect to AWS Redshift Databases using IDEs such as SQL Workbench
        Setup Python Virtual Environment for AWS Redshift
        Run Simple Query against AWS Redshift Database Table using Python
        Truncate AWS Redshift Table using Python
        Create IAM User to copy from s3 to AWS Redshift Tables
        Validate Access of IAM User using Boto3
        Run AWS Redshift Copy Command using Python
        AWS Redshift Tables with Distkeys and Sortkeys
        As part of this section, we will go through AWS Redshift-specific features such as distribution keys and sort keys to create AWS Redshift tables.
        AWS Redshift Tables with Distkeys and Sortkeys - Introduction
        Quick Review of AWS Redshift Architecture
        Create multi-node AWS Redshift Cluster
        Connect to AWS Redshift Cluster using Query Editor
        Create AWS Redshift Database
        Create AWS Redshift Database User
        Create AWS Redshift Database Schema
        Default Distribution Style of AWS Redshift Table
        Grant Select Permissions on Catalog to AWS Redshift Database User
        Update Search Path to query AWS Redshift system tables
        Validate AWS Redshift table with
        AWS Redshift Federated Queries and Spectrum - Introduction
        Overview of integrating AWS RDS and AWS Redshift for Federated Queries
        Create IAM Role for AWS Redshift Cluster
        Setup Postgres Database Server for AWS Redshift Federated Queries
        Create tables in Postgres Database for AWS Redshift Federated Queries
        Creating Secret using Secrets Manager for Postgres Database
        Accessing Secret Details using Python Boto3
        Reading Json Data to Dataframe using Pandas
        Write JSON Data to AWS Redshift Database Tables using Pandas
        Create AWS IAM Policy for Secret and associate with Redshift Role
        Create AWS Redshift Cluster using AWS IAM Role with permissions on secret
        Create AWS Redshift External Schema to Postgres Database
        Update AWS Redshift Cluster Network Settings for Federated Queries
        Performing ETL using AWS Redshift Federated Queries
        Clean up resources added for AWS Redshift Federated Queries
        Grant Access on AWS Glue Data Catalog to AWS Redshift Cluster for Spectrum
        Setup AWS Redshift Clusters to run queries using Spectrum
        Quick Recap of AWS Glue Catalog Database and Tables for AWS Redshift Spectrum
        Create External Schema using AWS Redshift Spectrum
        Run Queries using AWS Redshift Spectrum
        Cleanup the AWS Redshift Cluster

What's inside

Learning objectives

Data engineering leveraging services under aws data analytics
Aws essentials such as s3, iam, ec2, etc
Understanding aws s3 for cloud based storage
Understanding details related to virtual machines on aws known as ec2
Managing aws iam users, groups, roles and policies for rbac (role based access control)
Managing tables using aws glue catalog
Engineering batch data pipelines using aws glue jobs
Orchestrating batch data pipelines using aws glue workflows
Running queries using aws athena - server less query engine service
Using aws elastic map reduce (emr) clusters for building data pipelines
Using aws elastic map reduce (emr) clusters for reports and dashboards
Data ingestion using aws lambda functions
Scheduling using aws events bridge

Engineering streaming pipelines using aws kinesis
Streaming web server logs using aws kinesis firehose
Overview of data processing using aws athena
Running aws athena queries or commands using cli
Running aws athena queries using python boto3
Creating aws redshift cluster, create tables and perform crud operations
Copy data from s3 to aws redshift tables
Understanding distribution styles and creating tables using distkeys
Running queries on external rdbms tables using aws redshift federated queries
Running queries on glue or athena catalog tables using aws redshift spectrum
Show more
Show less

Data engineering leveraging services under aws data analytics
Aws essentials such as s3, iam, ec2, etc
Understanding aws s3 for cloud based storage
Understanding details related to virtual machines on aws known as ec2
Managing aws iam users, groups, roles and policies for rbac (role based access control)
Managing tables using aws glue catalog
Engineering batch data pipelines using aws glue jobs
Orchestrating batch data pipelines using aws glue workflows
Running queries using aws athena - server less query engine service
Using aws elastic map reduce (emr) clusters for building data pipelines
Using aws elastic map reduce (emr) clusters for reports and dashboards
Data ingestion using aws lambda functions
Scheduling using aws events bridge
Engineering streaming pipelines using aws kinesis
Streaming web server logs using aws kinesis firehose
Overview of data processing using aws athena
Running aws athena queries or commands using cli
Running aws athena queries using python boto3
Creating aws redshift cluster, create tables and perform crud operations
Copy data from s3 to aws redshift tables
Understanding distribution styles and creating tables using distkeys
Running queries on external rdbms tables using aws redshift federated queries
Running queries on glue or athena catalog tables using aws redshift spectrum
Show more
Show less

Syllabus

Introduction to the course

Introduction to Data Engineering using AWS Analytics Services

Video Lectures and Reference Material

Taking the Udemy Course for new Udemy Users

Additional Costs for AWS Infrastructure for Hands-on Practice

Signup for AWS Account

Logging in into AWS Account

Overview of AWS Billing Dashboard - Cost Explorer and Budgets

Setup Local Development Environment for AWS on Windows 10 or Windows 11

Setup Local Environment on Windows for AWS

Overview of Powershell on Windows 10 or Windows 11

Setup Ubuntu VM on Windows 10 or 11 using wsl

Setup Ubuntu VM on Windows 10 or 11 using wsl - Contd...

Setup Python venv and pip on Ubuntu

Setup AWS CLI on Windows and Ubuntu using Pip

Create AWS IAM User and Download Credentials

Configure AWS CLI on Windows

Create Python Virtual Environment for AWS Projects

Setup Boto3 as part of Python Virtual Environment

Setup Jupyter Lab and Validate boto3

Setup Local Development Environment for AWS on Mac

Setup Local Environment for AWS on Mac

Setup AWS CLI on Mac

Setup AWS IAM User to configure AWS CLI

Configure AWS CLI using IAM User Credentials

Setup Python Virtual Environment on Mac using Python 3

Setup Environment for Practice using Cloud9

Introduction to Cloud9

Setup Cloud9

Overview of Cloud9 IDE

Docker and AWS CLI on Cloud9

Cloud9 and EC2

Accessing Web Applications

Allocate and Assign Static IP

Changing Permissions using IAM Policies

Increasing Size of EBS Volume

Opening ports for Cloud9 Instance

Setup Jupyter lab on Cloud9 Instance

Open SSH Port for Cloud9 EC2 Instance

Connect to Cloud9 EC2 Instance using SSH

Understand how to get started by creating required AWS s3 bucket and granting permissions on bucket to AWS IAM Roles via IAM Policies.

Introduction - AWS Getting Started

[Instructions] Introduction - AWS Getting Started

Create AWS s3 Bucket using AWS Web Console

[Instructions] Create s3 Bucket

Create AWS IAM Group and User using AWS Web Console

[Instructions] Create IAM Group and User

Overview of AWS IAM Roles to grant permissions between AWS Services

[Instructions] Overview of Roles

Create and Attach AWS IAM Custom Policy using AWS Web Console

[Instructions and Code] Create and Attach Custom Policy

Configure and Validate AWS Command Line Interface to run AWS Commands

[Instructions and Code] Configure and Validate AWS CLI

Learn all the basic concepts of AWS s3 such as copying data as objects into s3 bucket, version control, overview of s3 tiers as well as managing objects in AWS s3 using AWS CLI.

Getting Started with AWS Simple Storage aka S3

[Instructions] Getting Started with AWS S3

Setup Data Set locally to upload into AWS s3

[Instructions] Setup Data Set locally to upload into AWS s3

Adding AWS S3 Buckets and Objects using AWS Web Console

[Instruction] Adding AWS s3 Buckets and Objects

Version Control of AWS S3 Objects or Files

[Instructions] Version Control in AWS S3

AWS S3 Cross-Region Replication for fault tolerance

[Instructions] AWS S3 Cross-Region Replication for fault tolerance

Overview of AWS S3 Storage Classes or Storage Tiers

[Instructions] Overview of AWS S3 Storage Classes or Storage Tiers

Overview of Glacier in AWS s3

[Instructions] Overview of Glacier in AWS s3

Managing AWS S3 buckets and objects using AWS CLI

[Instructions and Commands] Managing AWS S3 buckets and objects using AWS CLI

Managing Objects in AWS S3 using AWS CLI - Lab

[Instructions] Managing Objects in AWS S3 using AWS CLI - Lab

Ability to create Users, Groups and Roles using AWS IAM and also attach permissions via AWS IAM Policies. One will also learn how to create custom AWS IAM Policies.

Creating AWS IAM Users with Programmatic and Web Console Access

[Instructions] Creating IAM Users

Logging into AWS Management Console using AWS IAM User

[Instructions] Logging into AWS Management Console using IAM User

Validate Programmatic Access to AWS IAM User via AWS CLI

[Instructions and Commands] Validate Programmatic Access to IAM User

Getting Started with AWS IAM Identity-based Policies

[Instructions and Commands] IAM Identity-based Policies

Managing AWS IAM User Groups

[Instructions and Commands] Managing IAM Groups

Managing AWS IAM Roles for Service Level Access

[Instructions and Commands] Managing IAM Roles

Overview of AWS Custom Policies to grant permissions to Users, Groups, and Roles

[Instructions and Commands] Overview of Custom Policies

Managing AWS IAM Groups, Users, and Roles using AWS CLI

[Instructions and Commands] Managing IAM using AWS CLI

Understand some of the AWS EC2 Key Concepts with hands on practice about how to create Key Pair, setup EC2 Instance using the key pair, update security groups, etc.

Getting Started with AWS Elastic Cloud Compute aka EC2

[Instructions] Getting Started with EC2

Create AWS EC2 Key Pair for SSH Access

[Instructions] Create EC2 Key Pair

Launch AWS EC2 Instance or Virtual Machine

[Instructions] Launch EC2 Instance

Connecting to AWS EC2 Instance or Virtual Machine using SSH

[Instructions and Commands] Connecting to EC2 Instance

Good to know

Know what's good

, what to watch for

, and possible dealbreakers

Explores AWS Services, which is standard in data engineering

Taught by industry recognized experts who are recognized in the field of data engineering

Examines batch and streaming pipeline concepts, which are highly relevant to data engineering

Explores PySpark, Apache Spark, and Athena, which are all popular technologies used by data engineers

Develops foundational skills in AWS Analytics Services, such as s3, EC2, and IAM, which are essential technologies for data engineers

Requires prerequisite knowledge of basic computing concepts, which may be a caveat for some learners

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Data Engineering using AWS Data Analytics with these activities:

Review AWS Fundamentals

Show steps

Solidify your understanding of core AWS concepts to strengthen your proficiency in the course.

Show steps

Review AWS documentation on core services such as S3, EC2, and Lambda.
Complete hands-on labs or tutorials to practice using these services.

Review Python programming basics and data structures

Show steps

Ensuring that your Python programming basics and data structures are up to date will help you succeed in this course.

Browse courses on Python Programming

Show steps

Review Python data types and operators
Practice working with lists, dictionaries, and sets
Understand the concept of object-oriented programming
Solve simple coding problems using Python

Create a diagram of the AWS Data Analytics stack

Show steps

Visualizing the relationships and components of the AWS Data Analytics stack will enhance your understanding of how the services work together.

Show steps

Choose a visual tool like Draw.io or Lucidchart
Identify the key services of the AWS Data Analytics stack like S3, Redshift, and Athena
Draw and label connections between the services
Describe the flow of data and processes within the stack

Eight other activities

Expand to see all activities and additional details

Show all 11 activities

Join a study group or online forum dedicated to AWS Data Analytics

Show steps

Engaging with other learners and experts in a study group or online forum will expand your knowledge, expose you to different perspectives, and provide opportunities for collaboration.

Show steps

Practice setting up AWS IAM roles and policies

Show steps

Setting up AWS IAM roles and policies correctly is crucial for security and access control in your AWS environment. Practicing these tasks will help reinforce your understanding and ensure that you can perform them confidently.

Show steps

Create a new AWS IAM user with limited permissions
Create an AWS IAM role to grant specific permissions to the user
Attach the role to the user
Test the permissions of the user
Clean up the resources you created

Follow a tutorial on deploying a data pipeline using AWS Glue

Show steps

Hands-on experience with deploying a data pipeline is invaluable. Following a tutorial will provide guidance and ensure that you complete all the necessary steps.

Browse courses on Data Pipelines

Show steps

Find an online tutorial on deploying a data pipeline using AWS Glue
Gather the required resources and set up your AWS environment
Follow the tutorial steps to create a data pipeline
Test the data pipeline to ensure it's working correctly
Clean up the resources if you don't need them anymore

AWS Lambda Function Development

Show steps

Deepen your understanding of AWS Lambda functions by building and deploying your own.

Browse courses on AWS Lambda

Show steps

Create a simple AWS Lambda function using Python or Java.
Deploy your function to AWS and test its functionality.
Experiment with different event triggers and input data.

Develop a budgeting and cost optimization plan for an AWS Data Analytics project

Show steps

Creating a budgeting and cost optimization plan will help you manage your AWS resources efficiently and avoid unexpected expenses.

Browse courses on Budgeting

Show steps

Estimate the costs of your AWS Data Analytics project
Identify potential areas for cost optimization
Develop a cost optimization plan
Implement your cost optimization plan
Monitor and adjust your cost optimization plan as needed

Contribute to an open-source project related to AWS Data Analytics

Show steps

Contributing to an open-source project not only helps the community, but also allows you to learn from experienced developers and gain valuable practical experience.

Browse courses on Open Source

Show steps

Identify an open-source project related to AWS Data Analytics
Review the project's documentation and codebase
Identify an area where you can contribute
Create a pull request with your changes

Organize and review your notes, assignments, and quizzes from the course

Show steps

By organizing and reviewing your notes, assignments, and quizzes, you can reinforce your learning and identify areas where you may need additional support.

Browse courses on Self Assessment

Show steps

Gather your notes, assignments, and quizzes from the course
Create a system to organize your materials
Review your materials regularly
Identify key concepts and areas where you need more practice
Use your organized materials to prepare for upcoming assessments

Develop a Data Analytics Project using AWS Services

Show steps

Put your learning into practice by building a comprehensive data analytics project leveraging AWS services.

Browse courses on Data Pipeline

Show steps

Define the scope and objectives of your project.
Choose appropriate AWS services for data ingestion, processing, and visualization.
Implement your data pipeline using AWS services such as Glue, EMR, Athena, and Redshift.
Analyze and interpret your results to derive insights.

Career center

Learners who complete Data Engineering using AWS Data Analytics will develop knowledge and skills that may be useful to these careers:

Data Engineer

As a Data Engineer, you will be responsible for designing, building, and maintaining data pipelines. This course will help you build a strong foundation in AWS Data Analytics Services, which are essential for data engineering work. You will learn how to use AWS Glue, EMR, Lambda Functions, Athena, and other services to create and manage data pipelines. This course will also teach you how to use Python and other programming languages to develop data engineering applications.

See salaries and explore the career path for Data Engineer

Data Analyst

As a Data Analyst, you will be responsible for collecting, analyzing, and interpreting data to help businesses make informed decisions. This course will help you build a strong foundation in AWS Data Analytics Services, which are essential for data analysis work. You will learn how to use AWS Glue, EMR, Athena, and other services to collect, process, and analyze data. This course will also teach you how to use Python and other programming languages to develop data analysis applications.

See salaries and explore the career path for Data Analyst

Data Scientist

As a Data Scientist, you will be responsible for developing and applying statistical and machine learning models to data to help businesses solve problems and make better decisions. This course will help you build a strong foundation in AWS Data Analytics Services, which are essential for data science work. You will learn how to use AWS Glue, EMR, Athena, and other services to collect, process, and analyze data. This course will also teach you how to use Python and other programming languages to develop data science applications.

See salaries and explore the career path for Data Scientist

Cloud Architect

As a Cloud Architect, you will be responsible for designing and managing cloud computing solutions. This course will help you build a strong foundation in AWS Data Analytics Services, which are essential for cloud architecture work. You will learn how to use AWS Glue, EMR, Athena, and other services to build and manage data pipelines in the cloud. This course will also teach you how to use Python and other programming languages to develop cloud computing applications.

See salaries and explore the career path for Cloud Architect

DevOps Engineer

As a DevOps Engineer, you will be responsible for bridging the gap between development and operations teams. This course will help you build a strong foundation in AWS Data Analytics Services, which are essential for DevOps work. You will learn how to use AWS Glue, EMR, Athena, and other services to build and manage data pipelines that can be deployed and managed by both development and operations teams.

See salaries and explore the career path for DevOps Engineer

Data Architect

As a Data Architect, you will be responsible for designing and managing the architecture of data systems. This course will help you build a strong foundation in AWS Data Analytics Services, which are essential for data architecture work. You will learn how to use AWS Glue, EMR, Athena, and other services to design and manage data pipelines that meet the needs of your business.

See salaries and explore the career path for Data Architect

Business Intelligence Analyst

As a Business Intelligence Analyst, you will be responsible for helping businesses understand their data and make better decisions. This course will help you build a strong foundation in AWS Data Analytics Services, which are essential for business intelligence work. You will learn how to use AWS Glue, EMR, Athena, and other services to collect, process, and analyze data. This course will also teach you how to use Python and other programming languages to develop business intelligence applications.

See salaries and explore the career path for Business Intelligence Analyst

Database Administrator

As a Database Administrator, you will be responsible for managing and maintaining databases. This course may be useful to you if you are interested in working with AWS data analytics services. You will learn how to use AWS Glue, EMR, Athena, and other services to manage and maintain data pipelines.

See salaries and explore the career path for Database Administrator

Software Engineer

As a Software Engineer, you will be responsible for developing and maintaining software applications. This course may be useful to you if you are interested in working with AWS data analytics services. You will learn how to use AWS Glue, EMR, Athena, and other services to develop and maintain data pipelines.

See salaries and explore the career path for Software Engineer

Systems Administrator

As a Systems Administrator, you will be responsible for managing and maintaining computer systems. This course may be useful to you if you are interested in working with AWS data analytics services. You will learn how to use AWS Glue, EMR, Athena, and other services to manage and maintain data pipelines.

See salaries and explore the career path for Systems Administrator

Network Engineer

As a Network Engineer, you will be responsible for designing and managing computer networks. This course may be useful to you if you are interested in working with AWS data analytics services. You will learn how to use AWS Glue, EMR, Athena, and other services to manage and maintain data pipelines.

See salaries and explore the career path for Network Engineer

Security Analyst

As a Security Analyst, you will be responsible for protecting computer systems from security threats. This course may be useful to you if you are interested in working with AWS data analytics services. You will learn how to use AWS Glue, EMR, Athena, and other services to protect data pipelines from security threats.

See salaries and explore the career path for Security Analyst

Project Manager

As a Project Manager, you will be responsible for planning and managing projects. This course may be useful to you if you are interested in working with AWS data analytics services. You will learn how to use AWS Glue, EMR, Athena, and other services to plan and manage data pipelines.

See salaries and explore the career path for Project Manager

Product Manager

As a Product Manager, you will be responsible for planning and managing products. This course may be useful to you if you are interested in working with AWS data analytics services. You will learn how to use AWS Glue, EMR, Athena, and other services to plan and manage data pipelines.

See salaries and explore the career path for Product Manager

Data Warehouse Architect

As a Data Warehouse Architect, you will be responsible for designing and managing data warehouses. This course may be useful to you if you are interested in working with AWS data analytics services. You will learn how to use AWS Glue, EMR, Athena, and other services to design and manage data warehouses.

See salaries and explore the career path for Data Warehouse Architect