Mastering AWS Glue, QuickSight, Athena & Redshift Spectrum from Udemy

What's inside

Learning objectives

Confidently work with aws serverless services to develop data catalogue, etl, analytics and reporting on a data lake
Develop deep knowledge in glue, athena, redshift spectrum and quicksight

Build a serverless data lake on aws using structured and unstructured data
Architect serverless analytics solutions on aws cloud platform

Confidently work with aws serverless services to develop data catalogue, etl, analytics and reporting on a data lake
Develop deep knowledge in glue, athena, redshift spectrum and quicksight
Build a serverless data lake on aws using structured and unstructured data
Architect serverless analytics solutions on aws cloud platform

Syllabus

Introduction

Instructor and Course Introduction

Pre-requisites - What you'll need for this course

Course Objectives

Course Content, Convention and Resources

Section Agenda

Learn about basics of Serverless Computing and which AWS Services fits into it

Learn basics of AWS Serverless Data Lake Architecture

Setup sample data on S3 buckets that would be used throughout this course

Configure S3 Storage Analytics

Introduction to Amazon Redshift

Develop Amazon Redshift Cluster

Install and setup SQL Client to work with Amazon Redshift

Load sample data in Redshift cluster

Learn AWS Glue Architecture with diagrams

Learn frequently used AWS Glue Terms and their meanings

Learn about different applications and features of AWS Glue

Learn internal architecture of AWS Glue

Learn about the cost economics of AWS Glue

Setup IAM Role and policies to use with AWS Glue

Learn about the networking concepts and settings required for AWS Glue

Configure network settings for AWS Glue

Learn about the concept of Data Catalog in AWS Glue

Learn to develop databases in AWS Glue

Learn to develop tables in AWS Glue

Develop tables manually in AWS Glue

Learn about the concept of Crawler in AWS Glue

Learn about the concept of classifiers in AWS Glue

Develop crawlers in AWS Glue - Lab 1

Develop crawlers in AWS Glue - Lab 2

Develop crawlers in AWS Glue - Lab 3

Develop crawlers in AWS Glue - Lab 4

Develop crawlers in AWS Glue - Lab 5

Develop crawlers in AWS Glue - Lab 6

Develop crawlers in AWS Glue - Lab 7

Learn to develop serverless ETL jobs with AWS Glue

Learn about different ETL job properties in AWS Glue

Learn to develop serverless ETL jobs with AWS Glue with Redshift as data source

Learn to develop Python scripts and properties for serverless ETL jobs using AWS Glue

Learn about built-in ETL Transformations in AWS Glue

Learn about Triggers in AWS Glue

Learn about AWS Glue Development Endpoints

Learn to install and setup Apache Zeppelin

Learn to install Git and setup Port Forwarding

Learn to integrate AWS Glue Development Endpoint with Apache Zeppelin Notebook

Learn monitoring options available for AWS Glue

AWS Glue supports timeout values for ETL Jobs

AWS Glue supports reading from Amazon DynamoDB Tables

AWS Glue provides additional ETL Job metrics

AWS Glue supports data encryption at rest

AWS Glue supports connecting Sagemaker notebooks to dev endpoints

AWS Glue supports resource based policies and permissions

AWS Glue introduces Python Shell Jobs which can be used for custom transformations and other generic tasks in ETL jobs

Download Source code AWS Glue Data Catalog Client - Hive Metastore

AWS Glue enables running Apache Spark SQL Queries

AWS Glue supports additional options for memory-intensive jobs

AWS Glue crawlers support existing Data Catalog tables as sources

AWS Glue enables continuous logging for Spark ETL Jobs

AWS Glue supports scripts compatible with Python 3.6 in Shell Jobs

AWS Glue provides workflows to orchestrate ETL workloads

AWS Glue supports running ETL Jobs on Spark 2.4.3 with Python 3

AWS Glue supports additional options for memory intensive jobs

AWS Glue supports bookmarking Parquet and ORC Files using ETL Jobs

Launch AWS Glue, EMR and Aurora Serverless Clusters in Shared VPCs

AWS Glue provides FindMatches ML Transform

AWS Glue releases binaries of Glue ETL libraries for Glue Jobs

AWS Glue provides Apache Spark UI to monitor Glue ETL Jobs

AWS Glue provides ability to rewind Spark ETL Job bookmarks

AWS Glue support FindMatches ML Transform on Spark 2.4.3 & Glue 1.0

AWS Glue supports bringing your own JDBC driver for Spark ETL Jobs

Glue adds new transforms - Purge, Transition and Merge

Glue supports reading & writing to DocumentDB & MongoDB Collection

AWS Glue supports new tables, update schema & partitions from Jobs

AWS Glue supports serverless streaming ETL

Traffic lights

Read about what's good

what should give you pause

and possible dealbreakers

Delves into concepts essential to careers in data analytics: data catalog, ETL, analytics, and reporting using data lakes

Meets industry demand for serverless data analytics professionals through practical training

Focuses on intermediate to advanced topics, catering to experienced AWS Developers, Architects, and Administrators

Provides hands-on labs to reinforce understanding of AWS Glue, Athena, Redshift Spectrum, and QuickSight

Offers updated content on new features released by Amazon since 2018, ensuring currency with industry advancements

Requires prior experience with AWS Console and services, potentially posing a challenge for beginners

Reviews summary

Mastering aws serverless data analytics

According to learners, this course offers a deep dive into AWS's serverless data analytics services, specifically Glue, QuickSight, Athena, and Redshift Spectrum. Many students appreciate the practical, hands-on approach and feel it provides a solid foundation for building serverless data lakes. The course is generally considered highly valuable for experienced AWS professionals looking to specialize. However, multiple reviews highlight that the course is not suitable for beginners and a strong understanding of core AWS services is a crucial prerequisite. Some learners also note that the labs require an AWS account and can incur costs. Overall, the sentiment is largely positive among those who meet the intended intermediate/expert audience.

Focuses effectively on Glue, Athena, Redshift Spectrum, QuickSight.

"The sections on Glue and Athena were particularly strong."

"Gave me a great overview and practical use cases for QuickSight."

"Explained Redshift Spectrum integration clearly."

"I feel confident using these specific services now."

Hands-on exercises reinforce learning.

"The labs are very helpful for hands-on practice."

"Plenty of demos that make the concepts clear."

"Walking through the labs step-by-step was beneficial."

"I appreciated the practical setup examples."

Provides thorough coverage of complex topics.

"Instructor explains all the minute details about each component and service."

"Provides a deep understanding of the inner workings and concepts."

"This course is very comprehensive covering most aspects of the four services."

"I gained practical skills that I could apply to my work immediately."

Using AWS services incurs cloud usage fees.

"Be aware that practicing the labs will cost money on your AWS account."

"The expense of running the labs might be a barrier for some."

"Need to manage AWS resources carefully to control costs."

"Access to an AWS account with a budget is necessary."

Not suitable for beginners; assumes existing knowledge.

"Definitely for intermediate to advanced AWS users."

"I struggled without a basic understanding of AWS concepts."

"This course assumes you have prior working experience on AWS."

"Learners should meet the stated prerequisite knowledge."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Mastering AWS Glue, QuickSight, Athena & Redshift Spectrum with these activities:

Compile and organize the materials from the course

Show steps

Organize the materials from the course to make them easier to review later

Show steps

Create a folder for the course materials
Download or save all of the course materials
Organize the materials into subfolders (e.g., by topic)

Follow tutorial to build a simple data pipeline

Show steps

Build a simple data pipeline to understand the basic concept of AWS Glue

Browse courses on Data Pipeline

Show steps

Find a tutorial on building a data pipeline using AWS Glue
Follow the tutorial step-by-step
Test the data pipeline

Practice writing ETL jobs using AWS Glue

Show steps

Practice writing ETL jobs to solidify your understanding of AWS Glue

Browse courses on ETL

Show steps

Create a new AWS Glue job
Write the ETL code using Python or Scala
Test the ETL job

One other activity

Expand to see all activities and additional details

Show all four activities

Attend an AWS Meetup or conference

Show steps

Attend an AWS Meetup or conference to connect with other AWS professionals and learn about the latest AWS technologies

Show steps

Find an AWS Meetup or conference in your area
Register for the event
Attend the event

Career center

Learners who complete Mastering AWS Glue, QuickSight, Athena & Redshift Spectrum will develop knowledge and skills that may be useful to these careers:

Data Analyst

Data Analysts are responsible for collecting, cleaning, and analyzing data to provide insights and recommendations to organizations. This course will teach you how to use AWS Glue, Athena, Redshift Spectrum, and QuickSight to build a data lake, analyze data, and create visualizations. These skills are essential for Data Analysts who work with large datasets and need to be able to quickly and efficiently extract meaningful insights from data.

See salaries and explore the career path for Data Analyst

Big Data Architect

Big Data Architects are responsible for designing and developing big data solutions. This course can be helpful for Big Data Architects who are looking to learn about the latest cloud-based data technologies. You will learn how to build a serverless data lake, analyze data with SQL and Python, and create visualizations in QuickSight. This knowledge can help you develop and manage big data solutions that are scalable, reliable, and secure.

See salaries and explore the career path for Big Data Architect

Data Warehouse Architect

Data Warehouse Architects are responsible for designing, developing, and maintaining data warehouses. This course can be helpful for Data Warehouse Architects who are looking to learn about the latest cloud-based data technologies. You will learn how to build a serverless data lake, analyze data with SQL and Python, and create visualizations in QuickSight. This knowledge can help you to design and manage data warehouses that are scalable, reliable, and secure.

See salaries and explore the career path for Data Warehouse Architect

Data Infrastructure Engineer

Data Infrastructure Engineers are responsible for designing and developing data infrastructure solutions. This course can be helpful for Data Infrastructure Engineers who are looking to learn about the latest cloud-based data technologies. You will learn how to build a serverless data lake, analyze data with SQL and Python, and create visualizations in QuickSight. This knowledge can help you design and develop data infrastructure solutions that are scalable, reliable, and secure.

See salaries and explore the career path for Data Infrastructure Engineer

Data Architect

Data Architects not only bridge the gap between business and data, but they also ensure that the collection, storage, and usage of data adhere to both IT best practices and the organization's regulations. This course will help you gain a comprehensive understanding of data lake architectures and the components involved in a serverless data lake solution. This knowledge can be used to understand and analyze different data requirements, design and maintain data pipelines, and develop data management solutions to support business needs. Additionally, you will learn how to manage data governance and security in a cloud-based environment.

See salaries and explore the career path for Data Architect

Data Integration Engineer

Data Integration Engineers are responsible for designing and developing data integration solutions. This course can be helpful for Data Integration Engineers who are looking to learn about the latest cloud-based data technologies. You will learn how to build a serverless data lake, analyze data with SQL and Python, and create visualizations in QuickSight. This knowledge can help you to design and develop data integration solutions that are scalable, reliable, and secure.

See salaries and explore the career path for Data Integration Engineer

Data Governance Analyst

Data Governance Analysts are responsible for developing and implementing data governance policies and procedures. This course can be helpful for Data Governance Analysts who are looking to learn about the latest cloud-based data technologies. You will learn how to build a serverless data lake, analyze data with SQL and Python, and create visualizations in QuickSight. This knowledge can help you develop and implement data governance policies and procedures that are effective and efficient.

See salaries and explore the career path for Data Governance Analyst

Data Quality Analyst

Data Quality Analysts are responsible for ensuring the quality of data. This course can be helpful for Data Quality Analysts who are looking to learn about the latest cloud-based data technologies. You will learn how to build a serverless data lake, analyze data with SQL and Python, and create visualizations in QuickSight. This knowledge can help you develop and implement data quality processes that are effective and efficient.

See salaries and explore the career path for Data Quality Analyst

Database Developer

Database Developers are responsible for designing, developing, and maintaining databases. This course can be helpful for Database Developers who are looking to learn about the latest cloud-based data technologies. You will learn how to build a serverless data lake, analyze data with SQL and Python, and create visualizations in QuickSight. This knowledge can help you to develop and manage databases that are scalable, reliable, and secure.

See salaries and explore the career path for Database Developer

Data Engineer

A Data Engineer is responsible for the design, construction, and maintenance of big data infrastructures. The field of Data Engineering is closely related to Data Science and Machine Learning. This course introduces students to the tools and techniques employed in this field, which can also help you perform some of the most common tasks required in this role. For example, this course will teach you to extract, transform, and load data from a data source and bring it into a data warehouse. You will also learn how to write SQL queries to analyze your data. This course covers a variety of topics that go far beyond the scope of what a Data Engineer typically does, but it can help you gain a solid foundation with essential skills.

See salaries and explore the career path for Data Engineer

Cloud Engineer

Cloud Engineers are responsible for designing, deploying, and managing cloud-based applications and infrastructure. This course will teach you how to use AWS Glue, Athena, Redshift Spectrum, and QuickSight to build a serverless data lake, analyze data, and create visualizations. These skills can be valuable for Cloud Engineers who are working on projects that involve data analysis and visualization.

See salaries and explore the career path for Cloud Engineer

Software Engineer

Software Engineers are responsible for designing, developing, and maintaining software applications. This course will introduce you to the fundamentals of cloud-based data technologies and how they can be used to build serverless applications. You will learn how to use AWS Glue, Athena, Redshift Spectrum, and QuickSight to build a data lake, analyze data, and create visualizations. These skills can be valuable for Software Engineers who are working on projects that involve data analysis and visualization.

See salaries and explore the career path for Software Engineer

Business Analyst

A Business Analyst analyzes an organization or business domain to understand its needs and goals. This course will help you gain a better understanding of how data can be used to improve business outcomes. You will learn how to collect, analyze, and interpret data to identify trends and make recommendations to improve decision-making. This course can help you develop the skills and knowledge necessary to be successful as a Business Analyst, particularly in organizations that leverage data to make informed decisions.

See salaries and explore the career path for Business Analyst

Data Scientist

Data Scientists use scientific methods, processes, algorithms, and systems to extract knowledge and insights from data in various forms, both structured and unstructured. This course will introduce you to the fundamentals of data science by teaching you how to build a data lake, analyze data with SQL and Python, and create visualizations to communicate your findings. Completing this course can help you get started in this field and prepare for more advanced topics in data science, such as machine learning and artificial intelligence.

See salaries and explore the career path for Data Scientist

Machine Learning Engineer

Machine Learning Engineers are responsible for designing and developing machine learning models. This course may be useful for Machine Learning Engineers who are looking to gain experience with cloud-based data technologies. You will learn how to build a serverless data lake, analyze data with SQL and Python, and create visualizations in QuickSight. This knowledge can help you develop and deploy machine learning models that are scalable, reliable, and accurate.

See salaries and explore the career path for Machine Learning Engineer