PS:
PS:
Please do NOT join the course if you do NOT have any basic working knowledge of AWS Console and AWS Services like AWS Beginners may struggle understanding some of the topics.
Course explains all the labs. If you want to practice labs, it would require AWS Account and may cost $$.
Basic working knowledge of Redshift is recommended, but not a must.
This course has been designed for intermediate and expert AWS Developers / Architects / Administrators.
Course covers each and every feature that AWS has released since 2018 for AWS Glue, AWS QuickSight, AWS Athena, and Amazon Redshift Spectrum, and it regularly updated with every new feature released for these services.
Serverless is the future of cloud computing and AWS is continuously launching new services on Serverless paradigm. AWS launched Athena and QuickSight in Nov 2016, Redshift Spectrum in Apr 2017, and Glue in Aug 2017. Data and Analytics on AWS platform is evolving and gradually transforming to serverless mode.
Businesses have always wanted to manage less infrastructure and more solutions. Big data challenges are continuously challenging the infrastructure boundaries. Having Serverless Storage, Serverless ETL, Serverless Analytics, and Serverless Reporting, all on one cloud platform had sounded too good to be true for a very long time. But now its a reality on AWS platform. AWS is the only cloud provider that has all the native serverless components for a true Serverless Data Lake Analytics solution.
It's not a secret that when a technology is new in the industry, professionals with expertise in new technologies command great salaries. Serverless is the future, Serverless is the industry demand, and Serverless is new. It's the perfect time and opportunity to jump into Serverless Analytics on AWS Platform.
In this course, we would learn the following:
1) We will start with Basics on Serverless Computing and Basics of Data Lake Architecture on AWS.
2) We will learn Schema Discovery, ETL, Scheduling, and Tools integration using Serverless AWS Glue Engine built on Spark environment.
3) We will learn to develop a centralized Data Catalogue too using Serverless AWS Glue Engine.
4) We will learn to query data lake using Serverless Athena Engine build on the top of Presto and Hive.
5) We will learn to bridge the data warehouse and data lake using Serverless Amazon Redshift Spectrum Engine built on the top of Amazon Redshift platform.
6) We will learn to develop reports and dashboards, with a powerpoint like slideshow feature, and mobile support, without building any report server, by using Serverless Amazon QuickSight Reporting Engines.
7) We will finally learn how to source data from data warehouse, data lake, join data, apply row security, drill-down, drill-through and other data functions using the Serverless Amazon QuickSight Reporting Engines.
This course understands your time is important, and so the course is designed to be laser-sharp on lecture timings, where all the trivial details are kept at a minimum and focus is kept on core content for experienced AWS Developers / Architects / Administrators. By the end of this course, you can feel assured and confident that you are future-proof for the next change and disruption sweeping the cloud industry.
I am very passionate about AWS Serverless computing on Data and Analytics platform, and am covering A-to-Z of all the topics discussed in this course.
So if you are excited and ready to get trained on AWS Serverless Analytics platform, I am ready to welcome you in my class .
Instructor and Course Introduction
Pre-requisites - What you'll need for this course
Course Objectives
Course Content, Convention and Resources
Section Agenda
Learn about basics of Serverless Computing and which AWS Services fits into it
Learn basics of AWS Serverless Data Lake Architecture
Setup sample data on S3 buckets that would be used throughout this course
Configure S3 Storage Analytics
Introduction to Amazon Redshift
Develop Amazon Redshift Cluster
Install and setup SQL Client to work with Amazon Redshift
Load sample data in Redshift cluster
Learn AWS Glue Architecture with diagrams
Learn frequently used AWS Glue Terms and their meanings
Learn about different applications and features of AWS Glue
Learn internal architecture of AWS Glue
Learn about the cost economics of AWS Glue
Setup IAM Role and policies to use with AWS Glue
Learn about the networking concepts and settings required for AWS Glue
Configure network settings for AWS Glue
Learn about the concept of Data Catalog in AWS Glue
Learn to develop databases in AWS Glue
Learn to develop tables in AWS Glue
Develop tables manually in AWS Glue
Learn about the concept of Crawler in AWS Glue
Learn about the concept of classifiers in AWS Glue
Develop crawlers in AWS Glue - Lab 1
Develop crawlers in AWS Glue - Lab 2
Develop crawlers in AWS Glue - Lab 3
Develop crawlers in AWS Glue - Lab 4
Develop crawlers in AWS Glue - Lab 5
Develop crawlers in AWS Glue - Lab 6
Develop crawlers in AWS Glue - Lab 7
Learn to develop serverless ETL jobs with AWS Glue
Learn about different ETL job properties in AWS Glue
Learn to develop serverless ETL jobs with AWS Glue with Redshift as data source
Learn to develop Python scripts and properties for serverless ETL jobs using AWS Glue
Learn about built-in ETL Transformations in AWS Glue
Learn about Triggers in AWS Glue
Learn about AWS Glue Development Endpoints
Learn to install and setup Apache Zeppelin
Learn to install Git and setup Port Forwarding
Learn to integrate AWS Glue Development Endpoint with Apache Zeppelin Notebook
Learn monitoring options available for AWS Glue
AWS Glue supports timeout values for ETL Jobs
AWS Glue supports reading from Amazon DynamoDB Tables
AWS Glue provides additional ETL Job metrics
AWS Glue supports data encryption at rest
AWS Glue supports connecting Sagemaker notebooks to dev endpoints
AWS Glue supports resource based policies and permissions
AWS Glue introduces Python Shell Jobs which can be used for custom transformations and other generic tasks in ETL jobs
Download Source code AWS Glue Data Catalog Client - Hive Metastore
AWS Glue enables running Apache Spark SQL Queries
AWS Glue supports additional options for memory-intensive jobs
AWS Glue crawlers support existing Data Catalog tables as sources
AWS Glue enables continuous logging for Spark ETL Jobs
AWS Glue supports scripts compatible with Python 3.6 in Shell Jobs
AWS Glue provides workflows to orchestrate ETL workloads
AWS Glue supports running ETL Jobs on Spark 2.4.3 with Python 3
AWS Glue supports additional options for memory intensive jobs
AWS Glue supports bookmarking Parquet and ORC Files using ETL Jobs
Launch AWS Glue, EMR and Aurora Serverless Clusters in Shared VPCs
AWS Glue provides FindMatches ML Transform
AWS Glue releases binaries of Glue ETL libraries for Glue Jobs
AWS Glue provides Apache Spark UI to monitor Glue ETL Jobs
AWS Glue provides ability to rewind Spark ETL Job bookmarks
AWS Glue support FindMatches ML Transform on Spark 2.4.3 & Glue 1.0
AWS Glue supports bringing your own JDBC driver for Spark ETL Jobs
Glue adds new transforms - Purge, Transition and Merge
Glue supports reading & writing to DocumentDB & MongoDB Collection
AWS Glue supports new tables, update schema & partitions from Jobs
AWS Glue supports serverless streaming ETL
OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.
Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.
Find this site helpful? Tell a friend about us.
We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.
Your purchases help us maintain our catalog and keep our servers humming without ads.
Thank you for supporting OpenCourser.