We may earn an affiliate commission when you visit our partners.
Course image
Morgan Willis and Rafael Lopes

In this class, Introduction to Designing Data Lakes on AWS, we will help you understand how to create and operate a data lake in a secure and scalable way, without previous knowledge of data science! Starting with the "WHY" you may want a data lake, we will look at the Data-Lake value proposition, characteristics and components.

Read more

In this class, Introduction to Designing Data Lakes on AWS, we will help you understand how to create and operate a data lake in a secure and scalable way, without previous knowledge of data science! Starting with the "WHY" you may want a data lake, we will look at the Data-Lake value proposition, characteristics and components.

Designing a data lake is challenging because of the scale and growth of data. Developers need to understand best practices to avoid common mistakes that could be hard to rectify. In this course we will cover the foundations of what a Data Lake is, how to ingest and organize data into the Data Lake, and dive into the data processing that can be done to optimize performance and costs when consuming the data at scale. This course is for professionals (Architects, System Administrators and DevOps) who need to design and build an architecture for secure and scalable Data Lake components. Students will learn about the use cases for a Data Lake and, contrast that with a traditional infrastructure of servers and storage.

Enroll now

What's inside

Syllabus

Week 1
Welcome to the course! In Week 1, you'll discover why you may want a Data Lake, its characteristics and components, and how it compares to other data data scenarios, such as databases and data warehouses.
Read more
Week 2
In Week 2, you'll build on your knowledge of what data lakes are and why they may be a solution for your needs. You'll explore AWS services that can be used in data lake architectures, like Amazon S3, AWS Glue, Amazon Athena, Amazon Elasticsearch Service, LakeFormation, Amazon Rekognition, API Gateway and other services used for data movement, processing and visualization.
Week 3
In Week 3, you'll explore specifics of data cataloging and ingestion, and learn about services like AWS Transfer Family, Amazon Kinesis Data Streams, Kinesis Firehose, Kinesis Analytics, AWS Snow Family, AWS Glue Crawlers, and others. You'll also discover when is the right time to process data--before, after, or while data is being ingested. Given scenarios, you'll be able to easily identify when to process data and match the most appropriate AWS services to each scenario.
Week 4
In Week 4, you are going to dive deeper into data optimization and data processing. Demos around best practices will show you how to optimize your dataset for performance and cost--just by using the right tool for the job! You will also discover data security, data visualization tools, and AWS datasets you can use to experiment and get started.

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Suitable for beginners looking to build a solid foundation in designing data lakes on AWS
Taught by experienced instructors Morgan Willis and Rafael Lopes
Covers the full lifecycle of data lake design, from ingestion to optimization and visualization
Emphasizes best practices and industry standards for data lake architecture
Provides ample hands-on practice through demos and exercises

Save this course

Save Introduction to Designing Data Lakes on AWS to your list so you can find it easily later:
Save

Reviews summary

Highly rated introductory course

Learners say this course is an excellent introduction to data lakes on AWS. They praise the great instructors, practical labs, and well-organized content. Many reviewers note that the course is largely positive but some mention technical issues and a lack of advanced content.
High praise for instructors.
"Excellent instructors..."
"The best aws course is taught by two excellent instructors..."
"I really enjoyed this course. The instructors are great."
Overwhelmingly positive feedback.
"Excellent course!"
"One of the best Cloud Courses out there."
"The best aws course..."
"This whole specialization has been one of the best I've taken in Coursera."
"Good course providing lots of content and concepts..."
Valuable hands-on experience.
"The labs were good for getting the message across. "
"Great course, good tutors, labs..."
"All topics are explained very well and the assignments and exercises were also awesome."
Some desire more advanced content.
"It is a good one. However, I had some problems accessing the exercises..."
"Very good class. Except the problem in Lab2 of week 3. Author should fix it!"
"If you are working with data...this is the course for you."
Occasional technical difficulties.
"Sadly the lab 2 is outdated..."
"The system didn't allow me to work on the labs for weeks 3 and 4."
"Missing Lab's during the course..."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Introduction to Designing Data Lakes on AWS with these activities:
Review AWS Data Lake Concepts
Review the foundational concepts of data lakes and AWS-specific data lake components.
Browse courses on Data Lakes
Show steps
  • Read Chapters 1-3 of the 'Amazon S3 Tutorial'
  • View the 'AWS Data Lakes on Cloud' YouTube Video
  • Complete the 'AWS Storage Services Quiz'
AWS Data Lake Resources Compilation
Curate and organize a list of helpful articles, tutorials, and code examples for AWS data lake architecture, design, and implementation.
Browse courses on Data Lakes
Show steps
  • Search for and identify relevant blog posts
  • Find and bookmark useful tutorials and documentation
  • Locate and collect code examples from GitHub or other platforms
  • Organize the compilation using a tool like Evernote or Notion
AWS Data Lake Workshop
Attend an AWS Data Lake workshop to gain expert insights, hands-on experience, and best practices for data lake design and implementation.
Browse courses on Data Lakes
Show steps
  • Research and identify relevant workshops
  • Register for and attend the workshop
  • Actively participate in the workshop activities and discussions
  • Apply the knowledge and skills gained to your own data lake projects
Four other activities
Expand to see all activities and additional details
Show all seven activities
AWS Data Lake Design Scenarios
Analyze real-world design scenarios involving AWS data lakes and practice applying design principles to solve data management challenges.
Browse courses on Data Lakes
Show steps
  • Review the 'AWS Data Lake Architecture Blog' for use cases
  • Develop a list of potential data lake design scenarios
  • Create a solution for each scenario using AWS data lake services
  • Share your solutions with peers or online communities for feedback
AWS Data Lake Mini-Project
Build a small-scale, end-to-end AWS data lake solution to gain hands-on experience with data ingestion, processing, and analysis.
Browse courses on Data Lakes
Show steps
  • Define the project scope and objectives
  • Gather and prepare the data
  • Design and implement the data lake architecture
  • Process and analyze the data
  • Visualize and present the results
AWS Data Lake Architecture Proposal
Develop a comprehensive architectural proposal for an AWS data lake solution, including design, implementation, and data management strategies.
Browse courses on Data Lakes
Show steps
  • Identify the business requirements and data sources
  • Design the data lake architecture using AWS services
  • Develop a data ingestion and processing pipeline
  • Outline a data security and governance plan
  • Present the proposal to stakeholders for feedback and approval
AWS Data Lake Best Practices Blog Post
Write a blog post or article summarizing the best practices and lessons learned from designing and implementing AWS data lake solutions.
Browse courses on Data Lakes
Show steps
  • Research and gather information on AWS data lake best practices
  • Outline the key points and structure of the blog post
  • Write and edit the blog post
  • Publish the blog post on a relevant platform
  • Promote and share the blog post to reach a wider audience

Career center

Learners who complete Introduction to Designing Data Lakes on AWS will develop knowledge and skills that may be useful to these careers:
Data Lake Engineer
Data lake engineers are responsible for designing, building, and maintaining data lakes. They work closely with data architects and data scientists to ensure that data lakes meet the needs of the business. This course may be particularly helpful for aspiring data lake engineers who want to learn how to design and build data lakes on AWS, as well as how to use AWS services for data ingestion, processing, and optimization.
Data Engineer
Data engineers are responsible for building and maintaining data pipelines that move data between different systems. They work closely with data architects to design and implement data solutions that meet the needs of the business. This course may be particularly helpful for aspiring data engineers who want to learn how to design and build data lakes on AWS, as well as how to use AWS services for data ingestion, processing, and optimization.
Big Data Architect
Big data architects are responsible for designing and managing the architecture of big data systems. They work closely with data scientists and data engineers to ensure that big data systems meet the needs of the business. This course may be helpful for aspiring big data architects who want to learn how to design and build data lakes on AWS.
Data Architect
Data architects are accountable for planning, designing, creating, and managing the architecture of big data systems. They combine their deep understanding of data storage, data security, and data governance principles with knowledge of business needs. This course may assist aspiring data architects by helping them comprehend data lake architecture and its characteristics, as well as best practices for data ingestion, processing, and optimization, all of which are crucial components of big data systems.
Data Warehouse Engineer
Data warehouse engineers are responsible for designing, building, and maintaining data warehouses. They work closely with data architects and data analysts to ensure that data warehouses meet the needs of the business. This course may be helpful for aspiring data warehouse engineers who want to learn how to design and build data lakes on AWS.
Data Security Analyst
Data security analysts are responsible for protecting data from unauthorized access, use, disclosure, disruption, modification, or destruction. They work with data architects, data engineers, and data scientists to ensure that data is secure and compliant with all applicable laws and regulations. This course may be helpful for aspiring data security analysts who want to learn how to secure data lakes.
Data Governance Analyst
Data governance analysts are responsible for developing and implementing data governance policies and procedures. They work with data architects, data engineers, and data scientists to ensure that data is used in a consistent and ethical manner. This course may be helpful for aspiring data governance analysts who want to learn how to design and implement data governance policies for data lakes.
Data Privacy Analyst
Data privacy analysts are responsible for protecting the privacy of personal data. They work with data architects, data engineers, and data scientists to ensure that data is used in a compliant and ethical manner. This course may be helpful for aspiring data privacy analysts who want to learn how to protect the privacy of data in data lakes.
Cloud Architect
Cloud architects are responsible for designing and managing cloud computing systems. They work with data architects and data engineers to ensure that cloud computing systems are available, reliable, and secure. This course may be helpful for aspiring cloud architects who want to learn how to design and manage data lakes in the cloud.
Software Engineer
Software engineers are responsible for designing, developing, and maintaining software applications. They work with data architects and data engineers to ensure that software applications are available, reliable, and secure. This course may be helpful for aspiring software engineers who want to learn how to develop software applications that use data lakes.
Data Analyst
Data analysts use their knowledge of statistics and data analysis techniques to extract insights from data. They work on a variety of projects, such as identifying trends, developing reports, and creating presentations. This course may be helpful for aspiring data analysts who want to learn how to use data lakes to store and process large datasets.
Data Scientist
Data scientists use their knowledge of mathematics, statistics, and computer science to extract insights from data. They work on a variety of projects, such as developing predictive models, identifying trends, and creating visualizations. This course may be helpful for aspiring data scientists who want to learn how to use data lakes to store and process large datasets.
Systems Administrator
Systems administrators are responsible for managing and maintaining computer systems. They work with data architects and data engineers to ensure that computer systems are available, reliable, and secure. This course may be helpful for aspiring systems administrators who want to learn how to manage and maintain data lakes.
Network Administrator
Network administrators are responsible for managing and maintaining computer networks. They work with data architects and data engineers to ensure that computer networks are available, reliable, and secure. This course may be helpful for aspiring network administrators who want to learn how to manage and maintain data lakes.
Database Administrator
Database administrators are responsible for managing and maintaining databases. They work with data architects and data engineers to ensure that databases are available, reliable, and secure. This course may be helpful for aspiring database administrators who want to learn how to manage and maintain data lakes.

Reading list

We've selected six books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Introduction to Designing Data Lakes on AWS.
Provides a comprehensive guide to designing data-intensive applications, including data lakes. It covers the key challenges of data management at scale, as well as best practices for data modeling, processing, and storage.
Provides a practical guide to using Java for big data analytics, including data lakes. It covers the key APIs and libraries for data ingestion, processing, and storage, as well as best practices for developing and deploying data analytics applications.
Provides a non-technical introduction to data lakes, including their benefits and challenges. It covers the key components of a data lake, as well as best practices for data governance and security.
Data Analytics for Beginners beginner-friendly textbook that provides a great starting point for those with little to no background in data analytics. It may be helpful for students new to the field who wish to gain foundational knowledge before delving into the course content
Provides an introduction to data science for business professionals. It covers the key concepts and techniques of data science, such as data mining, machine learning, and data visualization.
Machine Learning for Dummies beginner-friendly introduction to machine learning. It covers the key concepts and techniques of machine learning, such as supervised learning, unsupervised learning, and reinforcement learning.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Introduction to Designing Data Lakes on AWS.
Introduction to Designing Data Lakes on AWS
Most relevant
Implement Data Auditing with Azure Data Lake
Implement Security on Azure Data Lakes
Scale and Deploy LLMs in Production Environments
Firebase Authentication 7 and Cloud Storage
Configuring Microsoft Azure Data Infrastructure Security
Delta Lake with Azure Databricks: Deep Dive
Mastering AWS Glue, QuickSight, Athena & Redshift Spectrum
Node.js Microservices: Advanced Topics and Best Practices
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser