Sorry, this page is no longer available
We may earn an affiliate commission when you visit our partners.
Course image
Rafael Lopes and Morgan Willis

Designing a data lake is challenging because of the scale and growth of data. Developers need to understand best practices to avoid common mistakes that could be hard to rectify. In this course we will cover the foundations of what a Data Lake is, how to ingest and organize data into the Data Lake, and dive into the data processing that can be done to optimize performance and costs when consuming the data at scale. This course is for professionals (Architects, System Administrators and DevOps) who need to design and build an architecture for secure and scalable Data Lake components. Students will learn about the use cases for a Data Lake and, contrast that with a traditional infrastructure of servers and storage.

What's inside

Learning objectives

  • Where to start with a data lake?
  • How to build a secure and scalable data lake?
  • What are the common components of a data lake?
  • Why do you need a data lake and what it's value?

Syllabus

Week 1: Hello World, I mean, Hello Data Lakes!
Video: Meet the Instructors
Video: Introduction to Week 1
Video: Why Data Lakes?
Read more

Traffic lights

Read about what's good
what should give you pause
and possible dealbreakers
Examines data lake foundations, ingestion, and data processing necessary to optimize performance and costs
Taught by Morgan Willis and Rafael Lopes, instructors with experience in data lake design and architecture
Introduces best practices for avoiding common mistakes in data lake design
Provides hands-on experience through labs, such as ingesting web logs and creating an end-to-end data lake with AWS services
Covers various AWS data services, including S3, Glue Data Catalog, Kinesis, EMR, Glue Jobs, and LakeFormation
Suitable for professionals, including architects, system administrators, and DevOps engineers, who need to design and build data lake components

Save this course

Create your own learning path. Save this course to your list so you can find it easily later.
Save

Reviews summary

Practical aws data lake design

According to students, this course provides a solid foundation for understanding Data Lakes on AWS, particularly for professionals. Learners commend the comprehensive coverage of key AWS services, including S3, Glue, Athena, and LakeFormation. The valuable hands-on labs and helpful demonstrations are highlighted as instrumental in solidifying theoretical concepts and providing practical application. While it offers a clear and well-structured introduction to building secure and scalable data lake architectures, some learners note that its introductory nature means it prioritizes breadth over deep dives into every service or complex optimization techniques. Prior AWS familiarity is beneficial for the best learning experience.
Provides a clear and solid understanding of data lake fundamentals.
"This course provided me with an incredibly solid foundation for understanding what a data lake is and its components."
"The comparison to data warehouses was a great starting point for clarifying data storage concepts."
"The instructors explain complex topics very well, and the sequence of topics builds knowledge effectively."
Covers a wide array of relevant AWS services for Data Lakes.
"The coverage of various AWS services like S3, Glue, Athena, and LakeFormation was comprehensive for a beginner."
"It demystified Data Lakes and showed how to build a secure and scalable architecture using AWS tools."
"I gained a great overview of the main AWS services required for a data lake implementation."
Offers excellent hands-on experience through practical labs and demos.
"The hands-on labs were especially valuable, bringing the concepts to life and helping me apply what I learned."
"The demos were very helpful in seeing how various AWS services connect and work together for data processing."
"I found that the practical exercises helped me solidify my understanding of building and managing a data lake."
Some prior AWS experience is beneficial for optimal learning.
"A strong starting point. I'd recommend having some prior AWS experience, as it doesn't spend much time on foundational AWS concepts."
"While introductory, I found that having a basic understanding of core AWS services beforehand improved my learning experience significantly."
"I would recommend some basic AWS knowledge coming into this course to fully grasp the nuances of the services presented."
Offers breadth over depth, suitable as an introduction.
"Good overall introduction. The breadth of AWS services covered is impressive, but it felt a bit rushed at times, not diving deep enough."
"It's an 'introduction' as promised, so it doesn't go very deep; I felt I needed to do extra reading for practical implementation."
"The course felt more like a service catalog walkthrough than a truly in-depth architectural design guide."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Introduction to Designing Data Lakes on AWS with these activities:
Review SQL for Data Analytics
Refresh your knowledge of SQL to enhance your ability to query and analyze data stored in the Data Lake.
Browse courses on Data Analysis
Show steps
  • Review online tutorials on SQL for data analytics
  • Practice writing SQL queries using an online SQL editor
Revisit concepts of data modeling
Refresh your understanding of data modeling to improve your ability to design and implement an effective Data Lake schema.
Browse courses on Data Modeling
Show steps
  • Review your notes from a previous data modeling course or tutorial
  • Read articles or blog posts about data modeling best practices
Organize and review course materials
Organize and review course materials to enhance your understanding and retention of key concepts.
Show steps
  • Create a dedicated folder or notebook for course materials
  • Organize materials by topic or module
  • Regularly review and summarize key concepts
Two other activities
Expand to see all activities and additional details
Show all five activities
Attend a virtual study group
Join a virtual study group to connect with classmates, discuss course concepts, and enhance your learning experience.
Show steps
  • Find a virtual study group or create your own
  • Meet regularly with your study group to review course material, work on assignments, and prepare for exams
Practice data ingestion techniques
Practice with different data ingestion tools and techniques to improve your understanding of how data is brought into a Data Lake.
Browse courses on Data Ingestion
Show steps
  • Review the different data ingestion methods supported by AWS
  • Experiment with ingesting data using AWS Kinesis Services
  • Explore the use of AWS Transfer Family for batch data ingestion

Career center

Learners who complete Introduction to Designing Data Lakes on AWS will develop knowledge and skills that may be useful to these careers:
Data Engineer
Data Engineers design, build, and maintain data pipelines and systems. They may also work on data quality and data governance. A background in data lakes is essential for Data Engineers, as they are often responsible for managing and processing data in data lakes. This course can help aspiring Data Engineers build a solid foundation in designing and managing data lakes on AWS.
Data Warehouse Engineer
Data Warehouse Engineers design, build, and maintain data warehouses. They may also work on data lakes. They ensure that data is stored and processed efficiently, and that data is accessible to users. This course may be useful for aspiring Data Warehouse Engineers as it covers the basics of designing and managing data lakes, as well as data processing and analytics.
Data Analyst
Data Analysts collect, clean, and analyze data to identify trends and patterns. They may use data lakes to store and process large amounts of data. This course may be useful for aspiring Data Analysts as it covers the basics of designing and managing data lakes, as well as data processing and analytics.
Data Governance Analyst
Data Governance Analysts develop and implement data governance policies and procedures. They may also work on data lakes to ensure that data is managed in a consistent and compliant manner. This course may be useful for aspiring Data Governance Analysts as it covers the basics of designing and managing data lakes, as well as data security and compliance.
Data Scientist
Data Scientists use data to solve business problems. They may use data lakes to store and process large amounts of data. This course may be useful for aspiring Data Scientists as it covers the basics of designing and managing data lakes, as well as data processing and analytics.
Security Analyst
Security Analysts identify and mitigate security risks. They may work on data lakes to ensure that data is secure and compliant with regulations. This course may be useful for aspiring Security Analysts as it covers the basics of designing and managing data lakes, as well as data security and compliance.
Business Analyst
Business Analysts use data to solve business problems. They may use data lakes to store and process large amounts of data. This course may be useful for aspiring Business Analysts as it covers the basics of designing and managing data lakes, as well as data processing and analytics.
Product Manager
Product Managers oversee the development and launch of new products. They may work on data lakes to ensure that data is flowing smoothly between different systems and that products are meeting customer needs. This course may be useful for aspiring Product Managers as it covers the basics of designing and managing data lakes, as well as data processing and analytics.
Project Manager
Project Managers oversee the development and execution of projects. They may work on data lakes to ensure that data is flowing smoothly between different systems and that projects are completed on time and within budget. This course may be useful for aspiring Project Managers as it covers the basics of designing and managing data lakes, as well as data processing and analytics.
Cloud Architect
Cloud Architects design and manage cloud computing systems. This may include designing and managing data lakes. They ensure that cloud systems are running smoothly and that data is secure and accessible. This course may be useful for aspiring Cloud Architects as it covers the fundamentals of designing and managing data lakes on AWS.
DevOps Engineer
DevOps Engineers bridge the gap between development and operations teams. They may work on data lakes to ensure that data is flowing smoothly between different systems. This course may be useful for aspiring DevOps Engineers as it covers the basics of designing and managing data lakes, as well as data processing and analytics.
Systems Engineer
Systems Engineers design, build, and maintain computer systems. This may include designing and managing data lakes. They ensure that systems are running smoothly and that data is secure and accessible. This course may be useful for aspiring Systems Engineers as it covers the fundamentals of designing and managing data lakes, as well as data security and compliance.
Software Engineer
Software Engineers design, build, and maintain software applications. This may include designing and managing data lakes. They ensure that applications are running smoothly and that data is secure and accessible. This course may be useful for aspiring Software Engineers as it covers the fundamentals of designing and managing data lakes, as well as data security and compliance.
Database Administrator
Database Administrators manage and maintain databases. This may include managing data lakes. They ensure that databases are running smoothly and that data is secure and accessible. This course may be useful for aspiring Database Administrators as it covers the fundamentals of designing and managing data lakes, as well as data security and compliance.
Data Architect
Data Architects create and manage the architecture of data systems. This may include designing and managing data lakes. They plan and design data management systems, ensuring that data is accessible, secure, and compliant with regulations. This course may be useful for aspiring Data Architects as it covers the fundamentals of designing and building data lakes, including data ingestion, organization, and processing.

Reading list

We've selected six books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Introduction to Designing Data Lakes on AWS.
Provides a comprehensive guide to designing and building data-intensive applications, including discussions on data lakes. Useful for understanding the broader context of data lake design and implementation.
Provides a comprehensive guide to designing and implementing a data mesh architecture, which is an alternative approach to data lake architectures. Useful for understanding the pros and cons of different data lake architectures.
Provides a comprehensive guide to data governance, a critical aspect of data lake management. Useful for data governance professionals and stakeholders involved in ensuring the quality and integrity of data lakes.
Provides a comprehensive overview of data lakes, including their benefits, challenges, and best practices. Useful for beginners who need a foundational understanding of data lakes.
Provides a practical guide to big data analytics, including discussions on data lakes. Useful for beginners who need a general understanding of big data analytics and its applications.
Provides a practical guide to using MapReduce for data-intensive text processing, a common task in data lake environments. Useful for data engineers and analysts involved in processing and analyzing large amounts of text data.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Similar courses are unavailable at this time. Please try again later.
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2025 OpenCourser