We may earn an affiliate commission when you visit our partners.
Ahmad Alkilani

This course introduces how to build robust, scalable, real-time big data systems using a variety of Apache Spark's APIs, including the Streaming, DataFrame, SQL, and DataSources APIs, integrated with Apache Kafka, HDFS and Apache Cassandra.

Read more

This course introduces how to build robust, scalable, real-time big data systems using a variety of Apache Spark's APIs, including the Streaming, DataFrame, SQL, and DataSources APIs, integrated with Apache Kafka, HDFS and Apache Cassandra.

This course aims to get beyond all the hype in the big data world and focus on what really works for building robust, highly-scalable batch and real-time systems. In this course, Applying the Lambda Architecture with Spark, Kafka, and Cassandra, you'll string together different technologies that fit well and have been designed by some of the companies with the most demanding data requirements (such as Facebook, Twitter, and LinkedIn) to companies that are leading the way in the design of data processing frameworks, like Apache Spark, which plays an integral role throughout this course. You'll look at each individual component and work out details about their architecture that make them good fits for building a system based on the Lambda Architecture. You'll continue to build out a full application from scratch, starting with a small application that simulates the production of data in a stream, all the way to addressing global state, non-associative calculations, application upgrades and restarts, and finally presenting real-time and batch views in Cassandra. When you're finished with this course, you'll be ready to hit the ground running with these technologies to build better data systems than ever.

Enroll now

What's inside

Syllabus

Course Overview
A Modern Big Data Architecture
Batch Layer with Apache Spark
Speed Layer with Spark Streaming
Read more
Advanced Streaming Operations
Streaming Ingest with Kafka and Spark Streaming
Persisting with Cassandra

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Teaches concepts and techniques that appear throughout industry, such as Lambda Architecture, Apache Spark, and Kafka
Taught by Ahmad Alkilani, who is recognized for their expertise in big data architectures
Develops real-world production-grade skills in building big data systems
Instructs on the use of modern data technologies, such as Spark, Kafka, and Cassandra
Requires previous knowledge and experience in big data technologies, such as Spark, Kafka, and Cassandra
May require access to specialized software and tools, such as Apache Spark and Kafka

Save this course

Save Applying the Lambda Architecture with Spark, Kafka, and Cassandra to your list so you can find it easily later:
Save

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Applying the Lambda Architecture with Spark, Kafka, and Cassandra with these activities:
Review Spark and Kafka Concepts
Refresh your understanding of Apache Spark and Apache Kafka to ensure a smooth transition into the course materials.
Browse courses on Apache Spark
Show steps
  • Review Apache Spark's core concepts such as RDDs, transformations, and actions.
  • Go over the basics of Apache Kafka, including topics, partitions, and brokers.
Study 'High-Performance Spark'
Gain in-depth knowledge of Spark's architecture and optimization techniques by reading this comprehensive book.
Show steps
  • Read and understand the core concepts and principles of Spark's architecture.
  • Explore advanced optimization techniques to enhance the performance of Spark applications.
Design and Implement a Spark Streaming Application
Apply your skills to a practical project, designing and implementing a Spark Streaming application to process and analyze real-time data.
Browse courses on Real-Time Data Processing
Show steps
  • Identify a suitable real-time data source and define data processing requirements.
  • Design and implement a Spark Streaming application to ingest, process, and analyze the data.
  • Visualize and interpret the results of your analysis.
Show all three activities

Career center

Learners who complete Applying the Lambda Architecture with Spark, Kafka, and Cassandra will develop knowledge and skills that may be useful to these careers:
Big Data Architect
Big Data Architects are responsible for designing and developing big data systems. This course may be useful for Big Data Architects who want to learn how to use Apache Spark, Kafka, and Cassandra to build robust, scalable, real-time big data systems. The course covers a variety of topics, including the Lambda Architecture, batch processing with Apache Spark, real-time processing with Spark Streaming, and data persistence with Cassandra. This knowledge can help Big Data Architects design better big data systems and applications.
Data Architect
Data Architects are responsible for designing and developing data systems. This course may be useful for Data Architects who want to learn how to use Apache Spark, Kafka, and Cassandra to build robust, scalable, real-time data systems. The course covers a variety of topics, including the Lambda Architecture, batch processing with Apache Spark, real-time processing with Spark Streaming, and data persistence with Cassandra. This knowledge can help Data Architects design better data systems and applications.
Data Engineer
Data Engineers are responsible for designing, building, and maintaining data pipelines. This course may be useful for Data Engineers who want to learn how to build robust, scalable, real-time big data systems using Apache Spark, Kafka, and Cassandra. The course covers a variety of topics, including the Lambda Architecture, batch processing with Apache Spark, real-time processing with Spark Streaming, and data persistence with Cassandra. This knowledge can help Data Engineers build better data pipelines and applications.
Big Data Engineer
Big Data Engineers are responsible for building and maintaining big data systems. This course may be useful for Big Data Engineers who want to learn how to use Apache Spark, Kafka, and Cassandra to build robust, scalable, real-time big data systems. The course covers a variety of topics, including the Lambda Architecture, batch processing with Apache Spark, real-time processing with Spark Streaming, and data persistence with Cassandra. This knowledge can help Big Data Engineers build better big data systems and applications.
Systems Architect
Systems Architects are responsible for designing and developing systems. This course may be useful for Systems Architects who want to learn how to use Apache Spark, Kafka, and Cassandra to build robust, scalable, real-time systems. The course covers a variety of topics, including the Lambda Architecture, batch processing with Apache Spark, real-time processing with Spark Streaming, and data persistence with Cassandra. This knowledge can help Systems Architects design better systems and applications.
Software Engineer
Software Engineers are responsible for designing, developing, and maintaining software systems. This course may be useful for Software Engineers who want to learn how to build robust, scalable, real-time big data systems using Apache Spark, Kafka, and Cassandra. The course covers a variety of topics, including the Lambda Architecture, batch processing with Apache Spark, real-time processing with Spark Streaming, and data persistence with Cassandra. This knowledge can help Software Engineers build better data systems and applications.
Database Administrator
Database Administrators are responsible for managing and maintaining databases. This course may be useful for Database Administrators who want to learn how to use Apache Spark, Kafka, and Cassandra to build robust, scalable, real-time database systems. The course covers a variety of topics, including the Lambda Architecture, batch processing with Apache Spark, real-time processing with Spark Streaming, and data persistence with Cassandra. This knowledge can help Database Administrators build better database systems and applications.
Enterprise Architect
Enterprise Architects are responsible for designing and developing enterprise-wide systems. This course may be useful for Enterprise Architects who want to learn how to use Apache Spark, Kafka, and Cassandra to build robust, scalable, real-time enterprise-wide systems. The course covers a variety of topics, including the Lambda Architecture, batch processing with Apache Spark, real-time processing with Spark Streaming, and data persistence with Cassandra. This knowledge can help Enterprise Architects design better enterprise-wide systems and applications.
Technical Architect
Technical Architects are responsible for designing and developing technical systems. This course may be useful for Technical Architects who want to learn how to use Apache Spark, Kafka, and Cassandra to build robust, scalable, real-time technical systems. The course covers a variety of topics, including the Lambda Architecture, batch processing with Apache Spark, real-time processing with Spark Streaming, and data persistence with Cassandra. This knowledge can help Technical Architects design better technical systems and applications.
Solution Architect
Solution Architects are responsible for designing and developing solutions to business problems. This course may be useful for Solution Architects who want to learn how to use Apache Spark, Kafka, and Cassandra to build robust, scalable, real-time solutions to business problems. The course covers a variety of topics, including the Lambda Architecture, batch processing with Apache Spark, real-time processing with Spark Streaming, and data persistence with Cassandra. This knowledge can help Solution Architects design better solutions to business problems and applications.
Cloud Architect
Cloud Architects are responsible for designing and developing cloud-based systems. This course may be useful for Cloud Architects who want to learn how to use Apache Spark, Kafka, and Cassandra to build robust, scalable, real-time cloud-based systems. The course covers a variety of topics, including the Lambda Architecture, batch processing with Apache Spark, real-time processing with Spark Streaming, and data persistence with Cassandra. This knowledge can help Cloud Architects design better cloud-based systems and applications.
Software Architect
Software Architects are responsible for designing and developing software systems. This course may be useful for Software Architects who want to learn how to use Apache Spark, Kafka, and Cassandra to build robust, scalable, real-time software systems. The course covers a variety of topics, including the Lambda Architecture, batch processing with Apache Spark, real-time processing with Spark Streaming, and data persistence with Cassandra. This knowledge can help Software Architects design better software systems and applications.
Data Analyst
Data Analysts are responsible for analyzing data to identify trends and patterns. This course may be useful for Data Analysts who want to learn how to use Apache Spark, Kafka, and Cassandra to build real-time data analytics applications. The course covers a variety of topics, including the Lambda Architecture, batch processing with Apache Spark, real-time processing with Spark Streaming, and data persistence with Cassandra. This knowledge can help Data Analysts build better data analytics applications and make better decisions based on data.
Machine Learning Engineer
Machine Learning Engineers are responsible for designing, developing, and deploying machine learning models. This course may be useful for Machine Learning Engineers who want to learn how to use Apache Spark, Kafka, and Cassandra to build real-time machine learning applications. The course covers a variety of topics, including the Lambda Architecture, batch processing with Apache Spark, real-time processing with Spark Streaming, and data persistence with Cassandra. This knowledge can help Machine Learning Engineers build better machine learning applications and make better decisions based on data.
Data Scientist
Data Scientists are responsible for using data to solve business problems. This course may be useful for Data Scientists who want to learn how to use Apache Spark, Kafka, and Cassandra to build real-time data science applications. The course covers a variety of topics, including the Lambda Architecture, batch processing with Apache Spark, real-time processing with Spark Streaming, and data persistence with Cassandra. This knowledge can help Data Scientists build better data science applications and make better decisions based on data.

Reading list

We've selected 11 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Applying the Lambda Architecture with Spark, Kafka, and Cassandra.
Provides a comprehensive overview of Apache Kafka, covering both the basics and advanced topics. It valuable resource for anyone who wants to learn more about Kafka or use it in their own projects.
Provides a comprehensive overview of Apache Spark, covering both the basics and advanced topics. It valuable resource for anyone who wants to learn more about Spark or use it in their own projects.
Provides a comprehensive overview of real-time data analytics with Java. It covers the basics of real-time data analytics, as well as how to implement it using Apache Spark, Apache Kafka, and Apache Cassandra.
Provides a comprehensive overview of big data analytics with Java. It covers the basics of big data analytics, as well as how to implement it using Apache Spark, Apache Kafka, and Apache Cassandra.
Provides a comprehensive overview of deep learning with Spark. It covers the basics of deep learning, as well as how to implement it using Apache Spark.
Provides a comprehensive overview of natural language processing with Spark. It covers the basics of natural language processing, as well as how to implement it using Apache Spark.
Provides a comprehensive overview of computer vision with Spark. It covers the basics of computer vision, as well as how to implement it using Apache Spark.
Provides a comprehensive overview of distributed systems. It covers the basics of distributed systems, as well as how to implement them using various tools and technologies.
Provides a deep dive into the internals of Spark and how to optimize it for performance. It valuable resource for anyone who wants to get the most out of Spark.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Applying the Lambda Architecture with Spark, Kafka, and Cassandra.
Processing Streaming Data Using Apache Spark Structured...
Most relevant
Storing and Managing Data with Redis and Apache Kafka on...
Most relevant
Apache Kafka - An Introduction
Most relevant
Kafka Connect Fundamentals
Most relevant
Kafka Fundamentals
Most relevant
Streaming API Development and Documentation
Most relevant
Kafka Integration with Storm, Spark, Flume, and Security
Most relevant
Migrating an application and data from Apache Cassandra™...
Most relevant
Apache Kafka Series - Learn Apache Kafka for Beginners v3
Most relevant
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser