Note (2024) : All the codes are updated with latest Flink version.
Apache Flink is the successor to Hadoop and Spark. It is the next generation Big data engine for Stream processing. If Hadoop is 2G, Spark is 3G then Apache Flink is the 4G in Big data stream processing frameworks. Actually Spark was not a true Stream processing framework, it was just a makeshift to do it but Apache Flink is a TRUE Streaming engine with added capacity to perform Batch, Graph, Table processing and also to run Machine Learning algorithms.
Note (2024) : All the codes are updated with latest Flink version.
Apache Flink is the successor to Hadoop and Spark. It is the next generation Big data engine for Stream processing. If Hadoop is 2G, Spark is 3G then Apache Flink is the 4G in Big data stream processing frameworks. Actually Spark was not a true Stream processing framework, it was just a makeshift to do it but Apache Flink is a TRUE Streaming engine with added capacity to perform Batch, Graph, Table processing and also to run Machine Learning algorithms.
Apache Flink is the latest Big data technology and is rapidly gaining momentum in the market. It is assumed that same like Apache Spark replaced Hadoop, Flink can also replace Spark in the coming near future.
Demand of Flink in market is already swelling. Many big companies from multiple Industry domains have already started using Apache Flink to process their Real-time Big data and thousands other are diving into.
What's included in the course ?
Complete Apache Flink concepts explained from Scratch to Real-Time implementation.
Each and Every Apache Flink concept is explained with a HANDS-ON Flink code of it.
Includes even those concepts, the explanation to which is not very clear even in Flink official documentation.
For Non-Java developer's help, All Flink Java codes are explained line by line in such a way that even a non -technical person can understand.
Flink codes and Datasets used in lectures are attached in the course for your convenience.
All the codes are updated with latest Flink version.
Implement 3 Real-time Case Studies using Flink.
This is the pilot lecture to get you familiar with Flink. The video will explain What is Apache Flink and what functionalities it provides.
This lecture will tell you the difference between stream processing and batch processing.
A lecture on difference between Hadoop and streaming technologies i.e Spark and Flink. This will also explain the similarities in Spark and Flink
What is the difference between Spark and Flink. How Flink is better than Spark.
This video explains the architecture of Apache Flink. What different APIs flink provides for Batch, Stream, Graph, Table processing. It explains the full ecosystem of Apache Flink.
Learn Apache Flink's programming model. You will see how to fit a Flink program in its architecture.
Install Flink in your local system
This lecture shows line to line explanation of program Word Count of Names starting with N while explaining the map operation, flatmap, filter, various data source functions, groupby(), sum etc.
This video shows How to perform Inner Join using Flink. Flink provides a join operation to do so.
In this video you will see how to perform Left outer join, Right Outer Join and Full Outer Join using Flink.
Join Hints is a Exclusive feature of Flink. By passing some Enumeration constants we can tell Flink which Join it has to perform. Flink provides us with 6 Join Hints.
There are various types of Data Sources and Data Sinks in Datastream API. In this lecture we will see those sources and sinks methods and learn what type of data they read and in what manner.
This lecture is the pilot lecture for Apache flink's datastream Api programs. The first program is a basic program i.e. Word count of names starting with N. The code will you the similarities and differences of Dataset and Datastream API Flink program.
Reduce method is applied on keyed streams. It will aggregate all the elements of a key.
Fold operation of Apache Flink is same like reduce operation only, just the difference is that unlike reducefunction interface fold interface can take different input and output type parameters.
Apache Flink has provided the general Aggreagation operations like min(), minBy(), Max(), MaxBy(), Sum()
Split operator of Apache Flink's Datastream API is used to split the incoming stream of data into 2 streams. It uses a select method to select data from SplitStream.
Iterate operator will iterate over the data stream again and again until it reaches to a desired output.
This is the first Introductory lecture to the section of windows. Windowing is a crucial concept of Apache Flink. You will learn various types of built-in windows provided by Flink and how to code it in a program throughout the section.
There are 2 tyoes of window assigners for windows in Apache Flink.
window()
windowAll()
windowAll() for non keyed streams and window() for keyed stream.
There are various time Notions of windows in Apache Flink. Processing time, Event time, Ingestion time.
Tumbling window is a time based window. It can be created using processing and event time notions. This video shows how to implement tumbling windows in a Flink program.
Sliding window is a time based window. It can be created using processing and event time notions. This video shows how to implement sliding windows in a Flink program.
This video explains how to implement Session Windows in Apache Flink program.
This video explains how to implement Global Windows in Apache Flink program.
With every window a trigger is attached which will ask the window to start processing. There are few default built-in triggers provided to us by Apache Flink but we can also create our own triggers by overriding few methods of trigger interface.
Evictors are the components which allows us to keep only selected elements in a window.
What is a watermark in Apache Flink
This lecture explains How actually to create watermarks for a Window in Flink. This lecture will explain the method assignTimestampsandWatermarks.
Flink provides us a fault tolerance to its applications. Means upon any node failures the app can be restored exactly from the same point where it failed.
Flink provides Fault tolerance using State and checkpointing. So this is the first lecture which explains what is a State in flink.
Flink does not do checkpointing on regular intervals of time or when some amount of data is processed, Apache Flink does checkpointing based on Asynchronus Barrier Snapshoting algorithm.
Incremental checkpointing is a new feature in Apache Flink. It was included form flink 1.3. It gives us better performance than conventional checkpointing.
States can categorized into 2 types
Operator State - Managed operator state and Raw operator state
Keyed State - Managed keyed state and Raw keyed state
What is Value State in Flink and how to implement it in a Flink program.
What is List State in Flink and how to implement it in a Flink program.
What is Reducing State in Flink and how to implement it in a Flink program.
Managed operator State in Flink and How to code it in a flink program
This lecture is dedicated to teach you how to perform checkpointing in a flink program . It also includes various restart strategies carried out by Flink.
This lecture will show how to implement Broadcast State in a Flink program
Queryable state concept is still in Beta version of Apache Flink and is daily evolving. If we set our managed keyed state as queryable then it allows the non flink programs to access a state.
Live Twitter data can be used to generate Insights in real-time. Twitter provides data through APIs. We can access it using security tokens. This lecture deals with How to ingest Twitter data in Apache Flink.
This lecture shows How to integrate Apache Kafka with Apache Flink.
A real time use case of twitter analysis in healthcare domain where by using Apache flink a healthcare company wants to check from which devices how many users are posting tweets regarding pollution .
Stock Real-Time Data Processing using Flink
Stock Real-Time Data Processing using Flink
Flink has introduced 2 Relational APIs for table processing. These are Table API and Sql API
This lecture will make you understand How to create and register a table in FLink using its Relational APIs.
An Example to show the implementation on how we write queries in Flink using Table and Sql API.
A graph is a ordered set of Edges and Vertices.
In this video you will learn how using Gelly API of Apache Flink you can do graph processing. In the use explained in the lecture we are finding out friends of friends of a person.
OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.
Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.
Find this site helpful? Tell a friend about us.
We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.
Your purchases help us maintain our catalog and keep our servers humming without ads.
Thank you for supporting OpenCourser.