Mastering Apache Flink | Integration with Hadoop | Yarn | Kafka

Duration: Hours

Training Mode: Online


Apache Flink is a distributed processing engine and a scalable data analytics framework. We can use Flink to process data streams at a large scale.

1). Demystify Scala :

a). Introduction to Scala

b). Setup, Installation, and configuration of Scala

c). Develop and execute Scala Programs

d). Scala operators and features

e). Different Functions, procedures, and Anonymous functions

f). Deep dive into Scala APIs

g). Collections Array, Map, Lists, Tuples and Loops

h). Advanced operations – Pattern matching

i). Eclipse IDE with Scala

2). Object Oriented and Functional Programming :

a). Object oriented programming

b). Oops concepts

c). Constructor, getter, setter, singleton, overloading and overriding

d). Type Inference, Implicit Parameters, Closures

e). Lists, Maps and Map Operations

f). Nested Classes, Visibility Rules

g). Functional Structures

h). Functional programming constructs

3). Introduction :

a). Learn What and why

b). Understand Features of Apache Flink

c). Architecture and Flink design principles

d). Work of master process – Job Manager

e). Role of worker process – Task Manager

f). Workers, Slots and Resources

g). Overview of Apache Flink APIs

h). Understand difference between Apache Spark and to learn Flink vs Spark.

4). Master Flink Stack :

a). Distributed Streaming Dataflow at Runtime with Flink

b). APIs

c). Libraries

d). Data Flow in Apache Flink

e). Fault tolerance

5). Setup and Installation of single node Flink

a). Setup environment and pre-requisites

b). Installation and configuration of Flink on single node

c). Troubleshooting the encountered problems

6). Setup and Installation of multi node Flink cluster and Cloud :

a). Setup environment on Cloud

b). Install pre-requisites on all nodes

c). Deploy on cluster and Cloud

d). Play with Flink in cluster mode

7). Master Data Stream API for Unbounded Streams :

a). Introduction to Flink DataStream API

b). Different DataStream Transformations in Flink

c). Various Data Sources – File based, Socket based, Collection based, Custom

d). Responsibility of Data Sink

e). Iterations in DataStream APIs

f). DataStream Execution Parameters – Fault tolerance, Controlling Latency

8). Learn Flink DataSet APIs for Static Data :

a). Overview of DataSet APIs in Flink

b). Various DataSet Transformations in Flink

c). Different Data Sources – File based, Collection based, Generic

d). Responsibility of Data Sink in Flink DataSet APIs

e). Iteration Operators in DataSet APIs

f). Operating on Data Objects in Functions – Object Reuse Disabled/Enabled

9). Play with Flink Table APIs and SQL Beta :

a). Registering Tables in Flink

b). Table Access and various Table API operators in Flink

c). SQL on batch tables and Streaming Tables

d). Writing Flink Tables to external sinks

10). Apache Flink Libraries :

a). Overview of Flink Libraries

b). Flink CEP – Complex Event Processing library

c). Apache Flink Machine Learning library

d). Apache Flink Gelly -Graph processing API and Library

11). Flink Integration with other Big data tools :

a). Integrate Flink with Hadoop

b). Process existing HDFS data with Flink

c). Yarn and Flink integration

d). Flink Data Streaming with Kafka

e). Consume data in real time from Kafka

12). Programming in Flink :

a). Parallel Data Flow in Flink

b). Develop complex Streaming applications in Flink

c). Handle Batch processing in Flink using DataSet APIs

d). Troubleshooting and Debugging Flink Programs

e). Best Practices of development in Flink

f). Real time Apache Flink Project

For more inputs on Apache Flink you can connect here.
Contact the L&D Specialist at Locus IT.

Locus Academy has more than a decade experience in delivering the training on Apache Flink for corporates across the globe. The participants for the training on Apache Flink are extremely satisfied and are able to implement the learnings in their on going projects.


There are no reviews yet.

Be the first to review “Mastering Apache Flink | Integration with Hadoop | Yarn | Kafka”

Your email address will not be published. Required fields are marked *

Flink is a distributed processing engine and a scalable data analytics framework. You can use Flink to process data streams at a large scale and to deliver real-time analytical insights about your processed data with your streaming application.