Table of contents
1: INTRODUCTION
1.1 Apache Pig ā Overview
What is Apache Pig?
Why Do We Need Apache Pig?
Features of Pig
Apache Pig Vs MapReduce
Apache Pig Vs SQL
Apache Pig Vs Hive
Applications of Apache Pig
Apache Pig ā History
1.2 Apache Pig ā Architecture
Apache Pig ā Components
Pig Latin ā Data Model
2: ENVIRONMENT
2.1 Apache Pig ā Installation
Prerequisites
Download Apache Pig
Install Apache Pig
Configure Apache Pig
2.2 Apache Pig ā Execution
Apache Pig ā Execution Modes
Apache Pig ā Execution Mechanisms
Invoking the Grunt Shell
Executing Apache Pig in Batch Mode
2.3 Grunt Shell
Shell Commands
Utility Commands
3: PIG LATIN
3.1 Pig Latin ā Basics
Pig Latin ā Data Model
Pig Latin ā Statements
Pig Latin ā Data Types
Null Values
Pig Latin ā Arithmetic Operators
Pig Latin ā Comparison Operators
Pig Latin ā Type Construction Operators
Pig Latin ā Relational Operations
4: LOAD AND STORE OPERATORS
4.1 Apache Pig ā Reading Data
Preparing HDFS
The Load Operator
Storing Data
5: DIAGNOSTIC OPERATORS
5.1 Diagnostic Operators
Dump Operator
Describe Operator
Explain Operator
Illustrate Command
6: GROUPING AND JOINING
6.1 Group Operator
Grouping by Multiple Columns
Group All
Cogroup Operator
Grouping Two Relations using Cogroup
Join Operator
Inner Join
Self-Join
Outer Join
Using Multiple Keys
Cross Operator
7: COMBINING AND SPLITTING
7.1 Union Operator
Split Operator
8: FILTERING
8.1 Filter Operator
Distinct Operator
Foreach Operator
9: SORTING
9.1 Order By
Limit Operator
10: PIG LATIN BUILT-IN FUNCTIONS
10.1 Eval Functions
AVG
Max
Min
Count
COUNT_STAR
Sum
DIFF
SUBTRACT
IsEmpty
Pluck Tuple
Size
BagToString
Concat
Tokenize
Load and Store Functions
PigStorage
TextLoader
BinStorage
Handling Compression
Bag and Tuple Functions
TOBAG
TOP
TOTUPLE
TOMAP
10.2 String Functions
STARTSWITH
ENDSWITH
SUBSTRING
EqualsIgnoreCase
INDEXOF
LAST_INDEX_OF
LCFIRST
UCFIRST
UPPER
LOWER
REPLACE
STRSPLIT
STRSPLITTOBAG
Trim
LTRIM
RTRIM
10.3 Date-Time Functions
To Date
Get Day
Get Hour
Get Minute
Get Second
Get Millisecond
Get Year
Get Month
Get Week
GetWeekYearCurrentTime
To String
Days Between
Hours Between
Minutes Between
Seconds Between
Milliseconds Between
Years Between
Months Between
Weeks Between
Add Duration
Subtract Duration
10.4 Math Functions
ABS
ACOS
ASIN
ATAN
CBRT
CEIL
COS
COSH
EXP
FLOOR
LOG
LOG10
RANDOM
ROUND
SIN
SINH
SQRT
TAN
TANH
11: OTHER MODES OF EXECUTION
11.1 User-Defined Functions
Types of UDFs in Java
Writing UDFs using Java
Using the UDF
Running Scripts
Comments in Pig Script
Executing Pig Script in Batch Mode
Executing a Pig Script from HDFS
To conclude; This course provides a comprehensive understanding of Apache Pig, equipping you with essential skills for processing and analyzing large data sets. By mastering Pig Latin and its operators, you will enhance your ability to work with big data efficiently and effectively.
If you are looking for customized info, Please contact us here
Reference for Apache Pig
Reviews
There are no reviews yet.