Description
PySpark-python API for Apache Spark. It enables you to perform real-time, large-scale data processing in a distributed environment using Python. PySpark helps you interface with Resilient Distributed Datasets (RDDs) in Apache Spark and Python programming language.
Course Content
1-Operating System
- Introduction to Operating System
- Important Unix Commands
2-Python
- Main constructs of any programming language: Sequence- Condition -Loop
- Working with Python packages- types of packages
- Importing and installing Packages
- Searching for python packages
- IDE Familiarity – Spyder/Pycharm/Jupyter Notebook
- Python Operators including bitwise operators
- Variables & Types
- Conditional statements – If else
- Loops
- Working with strings and arrays
- Functions
- Data Libraries (Numpy, Pandas)
3-RDBMS
- Database Architecture
- Data modelling in PySpark-python API
- Relational Database concepts
- Database design and schema
- DDL – Create, Alter, Drop Databases
- DML – Load and Query Data
4-Data warehousing
- Overview of Data Warehousing
- Concepts and architecture of Data Warehouses
5-Big Data Concepts
- Introduction to Big Data
- Distributed computing and Hadoop Architecture
6-Storage
- Storing data on Hadoop – HDFS
7-PySpark
- Spark Architecture
- Spark Session
- Spark Language API’s
- Data Frame and Partitions
- Transformations & Actions
- Structured API’s (PySpark-python API)
- Schema Spark
- Types Structured
- API Execution
- Operation on Data Frames
- Working with Different Data Types
- Aggregations in Spark
- Joins in Spark
- RDD and RDD Operations, DAG
8- PySpark Streaming
- PySpark Streaming introduction
- Structured Streaming
Locus Academy has more than a decade experience in delivering the training, Staffing on PySpark-python API for corporates across the globe. The participants for the training, Staffing on PySpark-python API are extremely satisfied and are able to implement the learnings in their on going projects.
Please Visit Apache Official Site: || Locus Academy ha s more than a decade experience in delivering the training/staffing on Apache PySpark and Python API Developments for corporates across the globe. The participants for the training/staffing on Apache PySpark and Python API Developments are extremely satisfied and are able to implement the learnings in their on going projects.
Other useful references
Reviews
There are no reviews yet.