Pandas Bootcamp 2022 : Data Science with Python Training

Description

Pandas is a game-changer for data science and analytics, it uses fast, flexible, and expressive data structures to make working with relational data. Locus IT has decade-long industry experience in “Pandas” consulting, staffing & training services.

Objectives Pandas Training:

1. Bring your Data Handling and Data Analysis skills to an outstanding level.

2. Learn and practice all relevant Pandas methods and workflows with Real-World Datasets

3. Learn Pandas based on NEW Version 1.x (the days of versions 0.x are over)

4. Import, clean, and merge messy Data and prepare Data for Machine Learning

5. Master a complete Machine Learning Project A-Z with Pandas, Scikit-Learn, and Seaborn

6. Analyze, visualize, and understand your Data with Pandas, Matplotlib, and Seaborn

7. Practice and master your Pandas skills with Quizzes, 150+ Exercises, and Comprehensive Projects

8. Import Financial/Stock Data from Web Sources and analyze them with Pandas

9. Learn and master the most important Pandas workflows for Finance

10. Learn how to best transition from Versions 0. x to new Version 1. x

11. Learn the Basics of Pandas and Numpy Coding (Appendix)

12. Learn and master important Statistical Concepts with Scipy

Overview

a). Installation of Anaconda

b). Opening a Jupyter Notebook

c). How to use Jupyter Notebooks

d). How to tackle Pandas Version 1.0

Part 1: Pandas from Zero to Hero (Building Blocks)

Introduction to Tabular Data / Pandas.

a). Pandas Basics (DataFrame Basic 1)

Create your very first Pandas DataFrame (from CSV)
Pandas Display Options and the methods head() & tail()
First Data Inspection
Built-in Functions, Attributes and Methods with Pandas
Make it easy: TAB Completion and Tooltip
Selecting Columns
Selecting one Column with the “dot notation”
Zero-based Indexing and Negative Indexing
Selecting Rows with iloc (position-based indexing)
Slicing Rows and Columns with iloc (position-based indexing)
Selecting Rows with loc (label-based indexing)
Slicing Rows and Columns with loc (label-based indexing)
Indexing and Slicing with reindex()

b). Pandas Service and Index Objects

Intro
First Steps with Pandas Series
Analyzing Numerical Series with unique(), unique() and value_counts()
Analyzing non-numerical Series with unique(), unique(), value_counts()
Creating Pandas Series
Indexing and Slicing Pandas Series
Sorting of Series and Introduction to the in-place – parameter
nlargest() and nsmallest()
idxmin() and idxmax()
Manipulating Pandas Series
First Steps with Pandas Index Objects
Creating Index Objects from Scratch
Changing Row Index with set_index() and reset_index()
Changing Column Labels
Renaming Index & Column Labels with rename()

c). DataFrame Basics :

Intro
Filtering DataFrames by one Condition
Filtering DataFrames by many Conditions (AND)
Filtering DataFrames by many Conditions (OR)
Advanced Filtering with between(), isin() and ~
any() and all()
Removing Columns
Removing Rows
Adding new Columns to a DataFrame
Creating Columns based on other Columns
Adding Columns with insert()
Adding new Rows (hands-on approach)

d). Manipulating elements in a Dataframes / Slice

Intro
View vs. Copy
Simple Rules about what to do and when.
Manipulating DataFrames / Slices

e). Visualization with Matplotlib

Intro
The plot() method
Customization of Plots
Histograms
Barcharts and Piecharts
Scatterplots

Part 2: Full Data workflow A – Z

a). Importing Data

Importing CSV files with pd.read_csv
Importing messy CSV files with pd.read_csv
Importing messy Data from Excel with pd.read_excel()
Importing Data from the Web with pd.read_html()

b). Cleaning Data

String Operations
Changing the Datatype of Columns with astype()
Intro NA values / missing values
Detection of missing Values
Removing missing values
Replacing missing values
Intro Duplicates
Detection of Duplicates
Handling / Removing Duplicates
The ignore_index parameter (NEW in Pandas 1.0)
Detection of Outliers
Handling / Removing Outliers
Categorical Data
Pandas Version 1.0: New dtypes and pd. NA

c). Merging, Joining and Concatenating Data

Intro
Adding Rows with append() and pd.concat
Arithmetic with Pandas Objects / Data Alignment
Inner Joins with merge()
Outer Joins (without Intersection) with merge()
Left Joins (without Intersection) with merge()
Right Joins (without Intersection) with merge()
Left Joins with merge()
Right Joins with merge()
Joining on different Column Names / Indexes
Joining on more than one Column
pd.merge() and join()

d). GroupBy Operations

Understanding the GroupBy Object
Splitting with many Keys
split-apply-combine explained
split-apply-combine applied
Advanced aggregation with agg()
GroupBy Aggregation with Relabeling (NEW – Pandas Version 0.25)
Transformation with transform()
Replacing NA Values with group-specific Values
Generalizing split-apply-combine with apply()
Hierarchical Indexing with Groupby
stack() and unstack()

e). Reshaping and pivoting DataFrames

Transposing Rows and Columns
Pivoting DataFrames with pivot()
Limits of pivot()
pivot_table()
pd. crosstab()
melting DataFrames with melt()

Part 3: Comprehensive Project Challenges

a). Explanatory Data Analysis Challenges

Merging and Concatenating
Data Cleaning 1
Impact of GDP, Population and Politics
Statistical Analysis and Hypothesis Testing
Aggregating and Ranking
Summer Games vs. Winter Games – Does location matter?

Part 4: Pandas for Finance, Investing & Time series

a). Time Series Basics

Converting strings to datetime objects with pd.to_datetime()
Initial Analysis / Visualization of Time Series
Indexing and Slicing Time Series
Creating a customized DatetimeIndex with pd.date_range()
More on pd.date_range()
Downsampling Time Series with resample() (Part 1)
Downsampling Time Series with resampling (Part 2)
The PeriodIndex object
Advanced Indexing with reindex()

b). Pandas for Finance and Investing

Getting Ready (Installing required package)
Importing Stock Price Data from Yahoo Finance (it still works!)
Initial Inspection and Visualization
Normalizing Time Series to a Base Value (100)
The shift() method
The methods diff() and pct_change()
Financial Time Series – Return and Risk
Financial Time Series – Covariance and Correlation
Helpful DatetimeIndex Attributes and Methods
Filling NA Values with bfill, ffill and interpolation

Part 5: Machine Learning with Pandas and Scikit – Learn

a). Introduction to Regression and Classification

Machine Learning – an Overview
Linear Regression with sci-kit-learn – a simple Introduction
Making Predictions with Linear Regression
Overfitting
Underfitting

b). What’s new in Panda version 1.0?

Intro and Overview
How to update Pandas to Version 1.0
Downloads for this Section
Important Recap: Pandas Display Options (Changed in Version 0.25)
Info() method – new and extended output
NEW Extension dtypes (“nullable” dtypes): Why do we need them?
Creating the NEW extension dtypes with convert_dtypes()
NEW pd.NA value for missing values
The NEW “nullable” Int64Dtype
The NEW StringDtype
The NEW “nullable” BooleanDtype
Addition of the ignore_index parameter
Removal of prior Version Deprecations

c). The NumPy Page3

Introduction to Numpy Arrays
Numpy Arrays: Vectorization
Numpy Arrays: Indexing and Slicing
Numpy Arrays: Shape and Dimensions
Numpy Arrays: Indexing and Slicing of multi-dimensional Arrays
Numpy Arrays: Boolean Indexing
Generating Random Numbers
Performance Issues
Case Study: Numpy vs. Python Standard Library
Summary Statistics
Visualization and (Linear) Regression

Requirements :

a). A desktop computer (Windows, Mac, or Linux) capable of storing and running Anaconda. The course will walk you through installing the necessary free software.

b). An internet connection capable of streaming videos.

c). Ideally some Spreadsheet Basics/Programming Basics (not mandatory, the course guides you through the basics)

For more inputs on Complete Pandas Bootcamp 2022: Data Science with Python Training/staffing you can connect here.
Contact the L&D Specialist at Locus IT.

Locus Academy has more than a decade of experience in delivering training/staffing on Pandas for corporates across the globe. The participants for the training/staffing on Pandas are delighted and can implement the learnings in their ongoing projects. Pandas Training/staffing in Bangalore Offered by Locus IT with 100% Hands-on Practical Classes by the Best Industrial Experts with Real Time Projects.

Reviews

There are no reviews yet.

Be the first to review “Pandas Bootcamp 2022 : Data Science with Python”

Pandas Bootcamp 2022 : Data Science with Python

Enquiry

Training Mode: Online