25 Tricks for Pandas
Check out this video (and Jupyter notebook) which outlines a number of Pandas tricks for working with and manipulating data, covering topics such as string manipulations, splitting and filtering DataFrames, combining and aggregating data, and more.
Last week, Kevin Markham (@justmarkham) of DataSchool.io posted a handy video and an companion Jupyter notebook titled "My top 25 pandas tricks." I found the collection of tricks handy enough to warrant sharing with our readers.
True to its name, the video outlines a number of Pandas tricks for working with and manipulating data, covering topics such as string manipulations, splitting and filtering DataFrames, combining and aggregating data, and more. Aside from the promised 25 tricks, a bonus 26th covering Pandas DataFrame profiling is included.
The collection of tricks are well explained in the video, are practical and ready for use right away, and implementations with sample datasets can be studied further in the accompanying notebook.
If you aren't aware, Kevin is a data science educator and the founder of Data School, specializing in Python and machine learning. Data School is a website featuring blog posts, videos, courses, Jupyter notebooks, and webcast recordings, with a mix of free and paid content.
The Pandas DataFrame tricks from the video are:
- Show installed versions
- Create an example DataFrame
- Rename columns
- Reverse row order
- Reverse column order
- Select columns by data type
- Convert strings to numbers
- Reduce DataFrame size
- Build a DataFrame from multiple files (row-wise)
- Build a DataFrame from multiple files (column-wise)
- Create a DataFrame from the clipboard
- Split a DataFrame into two random subsets
- Filter a DataFrame by multiple categories
- Filter a DataFrame by largest categories
- Handle missing values
- Split a string into multiple columns
- Expand a Series of lists into a DataFrame
- Aggregate by multiple functions
- Combine the output of an aggregation with a DataFrame
- Select a slice of rows and columns
- Reshape a MultiIndexed Series
- Create a pivot table
- Convert continuous data into categorical data
- Change display options
- Style a DataFrame
- Bonus trick: Profile a DataFrame
Check out the Jupyter notebook for a more in-depth look at the Pandas tricks that Kevin lays out in the video. Also be sure to check out Data School for lots of other useful data science related learning content.
Related:
- How to select rows and columns in Pandas using [ ], .loc, iloc, .at and .iat
- 7 Steps to Mastering Data Preparation for Machine Learning with Python — 2019 Edition
- 10 Simple Hacks to Speed up Your Data Analysis in Python