Member-only story

Top 10 Pandas Functions You Should Know

Level up your skills in processing data using Pandas

Yong Cui
11 min readMar 6, 2025

--

Photo by path digital on Unsplash

As an addiction scientist, I collect behavioral data from community participants enrolled in smoking cessation trials. In these clinical trials, we collect multiple measurements at different points throughout the study timeline. To identify the underlying factors that contribute to one’s smoking behavior and intervention outcomes, I typically use Pandas to process the datasets. In this article, I’d like to review the top 10 Python functions and their related sister functions that I use. I hope this quick summary can help you get comfortable with Pandas if you’re about to use it for your data processing needs.

1. read_csv, read_excel, read_sql

The first step in processing data is to read the dataset from an external source. The most common formats include CSV or tab-delimited files and spreadsheets. Typically, when you use read_csv and read_excel functions, you simply specify the file path. When needed, you can also specify the delimiter, the number of rows you want to skip, and the sheet name.

Pandas allows you to read data directly from a database, such as SQLite or Microsoft SQL. Typically, you first create a connection to the database using ODBC or SQLAlchemy (the preferred way for…

--

--

Yong Cui
Yong Cui

Written by Yong Cui

Work at the nexus of biomedicine, data science & mobile dev. Author of Python How-to by Manning (https://www.manning.com/books/python-how-to).

No responses yet