Descripstats package adds more descriptive statistics to the default describe of Pandas
For numeric data, the
describe( ) function of Python Pandas library provides a very convenient method to generate a general summary table of descriptive Statistics. However, the result’s index only include
max as well as lower,
50 and upper percentiles. By default, the lower percentile is
25, the upper percentile is
75, and the
50 percentile is the same as the median.
In most cases, such as writing a scientific and data analysis report, and journal paper, we need more statistic indices than these default ones, such as mean absolute deviation (
variance, standard error of the mean (
kurtosis, etc. Pandas also provides methods to calculate them, but we have to write a code snippet to add them to the summary table of the
describe( ) function.
In this connection, Dr. Shouke Wei from Deepsim Intelligence Inc. (Deepsim) created a Python package to easily generate the summary statistics table, which expands the indices of Pandas
describe( ). For convenient use purpose, I made it into a PyPI package named
descipstats, so you can easily install it and use it.
If you are interested in how to use this package with a concrete real-world dataset, please visit the post.