In this post, we'll look at a couple of statistics functions in Python. These statistics functions are part of the Python Standard Library in the statistics module. The four functions we'll use in this post are common in statistics:

  • mean - average value
  • median - middle value
  • mode - most often value
  • standard deviation - spread of values

To access Python's statistics functions, we need to import the functions from the statistics module using the statement:

from statistics import mean, median, mode, stdev

After the import statement, the functions mean(), median(), mode() and stdev()(standard deviation) can be used. Since the statistics module is part of the Python Standard Library, no external packages need to be installed.

Let's imagine we have a data set of 5 test scores. The test scores are 60, 83, 91 and 100. These test scores can be stored in a Python list. Python lists are defined with square brackets [ ]. Elements in Python lists are separated with commas.

In [1]:
from statistics import mean, median, mode, stdev

test_scores = [60 , 83, 83, 91, 100]

Calculate the mean

To calculate the mean, or average of our test scores, use the statistics module's mean() function.

In [2]:
mean(test_scores)
Out[2]:
83.4

Calculate the median

To calculate the median, or middle value of our test scores, use the statistics module's median() function.

If there are an odd number of values, median() returns the middle value. If there are an even number of values median() returns an average of the two middle values.

In [3]:
median(test_scores)
83
Out[3]:
83

Calculate the mode

To calculate the mode, or most often value of our test scores, use the statistics module's mode() function.

If there is more than one number which occurs most often, mode() returns an error.

>>> mode([1, 1, 2, 2, 3])

StatisticsError: no unique mode; found 2 equally common values

If there is no value that occurs most often (all the values are unique or occur the same number of times), mode() also returns an error.

>>> mode([1,2,3])

StatisticsError: no unique mode; found 3 equally common values
In [4]:
mode(test_scores)
Out[4]:
83

Calculate the standard deviation

To calculate the standard deviation, or spread of the test scores, use the statistics module's stdev() function. A large standard deviation indicates the data is spread out; a small standard deviation indicates the data is clustered close together.

In [5]:
stdev(test_scores)
Out[5]:
14.842506526863986

Alternatively, we can import the whole statistics module at once (all the functions in the staticsitics module) using the the line:

import statistics

Then to use the functions from the module, we need to call the names statistics.mean(), statistics.median(), statistics.mode(), and statistics.stdev(). See below:

In [6]:
import statistics

test_scores = [60 , 83, 83, 91, 100]
In [7]:
statistics.mean(test_scores)
Out[7]:
83.4
In [8]:
statistics.median(test_scores)
Out[8]:
83
In [9]:
statistics.mode(test_scores)
Out[9]:
83
In [10]:
statistics.stdev(test_scores)
Out[10]:
14.842506526863986

Summary

The statistics module is part of the Python Standard Library. To use statistics module functions, you first have to import the functions with the line from statistics import <function_name> where <function_name> is the name of the function you want to use. Then you can call the <function_name>() and pass in a list of values.

The following functions are part of Python's statistics module:

statistics module function name description example result
mean() mean mean or average mean([1,4,5,5]) 3.75
median() median middle value median([1,4,5,5]) 4.5
mode() mode most often mode([1,4,5,5]) 5
stdev() standard deviation spread of data stdev([1,4,5,5]) 1.892