In-code documentation

Questions

  • What can I do to make my code more easily understandable?

  • What information should go into comments?

  • What are docstrings and what information should go into docstrings?

Objectives

  • Write docstrings according to best practices

  • Know where and when to put comments

In this episode we will learn how to write good documentation inside your code.

Exercise - Writing good comments

In-code-1: Comments

Let’s take a look at two example comments (comments in python start with #):

Comment A

# Now we check if temperature is larger then -50:
if temperature > -50:
    print('do something')

Comment B

# We regard temperatures below -50 degrees as measurement errors
if temperature > -50:
    print('do something')

Which of these comments is best? Can you explain why?

Warning

Do not use comments for:

Keeping zombie code

# Do not run this code!:
# if temperature > 0:
#     print('It is warm')

Instead: just remove the code, you can always find it back in a previous version of your code in git.

Replacing git

# removed on August 5
# if() ...
# Now it connects to the API with o-auth2, updated 05/05/2016

Instead: use git to keep track of different versions of your code.

Writing docstrings in python

Let’s look at the following function:

def mean_temperature(data):
    temperatures = data['Air temperature (degC)']
    return sum(temperatures)/len(temperatures)

It computes the mean temperature for a given dataset. How can we make it clearer what this function does and how to use it?

We can add a docstring (the string in between the two """):

def mean_temperature(data):
    """
    Get the mean temperature

    Args:
        data (pandas.DataFrame): A pandas dataframe with air temperature measurements.

    Returns:
        The mean air temperature (float)
    """
    temperatures = data['Air temperature (degC)']
    return float(sum(temperatures)/len(temperatures))

A docstring is a structured comment associated to a segment of code (i.e. function or class)

Good docstrings describe:

  • What the function does

  • What goes in (including the type of the input variables)

  • What goes out (including the return type)

In python there are several styles that describe how docstrings should be formatted. Here we use Google style docstrings.

Python parses docstrings, for example calling the help function will display it:

help(mean_temperature)

Python will print this help text:

Help on function mean_temperature in module __main__:

mean_temperature(data)
    Get the mean temperature

    Args:
        data (pandas.DataFrame): A pandas dataframe with air temperature measurements.

    Returns:
        The mean air temperature (float)

It is common to write docstrings for functions, classes, and modules.

Script docstrings

You can also add a structured docstring at the top of a script to document what the script does and how to run it.

"""Prints information about the mean air temperature.

Usage:
 ./temperature.py

Author:
 Sven van der Burg - 2021-03-2021
"""

Small effort, large gain.

Writing docstrings makes you generate your documentation as you are generating the code!

Exercise: Adding in-code documentation

In-code-2: add in-code documentation

Update this code snippet so it is well-documented:

import pandas as pd

def x(a, print_columns=False):
   b = pd.read_excel(a)
   column_headers = list(b.columns.values)
   if print_columns:
       print("\n".join(column_headers))
   return column_headers

Naming is documentation.

Giving explicit, descriptive names to your code segments (functions, classes, variables) already provides very useful and important documentation. In practice you will find that for simple functions it is unnecessary to add a docstring when the function name and variable names already give enough information.

Keypoints

  • Comments should describe the why for your code not the what.

  • Writing docstrings is an easy way to write documentation while you type code.