If you try and read your own code written 3 months ago, chances are that it will feel as if it was written by a total stranger: weird algorithms, strange variable names, apparently random hard-coded values,"what's going on here?",... This "code alienation" problem is more acute when the code that you are reading is not even yours!
What is the solution to this problem? Is there a way to make "code alienation" less worrysome or painful? To make code that can be read in the future, or can be read by others, you need to document your code.
The documentation of your code should give information about
The first guideline helps mainly other developers to use your code. In python, this mostly consist of the so called docstrings, or text that hold information about the functions or objects to be used.
The second guideline helps people reading your code, or trying to update or improve it, or fixing bugs, by providing them with a rationale. These consist on comments that usually point out the functionality of chunks of code, or explanations about the approach used.
However, these guidelines should be complemented with simplicity: writing too many comments might not affect the performance of your program, but they are unnecessary, and they clutter the space necessary for reading the program. Too many comments can affect the person reading your code (remember, it might be your future you!):
Keeping the documentation as simple as possible (but not simpler!) should be the aim of any programmer. The same is true for the code: simple, clean code can reduce the amount of comments needed, specially when meaningful names and structure arise from that simplification. Sometimes people use comments as a crutch for their poor understanding of the program/problem they are tackling.
Quoting from PEP 257:
A docstring is a string literal that occurs as the first statement in a module, function, class, or method definition. Such a docstring becomes the doc special attribute of that object.
This string should hold information about what the module, function, class or method DOES: what is its purpose, what are the variables (if any) required, any options, what are the outputs,...
For example, the following are the first lines of a function that implements a complex number from its real and imaginary parts:
def complex(real=0.0, imag=0.0):
"""Form a complex number.
Keyword arguments:
real -- the real part (default 0.0)
imag -- the imaginary part (default 0.0)
"""
if imag == 0.0 and real == 0.0:
return complex_zero
...
They consist of a piece of text (one or multiple lines) that are delimited by a triple double-quotation marks.
Please, go and read PEP 257 now!
Comments are complete sentences (in English, please!) preceded by the hashtag symbol. They are used to add information that cannot be easily delivered reading the code.
Quoting Jef Raskin:
[code] can’t explain why the program is being written, and the rationale for choosing this or that method. [It] cannot discuss the reasons certain alternative approaches were taken. For example:
:Comment: A binary search turned out to be slower than the Boyer-Moore algorithm for the data sets of interest, thus we have used the more complex, but faster method even though this problem does not at first seem amenable to a string search technique. :End Comment:
This comment not only names the technique used, but also explains why a simpler approach was not taken.
Also, make sure you treat comments as a constitutive part of your code. The comments should talk about what the code is doing and the rationale behind it. If the code changes substantially so as to make the comments obsolete, change your comments accordingly.
Quoting from PEP 8:
Comments that contradict the code are worse than no comments. Always make a priority of keeping the comments up-to-date when the code changes!
In the community there is some debate regarding comments in the code. Specially, how much documentation is necessary, what should the comments say, and so on.
For example, let's take a look at BADLY DOCUMENTED piece of code:
In [1]:
import scipy as sp #This imports scipy
x = sp.linspace(0,2*sp.pi,1000) # This creates an array of 1000 elements from 0 to 10, equally spaced, and stores it in x
for d in sp.linspace(0,sp.pi,10): # For each small angle from 0 to pi in steps of 0.01, do...
MxD = sp.sin(x+d) # This creates an array MxD that stores the values of sin(x)
#Prints the value of the integral of y
print "The sum is {0:.3f}".format(sum(MxD*(x[1]-x[0])))
You can see that this code has a comment every single line, yet it can be considered badly documented. Why? Well, it fails to deliver crucial information, like what is the code supposed to do? or why does it do it that way?
A better example of the previous code would be
In [2]:
""" Test of the integral of a sine in one period.
This script aims to check that the integral of a sine
function over a period of oscillation is 0, regardless
of the initial dephasing.
The integral is done for functions of the type sin(x+a),
over x from 0 to 2pi (a period of oscillation), with
variable initial phase a.
Since it is a simple script, we use the Rectangle rule with
1000 elements between 0 and 2pi.
We expect each output to be close to 0.
"""
import scipy as sp
#The domain of integration is x=(0,2pi). We use 1000 points.
#The test is performed for 10 dephasing angles angles=(0,pi)
x = sp.linspace(0,2*sp.pi,1000)
dx = x[1]-x[0]
angles = sp.linspace(0,sp.pi,10)
for a in angles:
y = sp.sin(x+a)
integral = sum(2*y[1:-1]*dx)
print "The integral is {0:.3f}".format(integral)
The following links are articles that delve a little more into the issues of documentation in code
http://visualstudiomagazine.com/articles/2011/01/06/to-comment-or-not-to-comment.aspx
http://visualstudiomagazine.com/articles/2013/06/01/roc-rocks.aspx
http://sd.jtimothyking.com/2006/12/15/does-bad-writing-reflect-poor-programming-skills/
http://blog.codinghorror.com/code-tells-you-how-comments-tell-you-why/
https://en.wikipedia.org/wiki/The_Elements_of_Programming_Style