Bugs and debugging#

Error and traceback#

In most of our examples we have executed codes without any errors. However, if you are new to python, you may encounter error messages more frequently. For instance, you may forget that the : is part of the syntax of for:

for x in range(4)
    print(x)
  Cell In[1], line 1
    for x in range(4)
                     ^
SyntaxError: expected ':'

When you try to execute the codes above (as illustrated), python will give you an error message, and also give you some information about where the error is located, so that you can start debugging it

More often, especially when you are using 3rd-party modules (which we’ll get to in week 4), your error messages will be more complicated. For example (don’t worry about the import line, we’ll talk more about that in week 4):

import numpy as np

x_list = [1, 7, 5, None, 8]
np.max(x_list)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[2], line 4
      1 import numpy as np
      3 x_list = [1, 7, 5, None, 8]
----> 4 np.max(x_list)

File ~\miniforge3\envs\learn\Lib\site-packages\numpy\core\fromnumeric.py:2810, in max(a, axis, out, keepdims, initial, where)
   2692 @array_function_dispatch(_max_dispatcher)
   2693 @set_module('numpy')
   2694 def max(a, axis=None, out=None, keepdims=np._NoValue, initial=np._NoValue,
   2695          where=np._NoValue):
   2696     """
   2697     Return the maximum of an array or maximum along an axis.
   2698 
   (...)   2808     5
   2809     """
-> 2810     return _wrapreduction(a, np.maximum, 'max', axis, None, out,
   2811                           keepdims=keepdims, initial=initial, where=where)

File ~\miniforge3\envs\learn\Lib\site-packages\numpy\core\fromnumeric.py:88, in _wrapreduction(obj, ufunc, method, axis, dtype, out, **kwargs)
     85         else:
     86             return reduction(axis=axis, out=out, **passkwargs)
---> 88 return ufunc.reduce(obj, axis, dtype, out, **passkwargs)

TypeError: '>=' not supported between instances of 'int' and 'NoneType'

This is a lot of information, but we can peel it off step by step.

First, we focus our attention to the bottom of the error message, where the error type is indicated (TypeError). More importantly, the message give you a clue of what went wrong: You are trying to compare an int type with a None type, which is not allowed.

Now that you know what went wrong the next step is to figure out where it went wrong. Going from the bottom towards the top, you encounter unfamiliar codes from unfamiliar files in the middle. These you can ignore because these are codes written by someone else for the numpy package (again, more on this on week 4). These are not your code. Your code is actually close to the top, where something trigger the error downstream.

At the top you see that the np.max() code is being highlighted. The execution of the np.max() function triggers the error downstream. But there is nothing wrong with the function itself… So we look into the arguments you supplied to the function.

And there we finally understand what went wrong. The source data that is fed into the code is problematic. So the line that you’ll likely have to fix is actually the definition of x_list

Debugging strategies#

What if you have complicated blocks of codes and you can’t easily tell where the error comes from? Say, for example, you have the following block:

x = list(range(15))
y = list(range(1, 16))

for i, val in enumerate(x):
    y[i] = y[i] - val

out = []
for i, val in enumerate(y):
    out.append(val / x[i])
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
Cell In[3], line 9
      7 out = []
      8 for i, val in enumerate(y):
----> 9     out.append(val / x[i])

ZeroDivisionError: division by zero

Here are some suggestions:

  • If you code is in a large / long cell, see if you can break the cell into smaller pieces, and isolated the piece from which the error was triggered.

  • You may want to comment out part of the code and see how it affects the error. Again, this can provide useful information about where the error occurred.

  • Finally, if you know roughly where the error occurred, insert print() statement(s) shortly before that location to give you some information about the states of the variables before python was tripped.

In our example, we can break the code above into 3 smaller pieces, only the last of which triggers error:

x = list(range(15))
y = list(range(1, 16))
for i, val in enumerate(x):
    y[i] = y[i] - val
out = []
for i, val in enumerate(y):
    out.append(val / x[i])
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
Cell In[6], line 3
      1 out = []
      2 for i, val in enumerate(y):
----> 3     out.append(val / x[i])

ZeroDivisionError: division by zero

Next, we print out the values of i, val, and x[i] right before the error occurs:

out = []
for i, val in enumerate(y):
    print(i, val, x[i])
    out.append(val / x[i])
0 1 0
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
Cell In[7], line 4
      2 for i, val in enumerate(y):
      3     print(i, val, x[i])
----> 4     out.append(val / x[i])

ZeroDivisionError: division by zero

We see that the error happens at index i = 0, where val = 1 and x[i] = 0, and we recognize that it is the last part, that x[0] = 0, that causes our problem.

Logic error and sanity check#

So far we have discussed errors that python can catch (which we will refer to as runtime errors, using the term loosely). However, there are also errors and mistakes that python cannot catch. In these cases, your code will run and, say, generate some output. What you may not notice is that the code is not doing what you have in mind. These are called logic errors. Take the if-elif-else example from the last section. Suppose we have:

# define a temperature
temp = 22
if temp > 35:
    print("Dire warning: Temperature " + str(temp) + " is above the threshold of 35 deg C")
else:
    print("Temperature " + str(temp) + " deg C is normal")
    
if temp < 10:
    print("Warning: Temperature " + str(temp) + " is close to the threshold of 35 deg C")
Temperature 22 deg C is normal

The code seems to be working fine, until you supply a value of temp that is below 10:

# define a temperature
temp = -5
if temp > 35:
    print("Dire warning: Temperature " + str(temp) + " is above the threshold of 35 deg C")
else:
    print("Temperature " + str(temp) + " deg C is normal")
    
if temp < 10:
    print("Warning: Temperature " + str(temp) + " is close to the threshold of 35 deg C")
Temperature -5 deg C is normal
Warning: Temperature -5 is close to the threshold of 35 deg C

Now the codes print twice. This is an example of logic error, which, again, arise because you intended to carry out some operations but instead instructed python to do something else.

One good guard against logic error is sanity checks. Basically, you feed your code some data for which you know what answer(s) should come out, and check if the output is indeed what you have expected. If not, you will have to use the debug strategies mentioned above to locate where the error(s) occur and what went wrong, this time without python’s help via error messages.