Writing your own functions in python

Writing your own functions in python#

Note: in this section we will stay within core python and avoid using any third party modules such as numpy.

Up to this point we have been using functions written by someone else. What do we need to do to create our own functions? And when may we want to do that?

Why write your own functions?#

The main reason to write a function is the same main reason to use a for loop: you avoid repeating yourself. As an example, suppose you have a list of floats call float_values and you want to print each number out after rounding them to 2 decimal points, we can do so using the for loop below:

# define a list of floats
float_values = [1.3546, 2.8452, 7.3137, 4.5606, 4.5517]

# print out each number after rounded to 2 decimal points; Use " " as separator
for x in float_values:
    x = round(x, 2)
    print(x, end=" ")

1.35 2.85 7.31 4.56 4.55

Let’s suppose later on in the code you run into the same basic problem, but this time the list is called floats, you’ll then have to make another code block:

# another list of floats
floats = [1.1764, 5.1505 , 2.9198, 4.0561, 5.1616, 4.1885, 6.6322, 9.1152]

for x in floats:
    x = round(x, 2)
    print(x, end=" ")

1.18 5.15 2.92 4.06 5.16 4.19 6.63 9.12

By doing so, what amounts to the same piece of codes will show up in more than one places. In addition to repeating yourself, this creates problem when you decide to round to 1 decimal points instead of 2, for then you’ll have to hunt down every instance where this piece of codes is used and change everyone of them.

The upshot is this: if you are using the same piece of codes (except for minor modifications) multiple times, consider writing a function for the general task, then calling it every time when a specific case of the task needs to be performed.

Writing python functions#

In python, the definition of a function starts with the keyword def, followed by the name of the function and the comma-separated sequence of parameters placed inside parentheses, followed by a line-ending colon :. The body of the function is then written in an indented block following this first line.

As an example, we may write a function called print2dp that print out floats to 2 decimal place (note: the rules for allowed function names are exactly the same as the rules for allowed variable names):

def print2dp(float_list):
    for x in float_list:
        x = round(x, 2)
        print(x, end=" ")

As defined, the function print2dp takes one input argument float_list. Note that this input argument is essentially a place holder. It does not correspond to any concrete values. Instead, when you call the function, the variable float_list is assigned the values of the input at the start of the execution of the function body.

In addition, note that the for...: line is indented, because it is part of the function body. Similarly, because our function has a for loop that requires indention on its own body, the x = round(x, 2) and print(x, end=" ") lines are now doubly-indented.

Once our function is defined, it can be called like any other functions you have encountered. For example, now we can print out our two lists to 2 decimal points via:

print2dp(float_values)

1.35 2.85 7.31 4.56 4.55

print2dp(floats)

1.18 5.15 2.92 4.06 5.16 4.19 6.63 9.12

Local versus global variables#

There is one more important difference between rerunning the same piece of codes versus writing a function to complete the same task. When the block of code is not placed inside a function body, the variables defined during the execution of the codes will overwrite variables of the same name already defined, and are accessible after the block of code executes:

x = 0

for x in floats:
    x = round(x, 2)
    print(x, end=" ")

print()
print(x)

1.18 5.15 2.92 4.06 5.16 4.19 6.63 9.12 
9.12

We call the variable x in this case a global variable since it can be accessed globally (= anywhere in the codes after it is defined).

In contrast, when you place a code block inside a function, the variables the block defines are local to the function and cannot be accessed outside. For our example:

x = 0 # global variable x

def print2dp(float_list):

    for x in float_list: # x is local variable here
        x = round(x, 2)
        print(x, end=" ")

    # the local `x` is destroyed upon function exit

print2dp(floats)
print()
print(x) # print out the global x

1.18 5.15 2.92 4.06 5.16 4.19 6.63 9.12 
0

Similarly, the input arguments of your functions are local. They can take the same name as a global variable but they are always assigned the values of the inputs when you call the function, and the global variable will remain unchanged.

x = 0 # global variable x

def square(x): # local place-holder x
    x = x**2 # reassign the value of local x
    print(x) # print the value of local x

square(2) # substitute x = 2 as the function starts executing
print(x) # print out global value of x

4
0

The existence of local scope is a feature, not a bug. It allows you to hide implementation details of your function and control what object(s) are exposed to the outside. It also makes the global scope less cluttered when the function body generates a lot of intermediate variables.

What if a variable is defined in the global scope but not in the local scope? What if we use such variables inside the function body? Answer: when python can’t find the variable in the local scope it will interpret the variable as global. Here is an example:

x = 2
y = 1

def print_x_add_y(x):
    print(x + y) # local x, global y

print_x_add_y(5)
print(x) # print global x
print(y) # print global y

6
2
1

In the above function, when the print(x + y) line is reached, the value of x comes from the placeholder, i.e., from the value you input into the function when calling it, while the value of y comes from the global scope.

Note that by design, you can use a global variable in a function but cannot reassign it. In particular, the code below will result in an error:

x = 2
y = 1

def print_x_add_y(x):
    y = 2 * y # planted an error: cannot reassign global y
    print(x + y) 

print_x_add_y(2) # when the error is actually triggered

---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
Cell In[12], line 8
      5     y = 2 * y # planted an error: cannot reassign global y
      6     print(x + y) 
----> 8 print_x_add_y(2) # when the error is actually triggered

Cell In[12], line 5, in print_x_add_y(x)
      4 def print_x_add_y(x):
----> 5     y = 2 * y # planted an error: cannot reassign global y
      6     print(x + y)

UnboundLocalError: cannot access local variable 'y' where it is not associated with a value

Returns versus side effects#

It is worth noting that print2dp works by side effect: It does not return any object that can be assigned to the left hand side. Instead, numbers are printed out over the course of its execution. In contrast, we can also define a function that returns value.

For example, suppose we want our function to round every number in a list to 2 decimal places, but we want the list of rounded numbers to be returned rather than printed out. We may modify our function to:

def round2dp(float_list):

    out = []
    
    for x in float_list:
        x = round(x, 2)
        out.append(x)

    return out

Note that this time we have to define a new list called out at the beginning of the code, and returns it at the end. By returning a variable, the result can be assigned and used in the global scope, even though the name out remains local:

out = [] # global out

result = round2dp(floats) # global `result` assigned return value of round2dp

print(result)
print(out) # global `out` is unchanged

[1.18, 5.15, 2.92, 4.06, 5.16, 4.19, 6.63, 9.12]
[]

Functions with optional arguments#

It is easy to extend what we have learned and write functions that take multiple arguments. For example, suppose we generalized the print2dp() to a function print_n_dp() that allows you to supply how many decimal places you want to keep:

def print_n_dp(float_list, n):

    for x in float_list:
        x = round(x, n)
        print(x, end=" ")

print_n_dp(floats, 1)

1.2 5.2 2.9 4.1 5.2 4.2 6.6 9.1

This may be useful, but if your use cases center mostly on printing to 2 decimal points, you may want to define 2 to be the default value of n. To do so, we follow the argument name with the equal sign and the default value:

def print_n_dp(float_list, n = 2):

    for x in float_list:
        x = round(x, n)
        print(x, end=" ")

Then we can use the function without supplying n, at which case the default value will be used:

print_n_dp(floats)

1.18 5.15 2.92 4.06 5.16 4.19 6.63 9.12

Importantly, you must write down arguments without defaults before you write down those that have defaults. For example, the following function will raise an error at definition time:

def print_dp_n(n = 2, float_list):

    for x in float_list:
        x = round(x, n)
        print(x, end=" ")

  Cell In[19], line 1
    def print_dp_n(n = 2, float_list):
                          ^
SyntaxError: parameter without a default follows parameter with a default