Multi-dimensional arrays in numpy#

# initialization
import numpy as np

Some examples#

Often times we need to model data that is more than one-dimensional. For example:

  • Suppose we measure the ocean surface temperature over a range of latitudes and longitudes. The resulting data is naturally two-dimensional. If we include depth, the data will become three-dimensional.

  • Suppose we measure the variations of temperature and salinity at a particular location with time. The timestamps, temperature readings, and salinity readings can each be represented by a 1D array of size n, but we may opt to put these together as a n-by-3 array instead.

In numpy, a multi-dimensional array can be created from a regularly nested list, e.g.,

x = np.array([[1, 3, 5, 7], [2, 4, 6, 8]])
print(x)
[[1 3 5 7]
 [2 4 6 8]]

Note that this array has 8 elements in arranged in a 2-row by 4-column structure. If we look at its attributes, we get:

x.size
8
x.shape
(2, 4)
x.ndim
2

Thus, the first index of x.shape is the number of rows, while the second index is the number of columns

Next, here is an example of a 3-dimensional array (note the use of optional indentation to make the structure of the array clearer):

y = np.array([
    [
        [1, 2, 3, 4],
        [5, 6, 7, 8]
    ],[
        [-2, -1, 1, 2],
        [-4, -3, 3, 4]
    ],[
        [0, 1, 0, 1],
        [1, 1, 0, 0]
    ]
    
])
print(y)
[[[ 1  2  3  4]
  [ 5  6  7  8]]

 [[-2 -1  1  2]
  [-4 -3  3  4]]

 [[ 0  1  0  1]
  [ 1  1  0  0]]]

We can think of y as consisting of 3 “sheets”, each sheet having 2 rows and 4 columns. The corresponding size, shape, and number of dimension of y are:

y.size
24
y.shape
(3, 2, 4)
y.ndim
3

Indexing a multi-dimensional array#

An n-dimensional array can be indexed by a mixture of integers and slices of total length n, separated by commas. For example, if we want to obtain the element at the 2nd row (index = 1) and 3rd column (index = 2) of x, we can do:

x[1, 2]
6

Similarly, if we want to extract the 2nd and 3rd column of every row, we can do:

x[:, 1:3]
array([[3, 5],
       [4, 6]])

Note that the first : basically says “extract all rows”, while the 1:3 says “extract columns starting from index 1 and up to but not including index 3”

The same idea can be extended to n-dimensional array. Say we want to obtain the 2nd and 3rd column of every row of the first two sheets, we may do:

y[:2, :, 1:3]
array([[[ 2,  3],
        [ 6,  7]],

       [[-1,  1],
        [-3,  3]]])

The above syntax for indexing and slicing can also be used to assign values to the elements of an array. For example, we may overwrite x as follows:

x = np.array([[1, 3, 5, 7], [2, 4, 6, 8]])
x[1, 2] = 0

print(x)
[[1 3 5 7]
 [2 4 0 8]]

We can even assign multiple values at once, e.g.,

x = np.array([[1, 3, 5, 7], [2, 4, 6, 8]])
x[:, 1] = [-5, -2]

print(x)
[[ 1 -5  5  7]
 [ 2 -2  6  8]]

Arithmetic with multi-dimensional array#

The arithmetic operators +, -, *, /, //, %, and ** can operate between two arrays of the same dimensions. For example:

x = np.array([[1, 3, 5, 7], [2, 4, 6, 8]])
z = np.array([[1, 2, 0, 1], [0, 2, 1, 2]])
x ** z
array([[ 1,  9,  1,  7],
       [ 1, 16,  6, 64]])

Moreover, the arithmetic operators can also operate between a scalar and an n-dimensional array:

x / 2
array([[0.5, 1.5, 2.5, 3.5],
       [1. , 2. , 3. , 4. ]])

As it turns out, if you have an n-dimensional array (say y) and an m-dimensional array (say x), where m < n, as long as the last m dimensions of y agree with the dimensions of x, you can also operate between the two. What happens is that the array x will automatically expand to have the same dimension as y, where data are repeated identically if need be, and the operation will be carried out between y and this expanded version of x

As an example, suppose we have 4 measurements of time (as fractional hour), temperature (in °C), and salinity, packed as a 4-by-3 array:

measurement = np.array([
    [3.0, 8.9, 31.0],
    [9.0, 9.3, 30.7],
    [15.0, 9.6, 30.5],
    [21.0, 9.0, 30.9]
])

Suppose we want to convert temperature to Fahrenheit, we may define:

multiplier = [1, 9.0 / 5, 1]
offset = [0, 32.0, 0]

Which allow us to do:

multiplier * measurement + offset
array([[ 3.  , 48.02, 31.  ],
       [ 9.  , 48.74, 30.7 ],
       [15.  , 49.28, 30.5 ],
       [21.  , 48.2 , 30.9 ]])

The idea is that both multiplier and offset are expanded to become 4-by-3 arrays, with each row repeating the same content. The end effect is that each temperature element is modified as \(T_F = (9.0 / 5.0) \cdot T_C + 32.0\)

Incidentally, another way to perform the above calculation is:

measurement[:, 1] = (9.0 / 5) * measurement[:, 1] + 32
print(measurement)
[[ 3.   48.02 31.  ]
 [ 9.   48.74 30.7 ]
 [15.   49.28 30.5 ]
 [21.   48.2  30.9 ]]

With the difference being that this calculation modifies the array in-place