Multi-dimensional arrays in numpy

Multi-dimensional arrays in numpy#

# initialization
import numpy as np

Some examples#

Often times we need to model data that is more than one-dimensional. For example:

Suppose we measure the ocean surface temperature over a range of latitudes and longitudes. The resulting data is naturally two-dimensional. If we include depth, the data will become three-dimensional.
Suppose we measure the variations of temperature and salinity at a particular location with time. The timestamps, temperature readings, and salinity readings can each be represented by a 1D array of size n, but we may opt to put these together as a n-by-3 array instead.

In numpy, a multi-dimensional array can be created from a regularly nested list, e.g.,

x = np.array([[1, 3, 5, 7], [2, 4, 6, 8]])
print(x)

[[1 3 5 7]
 [2 4 6 8]]

Note that this array has 8 elements in arranged in a 2-row by 4-column structure. If we look at its attributes, we get:

x.size

x.shape

(2, 4)

x.ndim

Thus, the first index of x.shape is the number of rows, while the second index is the number of columns

Next, here is an example of a 3-dimensional array (note the use of optional indentation to make the structure of the array clearer):

y = np.array([
    [
        [1, 2, 3, 4],
        [5, 6, 7, 8]
    ],[
        [-2, -1, 1, 2],
        [-4, -3, 3, 4]
    ],[
        [0, 1, 0, 1],
        [1, 1, 0, 0]
    ]
    
])

print(y)

[[[ 1  2  3  4]
  [ 5  6  7  8]]

 [[-2 -1  1  2]
  [-4 -3  3  4]]

 [[ 0  1  0  1]
  [ 1  1  0  0]]]

We can think of y as consisting of 3 “sheets”, each sheet having 2 rows and 4 columns. The corresponding size, shape, and number of dimension of y are:

y.size

y.shape

(3, 2, 4)

y.ndim

Indexing a multi-dimensional array#

An n-dimensional array can be indexed by a mixture of integers and slices of total length n, separated by commas. For example, if we want to obtain the element at the 2nd row (index = 1) and 3rd column (index = 2) of x, we can do:

x[1, 2]

Similarly, if we want to extract the 2nd and 3rd column of every row, we can do:

x[:, 1:3]

array([[3, 5],
       [4, 6]])

Note that the first : basically says “extract all rows”, while the 1:3 says “extract columns starting from index 1 and up to but not including index 3”

The same idea can be extended to n-dimensional array. Say we want to obtain the 2nd and 3rd column of every row of the first two sheets, we may do:

y[:2, :, 1:3]

array([[[ 2,  3],
        [ 6,  7]],

       [[-1,  1],
        [-3,  3]]])

The above syntax for indexing and slicing can also be used to assign values to the elements of an array. For example, we may overwrite x as follows:

x = np.array([[1, 3, 5, 7], [2, 4, 6, 8]])
x[1, 2] = 0

print(x)

[[1 3 5 7]
 [2 4 0 8]]

We can even assign multiple values at once, e.g.,

x = np.array([[1, 3, 5, 7], [2, 4, 6, 8]])
x[:, 1] = [-5, -2]

print(x)

[[ 1 -5  5  7]
 [ 2 -2  6  8]]

Arithmetic with multi-dimensional array#

The arithmetic operators +, -, *, /, //, %, and ** can operate between two arrays of the same dimensions. For example:

x = np.array([[1, 3, 5, 7], [2, 4, 6, 8]])
z = np.array([[1, 2, 0, 1], [0, 2, 1, 2]])

x ** z

array([[ 1,  9,  1,  7],
       [ 1, 16,  6, 64]])

Moreover, the arithmetic operators can also operate between a scalar and an n-dimensional array:

x / 2

array([[0.5, 1.5, 2.5, 3.5],
       [1. , 2. , 3. , 4. ]])

As it turns out, if you have an n-dimensional array (say y) and an m-dimensional array (say x), where m < n, as long as the last m dimensions of y agree with the dimensions of x, you can also operate between the two. What happens is that the array x will automatically expand to have the same dimension as y, where data are repeated identically if need be, and the operation will be carried out between y and this expanded version of x

As an example, suppose we have 4 measurements of time (as fractional hour), temperature (in °C), and salinity, packed as a 4-by-3 array:

measurement = np.array([
    [3.0, 8.9, 31.0],
    [9.0, 9.3, 30.7],
    [15.0, 9.6, 30.5],
    [21.0, 9.0, 30.9]
])

Suppose we want to convert temperature to Fahrenheit, we may define:

multiplier = [1, 9.0 / 5, 1]
offset = [0, 32.0, 0]

Which allow us to do:

multiplier * measurement + offset

array([[ 3.  , 48.02, 31.  ],
       [ 9.  , 48.74, 30.7 ],
       [15.  , 49.28, 30.5 ],
       [21.  , 48.2 , 30.9 ]])

The idea is that both multiplier and offset are expanded to become 4-by-3 arrays, with each row repeating the same content. The end effect is that each temperature element is modified as \(T_F = (9.0 / 5.0) \cdot T_C + 32.0\)

Incidentally, another way to perform the above calculation is:

measurement[:, 1] = (9.0 / 5) * measurement[:, 1] + 32
print(measurement)

[[ 3.   48.02 31.  ]
 [ 9.   48.74 30.7 ]
 [15.   49.28 30.5 ]
 [21.   48.2  30.9 ]]

With the difference being that this calculation modifies the array in-place

Multi-dimensional arrays in numpy

Contents

Multi-dimensional arrays in numpy#

Some examples#

Indexing a multi-dimensional array#

Arithmetic with multi-dimensional array#