Multi-dimensional arrays in numpy#
# initialization
import numpy as np
Some examples#
Often times we need to model data that is more than one-dimensional. For example:
Suppose we measure the ocean surface temperature over a range of latitudes and longitudes. The resulting data is naturally two-dimensional. If we include depth, the data will become three-dimensional.
Suppose we measure the variations of temperature and salinity at a particular location with time. The timestamps, temperature readings, and salinity readings can each be represented by a 1D array of size n, but we may opt to put these together as a n-by-3 array instead.
In numpy, a multi-dimensional array can be created from a regularly nested list, e.g.,
x = np.array([[1, 3, 5, 7], [2, 4, 6, 8]])
print(x)
[[1 3 5 7]
[2 4 6 8]]
Note that this array has 8 elements in arranged in a 2-row by 4-column structure. If we look at its attributes, we get:
x.size
8
x.shape
(2, 4)
x.ndim
2
Thus, the first index of x.shape is the number of rows, while the second index is the number of columns
Next, here is an example of a 3-dimensional array (note the use of optional indentation to make the structure of the array clearer):
y = np.array([
[
[1, 2, 3, 4],
[5, 6, 7, 8]
],[
[-2, -1, 1, 2],
[-4, -3, 3, 4]
],[
[0, 1, 0, 1],
[1, 1, 0, 0]
]
])
print(y)
[[[ 1 2 3 4]
[ 5 6 7 8]]
[[-2 -1 1 2]
[-4 -3 3 4]]
[[ 0 1 0 1]
[ 1 1 0 0]]]
We can think of y as consisting of 3 “sheets”, each sheet having 2 rows and 4 columns. The corresponding size, shape, and number of dimension of y are:
y.size
24
y.shape
(3, 2, 4)
y.ndim
3
Indexing a multi-dimensional array#
An n-dimensional array can be indexed by a mixture of integers and slices of total length n, separated by commas. For example, if we want to obtain the element at the 2nd row (index = 1) and 3rd column (index = 2) of x, we can do:
x[1, 2]
6
Similarly, if we want to extract the 2nd and 3rd column of every row, we can do:
x[:, 1:3]
array([[3, 5],
[4, 6]])
Note that the first : basically says “extract all rows”, while the 1:3 says “extract columns starting from index 1 and up to but not including index 3”
The same idea can be extended to n-dimensional array. Say we want to obtain the 2nd and 3rd column of every row of the first two sheets, we may do:
y[:2, :, 1:3]
array([[[ 2, 3],
[ 6, 7]],
[[-1, 1],
[-3, 3]]])
The above syntax for indexing and slicing can also be used to assign values to the elements of an array. For example, we may overwrite x as follows:
x = np.array([[1, 3, 5, 7], [2, 4, 6, 8]])
x[1, 2] = 0
print(x)
[[1 3 5 7]
[2 4 0 8]]
We can even assign multiple values at once, e.g.,
x = np.array([[1, 3, 5, 7], [2, 4, 6, 8]])
x[:, 1] = [-5, -2]
print(x)
[[ 1 -5 5 7]
[ 2 -2 6 8]]
Arithmetic with multi-dimensional array#
The arithmetic operators +, -, *, /, //, %, and ** can operate between two arrays of the same dimensions. For example:
x = np.array([[1, 3, 5, 7], [2, 4, 6, 8]])
z = np.array([[1, 2, 0, 1], [0, 2, 1, 2]])
x ** z
array([[ 1, 9, 1, 7],
[ 1, 16, 6, 64]])
Moreover, the arithmetic operators can also operate between a scalar and an n-dimensional array:
x / 2
array([[0.5, 1.5, 2.5, 3.5],
[1. , 2. , 3. , 4. ]])
As it turns out, if you have an n-dimensional array (say y) and an m-dimensional array (say x), where m < n, as long as the last m dimensions of y agree with the dimensions of x, you can also operate between the two. What happens is that the array x will automatically expand to have the same dimension as y, where data are repeated identically if need be, and the operation will be carried out between y and this expanded version of x
As an example, suppose we have 4 measurements of time (as fractional hour), temperature (in °C), and salinity, packed as a 4-by-3 array:
measurement = np.array([
[3.0, 8.9, 31.0],
[9.0, 9.3, 30.7],
[15.0, 9.6, 30.5],
[21.0, 9.0, 30.9]
])
Suppose we want to convert temperature to Fahrenheit, we may define:
multiplier = [1, 9.0 / 5, 1]
offset = [0, 32.0, 0]
Which allow us to do:
multiplier * measurement + offset
array([[ 3. , 48.02, 31. ],
[ 9. , 48.74, 30.7 ],
[15. , 49.28, 30.5 ],
[21. , 48.2 , 30.9 ]])
The idea is that both multiplier and offset are expanded to become 4-by-3 arrays, with each row repeating the same content. The end effect is that each temperature element is modified as \(T_F = (9.0 / 5.0) \cdot T_C + 32.0\)
Incidentally, another way to perform the above calculation is:
measurement[:, 1] = (9.0 / 5) * measurement[:, 1] + 32
print(measurement)
[[ 3. 48.02 31. ]
[ 9. 48.74 30.7 ]
[15. 49.28 30.5 ]
[21. 48.2 30.9 ]]
With the difference being that this calculation modifies the array in-place