Arrays

Usage

The Python list and dict objects are not suitable to manage multi-dimensional arrays.

Instead, the Numpy (Numerical Python) library should be used. It allows to:

  • Create multi-dimensional arrays

  • Access array attributes (shape, number of dimensions, data types)

  • Manipulate arrays (changing shape, tiling)

  • Manipulate missing values

  • Perform optimized numerical operations (pointwise or matrix) via broadcasting (no loops)

Array manipulation

Array initialisation

There are several methods to initialize 1D arrays. arange takes an argument a start, end end stride argument.

import numpy as np 

x = np.arange(0, 10, 1) # equivalent to Matlab 0:1:9
x
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

linspace takes a start, end and a number of elements:

x = np.linspace(10, 20, 41) # 10 to 20 with 41 values
x
array([10.  , 10.25, 10.5 , 10.75, 11.  , 11.25, 11.5 , 11.75, 12.  ,
       12.25, 12.5 , 12.75, 13.  , 13.25, 13.5 , 13.75, 14.  , 14.25,
       14.5 , 14.75, 15.  , 15.25, 15.5 , 15.75, 16.  , 16.25, 16.5 ,
       16.75, 17.  , 17.25, 17.5 , 17.75, 18.  , 18.25, 18.5 , 18.75,
       19.  , 19.25, 19.5 , 19.75, 20.  ])

array allows to init from list for instance:

x = np.array([1, 2, 3, 4, 5]) # initialisation from list 
x
array([1, 2, 3, 4, 5])

Initialization of ND arrays can be achieved using different methods, which take as arguments the shape of the array and eventually the type.

x = np.ones((2, 3, 8), dtype=int) # init with 1 integers
x
array([[[1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1, 1, 1]],

       [[1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1, 1, 1]]])
x = np.zeros((2, 3, 8), dtype=float) # init with 0 floats 
x
array([[[0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.]]])
x = np.full((2, 3, 8), -99, dtype=float)  # fill array with 10.1
print(x)
[[[-99. -99. -99. -99. -99. -99. -99. -99.]
  [-99. -99. -99. -99. -99. -99. -99. -99.]
  [-99. -99. -99. -99. -99. -99. -99. -99.]]

 [[-99. -99. -99. -99. -99. -99. -99. -99.]
  [-99. -99. -99. -99. -99. -99. -99. -99.]
  [-99. -99. -99. -99. -99. -99. -99. -99.]]]

Changing data type

Changing data type is done by using the astype function:

x = x.astype(int)  # conversion from floats to int
x
array([[[-99, -99, -99, -99, -99, -99, -99, -99],
        [-99, -99, -99, -99, -99, -99, -99, -99],
        [-99, -99, -99, -99, -99, -99, -99, -99]],

       [[-99, -99, -99, -99, -99, -99, -99, -99],
        [-99, -99, -99, -99, -99, -99, -99, -99],
        [-99, -99, -99, -99, -99, -99, -99, -99]]])

Getting attributes

Numpy arrays have several attributes. To get the number of dimensions:

x.ndim
3

to get the shape:

x.shape
(2, 3, 8)

to get the data type:

x.dtype
dtype('int64')

Indexing

Indexing is done as for lists, except that it can be done along several dimensions:

x = np.ones((10, 20, 15, 30), dtype=float)
x[:, :, 0, :].shape
(10, 20, 30)
x[:, -1, ::2, ::2].shape
(10, 8, 15)
x[:, :, 0:1, ::3].shape
(10, 20, 1, 10)

Using ..., you can also access an array without fully knowing it’s shape:

x[..., 0].shape 
(10, 20, 15)
x[..., 2:5, 0].shape 
(10, 20, 3)
x[-1, ...].shape
(20, 15, 30)
x[::2, ::3,  ...].shape
(5, 7, 15, 30)

numpy.array can be used to index numpy.array, but prefer using slice instead. Indeed, using the former may be error prone.

For instance, let’s create a 2D array.

z = np.arange(20 * 10)
z
array([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,
        13,  14,  15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,
        26,  27,  28,  29,  30,  31,  32,  33,  34,  35,  36,  37,  38,
        39,  40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  50,  51,
        52,  53,  54,  55,  56,  57,  58,  59,  60,  61,  62,  63,  64,
        65,  66,  67,  68,  69,  70,  71,  72,  73,  74,  75,  76,  77,
        78,  79,  80,  81,  82,  83,  84,  85,  86,  87,  88,  89,  90,
        91,  92,  93,  94,  95,  96,  97,  98,  99, 100, 101, 102, 103,
       104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116,
       117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129,
       130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142,
       143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155,
       156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168,
       169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181,
       182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194,
       195, 196, 197, 198, 199])
x = np.reshape(z, (20, 10))
x.shape
(20, 10)

For extracting a subspan of an array, you might want to use:

i = [0, 1, 2, 3]
j = [5, 6, 7, 8]
x[i, j]
array([ 5, 16, 27, 38])

In this case, the output is 1D. It extracts the value for each (i, j) pair and is equivalent to:

x[i[0], j[0]], x[i[1], j[1]], x[i[2], j[2]], x[i[3], j[3]]
(5, 16, 27, 38)

Beside, this will fail if i and j have different sizes. The proper syntaxt would be:

i = slice(0, 4)
j = slice(5, 9)
x[i,j]
array([[ 5,  6,  7,  8],
       [15, 16, 17, 18],
       [25, 26, 27, 28],
       [35, 36, 37, 38]])

The values obtained in the first try are the elements in the diagonal of the subspan.

Copy

As for list, assigment of numpy arrays should not be used to make copies, since the reference is copied, not the values. In the following, the modification of x modifies y, and conversely:

x = np.arange(10)
y = x
y[0] = -1
x[-1] = 1000
x, y
(array([  -1,    1,    2,    3,    4,    5,    6,    7,    8, 1000]),
 array([  -1,    1,    2,    3,    4,    5,    6,    7,    8, 1000]))

To make copies, use the copy method:

x = np.arange(10)
y = x.copy()  # deep copy of x into y 
y[0] = -1
x[-1] = 1000
x, y
(array([   0,    1,    2,    3,    4,    5,    6,    7,    8, 1000]),
 array([-1,  1,  2,  3,  4,  5,  6,  7,  8,  9]))

Reshaping

If we create a 2D array:

x = np.zeros((6, 4))
for i in range(6):
    x[i, :] = np.arange(4) + i*4
x.shape
(6, 4)

The conversion into 1D can be achieved using ravel:

x1dc = np.ravel(x)   # converts ND to 1D following C order
x1dc
array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.,
       13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23.])

By default, the order of the reshaping follows the C order. You can specify an Fortran-like order as follows:

x1df = np.ravel(x, order="F")
x1df
array([ 0.,  4.,  8., 12., 16., 20.,  1.,  5.,  9., 13., 17., 21.,  2.,
        6., 10., 14., 18., 22.,  3.,  7., 11., 15., 19., 23.])

To convert an array from a given set of dimensions to another one, use the reshape method. The number of elements in the source and destination shapes must be the same.

xresc = np.reshape(x, (2, 3, 4)) 
xresc
array([[[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]],

       [[12., 13., 14., 15.],
        [16., 17., 18., 19.],
        [20., 21., 22., 23.]]])
xresf = np.reshape(x, (2, 3, 4), order="F")
xresf
array([[[ 0.,  1.,  2.,  3.],
        [ 8.,  9., 10., 11.],
        [16., 17., 18., 19.]],

       [[ 4.,  5.,  6.,  7.],
        [12., 13., 14., 15.],
        [20., 21., 22., 23.]]])

If we recreate a 3D array as follows:

x = np.reshape(np.arange(2 * 3 * 4), (2, 3, 4))
x.shape
(2, 3, 4)

We can reshape it from ND to 2D, by keeping one dimension unchanged, as follows:

xtest = np.reshape(x, (2, -1))
xtest.shape
(2, 12)

Here, we keep the first dimension but transform (3, 4) in 12.

xtest2 = np.reshape(x, (-1, 4))
xtest2.shape
(6, 4)

Here, we keep the last dimension but transform (2, 3) in 6.

Repeating along a given dimensions

To repeat an array along a dimension, use the tile function. If we create a 1D array:

x = np.zeros((10, 20, 30))
x.shape
(10, 20, 30)

We can repeat it 40 times as follows:

xtile = np.tile(x, (40, 1, 1, 1))
xtile.shape
(40, 10, 20, 30)

Note that the dimension argument of tile must always finish by 1. I.e. the array can only be extended along the first dimensions. If we try to extend the above array along the last dimension as follows:

xtile2 = np.tile(x, (1, 1, 1, 40))
xtile2.shape
(1, 10, 20, 1200)

Here the result has the wrong dimension. To make it work, the first syntax must be used with transpose:

xtile2 = np.tile(x, (40, 1, 1, 1))
xtile2.shape
(40, 10, 20, 30)
xtile2 = np.transpose(xtile2, (1, 2, 3, 0))
xtile2.shape
(10, 20, 30, 40)

Changing dimension order

x = np.ones((5, 10, 15))
x.shape
(5, 10, 15)

Changing dimension orders is achieved using the transpose method. By default, it reverses the dimensions.

xt = x.T
xt.shape
(15, 10, 5)
xt = np.transpose(x) # identical to xt = xt.T
xt.shape
(15, 10, 5)

But you can specify the new order as an argument:

xt = np.transpose(x, (2, 0, 1))
xt.shape
(15, 5, 10)

Testing conditions

xresc = np.reshape(np.arange(24), (2, 3, 4))

Testing conditions can be done by using operators on the array. It returns an array of boolean:

condition = (xresc <= 12) & (xresc >= 3)  # array of boolean (True if condition is met)
condition
array([[[False, False, False,  True],
        [ True,  True,  True,  True],
        [ True,  True,  True,  True]],

       [[ True, False, False, False],
        [False, False, False, False],
        [False, False, False, False]]])

To display the values when the condition is met:

xresc[condition]
array([ 3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

To display the values when condition is not met:

xresc[~condition]
array([ 0,  1,  2, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23])

To get the indexes where the condition is met, use the nonzero method. It returns a tuple with the indexes where the condition is met for each axes.

ind = np.nonzero(condition)
ind
(array([0, 0, 0, 0, 0, 0, 0, 0, 0, 1]),
 array([0, 1, 1, 1, 1, 2, 2, 2, 2, 0]),
 array([3, 0, 1, 2, 3, 0, 1, 2, 3, 0]))

Therefore, the condition is met for xresc[0, 0, 3], xresc[0, 1, 0], ..., xresc[0, 2, 3], xresc[1, 0, 0]

You can also get the values where condition is met using the indexes:

xresc[ind]
array([ 3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

To extract the unique occurrences in an array, use the unique value:

x = np.array([0, 0, 0, 1, 2, 2, 3, 3, 3, 3])
np.unique(x)
array([0, 1, 2, 3])

Concatenation

Concatenation of arrays is achieved by usinging the concatenate function, which takes as arguments the arrays to concatenate, and the dimension along which to concatenate:

x = np.zeros((2, 3, 5))
y = np.ones((2, 7, 5))
x.shape, y.shape
((2, 3, 5), (2, 7, 5))

To concatenate the arrays along their 2nd dimension:

z = np.concatenate((x, y), axis=1)  
z.shape
(2, 10, 5)

Operations using arrays

Arhtimetical operations

In Python, standard operations are performed pointwise (contrary to Matlab for instance). To do matrix operations, you need to call specific methods (numpy.matmul for instance).

x = np.array([1, 2, 3, 4, 5]).astype(float)
x
array([1., 2., 3., 4., 5.])
y = np.array([6, 7, 8, 9, 10]).astype(float)
y
array([ 6.,  7.,  8.,  9., 10.])
x * y
array([ 6., 14., 24., 36., 50.])
y / x
array([6.        , 3.5       , 2.66666667, 2.25      , 2.        ])
y % x
array([0., 1., 2., 1., 0.])
y // x
array([6., 3., 2., 2., 2.])
x**y
array([1.000000e+00, 1.280000e+02, 6.561000e+03, 2.621440e+05,
       9.765625e+06])

In the above, the / operator returns the right value. However, if x has 0 values:

x[2] = 0.
x[3] = -1.
x
array([ 1.,  2.,  0., -1.,  5.])
z = y / x
z
/home/barrier/Softwares/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:1: RuntimeWarning: divide by zero encountered in true_divide
  """Entry point for launching an IPython kernel.
array([ 6. ,  3.5,  inf, -9. ,  2. ])

Here, we have a warning message, and the output array contains an inf value. To avoid this, we can use the divide method with the where and out argument.

The first one specifies the condition when the operation should be applied.

The second one specifies an output array which will be used to replace missing values. In this case, the default output will be an array of NaNs of the same size as x.

z = np.divide(y, x, where=(x!=0), out=np.full(x.shape, np.nan, dtype=x.dtype))  # no more warning message
z
array([ 6. ,  3.5,  nan, -9. ,  2. ])

Here, we have no more warnings and inf has been replaced by nan. The same thing can be applied to other functions such as log10:

w = np.log10(x)
w
/home/barrier/Softwares/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:1: RuntimeWarning: divide by zero encountered in log10
  """Entry point for launching an IPython kernel.
/home/barrier/Softwares/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:1: RuntimeWarning: invalid value encountered in log10
  """Entry point for launching an IPython kernel.
array([0.     , 0.30103,    -inf,     nan, 0.69897])
w = np.log10(x, where=(x > 0), out=np.full(x.shape, np.nan, dtype=x.dtype))
w
array([0.     , 0.30103,     nan,     nan, 0.69897])

Mathematical operations (mean, etc.)

Here we will work on the given array:

x = np.array([[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
             [7, 1, 3, 5, 6, 7, 8, 10, 4, 6],
             [5, 1, 0, 3, 5, 8, 12, 5, 8, 1]])
x.shape
(3, 10)

To compute the mean over the entire array:

out = np.mean(x)
out
5.333333333333333

To compute the standard deviation over second dimension:

out = np.std(x, axis=1) 
out
array([2.87228132, 2.45153013, 3.57211422])

To compute the sum along the first dimension:

out = np.sum(x, axis=0)
out
array([13,  4,  6, 12, 16, 21, 27, 23, 21, 17])

Same thing for cumulated sum:

out = np.cumsum(x, axis=0) # compute sum over first dimension
out
array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
       [ 8,  3,  6,  9, 11, 13, 15, 18, 13, 16],
       [13,  4,  6, 12, 16, 21, 27, 23, 21, 17]])

Same thing for product:

out = np.prod(x, axis=0) 
out
array([ 35,   2,   0,  60, 150, 336, 672, 400, 288,  60])

This also works for multi-dimensional arrays:

x = np.reshape(np.arange(24), (2, 3, 4))  # lat, lon, time
x.shape
(2, 3, 4)

To compute the mean along the first and 2nd dimensions:

xmean = np.mean(x, axis=(0, 1))
xmean.shape
(4,)

To compute the mean along the 2nd and 3rd dimensions:

xmean = np.mean(x, axis=(1, 2))
xmean.shape
(2,)

Broadcasting

To make computation, it is highly advised to avoid loops. Numpy provides different broadcasting rules which allow to prevent from using loops (cf. docs.scipy)

Imagine that we have a ND array, with the dimensions being (time, lat, lon):

x = np.reshape(np.arange(24), (2, 3, 4))  # time, lat, lon
x.shape
(2, 3, 4)

If we want to compute the anomalies, we first compute the temporal mean:

xmean = np.mean(x, axis=0) 
xmean.shape
(3, 4)

Now we can compute the anomalies

anom = x - xmean
anom.shape
(2, 3, 4)

In this case, the code above works even though the x and xmean have different shapes. This is because the last two dimensions (i.e. trailing dimensions) are the same. Now, imagine that the array is in fact ordered as (lat, lon, time).

We first compute the mean over the time-dimensions:

xmean = np.mean(x, axis=-1)    # temporal mean
xmean.shape
(2, 3)

However, the computation of the anomalies will fail:

#anom = x - xmean  

It fails because the first (i.e leading) dimensions are the same, which does not allow broadcasting. This can be fixed by using the keepdims argument. It will keep a virtual dimension on the mean output.

xmean = np.mean(x, axis=-1, keepdims=True)
xmean.shape
(2, 3, 1)
anom = x - xmean

If you are lazy to remember the broadcasting rules, you can use the numpy.newaxis method to add virtual dimensions. It allows to add degenerated dimensions on an array. If we look at the shape of our x array:

x.shape
(2, 3, 4)

If we create a 6D array of dimensions:

y = np.zeros((5, 2, 7, 3, 10, 4))
y.shape
(5, 2, 7, 3, 10, 4)

To multiply x and y without loops, we can add degenerated dimenions to x in order to align them with the dimensions of y:

xnd = x[np.newaxis, :, np.newaxis, :, np.newaxis, :]
xnd.shape
(1, 2, 1, 3, 1, 4)

We can see that now the shapes of the xnd array and of y are aligned:

y.shape
(5, 2, 7, 3, 10, 4)

Now we can multiply both arrays:

z = y * xnd
z.shape
(5, 2, 7, 3, 10, 4)

Managing filled values

Filled values are generally defined as numpy.nan values.

x = np.arange(1, 10).astype(float)
x[0:3] = np.nan
x
array([nan, nan, nan,  4.,  5.,  6.,  7.,  8.,  9.])

Checking whether values are NaN is achieved by using the np.isnan method

cond = np.isnan(x)
cond
array([ True,  True,  True, False, False, False, False, False, False])

To find the index of the NaN values:

np.nonzero(np.isnan(x))
(array([0, 1, 2]),)

An equivalent to the np.isnan method is to use the mathematical definition of NaNs.

np.nonzero(x != x) # Mathematical definition of NaN
(array([0, 1, 2]),)

Warning: NaNs can only be used with Float arrays (not integer arrays)

Operations on filled values

To perform operations on data containing NaNs requires the use of special functions:

np.nanmean(x)
6.5
np.nansum(x)
39.0
np.nanstd(x)
1.707825127659933

Since this can be a bit annoying, numpy.array objects can be converted into masked_array objects, whose operations automatically discards NaN.

Masked array

x = np.arange(0, 10).astype(int)
x
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

To convert an array into a masked array, the following methods can be used:

x = np.ma.masked_where(x==0, x)
x
masked_array(data=[--, 1, 2, 3, 4, 5, 6, 7, 8, 9],
             mask=[ True, False, False, False, False, False, False, False,
                   False, False],
       fill_value=999999)
x = np.ma.masked_equal(x, 5)
x
masked_array(data=[--, 1, 2, 3, 4, --, 6, 7, 8, 9],
             mask=[ True, False, False, False, False,  True, False, False,
                   False, False],
       fill_value=5)
x = np.ma.masked_greater(x, 7)
x
masked_array(data=[--, 1, 2, 3, 4, --, 6, 7, --, --],
             mask=[ True, False, False, False, False,  True, False, False,
                    True,  True],
       fill_value=5)
x = np.ma.masked_less(x, 3)
x
masked_array(data=[--, --, --, 3, 4, --, 6, 7, --, --],
             mask=[ True,  True,  True, False, False,  True, False, False,
                    True,  True],
       fill_value=5)
x = np.ma.masked_where(np.isnan(x), x)
x
masked_array(data=[--, --, --, 3, 4, --, 6, 7, --, --],
             mask=[ True,  True,  True, False, False,  True, False, False,
                    True,  True],
       fill_value=5)
x[2] = np.ma.masked
x
masked_array(data=[--, --, --, 3, 4, --, 6, 7, --, --],
             mask=[ True,  True,  True, False, False,  True, False, False,
                    True,  True],
       fill_value=5)

This new object has additional attributes and method. Especially for assessing where the filled values are located.

x.mask
array([ True,  True,  True, False, False,  True, False, False,  True,
        True])
np.ma.getmaskarray(x)
array([ True,  True,  True, False, False,  True, False, False,  True,
        True])
np.mean(x)
5.0
np.sum(x)
20
np.std(x)
1.5811388300841898

It is strongly advised to use the np.ma.getmaskarray method rather than using the mask attribute. Indeed, the latter will return a bool instead of an array of bool if no data is missing.

# warning for masked_arrays
x = np.arange(10, 15).astype(float)
x = np.ma.masked_where(np.isnan(x), x)
x
masked_array(data=[10., 11., 12., 13., 14.],
             mask=False,
       fill_value=1e+20)

Therefore, extracting all non missing values from an array is not safe using mask:

iok = np.nonzero(x.mask == False) 
x[iok]
masked_array(data=[10.],
             mask=False,
       fill_value=1e+20)

In this case, only the first value is returned, instead of all values.

However, it works perfectly using the getmaskarray method:

iok = np.nonzero(np.ma.getmaskarray(x) == False)
x[iok]
masked_array(data=[10., 11., 12., 13., 14.],
             mask=False,
       fill_value=1e+20)

Scientific Python

Although the Numpy library allows to do some operations, it is rather limited. Other mathematical functions are provided by the Scipy library.

In an ideal world, NumPy would contain nothing but the array data type and the most basic operations: indexing, sorting, reshaping, basic elementwise functions, etc. All numerical code would reside in SciPy. [...] If you are doing scientific computing with Python, you should probably install both NumPy and SciPy. Most new features belong in SciPy rather than NumPy. Source: Scipy FAQ

Submodules

Scipy comes with numerous submodules, which are listed below (source: scipy.org)

Description

Module

Special functions (mathematical physics)

scipy.special

Integration

scipy.integrate

Optimization

scipy.optimize

Interpolation

scipy.interpolate

Fourier Transforms

scipy.fft

Signal Processing

scipy.signal

Linear Algebra

scipy.linalg

Spatial data structures and algorithms

scipy.spatial

Statistics

scipy.stats

Multidimensional image processing

scipy.ndimage

File IO: Matlab, NetCDF, IDL

scipy.io

Since you all are fluent in using Numpy, I leave it to you the exploration of Scipy...

Loops

As highlighted in the above, it is highly recommended to avoid the use of loops. However, if needed, the looping method in arrays is described below:

Multi-arrays in computer memory

Computer memory is inherently linear, i.e. multi-dimensional arrays are stored in memory as one-dimensional arrays. This can be done in two ways:

  • Row-major order: C/C++, Python

  • Column-major order: Fortran, Matlab, R, Julia

Row-major order:

Dictionaries

Column-major order:

Dictionaries

Loops

This has implications when writting loops. Indeed, imbricated loops should be consistent with the memory ordering:

Row-order (Python)

import numpy as np
x = np.empty((5, 10))
for i in range(5):  # inner loop: 1st dim
    for j in range(10):  # outer loop: last dim
        print(x[i, j])

Column-order (Julia)

x = zeros(Int8, 5, 10)
for j = 1:10   # inner loop: last dim
    for i = 1:5  # outer loop: 1st dim
        println(x[i, j])
    end
end

Sources: Wikipedia, thegreenplace

Let’s see with a quick example, using the following array

import numpy as np

shape = (30, 40, 50, 800)
x = np.random.rand(shape[0], shape[1], shape[2], shape[3]).astype(float)
x.shape
(30, 40, 50, 800)
%%time
total = 0
N = 0
for i in range(shape[0]):
    for j in range(shape[1]):
        for k in range(shape[2]):
            for l in range(shape[3]):
                total += x[i, j, k, l]
                N += 1
total /= N
total
CPU times: user 18.4 s, sys: 2 ms, total: 18.4 s
Wall time: 18.4 s
0.49996726640667866
%%time
total = 0
N = 0
for l in range(shape[3]):
    for k in range(shape[2]):
        for j in range(shape[1]):
            for i in range(shape[0]):
                total += x[i, j, k, l]
                N += 1
total /= N
total
CPU times: user 24.2 s, sys: 36 ms, total: 24.3 s
Wall time: 24.3 s
0.4999672664065223

The last loop is slower than the first one because the loop order is not consistent with the C-order used in Python.

Note: The np.ndenumerate method alows to loop in an array without risk.

cpt = 0
for k, v in np.ndenumerate(x):
    if(cpt == 15):
        break
    print(k, x[k])
    cpt += 1
(0, 0, 0, 0) 0.42457975076903165
(0, 0, 0, 1) 0.5377794661947678
(0, 0, 0, 2) 0.9744969920065903
(0, 0, 0, 3) 0.740293458634836
(0, 0, 0, 4) 0.13934592348230423
(0, 0, 0, 5) 0.09461858849650584
(0, 0, 0, 6) 0.4775997141298671
(0, 0, 0, 7) 0.8553440450168308
(0, 0, 0, 8) 0.1048005048148104
(0, 0, 0, 9) 0.885842543934454
(0, 0, 0, 10) 0.4107712279021991
(0, 0, 0, 11) 0.6931539244334584
(0, 0, 0, 12) 0.47570418127062775
(0, 0, 0, 13) 0.012965744408408142
(0, 0, 0, 14) 0.9019989978361336

The np.nditer allows to so similar things:

cpt = 0
for v in np.nditer(x):
    if(cpt == 15):
        break
    print(v)
    cpt += 1
0.42457975076903165
0.5377794661947678
0.9744969920065903
0.740293458634836
0.13934592348230423
0.09461858849650584
0.4775997141298671
0.8553440450168308
0.1048005048148104
0.885842543934454
0.4107712279021991
0.6931539244334584
0.47570418127062775
0.012965744408408142
0.9019989978361336

Note that the use of nditer is read-only. To modify the input array:

cpt = 0
with np.nditer(x, op_flags=['readwrite']) as it:
    for v in it:
        if(cpt == 15):
            break
        v[...] = 2 * v
        cpt += 1
cpt = 0
for v in np.nditer(x):
    if(cpt == 15):
        break
    print(v)
    cpt += 1
0.8491595015380633
1.0755589323895356
1.9489939840131807
1.480586917269672
0.27869184696460847
0.18923717699301168
0.9551994282597343
1.7106880900336616
0.2096010096296208
1.771685087868908
0.8215424558043982
1.3863078488669167
0.9514083625412555
0.025931488816816284
1.8039979956722672

For more details about iteration, visit arrays.nditer.

Since the loops are not efficient in Python, it is better not to use them! Therefore, use vectorized functions.

%%time
tot = x.mean()
tot
CPU times: user 26.6 ms, sys: 15 µs, total: 26.7 ms
Wall time: 26.6 ms
0.49996742743360034