Finding extremal values in a nd-array

Laurent Perrinet

2016-11-17 09:56

Sometimes, you need to pick up the $N$-th extremal values in a mutli-dimensional matrix.

Let's suppose it is represented as a nd-array (here, I further suppose you are using the numpy library from the python language). Finding extremal values is easy with argmax, argmin or argsort but this function operated on 1d vectors... Juggling around indices is sometimes not such an easy task, but luckily, we have the unravel_index function.

For those in a hurry, one quick application is that given an np.ndarray, it's eeasy to get the index of the maximal value in that array :

In [1]:

import numpy as np
x = np.arange(2*3*4).reshape((4, 3, 2))
x = np.random.permutation(np.arange(2*3*4)).reshape((4, 3, 2))
print ('input ndarray', x)
idx = np.unravel_index(np.argmax(x.ravel()), x.shape)
print ('index of maximal value = ', idx, ' and we verify that ', x[idx], '=', x.max())

input ndarray [[[ 4  3]
  [19  7]
  [ 1 13]]

 [[ 6 12]
  [14  0]
  [16 11]]

 [[15 17]
  [20 22]
  [ 9  5]]

 [[ 8 18]
  [23 21]
  [10  2]]]
index of maximal value =  (3, 1, 0)  and we verify that  23 = 23

Let's unwrap how we found such an easy solution...

Let's first initialize the notebook with all we need, numpy:

In [2]:

import numpy as np

To work out on "real-like" data, let's first define a dummy 3-D array:

In [3]:

x = np.arange(2*3*4).reshape((4, 3, 2))
x = np.random.permutation(np.arange(2*3*4)).reshape((4, 3, 2))
print (x)

[[[19 17]
  [ 8 16]
  [11  7]]

 [[ 3 10]
  [23  1]
  [ 0 15]]

 [[12 13]
  [ 9 20]
  [ 4 18]]

 [[ 2 21]
  [22  6]
  [ 5 14]]]

We wish to find the indices of the sorted values of this matrix. For this, we will use the np.argsort function which operates on 1-D vectors.

Indeed, we can represent as a 1-d array (a vector):

In [4]:

print (x.ravel())

[19 17  8 16 11  7  3 10 23  1  0 15 12 13  9 20  4 18  2 21 22  6  5 14]

We may now find the list of indices to sort it:

In [5]:

print (np.argsort(x.ravel()))

[10  9 18  6 16 22 21  5  2 14  7  4 12 13 23 11  3  1 17  0 15 19 20  8]

And we verify that the entries are indeed sorted:

In [6]:

print (x.ravel()[np.argsort(x.ravel())])

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]

To go back to the coordinates of the initial np.ndarray, we use the unraval_index function:

In [7]:

help(np.unravel_index)

Help on built-in function unravel_index in module numpy:

unravel_index(...)
    unravel_index(indices, shape, order='C')
    
    Converts a flat index or array of flat indices into a tuple
    of coordinate arrays.
    
    Parameters
    ----------
    indices : array_like
        An integer array whose elements are indices into the flattened
        version of an array of dimensions ``shape``. Before version 1.6.0,
        this function accepted just one index value.
    shape : tuple of ints
        The shape of the array to use for unraveling ``indices``.
    
        .. versionchanged:: 1.16.0
            Renamed from ``dims`` to ``shape``.
    
    order : {'C', 'F'}, optional
        Determines whether the indices should be viewed as indexing in
        row-major (C-style) or column-major (Fortran-style) order.
    
        .. versionadded:: 1.6.0
    
    Returns
    -------
    unraveled_coords : tuple of ndarray
        Each array in the tuple has the same shape as the ``indices``
        array.
    
    See Also
    --------
    ravel_multi_index
    
    Examples
    --------
    >>> np.unravel_index([22, 41, 37], (7,6))
    (array([3, 6, 6]), array([4, 5, 1]))
    >>> np.unravel_index([31, 41, 13], (7,6), order='F')
    (array([3, 6, 6]), array([4, 5, 1]))
    
    >>> np.unravel_index(1621, (6,7,8,9))
    (3, 1, 4, 1)

In [8]:

print (np.unravel_index(np.argsort(x.ravel()), x.shape))

(array([1, 1, 3, 1, 2, 3, 3, 0, 0, 2, 1, 0, 2, 2, 3, 1, 0, 0, 2, 0, 2, 3,
       3, 1]), array([2, 1, 0, 0, 2, 2, 1, 2, 1, 1, 0, 2, 0, 0, 2, 2, 1, 0, 2, 0, 1, 0,
       1, 1]), array([0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1,
       0, 0]))

Such that we can now sort the whole array from the lowest to highest index. We verify that:

In [9]:

print (x[np.unravel_index(np.argsort(x.ravel()), x.shape)])

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]

Some applications: We can now pick just the datapoints extremal values of interest :

In [10]:

datapoints = 10
print (np.unravel_index(np.argsort(x.ravel())[:datapoints], x.shape))

(array([1, 1, 3, 1, 2, 3, 3, 0, 0, 2]), array([2, 1, 0, 0, 2, 2, 1, 2, 1, 1]), array([0, 1, 0, 0, 0, 0, 1, 1, 0, 0]))

Let's now try with a more generic example :

In [11]:

x = np.random.rand(4, 3, 2)
print (x)

[[[0.06834601 0.60352804]
  [0.60561573 0.77685711]
  [0.19118084 0.2985651 ]]

 [[0.91722699 0.99373095]
  [0.50057943 0.50154522]
  [0.39299099 0.43339377]]

 [[0.97056531 0.00931049]
  [0.95539077 0.44781701]
  [0.87493581 0.93165374]]

 [[0.0136503  0.41497518]
  [0.46685109 0.86126145]
  [0.60488443 0.1447974 ]]]

The indices for the datapoints extremal values of interest are :

In [12]:

print (np.unravel_index(np.argsort(x.ravel())[:datapoints], x.shape))

(array([2, 3, 0, 3, 0, 0, 1, 3, 1, 2]), array([0, 0, 0, 2, 2, 2, 2, 0, 2, 1]), array([1, 0, 0, 1, 0, 1, 0, 1, 1, 1]))

... which correspond to the minimal values of interest :

In [13]:

print (x[np.unravel_index(np.argsort(x.ravel())[:datapoints], x.shape)])

[0.00931049 0.0136503  0.06834601 0.1447974  0.19118084 0.2985651
 0.39299099 0.41497518 0.43339377 0.44781701]

Note that it is also easy to pick the maximal values :

In [14]:

print (np.unravel_index(np.argsort(-x.ravel())[:datapoints], x.shape))

(array([1, 2, 2, 2, 1, 2, 3, 0, 0, 3]), array([0, 0, 1, 2, 0, 2, 1, 1, 1, 2]), array([1, 0, 0, 1, 0, 0, 1, 1, 0, 0]))

Another application: Get the index of the maximal value of an array :

In [15]:

idx = np.unravel_index(np.argmax(x.ravel()), x.shape)
print ('index of maximal value = ', idx, ' and we verify that ', x[idx], '=', x.max())

index of maximal value =  (1, 0, 1)  and we verify that  0.9937309474909091 = 0.9937309474909091