indexing different sized ranges in a 2d numpy array using a pythonic vectorized code

  • Last Update :
  • Techknowledgy :

We can use broadcasting to generate an appropriate mask and then masking does the job -

In[150]: a
Out[150]:
   array([
      [0, 1, 2],
      [3, 4, 5],
      [6, 7, 8],
      [9, 10, 11],
      [12, 13, 14]
   ])

In[151]: b
Out[151]: [4, 3, 1]

In[152]: mask = np.arange(len(a))[: , None] < b

In[153]: a.T[mask.T]
Out[153]: array([0, 3, 6, 9, 1, 4, 7, 2])

Another way to mask would be -

In[156]: a.T[np.greater.outer(b, np.arange(len(a)))]
Out[156]: array([0, 3, 6, 9, 1, 4, 7, 2])

If we are required to slice per row based on chunk sizes, we would need to modify few things -

In[51]: a
Out[51]:
   array([
      [0, 1, 2, 3, 4],
      [5, 6, 7, 8, 9],
      [10, 11, 12, 13, 14]
   ])

# slice lengths per row
In[52]: b
Out[52]: [4, 3, 1]

# Usual loop based solution:
   In[53]: np.concatenate([a[i,: b_i]
      for i, b_i in enumerate(b)
   ])
Out[53]: array([0, 1, 2, 3, 5, 6, 7, 10])

# Vectorized mask based solution:
   In[54]: a[np.greater.outer(b, np.arange(a.shape[1]))]
Out[54]: array([0, 1, 2, 3, 5, 6, 7, 10])

Suggestion : 2

arange is an array-valued version of the built-in Python range function:,While not common, a ufunc can return multiple arrays. modf is one example, a vectorized version of the built-in Python divmod: it returns the fractional and integral parts of a floating point array:,This chapter will introduce you to the basics of using NumPy arrays, and should be sufficient for following along with the rest of the book. While it’s not necessary to have a deep understanding of NumPy for many data analytical applications, becoming proficient in array-oriented programming and thinking is a key step along the way to becoming a scientific Python guru.,The numpy.random module supplements the built-in Python random with functions for efficiently generating whole arrays of sample values from many kinds of probability distributions. For example, you can get a 4 by 4 array of samples from the standard normal distribution using normal:

In[13]: data1 = [6, 7.5, 8, 0, 1]

In[14]: arr1 = np.array(data1)

In[15]: arr1
Out[15]: array([6., 7.5, 8., 0., 1.])
In[27]: arr1 = np.array([1, 2, 3], dtype = np.float64)

In[28]: arr2 = np.array([1, 2, 3], dtype = np.int32)

In[29]: arr1.dtype In[30]: arr2.dtype
Out[29]: dtype('float64') Out[30]: dtype('int32')
In[45]: arr = np.array([
   [1., 2., 3.],
   [4., 5., 6.]
])

In[46]: arr
Out[46]:
   array([
      [1., 2., 3.],
      [4., 5., 6.]
   ])

In[47]: arr * arr In[48]: arr - arr
Out[47]: Out[48]:
   array([
      [1., 4., 9.], array([
         [0., 0., 0.],
         [16., 25., 36.]
      ])[0., 0., 0.]
   ])
In[51]: arr = np.arange(10)

In[52]: arr
Out[52]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In[53]: arr[5]
Out[53]: 5

In[54]: arr[5: 8]
Out[54]: array([5, 6, 7])

In[55]: arr[5: 8] = 12

In[56]: arr
Out[56]: array([0, 1, 2, 3, 4, 12, 12, 12, 8, 9])
In[75]: arr[1: 6]
Out[75]: array([1, 2, 3, 4, 64])
In[83]: names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])

In[84]: data = np.random.randn(7, 4)

In[85]: names
Out[85]:
   array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'],
      dtype = '|S4')

In[86]: data
Out[86]:
   array([
      [-0.048, 0.5433, -0.2349, 1.2792],
      [-0.268, 0.5465, 0.0939, -2.0445],
      [-0.047, -2.026, 0.7719, 0.3103],
      [2.1452, 0.8799, -0.0523, 0.0672],
      [-1.0023, -0.1698, 1.1503, 1.7289],
      [0.1913, 0.4544, 0.4519, 0.5535],
      [0.5994, 0.8174, -0.9297, -1.2564]
   ])

Suggestion : 3

I have a numpy 2D array, and I would like to select different sized ranges of this array, depending on the column index. Here is the input array a = np.reshape(np.array(range(15)), (5, 3)) example,Then, list b = [4,3,1] determines the different range sizes for each column slice, so that we would get the arrays, The simplest case of indexing with N integers returns an array scalar representing the corresponding item. As in Python, all indices are zero-based: for the i -th index n i , the valid range is 0 ≤ n i < d i where d i is the i -th element of the shape of the array. , For example, I initialize a 2d numpy array as a = np.zeros ( (10,10)). I then try to index a portion of it using the range function as the indices by the following way: a [range (0,5),range (0,5)]. I get an array of shape (5,). What I want is the first 5 rows and columns of the 2d array a.


[
   [0 1 2][3 4 5][6 7 8][9 10 11][12 13 14]
]

In[150]: a Out[150]: array([
   [0, 1, 2],
   [3, 4, 5],
   [6, 7, 8],
   [9, 10, 11],
   [12, 13, 14]
]) In[151]: b Out[151]: [4, 3, 1] In[152]: mask = np.arange(len(a))[: , None] < b In[153]: a.T[mask.T] Out[153]: array([0, 3, 6, 9, 1, 4, 7, 2])
[
   [0 1 2][3 4 5][6 7 8][9 10 11][12 13 14]
]
[0 3 6 9][1 4 7][2]
[0 3 6 9 1 4 7 2]
slices = []
for i in range(a.shape[1]): slices.append(a[: b[i], i]) c = np.concatenate(slices)

Suggestion : 4

The simplest case of indexing with N integers returns an array scalar representing the corresponding item. As in Python, all indices are zero-based: for the i-th index \(n_i\), the valid range is \(0 \le n_i < d_i\) where \(d_i\) is the i-th element of the shape of the array. Negative indices are interpreted as counting from the end of the array (i.e., if \(n_i < 0\), it means \(n_i + d_i\)).,Integer array indexing allows selection of arbitrary items in the array based on their N-dimensional index. Each integer array represents a number of indices into that dimension.,Indexing with multidimensional index arrays tend to be more unusual uses, but they are permitted, and they are useful for some problems. We’ll start with the simplest multidimensional case:,If a zero-dimensional array is present in the index and it is a full integer index the result will be a scalar and not a zero-dimensional array. (Advanced indexing is not triggered.)

>>> x = np.arange(10) >>>
   x[2]
2
   >>>
   x[-2]
8
>>> x.shape = (2, 5) # now x is 2 - dimensional >>>
   x[1, 3]
8
   >>>
   x[1, -1]
9
>>> x[0]
array([0, 1, 2, 3, 4])
>>> x[0][2]
2
>>> x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>>
   x[1: 7: 2]
array([1, 3, 5])
>>> x[-2: 10]
array([8, 9]) >>>
   x[-3: 3: -1]
array([7, 6, 5, 4])

Suggestion : 5

Introduce the indexing and slicing scheme for accessing a multi-dimensional array’s contents,Two-dimensional Arrays Integer Indexing Slice Indexing Negative Indices Supplying Fewer Indices Than Dimensions ,This is not equivalent to a length-1 1D-array: np.array([15.2]). According to our definition of dimensionality, zero numbers are required to index into a 0-D array as it is unnecessary to provide an identifier for a standalone number. Thus you cannot index into a 0-D array.,This is because NumPy will automatically insert trailing slices for you if you don’t provide as many indices as there are dimensions for your array. grades[0] was treated as grades[0, :].

# A 0 - D array
np.array(8)

# A 1 - D array, shape - (3, )
np.array([2.3, 0.1, -9.1])

# A 2 - D array, shape - (3, 2)
np.array([
   [93, 95],
   [84, 100],
   [99, 87]
])

# A 3 - D array, shape - (2, 2, 2)
np.array([
   [
      [0, 1],
      [2, 3]
   ],

   [
      [4, 5],
      [6, 7]
   ]
])
>>>
import numpy as np

# A 3 - D array >>>
   x = np.array([
      [
         [0, 1],
         ...[2, 3]
      ],
      ...
      ...[
         [4, 5],
         ...[6, 7]
      ]
   ])

# get: sheet - 0, both rows, flip order of columns >>>
   x[0,: , ::-1]
array([
   [1, 0],
   [3, 2]
])
>>> simple_array = np.array([2.3, 0.1, -9.1])
+ -- -- -- + -- -- -- + -- -- -- +
|
2.3 | 0.1 | -9.1 |
   + -- -- -- + -- -- -- + -- -- -- +
   0 1 2 -
   3 - 2 - 1
>>> simple_array[0]
2.3

   >>>
   simple_array[-2]
0.1

   >>>
   simple_array[1: 3]
array([0.1, -9.1])

   >>>
   simple_array[3]
IndexError: index 3 is out of bounds
for axis 0 with size 3
# using a 1 - dimensional array to store the grades >>>
   grades = np.array([93, 95, 84, 100, 99, 87])