using numpy.argpartition ignoring nans

  • Last Update :
  • Techknowledgy :

Use count of NaNs to offset and thus compute indices and extract values -

In[200]: N = 3

In[201]: c = np.isnan(x).sum()

In[204]: idx = np.argpartition(x.ravel(), -N - c)[-N - c: -c]

In[207]: val = x.flat[idx]

In[208]: idx, val
Out[208]: (array([1, 3, 7]), array([2., 2., 6.]))

Suggestion : 2

Use count of NaNs to offset and thus anycodings_python compute indices and extract values -,this works very good for an array without anycodings_sorting NaNs, but if there is NaNs the nan's are anycodings_sorting coming as the largest item because it is anycodings_sorting considered as infinity in python.,Trying to do linear regression in Python, but for some reason column data type is set to string,Navigation bar appears for few seconds before disappearing when trying to hide it in swiftUI which is embedded in UIKIt in iOS 15

this works very good for an array without anycodings_sorting NaNs, but if there is NaNs the nan's are anycodings_sorting coming as the largest item because it is anycodings_sorting considered as infinity in python.

x = np.array([np.nan, 2, -1, 2, -4, -8, -9, 6, -3]).reshape(3, 3)
y = np.argpartition(x.ravel(), -3)[-3: ]
z = x.ravel()[y]
# this is the result I am getting === [2, 6, nan]
# but I need this === = [2, 2, 6]

Use count of NaNs to offset and thus anycodings_python compute indices and extract values -

In[200]: N = 3

In[201]: c = np.isnan(x).sum()

In[204]: idx = np.argpartition(x.ravel(), -N - c)[-N - c: -c]

In[207]: val = x.flat[idx]

In[208]: idx, val
Out[208]: (array([1, 3, 7]), array([2., 2., 6.]))

Suggestion : 3

Perform an indirect partition along the given axis using the algorithm specified by the kind keyword. It returns an array of indices of the same shape as a that index data along the given axis in partitioned order.,Array of indices that partition a along the specified axis. If a is one-dimensional, a[index_array] yields a partitioned a. More generally, np.take_along_axis(a, index_array, axis) always yields the partitioned a, irrespective of dimensionality.,Element index to partition by. The k-th element will be in its final sorted position and all smaller elements will be moved before it and all larger elements behind it. The order all elements in the partitions is undefined. If provided with a sequence of k-th it will partition all of them into their sorted position at once.,When a is an array with fields defined, this argument specifies which fields to compare first, second, etc. A single field can be specified as a string, and not all fields need be specified, but unspecified fields will still be used, in the order in which they come up in the dtype, to break ties.

>>> x = np.array([3, 4, 2, 1]) >>>
   x[np.argpartition(x, 3)]
array([2, 1, 3, 4]) >>>
   x[np.argpartition(x, (1, 3))]
array([1, 2, 3, 4])
>>> x = [3, 4, 2, 1] >>>
   np.array(x)[np.argpartition(x, 3)]
array([2, 1, 3, 4])
>>> x = np.array([
      [3, 4, 2],
      [1, 3, 1]
   ]) >>>
   index_array = np.argpartition(x, kth = 1, axis = -1) >>>
   np.take_along_axis(x, index_array, axis = -1) # same as np.partition(x, kth = 1)
array([
   [2, 3, 4],
   [1, 1, 3]
])

Suggestion : 4

In Python, the numpy.argpartition() function returns the indices that would partition an array along with a given axis based on the specified kth element(s). ,We also worked on the top three questions about the np.argpartition() function, ranging from np.argpartition 2d array/axis, np.argpartition order, and np.argpartition ignore np.nan. ,If it sounds great to you, please continue reading, and you will fully understand the np.argpartition() function through Python code snippets and vivid visualization.,If you are more interested in sorting array as a whole, you might want to check out the numpy.argsort() function and here is my tutorial on numpy.argsort().

Here is the syntax of np.argpartition():

# Syntax
numpy.argpartition(a, kth[, axis = -1[,
   kind = 'introselect' [, order = None]]])

Here is a one-dimensional array code example:

# Basic Example
import numpy as np
one_dim = np.array([2, 3, 1, 5, 4])
# kth = 0 - > partition based on 2(zero index).
partitioned = np.argpartition(one_dim, 0)
print(f 'Unpartitioned array: {one_dim}')
print(f 'Partitioned array index: {partitioned}')
print(f 'Partitioned array: {one_dim[partitioned]}')

Here is the 2d array code example:

import numpy as np

# axis = 0[partial sort along the axis = 0. In this
   case, row - like.]
print('axis = 0')
two_dim = np.array([
   [1, 4, 3],
   [3, 2, 1]
])
partitioned = np.argpartition(two_dim, kth = 0, axis = 0)
print(f 'Unpartitioned array: {two_dim}')
print(f 'Partitioned array index: {partitioned}')
print(f 'Partitioned array: {np.take_along_axis(two_dim, partitioned, axis=0)}')

# axis = 1[partial sort along the axis = 1. In this
   case, column - like.]
print('-' * 85)
print('axis = 1')
two_dim = np.array([
   [1, 4, 3],
   [3, 2, 1]
])
partitioned = np.argpartition(two_dim, kth = 0, axis = 1)
print(f 'Unpartitioned array: {two_dim}')
print(f 'Partitioned array index: {partitioned}')
print(f 'Partitioned array: {np.take_along_axis(two_dim, partitioned, axis=1)}')

Suggestion : 5

Last Updated : 15 Nov, 2018

numpy.sort() : This function returns a sorted copy of an array.

# importing libraries
import numpy as np

# sort along the first axis
a = np.array([
   [12, 15],
   [10, 1]
])
arr1 = np.sort(a, axis = 0)
print("Along first axis : \n", arr1)

# sort along the last axis
a = np.array([
   [10, 15],
   [12, 1]
])
arr2 = np.sort(a, axis = -1)
print("\nAlong first axis : \n", arr2)

a = np.array([
   [12, 15],
   [10, 1]
])
arr1 = np.sort(a, axis = None)
print("\nAlong none axis : \n", arr1)

Output :

Along first axis: [
   [10 1]
   [12 15]
]

Along first axis: [
   [10 15]
   [1 12]
]

Along none axis: [1 10 12 15]

 
numpy.argsort() : This function returns the indices that would sort an array.

# Python code to demonstrate
# working of numpy.argsort
import numpy as np

# Numpy array created
a = np.array([9, 3, 1, 7, 4, 3, 6])

# unsorted array print
print('Original array:\n', a)

# Sort array indices
b = np.argsort(a)
print('Sorted indices of original array->', b)

# To get sorted array using sorted indices
# c is temp array created of same len as of b
c = np.zeros(len(b), dtype = int)
for i in range(0, len(b)):
   c[i] = a[b[i]]
print('Sorted array->', c)

 
numpy.lexsort() : This function returns an indirect stable sort using a sequence of keys.

# Python code to demonstrate working of
   # numpy.lexsort()
import numpy as np

# Numpy array created
# First column
a = np.array([9, 3, 1, 3, 4, 3, 6])

# Second column
b = np.array([4, 6, 9, 2, 1, 8, 7])
print('column a, column b')
for (i, j) in zip(a, b):
   print(i, ' ', j)

# Sort by a then by b
ind = np.lexsort((b, a))
print('Sorted indices->', ind)

numpy.argmax() : This function returns indices of the max element of the array in a particular axis.

# Python Program illustrating
# working of argmax()

import numpy as geek

# Working on 2 D array
array = geek.arange(12).reshape(3, 4)
print("INPUT ARRAY : \n", array)

# No axis mentioned, so works on entire array
print("\nMax element : ", geek.argmax(array))

# returning Indices of the max element
# as per the indices
print(("\nIndices of Max element : ", geek.argmax(array, axis = 0)))
print(("\nIndices of Max element : ", geek.argmax(array, axis = 1)))

Suggestion : 6

We got nan which is not correct. We need to exclude the nans before calculating the mean. Numpy has nanmean which does the mean for only non nan values.,To resolve the above situation we will have to use numpy masks. Masks are used to mask the values which need not to be used in computation., Natural Language Processing Opinion Mining Aspect Level Sentiment Analysis Sentiment Analysis using Autoencoders Understanding Autoencoders With Examples Word Embeddings Transformers In SVM Classifier ,AttributeError: 'numpy.ndarray' object has no attribute 'nanmean'. Correct way is to pass numpy array to nanmean function.

import pandas as pd
import numpy as np
a = np.array([1, np.nan, np.nan, np.nan, 3, 4, 5, 6, 7, 8, 9])
a
array([1., nan, nan, nan, 3., 4., 5., 6., 7., 8., 9.])
type(a)
numpy.ndarray

Suggestion : 7

Mean of array elements along given axis ignoring NaNs.,Median of array elements along given axis ignoring NaNs.,Partition array elements along given axis.,Sum of array elements along given axis treating NaNs as zero.

>>> bn.nansum(1)
1
   >>>
   bn.nansum([1])
1
   >>>
   bn.nansum([1, np.nan])
1.0
   >>>
   a = np.array([
      [1, 1],
      [1, np.nan]
   ]) >>>
   bn.nansum(a)
3.0
   >>>
   bn.nansum(a, axis = 0)
array([2., 1.])
>>> bn.nansum([1, np.nan, np.inf])
inf
   >>>
   bn.nansum([1, np.nan, np.NINF]) -
   inf >>>
   bn.nansum([1, np.nan, np.inf, np.NINF])
nan
>>> bn.nanmean(1)
1.0
   >>>
   bn.nanmean([1])
1.0
   >>>
   bn.nanmean([1, np.nan])
1.0
   >>>
   a = np.array([
      [1, 4],
      [1, np.nan]
   ]) >>>
   bn.nanmean(a)
2.0
   >>>
   bn.nanmean(a, axis = 0)
array([1., 4.])
>>> bn.nanmean([1, np.nan, np.inf])
inf
   >>>
   bn.nanmean([1, np.nan, np.NINF]) -
   inf >>>
   bn.nanmean([1, np.nan, np.inf, np.NINF])
nan
>>> np.sqrt((a * a).mean() - a.mean() ** 2)
>>> np.sqrt(((a - a.mean()) ** 2).mean())