def consecutive(data, stepsize = 1):
return np.split(data, np.where(np.diff(data) != stepsize)[0] + 1)
a = np.array([0, 47, 48, 49, 50, 97, 98, 99])
consecutive(a)
yields
[array([0]), array([47, 48, 49, 50]), array([97, 98, 99])]
Here's a lil func that might help:
def group_consecutives(vals, step = 1):
""
"Return list of consecutive lists of numbers from vals (number list)."
""
run = []
result = [run]
expect = None
for v in vals:
if (v == expect) or(expect is None):
run.append(v)
else:
run = [v]
result.append(run)
expect = v + step
return result
>>>
group_consecutives(a)[[0], [47, 48, 49, 50], [97, 98, 99]] >>>
group_consecutives(a, step = 47)[[0, 47], [48], [49], [50, 97], [98], [99]]
this is what I came up so far: not sure is 100% correct
import numpy as np
a = np.array([0, 47, 48, 49, 50, 97, 98, 99])
print np.split(a, np.cumsum(np.where(a[1: ] - a[: -1] > 1)) + 1)
returns:
>>> [array([0]), array([47, 48, 49, 50]), array([97, 98, 99])]
Get where diff isn't one
diffs = numpy.diff(array) != 1
Get the indexes of diffs, grab the first dimension and add one to all because diff compares with the previous index
indexes = numpy.nonzero(diffs)[0] + 1
Split with the given indexes
groups = numpy.split(array, indexes)
It turns out that instead of np.split
, list comprehension is more performative. So the below function (almost like @unutbu's consecutive
function except it uses a list comprehension to split the array) is much faster:
def consecutive_w_list_comprehension(arr, stepsize = 1):
idx = np.r_[0, np.where(np.diff(arr) != stepsize)[0] + 1, len(arr)]
return [arr[i: j]
for i, j in zip(idx, idx[1: ])
]
For example, for an array of length 100_000, consecutive_w_list_comprehension
is over 4x faster:
arr = np.sort(np.random.choice(range(150000), size = 100000, replace = False)) % timeit - n 100 consecutive(arr) 96.1 ms± 1.22 ms per loop(mean± std.dev.of 7 runs, 100 loops each) % timeit - n 100 consecutive_w_list_comprehension(arr) 23.2 ms± 858 µs per loop(mean± std.dev.of 7 runs, 100 loops each)
Code used to produce the plot above:
import perfplot
import numpy as np
def consecutive(data, stepsize = 1):
return np.split(data, np.where(np.diff(data) != stepsize)[0] + 1)
def consecutive_w_list_comprehension(arr, stepsize = 1):
idx = np.r_[0, np.where(np.diff(arr) != stepsize)[0] + 1, len(arr)]
return [arr[i: j]
for i, j in zip(idx, idx[1: ])
]
def group_consecutives(vals, step = 1):
run = []
result = [run]
expect = None
for v in vals:
if (v == expect) or(expect is None):
run.append(v)
else:
run = [v]
result.append(run)
expect = v + step
return result
def JozeWs(array):
diffs = np.diff(array) != 1
indexes = np.nonzero(diffs)[0] + 1
groups = np.split(array, indexes)
return groups
perfplot.show(
setup = lambda n: np.sort(np.random.choice(range(2 * n), size = n, replace = False)),
kernels = [consecutive, consecutive_w_list_comprehension, group_consecutives, JozeWs],
labels = ['consecutive', 'consecutive_w_list_comprehension', 'group_consecutives', 'JozeWs'],
n_range = [2 ** k
for k in range(5, 22)
],
equality_check = lambda * lst: all((x == y).all() for x, y in zip( * lst)),
xlabel = '~len(arr)'
)
You can iterate over a list using
for i in range(len(a)):
print a[i]
You could test the next element in the list meets some criteria like follows
if a[i] == a[i] + 1:
print "it must be a consecutive run"
And you can store results seperately in
results = []
I have a NumPy array as follows:,Based on a previous question I can count the anycodings_python number c which is defined by the number of anycodings_python times the elements in a are less than b 2 or anycodings_python more times consecutively.,Now I would like to output an array each anycodings_python time the condition is met instead of anycodings_python counting the number of times the condition anycodings_python is met.,But none of them achieved what I am anycodings_python searching for. Can someone point me to the anycodings_python right Python tools in order to output the anycodings_python different arrays satisfying my condition?
I have a NumPy array as follows:
import numpy as np
a = np.array([1, 4, 2, 6, 4, 4, 6, 2, 7, 6, 2, 8, 9, 3, 6, 3, 4, 4, 5, 8])
Based on a previous question I can count the anycodings_python number c which is defined by the number of anycodings_python times the elements in a are less than b 2 or anycodings_python more times consecutively.
from itertools
import groupby
b = 6
sum(len(list(g)) >= 2
for i, g in groupby(a < b) if i)
So with this example the right output would anycodings_python be:
array1 = [1, 4, 2] array2 = [4, 4] array3 = [3, 4, 4, 5]
So far I have tried different options:
np.isin((len(list(g)) >= 2 for i, g in groupby(a < b) if i), a)
and
np.extract((len(list(g)) >= 2 for i, g in groupby(a < b) if i), a)
Based on this answer I came up with the anycodings_python following solution using np.split which anycodings_python is more efficent than both previously anycodings_python added answers here:
array = np.append(a, -np.inf) # padding so we don 't lose last element mask = array >= 6 # values to be removed split_indices = np.where(mask)[0] for subarray in np.split(array, split_indices + 1): if len(subarray) > 2: print(subarray[: -1])
gives:
[1. 4. 2.]
[4. 4.]
[3. 4. 4. 5.]
Use groupby and grab the groups:
from itertools
import groupby
lst = []
b = 6
for i, g in groupby(a, key = lambda x: x < b):
grp = list(g)
if i and len(grp) >= 2:
lst.append(grp)
print(lst)
#[[1, 4, 2], [4, 4], [3, 4, 4, 5]]
This task is very similar to image anycodings_python labeling, but, in your case, it is anycodings_python one-dimensional. SciPy library provides anycodings_python some useful functionality for image anycodings_python processing that we could employ here:
import numpy as np from scipy.ndimage import(binary_dilation, binary_erosion, label) a = np.array([1, 4, 2, 6, 4, 4, 6, 2, 7, 6, 2, 8, 9, 3, 6, 3, 4, 4, 5, 8]) b = 6 # your threshold min_consequent_count = 2 mask = a < b structure = [False] + [True] * min_consequent_count # used for erosion and dilation eroded = binary_erosion(mask, structure) dilated = binary_dilation(eroded, structure) labeled_array, labels_count = label(dilated) # labels_count == c for label_number in range(1, labels_count + 1): # labeling starts from 1 subarray = a[labeled_array == label_number] print(subarray)
gives:
[1 4 2]
[4 4]
[3 4 4 5]
mask = a < b returns a boolean array anycodings_python with True values where elements are less anycodings_python than the threshold b:
array([True, True, True, False, True, True, False, True, False, False, True, False, False, True, False, True, True, True, True, False ])
We managed to remove single True values anycodings_python but we need to get the initial anycodings_python configuration for other groups. In order anycodings_python to do so, we use binary dilation with anycodings_python the same structure:
>>> dilated = binary_dilation(eroded, structure) >>> dilated array([True, True, True, False, True, True, False, False, False, False, False, False, False, False, False, True, True, True, True, False ])
And as a final step, we label each group anycodings_python with scipy.ndimage.label:
>>> labeled_array, labels_count = label(dilated) >>> labeled_array array([1, 1, 1, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 3, 3, 3, 0]) >>> labels_count 3
mask = a < b returns a boolean array anycodings_python with True values where elements are less anycodings_python than the threshold b:
array([True, True, True, False, True, True, False, True, False, False, True, False, False, True, False, True, True, True, True, False ])
As you can see the result contains some anycodings_python True elements that don't have any other anycodings_python True neighbors around them. To eliminate anycodings_python them we could use binary erosion. I use anycodings_python scipy.ndimage.binary_erosion for that anycodings_python purpose. Its default structure parameter anycodings_python is not suitable for our needs as it will anycodings_python also delete two consequent True values, anycodings_python so I construct my own:
>>> structure = [False] + [True] * min_consequent_count >>> structure[False, True, True] >>> eroded = binary_erosion(mask, structure) >>> eroded array([True, True, False, False, True, False, False, False, False, False, False, False, False, False, False, True, True, True, False, False ])
I have to cluster the consecutive elements from a NumPy array. Considering the following example, 1 day ago Python I have to cluster the consecutive elements from a NumPy array. Considering the following example The output should be a list of tuples as … Press J to jump to the feed. , › How to set a property in powershell on an instance of a class that implements idictionary and icollection , 1 day ago How to find the groups of consecutive elements in a NumPy array - PYTHON [ Ext for Developers : https://www.hows.tech/p/recommended.html ] How to find the g...
a = [0, 47, 48, 49, 50, 97, 98, 99]
def consecutive(data, stepsize = 1): return np.split(data, np.where(np.diff(data) != stepsize)[0] + 1) a = np.array([0, 47, 48, 49, 50, 97, 98, 99]) consecutive(a)
[array([0]), array([47, 48, 49, 50]), array([97, 98, 99])]
def group_consecutives(vals, step = 1): ""
"Return list of consecutive lists of numbers from vals (number list)."
""
run = [] result = [run] expect = Nonefor v in vals: if (v == expect) or(expect is None): run.append(v)
else: run = [v] result.append(run) expect = v + stepreturn result >>> group_consecutives(a)[[0], [47, 48, 49, 50], [97, 98, 99]] >>> group_consecutives(a, step = 47)[[0, 47], [48], [49], [50, 97], [98], [99]]
The easiest way to create an array is to use the array function. This accepts any sequence-like object (including other arrays) and produces a new NumPy array containing the passed data. For example, a list is a good candidate for conversion:,Whenever you see “array”, “NumPy array”, or “ndarray” in the text, with few exceptions they all refer to the same thing: the ndarray object.,NumPy array indexing is a rich topic, as there are many ways you may want to select a subset of your data or individual elements. One-dimensional arrays are simple; on the surface they act similarly to Python lists:,As a simple example, suppose we wished to evaluate the function sqrt(x^2 + y^2) across a regular grid of values. The np.meshgrid function takes two 1D arrays and produces two 2D matrices corresponding to all pairs of (x, y) in the two arrays:
In[83]: names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
In[84]: data = np.random.randn(7, 4)
In[85]: names
Out[85]:
array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'],
dtype = '|S4')
In[86]: data
Out[86]:
array([
[-0.048, 0.5433, -0.2349, 1.2792],
[-0.268, 0.5465, 0.0939, -2.0445],
[-0.047, -2.026, 0.7719, 0.3103],
[2.1452, 0.8799, -0.0523, 0.0672],
[-1.0023, -0.1698, 1.1503, 1.7289],
[0.1913, 0.4544, 0.4519, 0.5535],
[0.5994, 0.8174, -0.9297, -1.2564]
])
And use idxmax() to get the index of the first 1 occurence,To visualize the way you described, just reshape the output,Extract the first and last indices of all sequences of 1s in a numpy array and append them to a list?,extract the first occurrence in numpy array following the nan
IIUC, you can first make your np array 2D and build a data frame, which makes everything easier. Take a look
row, cols = m.shape[0], m.shape[1] * m.shape[2] df = pd.DataFrame(m.reshape(row, cols)) 0 1 2 3 0 1.0 0.0 0.0 1.0 1 0.0 1.0 0.0 1.0 2 1.0 1.0 1.0 1.0 3 1.0 1.0 1.0 0.0 4 1.0 0.0 0.0 1.0 5 1.0 1.0 1.0 1.0
Now you can use a reverse rolling
window of 3
on axis=0
and check if all
elements are 1
ndf = df[::-1].rolling(3, axis = 0).apply(all, raw = True)[::-1] 0 1 2 3 0 NaN NaN NaN 1.0 1 NaN 1.0 NaN NaN 2 1.0 NaN NaN NaN 3 1.0 NaN NaN NaN 4 NaN NaN NaN NaN 5 NaN NaN NaN NaN
And use idxmax()
to get the index of the first 1
occurence
ndf[ndf >= 1].idxmax()
0 2.0
1 1.0
2 NaN
3 0.0
dtype: float