Using as_strided
in this way appears to be somewhat faster than Divakar's approach (20 ms vs 35 ms here), although memory usage might be an issue.
data_wins = as_strided(data, shape = (data.size - 2 * winsize + 1, 2 * winsize), strides = (8, 8)) inds = np.random.randint(low = 0, high = data.size - 2 * winsize, size = inds_size) sliced = data_wins[inds] sliced = sliced.transpose((2, 0, 1)) # to use the same index order as before
Strides are the steps in bytes for the index in each dimension. For example, with an array of shape (x, y, z)
and a data type of size d
(8 for float64), the strides will ordinarily be (y*z*d, z*d, d)
, so that the second index steps over whole rows of z items. Setting both values to 8, data_wins[i, j]
and data_wins[j, i]
will refer to the same memory location.
>>>
import numpy as np
>>>
from numpy.lib.stride_tricks
import as_strided
>>>
a = np.arange(10, dtype = np.int8) >>>
as_strided(a, shape = (3, 10 - 2), strides = (1, 1))
array([
[0, 1, 2, 3, 4, 5, 6, 7],
[1, 2, 3, 4, 5, 6, 7, 8],
[2, 3, 4, 5, 6, 7, 8, 9]
], dtype = int8)
Here's a vectorized approach using broadcasting
-
# Get 3 D offsetting array and add to inds for all indices allinds = inds + np.arange(-60, 60)[: , None, None] # Index into data with all indices for desired output sliced_dataout = data[allinds]
Runtime test -
In[20]: # generate some 1 D data
...: data = np.random.randn(500)
...:
...: # window size(slices are 2 * winsize long)
...: winsize = 60
...:
...: # number of slices to take from the data
...: inds_size = (100, 200)
...:
...: # get random integers that
function as indices into the data
...: inds = np.random.randint(low = winsize, high = len(data) - winsize, size = inds_size)
...:
In[21]: % % timeit
...: sliced_data = np.zeros((winsize * 2, ) + inds_size)
...: for k in range(inds_size[0]):
...: for l in range(inds_size[1]):
...: sliced_data[: , k, l] = data[inds[k, l] - winsize: inds[k, l] + winsize]
...:
10 loops, best of 3: 66.9 ms per loop
In[22]: % % timeit
...: allinds = inds + np.arange(-60, 60)[: , None, None]
...: sliced_dataout = data[allinds]
...:
10 loops, best of 3: 24.1 ms per loop
If memory consumption is an issue, here's a compromise solution with one loop -
sliced_dataout = np.zeros((winsize * 2, ) + inds_size) for k in range(sliced_data.shape[0]): sliced_dataout[k] = data[inds - winsize + k]
Updated: January 28, 2021
avg_monthly_precip = numpy.array([0.70, 0.75, 1.85])
precip_2002_2013 = numpy.array([ [1.07, 0.44, 1.5], [0.27, 1.13, 1.72] ])
# Import necessary packages import os import numpy as np import earthpy as et
# Download.txt with avg monthly precip(inches) monthly_precip_url = 'https://ndownloader.figshare.com/files/12565616' et.data.get_data(url = monthly_precip_url) # Download.csv of precip data for 2002 and 2013(inches) precip_2002_2013_url = 'https://ndownloader.figshare.com/files/12707792' et.data.get_data(url = precip_2002_2013_url)
'/root/earth-analytics/data/earthpy-downloads/monthly-precip-2002-2013.csv'
# Set working directory to earth - analytics os.chdir(os.path.join(et.io.HOME, 'earth-analytics'))
Slicing in python means taking elements from one given index to another given index.,We pass slice instead of index like this: [start:end].,From the second element, slice elements from index 1 to index 4 (not included):,From both elements, slice index 1 to index 4 (not included), this will return a 2-D array:
arr = np.array([10, 15, 20, 25, 30, 35, 40]) print(arr)
A slicing operation creates a view on the original array, which is just a way of accessing array data. Thus the original array is not copied in memory. You can use np.may_share_memory() to check if two arrays share the same memory block. Note however, that this uses heuristics and may give you false positives.,NumPy arrays can be indexed with slices, but also with boolean or integer arrays (masks). This method is called fancy indexing. It creates copies not views.,Use fancy indexing on the left and array creation on the right to assign values into an array, for instance by setting parts of the array in the diagram above to zero.,When a new array is created by indexing with an array of integers, the new array has the same shape as the array of integers:
>>>
import numpy as np
>>>
a = np.array([0, 1, 2, 3]) >>>
a
array([0, 1, 2, 3])
In[1]: L = range(1000)
In[2]: % timeit[i ** 2
for i in L]
1000 loops, best of 3: 403 us per loop
In[3]: a = np.arange(1000)
In[4]: % timeit a ** 2
100000 loops, best of 3: 12.7 us per loop
In [5]: np.array?
String Form:<built-in function array>
Docstring:
array(object, dtype=None, copy=True, order=None, subok=False, ndmin=0, ...
>>> np.lookfor('create array')
Search results
for 'create array'
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -
numpy.array
Create an array.
numpy.memmap
Create a memory - map to an array stored in a * binary * file on disk.
In[6]: np.con * ?
np.concatenate
np.conj
np.conjugate
np.convolve
>>>
import numpy as np