Think boolean-indexing could be one efficient way. Hence, we can create a mask and then index cols and get our output -
# Generate mask for cols mask = np.zeros(arr.shape[1], dtype = bool) for (i, j) in selector: mask[i: j] = True # Boolean index into cols for final o / p out = arr[: , mask]
If there are many entries in selector
, there's a broadcasting-based vectorized way to create the mask for cols, like so -
r = np.arange(arr.shape[1]) mask = ((selector[: , 0, None] <= r) & (selector[: , 1, None] > r)).any(0)
You can just create an indexing array from individual aranges
slices = [ [0, 2], [6, 9] ] np.concatenate([np.arange( * i) for i in slices]) # array([0, 1, 6, 7, 8])
and use it to extract the data
arr[: , np.concatenate([np.arange( * i) for i in slices])] # array([ [0, 1, 6, 7, 8], #[12, 13, 18, 19, 20] ])
The top level method np.sort returns a sorted copy of an array instead of modifying the array in place. A quick-and-dirty way to compute the quantiles of an array is to sort it and select the value at a particular rank:,Transposing is a special form of reshaping which similarly returns a view on the underlying data without copying anything. Arrays have the transpose method and also the special T attribute:,Calling astype always creates a new array (a copy of the data), even if the new dtype is the same as the old dtype.,Selecting data from an array by boolean indexing always creates a copy of the data, even if the returned array is unchanged.
In[83]: names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
In[84]: data = np.random.randn(7, 4)
In[85]: names
Out[85]:
array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'],
dtype = '|S4')
In[86]: data
Out[86]:
array([
[-0.048, 0.5433, -0.2349, 1.2792],
[-0.268, 0.5465, 0.0939, -2.0445],
[-0.047, -2.026, 0.7719, 0.3103],
[2.1452, 0.8799, -0.0523, 0.0672],
[-1.0023, -0.1698, 1.1503, 1.7289],
[0.1913, 0.4544, 0.4519, 0.5535],
[0.5994, 0.8174, -0.9297, -1.2564]
])
Last Updated : 05 Aug, 2021,GATE CS 2021 Syllabus
Output :
TypeError: can 't multiply sequence by non-int of type '
list '
Output :
Array is: [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]
a[-8: 17: 1] = [12 13 14 15 16]
a[10: ] = [10 11 12 13 14 15 16 17 18 19]
Datasets are very similar to NumPy arrays. They are homogeneous collections of data elements, with an immutable datatype and (hyper)rectangular shape. Unlike NumPy arrays, they support a variety of transparent storage features such as compression, error-detection, and chunked I/O.,HDF5 datasets re-use the NumPy slicing syntax to read and write to the file. Slice specifications are translated directly to HDF5 “hyperslab” selections, and are a fast and efficient way to access data in the file. The following slicing arguments are recognized:,Read from an HDF5 dataset directly into a NumPy array, which can avoid making an intermediate copy as happens with slicing. The destination array must be C-contiguous and writable, and must have a datatype to which the source data may be cast. Data type conversion will be carried out on the fly by HDF5.,Return a wrapper allowing you to read data as a particular type. Conversion is handled by HDF5 directly, on the fly:
>>> dset = f.create_dataset("default", (100, )) >>>
dset = f.create_dataset("ints", (100, ), dtype = 'i8')
>>> arr = np.arange(100) >>>
dset = f.create_dataset("init", data = arr)
>>> dset = f.create_dataset("MyDataset", (10, 10, 10), 'f') >>>
dset[0, 0, 0] >>>
dset[0, 2: 10, 1: 9: 3] >>>
dset[: , ::2, 5] >>>
dset[0] >>>
dset[1, 5] >>>
dset[0, ...] >>>
dset[..., 6] >>>
dset[()]
>>> dset.fields("FieldA")[: 10] # Read a single field >>>
dset[: 10]["FieldA"] # Read all fields, select in NumPy
>>> dset[0,: ,: ] = np.arange(10) # Broadcasts to(10, 10)
>>> f = h5py.File('my_hdf5_file.h5', 'w') >>> dset = f.create_dataset("test", (2, 2)) >>> dset[0][1] = 3.0 # No effect! >>> print(dset[0][1]) 0.0
A single label, e.g. 5 or 'a', (note that 5 is interpreted as a label of the index. This use is not an integer position along the index),A single label, e.g. 5 or 'a', (note that 5 is interpreted as a label of the index. This use is not an integer position along the index),A list or array of labels ['a', 'b', 'c'],A slice object with labels 'a':'f', (note that contrary to usual python slices, both the start and the stop are included!)
In [1]: dates = date_range('1/1/2000', periods=8)
In [2]: df = DataFrame(randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D'])
In [3]: df
Out[3]:
A B C D
2000-01-01 0.469112 -0.282863 -1.509059 -1.135632
2000-01-02 1.212112 -0.173215 0.119209 -1.044236
2000-01-03 -0.861849 -2.104569 -0.494929 1.071804
2000-01-04 0.721555 -0.706771 -1.039575 0.271860
2000-01-05 -0.424972 0.567020 0.276232 -1.087401
2000-01-06 -0.673690 0.113648 -1.478427 0.524988
2000-01-07 0.404705 0.577046 -1.715002 -1.039268
2000-01-08 -0.370647 -1.157892 -1.344312 0.844885
In [4]: panel = Panel({'one' : df, 'two' : df - df.mean()})
In [5]: panel
Out[5]:
<class 'pandas.core.panel.Panel'>
Dimensions: 2 (items) x 8 (major_axis) x 4 (minor_axis)
Items axis: one to two
Major_axis axis: 2000-01-01 00:00:00 to 2000-01-08 00:00:00
Minor_axis axis: A to D
In[6]: s = df['A']
In[7]: s[dates[5]]
Out[7]: -0.67368970808837025
In[8]: panel['two']
Out[8]:
A B C D
2000 - 01 - 01 0.409571 0.113086 - 0.610826 - 0.936507
2000 - 01 - 02 1.152571 0.222735 1.017442 - 0.845111
2000 - 01 - 03 - 0.921390 - 1.708620 0.403304 1.270929
2000 - 01 - 04 0.662014 - 0.310822 - 0.141342 0.470985
2000 - 01 - 05 - 0.484513 0.962970 1.174465 - 0.888276
2000 - 01 - 06 - 0.733231 0.509598 - 0.580194 0.724113
2000 - 01 - 07 0.345164 0.972995 - 0.816769 - 0.840143
2000 - 01 - 08 - 0.430188 - 0.761943 - 0.446079 1.044010
In[9]: df
Out[9]:
A B C D
2000 - 01 - 01 0.469112 - 0.282863 - 1.509059 - 1.135632
2000 - 01 - 02 1.212112 - 0.173215 0.119209 - 1.044236
2000 - 01 - 03 - 0.861849 - 2.104569 - 0.494929 1.071804
2000 - 01 - 04 0.721555 - 0.706771 - 1.039575 0.271860
2000 - 01 - 05 - 0.424972 0.567020 0.276232 - 1.087401
2000 - 01 - 06 - 0.673690 0.113648 - 1.478427 0.524988
2000 - 01 - 07 0.404705 0.577046 - 1.715002 - 1.039268
2000 - 01 - 08 - 0.370647 - 1.157892 - 1.344312 0.844885
In[10]: df[['B', 'A']] = df[['A', 'B']]
In[11]: df
Out[11]:
A B C D
2000 - 01 - 01 - 0.282863 0.469112 - 1.509059 - 1.135632
2000 - 01 - 02 - 0.173215 1.212112 0.119209 - 1.044236
2000 - 01 - 03 - 2.104569 - 0.861849 - 0.494929 1.071804
2000 - 01 - 04 - 0.706771 0.721555 - 1.039575 0.271860
2000 - 01 - 05 0.567020 - 0.424972 0.276232 - 1.087401
2000 - 01 - 06 0.113648 - 0.673690 - 1.478427 0.524988
2000 - 01 - 07 0.577046 0.404705 - 1.715002 - 1.039268
2000 - 01 - 08 - 1.157892 - 0.370647 - 1.344312 0.844885
In[12]: sa = Series([1, 2, 3], index = list('abc'))
In[13]: dfa = df.copy()
In[14]: sa.b
Out[14]: 2
In[15]: dfa.A
Out[15]:
2000 - 01 - 01 - 0.282863
2000 - 01 - 02 - 0.173215
2000 - 01 - 03 - 2.104569
2000 - 01 - 04 - 0.706771
2000 - 01 - 05 0.567020
2000 - 01 - 06 0.113648
2000 - 01 - 07 0.577046
2000 - 01 - 08 - 1.157892
Freq: D, Name: A, dtype: float64
In[16]: panel.one
Out[16]:
A B C D
2000 - 01 - 01 0.469112 - 0.282863 - 1.509059 - 1.135632
2000 - 01 - 02 1.212112 - 0.173215 0.119209 - 1.044236
2000 - 01 - 03 - 0.861849 - 2.104569 - 0.494929 1.071804
2000 - 01 - 04 0.721555 - 0.706771 - 1.039575 0.271860
2000 - 01 - 05 - 0.424972 0.567020 0.276232 - 1.087401
2000 - 01 - 06 - 0.673690 0.113648 - 1.478427 0.524988
2000 - 01 - 07 0.404705 0.577046 - 1.715002 - 1.039268
2000 - 01 - 08 - 0.370647 - 1.157892 - 1.344312 0.844885
In[17]: sa.a = 5
In[18]: sa
Out[18]:
a 5
b 2
c 3
dtype: int64
In[19]: dfa.A = list(range(len(dfa.index))) # ok
if A already exists
In[20]: dfa
Out[20]:
A B C D
2000 - 01 - 01 0 0.469112 - 1.509059 - 1.135632
2000 - 01 - 02 1 1.212112 0.119209 - 1.044236
2000 - 01 - 03 2 - 0.861849 - 0.494929 1.071804
2000 - 01 - 04 3 0.721555 - 1.039575 0.271860
2000 - 01 - 05 4 - 0.424972 0.276232 - 1.087401
2000 - 01 - 06 5 - 0.673690 - 1.478427 0.524988
2000 - 01 - 07 6 0.404705 - 1.715002 - 1.039268
2000 - 01 - 08 7 - 0.370647 - 1.344312 0.844885
In[21]: dfa['A'] = list(range(len(dfa.index))) # use this form to create a new column
In[22]: dfa
Out[22]:
A B C D
2000 - 01 - 01 0 0.469112 - 1.509059 - 1.135632
2000 - 01 - 02 1 1.212112 0.119209 - 1.044236
2000 - 01 - 03 2 - 0.861849 - 0.494929 1.071804
2000 - 01 - 04 3 0.721555 - 1.039575 0.271860
2000 - 01 - 05 4 - 0.424972 0.276232 - 1.087401
2000 - 01 - 06 5 - 0.673690 - 1.478427 0.524988
2000 - 01 - 07 6 0.404705 - 1.715002 - 1.039268
2000 - 01 - 08 7 - 0.370647 - 1.344312 0.844885