Pandas has no issue if the index level is a single level so not a multi index:
In[178]:
frame = frame.set_index(['a'])
frame.loc[1.2]
Out[178]:
b v
a
1.2 30 123
1.2 60 1234
If you do have a multi-index then you can get generate a mask using the index level 0 (the first) and use this to select the values:
In[180]:
mask = frame.index.get_level_values(0)
frame.loc[mask == 1.2]
Out[180]:
v
a b
1.2 30 123
60 1234
The mask itself contains all the level 0 values for each row:
In[181]:
mask
Out[181]:
Float64Index([1.2, 1.2, 3.0, 3.0], dtype = 'float64')
Came across this while trying something similar and it worked without issue. Either the pandas library has improved, or you are missing inplace (or assignment) in set_index.
example_data = [{ 'a': 1.2, 'b': 30, 'v': 123 }, { 'a': 1.2, 'b': 60, 'v': 1234 }, { 'a': 3, 'b': 30, 'v': 12345 }, { 'a': 3, 'b': 60, 'v': 123456 }, ] frame = pd.DataFrame(example_data) f2 = frame.set_index(['a', 'b']) # << << << << < print(f2) v a b 1.2 30 123 60 1234 3.0 30 12345 60 123456
Now f2.loc[1.2] works.
print(f2.loc[1.2])
v
b
30 123
60 1234
Define, manipulate, and interconvert integers and floats in Python.,Define, manipulate, and interconvert integers and floats in Python. ,How information is stored in a DataFrame or a Python object affects what we can do with it and the outputs of calculations as well. There are two main types of data that we will explore in this lesson: numeric and text data types.,Getting back to our data, we can modify the format of values within our data, if we want. For instance, we could convert the record_id field to floating point values.
# Make sure pandas is loaded import pandas as pd # Note that pd.read_csv is used because we imported pandas as pd surveys_df = pd.read_csv("data/surveys.csv")
type(surveys_df)
pandas.core.frame.DataFrame
surveys_df['sex'].dtype
dtype('O')
surveys_df['record_id'].dtype
Last Updated : 13 Apr, 2022
In order to select two rows and three columns, we select a two rows which we want to select and three columns and put it in a separate list like this:
Dataframe.loc[["row1", "row2"], ["column1", "column2", "column3"]]
In order to select all of the rows and some columns, we use single colon [:] to select all of rows and list of some columns which we want to select like this:
Dataframe.loc[: , ["column1", "column2", "column3"]]
3. Create Series with Index,4. Create DataFrame with Index,Pandas – Create DataFrame,5. Get DataFrame Index as List
You can create a pandas Index through its constructor. You can use any class from the above table to create an Index.
# Syntax of Index() constructor. class pandas.Index(data = None, dtype = None, copy = False, name = None, tupleize_cols = True, ** kwargs)
s = pd.Series(['A', 'B', 'C', 'D', 'E']) print(s) # Outputs #0 A # 1 B #2 C # 3 D #4 E
This creates a Series with a default numerical index starting from zero. You can set the Index with the custom values while creating a Series object.
idx = ['idx1', 'idx2', 'idx3', 'idx4', 'idx5'] s = pd.Series(['A', 'B', 'C', 'D', 'E'], index = idx) print(s) # Outputs #dtype: object #idx1 A #idx2 B #idx3 C #idx4 D #idx5 E #dtype: object
# Create pandas DataFrame from List import pandas as pd technologies = [ ["Spark", 20000, "30days"], ["pandas", 20000, "40days"], ] df = pd.DataFrame(technologies) print(df)
Since we are not giving labels to columns and rows(index), DataFrame by default assigns incremental sequence numbers as labels to both rows and columns called Index.
0 1 2 0 Spark 20000 30 days 1 pandas 20000 40 days