how to work with data indexed by floats in pandas

  • Last Update :
  • Techknowledgy :

Pandas has no issue if the index level is a single level so not a multi index:

In[178]:

   frame = frame.set_index(['a'])
frame.loc[1.2]
Out[178]:
   b v
a
1.2 30 123
1.2 60 1234

If you do have a multi-index then you can get generate a mask using the index level 0 (the first) and use this to select the values:

In[180]:

   mask = frame.index.get_level_values(0)
frame.loc[mask == 1.2]
Out[180]:
   v
a b
1.2 30 123
60 1234

The mask itself contains all the level 0 values for each row:

In[181]:

   mask
Out[181]:
   Float64Index([1.2, 1.2, 3.0, 3.0], dtype = 'float64')

Came across this while trying something similar and it worked without issue. Either the pandas library has improved, or you are missing inplace (or assignment) in set_index.

example_data = [{
      'a': 1.2,
      'b': 30,
      'v': 123
   },
   {
      'a': 1.2,
      'b': 60,
      'v': 1234
   },
   {
      'a': 3,
      'b': 30,
      'v': 12345
   },
   {
      'a': 3,
      'b': 60,
      'v': 123456
   },
]
frame = pd.DataFrame(example_data)
f2 = frame.set_index(['a', 'b']) # << << << << <
   print(f2)
v
a b
1.2 30 123
60 1234
3.0 30 12345
60 123456

Now f2.loc[1.2] works.

print(f2.loc[1.2])
v
b
30 123
60 1234

Suggestion : 2

Define, manipulate, and interconvert integers and floats in Python.,Define, manipulate, and interconvert integers and floats in Python. ,How information is stored in a DataFrame or a Python object affects what we can do with it and the outputs of calculations as well. There are two main types of data that we will explore in this lesson: numeric and text data types.,Getting back to our data, we can modify the format of values within our data, if we want. For instance, we could convert the record_id field to floating point values.

# Make sure pandas is loaded
import pandas as pd

# Note that pd.read_csv is used because we imported pandas as pd
surveys_df = pd.read_csv("data/surveys.csv")
type(surveys_df)
pandas.core.frame.DataFrame
surveys_df['sex'].dtype
dtype('O')
surveys_df['record_id'].dtype

Suggestion : 3

Last Updated : 13 Apr, 2022

In order to select two rows and three columns, we select a two rows which we want to select and three columns and put it in a separate list like this:

Dataframe.loc[["row1", "row2"], ["column1", "column2", "column3"]]

In order to select all of the rows and some columns, we use single colon [:] to select all of rows and list of some columns which we want to select like this:

Dataframe.loc[: , ["column1", "column2", "column3"]]

Suggestion : 4

3. Create Series with Index,4. Create DataFrame with Index,Pandas – Create DataFrame,5. Get DataFrame Index as List

You can create a pandas Index through its constructor. You can use any class from the above table to create an Index.

# Syntax of Index() constructor.
class pandas.Index(data = None, dtype = None, copy = False, name = None, tupleize_cols = True, ** kwargs)
2._
s = pd.Series(['A', 'B', 'C', 'D', 'E'])
print(s)

# Outputs
#0    A
# 1 B
#2    C
# 3 D
#4    E

This creates a Series with a default numerical index starting from zero. You can set the Index with the custom values while creating a Series object.

idx = ['idx1', 'idx2', 'idx3', 'idx4', 'idx5']
s = pd.Series(['A', 'B', 'C', 'D', 'E'], index = idx)
print(s)

# Outputs
#dtype: object
#idx1 A
#idx2 B
#idx3 C
#idx4 D
#idx5 E
#dtype: object
5._
# Create pandas DataFrame from List
import pandas as pd
technologies = [
   ["Spark", 20000, "30days"],
   ["pandas", 20000, "40days"],
]
df = pd.DataFrame(technologies)
print(df)

Since we are not giving labels to columns and rows(index), DataFrame by default assigns incremental sequence numbers as labels to both rows and columns called Index.

0 1 2
0 Spark 20000 30 days
1 pandas 20000 40 days