how to set value to a cell filtered by rows in python dataframe?

  • Last Update :
  • Techknowledgy :

If need change value of column in DataFrame is necessary DataFrame.loc with condition and column name:

df.loc[df['B'] % 2 == 0, 'C'] = 5
print(df)
A B C
0 1 2 5
1 4 5 6
2 7 8 5
3 10 11 12

You could just change the order to:

df['C'][df['B'] % 2 == 0] = 5

Using numpy where

df['C'] = np.where(df['B'] % 2 == 0, 5, df['C'])

Output

    A B C
    0 1 2 5
    1 4 5 6
    2 7 8 5
    3 10 11 12

Suggestion : 2

pandas.DataFrame.apply() method is used to apply the expression row-by-row and return the rows that matched the values.,pandas support several ways to filter by column value, DataFrame.query() method is the most used to filter the rows based on the expression and returns a new DataFrame after applying the column filter. In case you wanted to update the existing or referring DataFrame use inplace=True argument. Alternatively, you can also use DataFrame[] with loc[] and DataFrame.apply().,DataFrame.query() function is used to filter rows based on column value in pandas. After applying the expression, it returns a new DataFrame. If you wanted to update the existing DataFrame use inplace=True param., In case you wanted to filter and ignore rows that have None or nan on column values, use DataFrame.dropna() method.

1._
# Filter Rows using DataFrame.query()
df2 = df.query("Courses == 'Spark'")

#Using variable
value = 'Spark'
df2 = df.query("Courses == @value")

#inpace
df.query("Courses == 'Spark'", inplace = True)

#Not equals, in & multiple conditions
df.query("Courses != 'Spark'")
df.query("Courses in ('Spark','PySpark')")
df.query("`Courses Fee` >= 23000")
df.query("`Courses Fee` >= 23000 and `Courses Fee` <= 24000")

# Other ways to Filter Rows
df.loc[df['Courses'] == value]
df.loc[df['Courses'] != 'Spark']
df.loc[df['Courses'].isin(values)]
df.loc[~df['Courses'].isin(values)]
df.loc[(df['Discount'] >= 1000) & (df['Discount'] <= 2000)]
df.loc[(df['Discount'] >= 1200) & (df['Fee'] >= 23000)]

df[df["Courses"] == 'Spark']
df[df['Courses'].str.contains("Spark")]
df[df['Courses'].str.lower().str.contains("spark")]
df[df['Courses'].str.startswith("P")]

df.apply(lambda row: row[df['Courses'].isin(['Spark', 'PySpark'])])
df.dropna()

If you are a learner, Let’s see with sample data and run through these examples and explore the output to understand better. First, let’s create a pandas DataFrame from Dictionary.

import pandas as pd
import numpy as np
technologies = {
   'Courses': ["Spark", "PySpark", "Hadoop", "Python", "Pandas"],
   'Fee': [22000, 25000, 23000, 24000, 26000],
   'Duration': ['30days', '50days', '30days', None, np.nan],
   'Discount': [1000, 2300, 1000, 1200, 2500]
}
df = pd.DataFrame(technologies)
print(df)
3._
# Filter all rows with Courses rquals 'Spark'
df2 = df.query("Courses == 'Spark'")
print(df2)

In case you wanted to use a variable in the expression, use @ character.

# Filter Rows by using Python variable
value = 'Spark'
df2 = df.query("Courses == @value")
print(df2)
6._
# Replace current esisting DataFrame
df.query("Courses == 'Spark'", inplace = True)
print(df)

Suggestion : 3

Mar 18, 2022 , Mar 14, 2022 , Mar 17, 2022 , May 31, 2022

 num_df.loc[num_df['a'] == 2]

Suggestion : 4

February 22, 2018 by cmdline

Let us first load gapminder data as a dataframe into pandas.

# load pandas
import pandas as pd
data_url = 'http://bit.ly/2cLzoxH'
# read data from url as pandas dataframe
gapminder = pd.read_csv(data_url)

This data frame has over 6000 rows and 6 columns. One of the columns is year. Let us look at the first three rows of the data frame.

print(gapminder.head(3))
country year pop continent lifeExp gdpPercap
0 Afghanistan 1952 8425333.0 Asia 28.801 779.445314
1 Afghanistan 1957 9240934.0 Asia 30.332 820.853030
2 Afghanistan 1962 10267083.0 Asia 31.997 853.100710

For example, let us filter the dataframe or subset the dataframe based on year’s value 2002. This conditional results in a boolean variable that has True when the value of year equals 2002, False otherwise.

# does year equals to 2002 ?
   # is_2002 is a boolean variable with True or False in it >
   is_2002 = gapminder['year'] == 2002

We can then use this boolean variable to filter the dataframe. After subsetting we can see that new dataframe is much smaller in size.

# filter rows
for year 2002 using the boolean variable
   >
   gapminder_2002 = gapminder[is_2002]

Checking the shape or dimension of the filtered dataframe

> print(gapminder_2002.shape)
   (142, 6)