Just use something like df[df != 0]
to get at the nonzero parts of your dataframe:
import pandas as pd
import numpy as np
np.random.seed(123)
df = pd.DataFrame(np.random.randint(0, 10, (5, 5)), columns = list('abcde'))
df
Out[11]:
a b c d e
0 2 2 6 1 3
1 9 6 1 0 1
2 9 0 0 9 3
3 4 0 0 4 1
4 7 3 2 4 7
df[df != 0] = 1
df
Out[13]:
a b c d e
0 1 1 1 1 1
1 1 1 1 0 1
2 1 0 0 1 1
3 1 0 0 1 1
4 1 1 1 1 1
As an unorthodox alternative, consider
% timeit(df / df == 1).astype(int) 1000 loops, best of 3: 449 µs per loop % timeit df[df != 0] = 1 1000 loops, best of 3: 801 µs per loop
create column 280 from 279 for class {1:Normal,0:Arrhythmia}
df[280] = df[279] df[280][df[280] != 1] = 0
Values of the DataFrame are replaced with other values dynamically.,This differs from updating with .loc or .iloc, which require you to specify a location to update with some value.,Dicts can be used to specify different replacement values for different existing values. For example, {'a': 'b', 'y': 'z'} replaces the value ‘a’ with ‘b’ and ‘y’ with ‘z’. To use a dict in this way the value parameter should be None.,Replace values given in to_replace with value.
>>> s = pd.Series([1, 2, 3, 4, 5]) >>> s.replace(1, 5) 0 5 1 2 2 3 3 4 4 5 dtype: int64
>>> df = pd.DataFrame({
'A': [0, 1, 2, 3, 4],
...'B': [5, 6, 7, 8, 9],
...'C': ['a', 'b', 'c', 'd', 'e']
}) >>>
df.replace(0, 5)
A B C
0 5 5 a
1 1 6 b
2 2 7 c
3 3 8 d
4 4 9 e
>>> df.replace([0, 1, 2, 3], 4)
A B C
0 4 5 a
1 4 6 b
2 4 7 c
3 4 8 d
4 4 9 e
>>> df.replace([0, 1, 2, 3], [4, 3, 2, 1])
A B C
0 4 5 a
1 3 6 b
2 2 7 c
3 1 8 d
4 4 9 e
>>> s.replace([1, 2], method = 'bfill')
0 3
1 3
2 3
3 4
4 5
dtype: int64
>>> df.replace({
0: 10,
1: 100
})
A B C
0 10 5 a
1 100 6 b
2 2 7 c
3 3 8 d
4 4 9 e
Pandas Replace will replace values in your DataFrame with another value. This function starts simple, but gets flexible & fun later on.,Pandas DataFrame.replace() is a small but powerful function that will replace (or swap) values in your DataFrame with another value. What starts as a simple function, can quickly be expanded for most of your scenarios,Here we will find a all instances of a single value in our DataFrame, and replace it with something else.,Want to replace values in your DataFrame with something else? No problem. That is where pandas replace comes in.
Pandas DataFrame.replace() is a small but powerful function that will replace (or swap) values in your DataFrame with another value. What starts as a simple function, can quickly be expanded for most of your scenarios
1. YourDataFrame.replace(to_replace = 'what you want to replace', \
value = 'what you want to replace with')
import pandas as pd
df = pd.DataFrame({
'X': [1, 2, 3, 4, 5],
'Y': [5, 6, 7, 8, 9],
'Z': ['z', 'y', 'x', 'w', 'v']
})
df
df.replace(to_replace = 2, value = 20)
df.replace(to_replace = [1, 3, 5], value = 20)
df.replace(to_replace = [1, 3, 5], value = [10, 30, 50])
Updated: July 17, 2019
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Date': ['11/8/2011', '11/9/2011', '11/10/2011',
'11/11/2011', '11/12/2011'
],
'Event': ['Dance', 'Painting', 'Dance', 'Dance', 'Painting']
})
df
df.loc[(df.Event == 'Dance'), 'Event'] = 'Hip-Hop'
df
df['Event'] = np.where((df.Event == 'Painting'), 'Art', df.Event)
df
df['Event'].mask(df['Event'] == 'Hip-Hop', 'Jazz', inplace = True)
m = df.Event == 'Art'
df.where(~m, other = 'Theater')
df = pd.DataFrame([
[1.4, 8],
[1.2, 5],
[0.3, 10]
],
index = ['China', 'India', 'USA'],
columns = ['Population(B)', 'Economy'])
1 day ago Replace zero value with the column mean. You might want to replace those missing values with the average value of your DataFrame column. In our case, we’ll modify the salary column. Here is a simple snippet that you can use: salary_col = campaigns ['salary'] salary_col.replace (to_replace = 0, value = salary_col.mean (), inplace=True) Here ... , 3 days ago Replace Column Values With Conditions in Pandas DataFrame. We can use boolean conditions to specify the targeted elements. df.loc [df.grades>50, 'result']='success' replaces the values in the grades column with sucess if the values is greather than 50. , 1 week ago For a DataFrame a dict can specify that different values should be replaced in different columns. For example, {'a': 1, 'b': 'z'} looks for the value 1 in column ‘a’ and the value ‘z’ in column ‘b’ and replaces these values with whatever is specified in value. The value parameter should not be None in this case. , 6 days ago Jul 23, 2021 · (4) Replace a single value with a new value for an entire DataFrame: df = df.replace(['old value'],'new value') In the next section, you’ll see how to apply the above templates in practice. Steps to Replace Values in Pandas DataFrame Step 1: Gather your Data. To begin, gather your data with the values that you’d like to replace.
3, 0, 1, 0, 0 11, 0, 0, 0, 0 1, 0, 0, 0, 0 0, 0, 0, 0, 4 13, 1, 1, 5, 0
import pandas as pd
import numpy as np np.random.seed(123) df = pd.DataFrame(np.random.randint(0, 10, (5, 5)), columns = list('abcde')) df Out[11]: a b c d e 0 2 2 6 1 3 1 9 6 1 0 1 2 9 0 0 9 3 3 4 0 0 4 1 4 7 3 2 4 7 df[df != 0] = 1 df Out[13]: a b c d e 0 1 1 1 1 1 1 1 1 1 0 1 2 1 0 0 1 1 3 1 0 0 1 1 4 1 1 1 1 1
3, 0, 1, 0, 0 11, 0, 0, 0, 0 1, 0, 0, 0, 0 0, 0, 0, 0, 4 13, 1, 1, 5, 0
1, 0, 1, 0, 0 1, 0, 0, 0, 0 1, 0, 0, 0, 0 0, 0, 0, 0, 1 1, 1, 1, 1, 0
import pandas as pd
import numpy as np np.random.seed(123) df = pd.DataFrame(np.random.randint(0, 10, (5, 5)), columns = list('abcde')) df Out[11]: a b c d e 0 2 2 6 1 3 1 9 6 1 0 1 2 9 0 0 9 3 3 4 0 0 4 1 4 7 3 2 4 7 df[df != 0] = 1 df Out[13]: a b c d e 0 1 1 1 1 1 1 1 1 1 0 1 2 1 0 0 1 1 3 1 0 0 1 1 4 1 1 1 1 1
% timeit(df / df == 1).astype(int) 1000 loops, best of 3: 449 µs per loop % timeit df[df != 0] = 1 1000 loops, best of 3: 801 µs per loop
August 25, 2021March 8, 2022
To start things off, let’s begin by loading a Pandas dataframe. We’ll keep things simple so it’s easier to follow exactly what we’re replacing.
import pandas as pd
df = pd.DataFrame.from_dict({
'Name': ['Jane', 'Melissa', 'John', 'Matt'],
'Age': [23, 45, 35, 64],
'Birth City': ['London', 'Paris', 'Toronto', 'Atlanta'],
'Gender': ['F', 'F', 'M', 'M']
})
print(df)
This returns the following dataframe:
Name Age Birth City Gender 0 Jane 23 London F 1 Melissa 45 Paris F 2 John 35 Toronto M 3 Matt 64 Atlanta M
The Pandas .replace()
method takes a number of different parameters. Let’s take a look at them:
DataFrame.replace(
to_replace = None,
value = None,
inplace = False,
limit = None,
regex = False,
method = 'pad')
Of course, you could simply run the method twice, but there’s a much more efficient way to accomplish this. Here, we’ll look to replace London
and Paris
with Europe
:
df['Birth City'] = df['Birth City'].replace(
to_replace = ['London', 'Paris'],
value = 'Europe')
print(df)
In the example below, we’ll replace London
with England
and Paris
with France
:
df['Birth City'] = df['Birth City'].replace(
to_replace = ['London', 'Paris'],
value = ['England', 'France'])
print(df)
You can replace all values or selected values in a column of pandas DataFrame based on condition by using DataFrame.loc[], np.where() and DataFrame.mask() methods.,In this article, you have learned how to replace values of all columns or selected columns in pandas DataFrame based on condition by using DataFrame.loc[], np.where() and DataFrame.mask() methods with detailed examples.,You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame.loc[] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.,If, you are in hurry below are some quick examples to replace column values based on the condition in pandas DataFrame.
# Below are some quick examples. # Replace values of columns by using DataFrame.loc[] property. df.loc[df['Fee'] > 22000, 'Fee'] = 1 # Replace values of Given column by using np.where() function. df['Fee'] = np.where(df['Fee'] > 22000, 1, df['Fee']) # By checking multiple conditions df['Fee'] = np.where((df['Fee'] >= 22000) & (df['Courses'] == 'PySpark'), 14000, df['Fee']) # Using DataFrame.mask() function. df['Fee'].mask(df['Fee'] >= 22000, '0', inplace = True)
Now, let’s create a Pandas DataFrame with a few rows and columns and execute some examples to update all or selected values with other values in a column. Our DataFrame contains column names Courses
, Fee
, Duration
, and Discount
.
# Create a Pandas DataFrame. import pandas as pd import numpy as np technologies = { 'Courses': ["Spark", "PySpark", "Python", "pandas"], 'Fee': [20000, 25000, 22000, 30000], 'Duration': ['30days', '40days', '35days', '50days'], 'Discount': [1000, 2300, 1200, 2000] } index_labels = ['r1', 'r2', 'r3', 'r4'] df = pd.DataFrame(technologies, index = index_labels) print(df)
Yields below output.
Courses Fee Duration Discount r1 Spark 20000 30 days 1000 r2 PySpark 25000 40 days 2300 r3 Python 22000 35 days 1200 r4 pandas 30000 50 days 2000
Courses Fee Duration Discount r1 Spark 20000 30 days 1000 r2 PySpark 15000 40 days 2300 r3 Python 22000 35 days 1200 r4 pandas 15000 50 days 2000
Another method to replace values of columns based on condition by using numpy.where()
function. The where()
function returns the indices of elements in an input array where the given condition is satisfied. Here, NumPy is a very popular library used for calculations with 2d and 3d arrays.
# Replace values of Given column by using np.where() function. df = pd.DataFrame(technologies, index = index_labels) df['Fee'] = np.where(df['Fee'] >= 22000, 15000, df['Fee']) print(df)