You can overcome it with:
In[5]: x == y.reindex(x.index)
Out[5]:
A True
B True
C True
dtype: bool
or
In[6]: x.sort_index() == y.sort_index()
Out[6]:
A True
B True
C True
dtype: bool
This function allows two Series or DataFrames to be compared against each other to see if they have the same shape and elements. NaNs in the same location are considered equal.,Compare two DataFrame objects of the same shape and return a DataFrame where each element is True if the respective element in each DataFrame is equal, False otherwise.,Compare two Series objects of the same length and return a Series where each element is True if the element in each Series is equal, False otherwise.,DataFrames df and different_data_type have different types for the same values for their elements, and will return False even though their column labels are the same values and types.
>>> df = pd.DataFrame({ 1: [10], 2: [20] }) >>> df 1 2 0 10 20
>>> exactly_equal = pd.DataFrame({ 1: [10], 2: [20] }) >>> exactly_equal 1 2 0 10 20 >>> df.equals(exactly_equal) True
>>> different_column_type = pd.DataFrame({ 1.0: [10], 2.0: [20] }) >>> different_column_type 1.0 2.0 0 10 20 >>> df.equals(different_column_type) True
>>> different_data_type = pd.DataFrame({ 1: [10.0], 2: [20.0] }) >>> different_data_type 1 2 0 10.0 20.0 >>> df.equals(different_data_type) False
Last Updated : 25 Oct, 2020
Syntax:
Series.equals(other)
@jreback I was answering this SO question: http://stackoverflow.com/questions/22983523/comparing-pandas-series-for-equality-when-they-are-in-different-orders/22983621#22983621. And I was wondering:,reported here again: http://stackoverflow.com/questions/30284415/why-do-pandas-comparison-operators-not-align-on-index/30285686#30285686,Out of interest, is there a way to figure out whether s1 and s2 are a view on the same underlying series and in this case have s1 == s2 do the current comparison. When s1 and s2 don't have anything to do with each other you would do the equivalent of aligning + comparison.,There doesn't seem to be any tests for pandas.Series. eq for two series with a different index in pandas/pandas/tests/test_series.py. I have a patch lying around to add such a test and I could commit it if that's useful.
import operator
import pandas
s1 = pandas.Series([1, 2], ['a', 'b'])
s2 = pandas.Series([2, 3], ['b', 'c'])
s1 == s2
s2 == s1
InIn[5]: s1 == s2
Out[5]:
a False
b False
In[6]: s2 == s1
Out[6]:
b False
c False
In[7]: s1.combine(s2, operator.eq)
Out[7]:
a 0
b 1
c 0
In[8]: s2.combine(s1, operator.eq)
Out[8]:
a 0
b 1
c 0
In[154]: x = pd.Series(index = ["A", "B", "C"], data = [1, 2, 3])
In[155]: y = pd.Series(index = ["C", "B", "A"], data = [3, 2, 1])
In[156]: x == y
Out[156]:
A False
B True
C False
dtype: bool
In[157]: x.eq(y)
Out[157]:
A False
B True
C False
dtype: bool
In[158]: x.to_frame() == y.to_frame()
Traceback(most recent call last):
...
ValueError: Can only compare identically - labeled DataFrame objects
In[159]: x.to_frame().eq(y.to_frame())
Out[159]:
0
A True
B True
C True
Starting in v0 .8, pandas introduced binary comparison methods eq, ne, lt, gt, le, and ge to Series and DataFrame whose behavior is analogous to the binary arithmetic operations described above:
pd.Series([1, 2, 3], index = list('ABC')) + pd.Series([2, 2, 2], index = list('ABD')) # A 3.0 # B 4.0 # C NaN # D NaN # dtype: float64 # pd.Series([1, 2, 3], index = list('ABC')) + pd.Series([2, 2, 2, 2], index = list('ABCD')) # A 3.0 # B 4.0 # C 5.0 # D NaN # dtype: float64
Here, df1 and df2 are the two dataframes you want to compare. Note that NaNs in the same location are considered equal. ,In the above example, you can see that NaNs and None are considered equal if they occur at the same location.,In the above example, we see that the elements of the dataframes df1 and df2 are the same but since the column names are different both the dataframes cannot be said to be equal.,In the above example, two dataframes df1 and df2 are compared for equality using the equals() method. Since the dataframes are exactly similar (1. values and datatypes of elements are the same and values and 2. datatypes of row and column labels are the same) True is returned.
The pandas dataframe function equals()
is used to compare two dataframes for equality. It returns True
if the two dataframes have the same shape and elements. For two dataframes to be equal, the elements should have the same dtype
. The column headers, however, do not need to have the same dtype. The following is the syntax:
df1.equals(df2)
1. Compare two exactly similar dataframes
import pandas as pd # two identical dataframes df1 = pd.DataFrame({ 'A': [1, 2], 'B': ['x', 'y'] }) df2 = pd.DataFrame({ 'A': [1, 2], 'B': ['x', 'y'] }) # print the two dataframes print("DataFrame df1:") print(df1) print("\nDataFrame df2:") print(df2) # check if both are equal print(df1.equals(df2))
DataFrame df1: A B 0 1 x 1 2 y DataFrame df2: A B 0 1 x 1 2 y True
Output:
DataFrame df1: A B 0 1.0 x 1 NaN None DataFrame df2: A B 0 1.0 x 1 NaN None Are both equal ? True
3. Compare two dataframes with equal values but different dtypes
import pandas as pd import numpy as np # two identical dataframes df1 = pd.DataFrame({ 'A': [1, 2], 'B': ['x', 'y'] }) df2 = pd.DataFrame({ 'A': [1.0, 2.0], 'B': ['x', 'y'] }) # print the two dataframes print("DataFrame df1:") print(df1) print("\nDataFrame df2:") print(df2) # check if both are equal print("\nAre both equal?") print(df1.equals(df2))
DataFrame df1: A B 0 1 x 1 2 y DataFrame df2: A B 0 1 x 1 2 y True
Compare Two pandas DataFrames in Python,Compare Headers of Two pandas DataFrames,Next, we’ll also need to construct some example data.
import pandas as pd # Load pandas
data = pd.DataFrame({ 'x1': [1, 3, 2, 4, 7, 5], # Create pandas DataFrame 'x2': ['a', 'b', 'c', 'd', 'e', 'f'], 'x3': range(1, 7) }) print(data) # Print pandas DataFrame
print(data['x1'].equals(data['x3'])) # Apply equals
function
# False
print(data['x1'] == data['x3']) # Apply logical operator # 0 True # 1 False # 2 False # 3 True # 4 False # 5 False # dtype: bool
print(data['x1'].isin(data['x3'])) # Apply isin
function
# 0 True
# 1 True
# 2 True
# 3 True
# 4 False
# 5 True
# Name: x1, dtype: bool