comparing pandas.series for equality when they are in different orders

  • Last Update :
  • Techknowledgy :

You can overcome it with:

In[5]: x == y.reindex(x.index)
Out[5]:
   A True
B True
C True
dtype: bool

or

In[6]: x.sort_index() == y.sort_index()
Out[6]:
   A True
B True
C True
dtype: bool

Suggestion : 2

This function allows two Series or DataFrames to be compared against each other to see if they have the same shape and elements. NaNs in the same location are considered equal.,Compare two DataFrame objects of the same shape and return a DataFrame where each element is True if the respective element in each DataFrame is equal, False otherwise.,Compare two Series objects of the same length and return a Series where each element is True if the element in each Series is equal, False otherwise.,DataFrames df and different_data_type have different types for the same values for their elements, and will return False even though their column labels are the same values and types.

>>> df = pd.DataFrame({
      1: [10],
      2: [20]
   }) >>>
   df
1 2
0 10 20
>>> exactly_equal = pd.DataFrame({
      1: [10],
      2: [20]
   }) >>>
   exactly_equal
1 2
0 10 20
   >>>
   df.equals(exactly_equal)
True
>>> different_column_type = pd.DataFrame({
      1.0: [10],
      2.0: [20]
   }) >>>
   different_column_type
1.0 2.0
0 10 20
   >>>
   df.equals(different_column_type)
True
>>> different_data_type = pd.DataFrame({
      1: [10.0],
      2: [20.0]
   }) >>>
   different_data_type
1 2
0 10.0 20.0
   >>>
   df.equals(different_data_type)
False

Suggestion : 3

Last Updated : 25 Oct, 2020

Syntax: 

Series.equals(other)

Suggestion : 4

@jreback I was answering this SO question: http://stackoverflow.com/questions/22983523/comparing-pandas-series-for-equality-when-they-are-in-different-orders/22983621#22983621. And I was wondering:,reported here again: http://stackoverflow.com/questions/30284415/why-do-pandas-comparison-operators-not-align-on-index/30285686#30285686,Out of interest, is there a way to figure out whether s1 and s2 are a view on the same underlying series and in this case have s1 == s2 do the current comparison. When s1 and s2 don't have anything to do with each other you would do the equivalent of aligning + comparison.,There doesn't seem to be any tests for pandas.Series. eq for two series with a different index in pandas/pandas/tests/test_series.py. I have a patch lying around to add such a test and I could commit it if that's useful.

import operator
import pandas
s1 = pandas.Series([1, 2], ['a', 'b'])
s2 = pandas.Series([2, 3], ['b', 'c'])
s1 == s2
s2 == s1
InIn[5]: s1 == s2
Out[5]:
   a False
b False

In[6]: s2 == s1
Out[6]:
   b False
c False
In[7]: s1.combine(s2, operator.eq)
Out[7]:
   a 0
b 1
c 0

In[8]: s2.combine(s1, operator.eq)
Out[8]:
   a 0
b 1
c 0
In[154]: x = pd.Series(index = ["A", "B", "C"], data = [1, 2, 3])
In[155]: y = pd.Series(index = ["C", "B", "A"], data = [3, 2, 1])
In[156]: x == y
Out[156]:
   A False
B True
C False
dtype: bool

In[157]: x.eq(y)
Out[157]:
   A False
B True
C False
dtype: bool

In[158]: x.to_frame() == y.to_frame()
Traceback(most recent call last):
   ...
   ValueError: Can only compare identically - labeled DataFrame objects

In[159]: x.to_frame().eq(y.to_frame())
Out[159]:
   0
A True
B True
C True
Starting in v0 .8, pandas introduced binary comparison methods eq, ne, lt, gt, le, and ge to Series and DataFrame whose behavior is analogous to the binary arithmetic operations described above:
pd.Series([1, 2, 3], index = list('ABC')) + pd.Series([2, 2, 2], index = list('ABD'))
# A 3.0
# B 4.0
# C NaN
# D NaN
# dtype: float64

# pd.Series([1, 2, 3], index = list('ABC')) + pd.Series([2, 2, 2, 2], index = list('ABCD'))
# A 3.0
# B 4.0
# C 5.0
# D NaN
# dtype: float64

Suggestion : 5

Here, df1 and df2 are the two dataframes you want to compare. Note that NaNs in the same location are considered equal. ,In the above example, you can see that NaNs and None are considered equal if they occur at the same location.,In the above example, we see that the elements of the dataframes df1 and df2 are the same but since the column names are different both the dataframes cannot be said to be equal.,In the above example, two dataframes df1 and df2 are compared for equality using the equals() method. Since the dataframes are exactly similar (1. values and datatypes of elements are the same and values and 2. datatypes of row and column labels are the same) True is returned.

The pandas dataframe function equals() is used to compare two dataframes for equality. It returns True if the two dataframes have the same shape and elements. For two dataframes to be equal, the elements should have the same dtype. The column headers, however, do not need to have the same dtype. The following is the syntax:

df1.equals(df2)

1. Compare two exactly similar dataframes

import pandas as pd

# two identical dataframes
df1 = pd.DataFrame({
   'A': [1, 2],
   'B': ['x', 'y']
})
df2 = pd.DataFrame({
   'A': [1, 2],
   'B': ['x', 'y']
})

# print the two dataframes
print("DataFrame df1:")
print(df1)
print("\nDataFrame df2:")
print(df2)

# check
if both are equal
print(df1.equals(df2))
3._
DataFrame df1:
   A B
0 1 x
1 2 y

DataFrame df2:
   A B
0 1 x
1 2 y
True

Output:

DataFrame df1:
   A B
0 1.0 x
1 NaN None

DataFrame df2:
   A B
0 1.0 x
1 NaN None

Are both equal ?
   True

3. Compare two dataframes with equal values but different dtypes

import pandas as pd
import numpy as np

# two identical dataframes
df1 = pd.DataFrame({
   'A': [1, 2],
   'B': ['x', 'y']
})
df2 = pd.DataFrame({
   'A': [1.0, 2.0],
   'B': ['x', 'y']
})

# print the two dataframes
print("DataFrame df1:")
print(df1)
print("\nDataFrame df2:")
print(df2)

# check
if both are equal
print("\nAre both equal?")
print(df1.equals(df2))
DataFrame df1:
   A B
0 1 x
1 2 y

DataFrame df2:
   A B
0 1 x
1 2 y
True

Suggestion : 6

Compare Two pandas DataFrames in Python,Compare Headers of Two pandas DataFrames,Next, we’ll also need to construct some example data.

import pandas as pd # Load pandas
data = pd.DataFrame({
   'x1': [1, 3, 2, 4, 7, 5],
   # Create pandas DataFrame 'x2': ['a', 'b', 'c', 'd', 'e', 'f'],
   'x3': range(1, 7)
})
print(data) # Print pandas DataFrame
print(data['x1'].equals(data['x3'])) # Apply equals
function
# False
print(data['x1'] == data['x3']) # Apply logical operator
# 0 True
# 1 False
# 2 False
# 3 True
# 4 False
# 5 False
# dtype: bool
print(data['x1'].isin(data['x3'])) # Apply isin
function
# 0 True
# 1 True
# 2 True
# 3 True
# 4 False
# 5 True
# Name: x1, dtype: bool