pandas: return count of matching values between two dataframe variables

  • Last Update :
  • Techknowledgy :

Use Series.value_counts to count the number of occurrences for each city in US['city'], and then use Series.map to apply those counts to corresponding values in UK['city']:

In[40]: US['city'].value_counts()
Out[40]:
   Edinburgh 3
Bury 2
Hamilton 2
Name: city, dtype: int64

In[41]: UK['count'] = UK['city'].map(US['city'].value_counts())

In[42]: UK
Out[42]:
   city count
0 Hamilton 2
1 Edinburgh 3
2 Bury 2

Suggestion : 2

Pandas: Return count of matching values between two DataFrame variables,Find matching values in two pandas dataframes and return a value from the matching row,Pandas DataFrame merge between two values instead of matching one,Pandas compare and sum values between two DataFrame with different size

Use Series.value_counts to count the number of occurrences for each city in US['city'], and then use Series.map to apply those counts to corresponding values in UK['city']:

In[40]: US['city'].value_counts()
Out[40]:
   Edinburgh 3
Bury 2
Hamilton 2
Name: city, dtype: int64

In[41]: UK['count'] = UK['city'].map(US['city'].value_counts())

In[42]: UK
Out[42]:
   city count
0 Hamilton 2
1 Edinburgh 3
2 Bury 2

Suggestion : 3

If true, the result keeps values that are equal. Otherwise, equal values are shown as NaNs.,DataFrame that shows the differences stacked side by side.,If true, all rows and columns are kept. Otherwise, only the ones with different values are kept., 0, or ‘index’Resulting differences are stacked verticallywith rows drawn alternately from self and other.

>>> df = pd.DataFrame(
      ...{
         ..."col1": ["a", "a", "b", "b", "a"],
         ..."col2": [1.0, 2.0, 3.0, np.nan, 5.0],
         ..."col3": [1.0, 2.0, 3.0, 4.0, 5.0]
            ...
      },
      ...columns = ["col1", "col2", "col3"],
      ...) >>>
   df
col1 col2 col3
0 a 1.0 1.0
1 a 2.0 2.0
2 b 3.0 3.0
3 b NaN 4.0
4 a 5.0 5.0
>>> df2 = df.copy() >>>
   df2.loc[0, 'col1'] = 'c' >>>
   df2.loc[2, 'col3'] = 4.0 >>>
   df2
col1 col2 col3
0 c 1.0 1.0
1 a 2.0 2.0
2 b 3.0 4.0
3 b NaN 4.0
4 a 5.0 5.0
>>> df.compare(df2)
col1 col3
self other self other
0 a c NaN NaN
2 NaN NaN 3.0 4.0
>>> df.compare(df2, align_axis = 0)
col1 col3
0 self a NaN
other c NaN
2 self NaN 3.0
other NaN 4.0
>>> df.compare(df2, keep_equal = True)
col1 col3
self other self other
0 a c 1.0 1.0
2 b b 3.0 4.0
>>> df.compare(df2, keep_shape = True)
col1 col2 col3
self other self other self other
0 a c NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN 3.0 4.0
3 NaN NaN NaN NaN NaN NaN
4 NaN NaN NaN NaN NaN NaN