python: how to find the intersection between two lists based on object's id?

  • Last Update :
  • Techknowledgy :

There's no need to intersect two sets. In this case you can just check if the id() exists in another set.

set2 = {
   id(n) for n in list2
}
result = [n
   for n in list1
   if id(n) in set2
]

The complexity of this code is O(n1 + n2). I'll explain this in following equivalent but more readable code:

set2 = {
   id(n) for n in list2
}
# O(n2)
result = []
for n in list1: # O(n1)
if id(n) in set2: # O(1)
result.append(n) # O(1)

There is also an alternative solution if you can make change to the Node class by just defining the __hash__ and __eq__ method.

class Node:
   ...

   def __hash__(self):
   return id(self)

def __eq__(self, another):
   return id(self) == id(another)

list1 = [...]
list2 = [...]

result = set(list1) & set(list2)

The solution you suggested will work.

class Node:
   def __init__(self, value):
   self.value = value

def __repr__(self):
   return "Node {}".format(self.value)

nodes1 = [Node(1), Node(2), Node(3)]
nodes2 = nodes1[: 2] + [Node(4)]

common_nodes = set(nodes1) & set(nodes2)

print(common_nodes) # {
   Node 2, Node 1
}

You can check this is true with the following experiment.

>>> obj = object() >>>
   hash(obj)
155115580943
   >>>
   id(obj)
2481849295088
   >>>
   id(obj) // 16 == hash(obj)
True

Suggestion : 2

Last Updated : 01 Sep, 2021,GATE CS 2021 Syllabus

1._
Input:
   lst1 = [15, 9, 10, 56, 23, 78, 5, 4, 9]
lst2 = [9, 4, 5, 36, 47, 26, 10, 45, 87]
Output: [9, 10, 4, 5]

Input:
   lst1 = [4, 9, 1, 17, 11, 26, 28, 54, 69]
lst2 = [9, 9, 74, 21, 45, 11, 63, 28, 26]
Output: [9, 11, 26, 28]

Output: 
 

[9, 11, 26, 28]

Working: The filter part takes each sublist’s item and checks to see if it is in the source list. The list comprehension is executed for each sublist in list2. 
Output: 
 

[
   [13, 32],
   [7, 13, 28],
   [1, 6]
]

Suggestion : 3

There's no need to intersect two sets. anycodings_python In this case you can just check if the anycodings_python id() exists in another set.,I know if I had two lists of, say integers, anycodings_python I could simply do list(set(list1) & anycodings_python set(list2)) to get the intersection. anycodings_python However, in my two lists, I have mutable anycodings_python objects, namely Nodes. Node is a class that anycodings_python can be initialized with a value.,Without having to do a double for-loop, is anycodings_python there any way to get the intersection of two anycodings_python lists based on their ids? I'm looking for anycodings_python something similar to list(set(list1) & anycodings_python set(list2)).,The reason this works is because despite anycodings_python being mutable, an instance of a class anycodings_python for which you did not define __hash__ or anycodings_python __eq__ will be hashed and compared by anycodings_python its id by default because it inherits anycodings_python those methods from object.

There's no need to intersect two sets. anycodings_python In this case you can just check if the anycodings_python id() exists in another set.

set2 = {
   id(n) for n in list2
}
result = [n
   for n in list1
   if id(n) in set2
]

The complexity of this code is O(n1 + anycodings_python n2). I'll explain this in following anycodings_python equivalent but more readable code:

set2 = {
   id(n) for n in list2
}
# O(n2)
result = []
for n in list1: # O(n1)
if id(n) in set2: # O(1)
result.append(n) # O(1)

There is also an alternative solution if anycodings_python you can make change to the Node class by anycodings_python just defining the __hash__ and __eq__ anycodings_python method.

class Node:
   ...

   def __hash__(self):
   return id(self)

def __eq__(self, another):
   return id(self) == id(another)

list1 = [...]
list2 = [...]

result = set(list1) & set(list2)

The solution you suggested will work.

class Node:
   def __init__(self, value):
   self.value = value

def __repr__(self):
   return "Node {}".format(self.value)

nodes1 = [Node(1), Node(2), Node(3)]
nodes2 = nodes1[: 2] + [Node(4)]

common_nodes = set(nodes1) & set(nodes2)

print(common_nodes) # {
   Node 2, Node 1
}

You can check this is true with the anycodings_python following experiment.

>>> obj = object() >>>
   hash(obj)
155115580943
   >>>
   id(obj)
2481849295088
   >>>
   id(obj) // 16 == hash(obj)
True

Suggestion : 4

Last modified: December 31, 2020

Let's create two Lists of Strings with some intersection — both having some duplicated elements:

List<String> list = Arrays.asList("red", "blue", "blue", "green", "red");
   List<String> otherList = Arrays.asList("red", "green", "green", "yellow");

And now we'll determine the intersection of the lists with the help of stream methods:

Set<String> result = list.stream()
   .distinct()
   .filter(otherList::contains)
   .collect(Collectors.toSet());

   Set<String> commonElements = new HashSet(Arrays.asList("red", "green"));

      Assert.assertEquals(commonElements, result);

Suggestion : 5

The intersection of two sets A and B is defined as the set that contains all the elements of A that also appear in B, but no other elements.,A sequence that contains the elements that form the set intersection of two sequences.,When the object returned by this method is enumerated, Intersect yields distinct elements occurring in both sequences in the order in which they appear in first.,An IEnumerable<T> whose distinct elements that also appear in the first sequence will be returned.

1._
public:
generic <typename TSource>
   [System::Runtime::CompilerServices::Extension]
   static System::Collections::Generic::IEnumerable<TSource> ^ Intersect(System::Collections::Generic::IEnumerable<TSource> ^ first, System::Collections::Generic::IEnumerable<TSource> ^ second);
public static System.Collections.Generic.IEnumerable<TSource> Intersect<TSource> (this System.Collections.Generic.IEnumerable<TSource> first, System.Collections.Generic.IEnumerable<TSource> second);
static member Intersect : seq<'Source> * seq<'Source> -> seq<'Source>
<Extension()>
   Public Function Intersect(Of TSource) (first As IEnumerable(Of TSource), second As IEnumerable(Of TSource)) As IEnumerable(Of TSource)
' Create two integer arrays.
Dim id1() As Integer = {
   44,
   26,
   92,
   30,
   71,
   38
}
Dim id2() As Integer = {
   39,
   59,
   83,
   47,
   26,
   4,
   30
}

' Find the set intersection of the two arrays.
Dim intersection As IEnumerable(Of Integer) = id1.Intersect(id2)

Dim output As New System.Text.StringBuilder
For Each id As Integer In intersection
output.AppendLine(id)
Next

   ' Display the output.
Console.WriteLine(output.ToString)

' This code produces the following output:
'
' 26
' 30

If you want to compare sequences of objects of some custom data type, you have to implement the IEquatable<T> generic interface in a helper class. The following code example shows how to implement this interface in a custom data type and override GetHashCode and Equals methods.

public class ProductA : IEquatable<ProductA>
{
    public string Name { get; set; }
    public int Code { get; set; }

    public bool Equals(ProductA other)
    {
        if (other is null)
            return false;

        return this.Name == other.Name && this.Code == other.Code;
    }

    public override bool Equals(object obj) => Equals(obj as ProductA);
    public override int GetHashCode() => (Name, Code).GetHashCode();
}
1._
public:
generic <typename TSource>
   [System::Runtime::CompilerServices::Extension]
   static System::Collections::Generic::IEnumerable<TSource> ^ Intersect(System::Collections::Generic::IEnumerable<TSource> ^ first, System::Collections::Generic::IEnumerable<TSource> ^ second, System::Collections::Generic::IEqualityComparer<TSource> ^ comparer);
public static System.Collections.Generic.IEnumerable<TSource> Intersect<TSource> (this System.Collections.Generic.IEnumerable<TSource> first, System.Collections.Generic.IEnumerable<TSource> second, System.Collections.Generic.IEqualityComparer<TSource> comparer);
public static System.Collections.Generic.IEnumerable<TSource> Intersect<TSource> (this System.Collections.Generic.IEnumerable<TSource> first, System.Collections.Generic.IEnumerable<TSource> second, System.Collections.Generic.IEqualityComparer<TSource>? comparer);
static member Intersect : seq<'Source> * seq<'Source> * System.Collections.Generic.IEqualityComparer<'Source> -> seq<'Source>
<Extension()>
   Public Function Intersect(Of TSource) (first As IEnumerable(Of TSource), second As IEnumerable(Of TSource), comparer As IEqualityComparer(Of TSource)) As IEnumerable(Of TSource)

The following example shows how to implement an equality comparer that can be used in the Intersect method.

public class Product
{
    public string Name { get; set; }
    public int Code { get; set; }
}

// Custom comparer for the Product class
class ProductComparer : IEqualityComparer<Product>
{
    // Products are equal if their names and product numbers are equal.
    public bool Equals(Product x, Product y)
    {

        //Check whether the compared objects reference the same data.
        if (Object.ReferenceEquals(x, y)) return true;

        //Check whether any of the compared objects is null.
        if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
            return false;

        //Check whether the products' properties are equal.
        return x.Code == y.Code && x.Name == y.Name;
    }

    // If Equals() returns true for a pair of objects
    // then GetHashCode() must return the same value for these objects.

    public int GetHashCode(Product product)
    {
        //Check whether the object is null
        if (Object.ReferenceEquals(product, null)) return 0;

        //Get hash code for the Name field if it is not null.
        int hashProductName = product.Name == null ? 0 : product.Name.GetHashCode();

        //Get hash code for the Code field.
        int hashProductCode = product.Code.GetHashCode();

        //Calculate the hash code for the product.
        return hashProductName ^ hashProductCode;
    }
}

Suggestion : 6

Some operations are supported by several object types; in particular, practically all objects can be compared for equality, tested for truth value, and converted to a string (with the repr() function or the slightly different str() function). The latter function is implicitly used when an object is written by the print() function.,Two more operations with the same syntactic priority, in and not in, are supported by types that are iterable or implement the __contains__() method.,The implementation adds a few special read-only attributes to several object types, where they are relevant. Some of these are not reported by the dir() built-in function.,a complex number with real part re, imaginary part im. im defaults to zero.

>>> n = -37 >>>
   bin(n)
'-0b100101' >>>
n.bit_length()
6
def bit_length(self):
   s = bin(self) # binary representation: bin(-37) -- > '-0b100101'
s = s.lstrip('-0b') # remove leading zeros and minus sign
return len(s) # len('100101') -- > 6
>>> n = 19 >>>
   bin(n)
'0b10011' >>>
n.bit_count()
3
   >>>
   (-n).bit_count()
3
def bit_count(self):
   return bin(self).count("1")
>>> (1024).to_bytes(2, byteorder = 'big')
b '\x04\x00' >>>
   (1024).to_bytes(10, byteorder = 'big')
b '\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00' >>>
   (-1024).to_bytes(10, byteorder = 'big', signed = True)
b '\xff\xff\xff\xff\xff\xff\xff\xff\xfc\x00' >>>
   x = 1000 >>>
   x.to_bytes((x.bit_length() + 7) // 8, byteorder='little')
      b '\xe8\x03'
>>> int.from_bytes(b '\x00\x10', byteorder = 'big')
16
   >>>
   int.from_bytes(b '\x00\x10', byteorder = 'little')
4096
   >>>
   int.from_bytes(b '\xfc\x00', byteorder = 'big', signed = True) -
   1024 >>>
   int.from_bytes(b '\xfc\x00', byteorder = 'big', signed = False)
64512
   >>>
   int.from_bytes([255, 0, 0], byteorder = 'big')
16711680

Suggestion : 7

If True, the indices which correspond to the intersection of the two arrays are returned. The first instance of a value is used if there are multiple. Default is False.,Return the sorted, unique values that are in both of the input arrays.,To return the indices of the values common to the input arrays along with the intersected values:,If True, the input arrays are both assumed to be unique, which can speed up the calculation. If True but ar1 or ar2 are not unique, incorrect results and out-of-bounds indices could result. Default is False.

>>> np.intersect1d([1, 3, 4, 3], [3, 1, 2, 1])
array([1, 3])
>>> from functools
import reduce
   >>>
   reduce(np.intersect1d, ([1, 3, 4, 3], [3, 1, 2, 1], [6, 3, 4, 2]))
array([3])
>>> x = np.array([1, 1, 2, 3, 4]) >>>
   y = np.array([2, 1, 4, 6]) >>>
   xy, x_ind, y_ind = np.intersect1d(x, y, return_indices = True) >>>
   x_ind, y_ind(array([0, 2, 4]), array([1, 0, 2])) >>>
   xy, x[x_ind], y[y_ind]
   (array([1, 2, 4]), array([1, 2, 4]), array([1, 2, 4]))