fast algorithm to compute adamic-adar

  • Last Update :
  • Techknowledgy :

I believe you are using rather slow approach. It would better to revert it -
- initialize AA (Adamic-Adar) matrix by zeros
- for every node k get it's degree k_deg
- calc d = log(1.0/k_deg) (why log10 - is it important or not?)
- add d to all AAij, where i,j - all pairs of 1s in kth row of adjacency matrix
Edit:
- for sparse graphs it is useful to extract positions of all 1s in kth row to the list to reach O(V*(V+E)) complexity instead of O(V^3)

AA = np.zeros((N, N))
for k = 0 to N - 1 do
      AdjList = []
   for j = 0 to N - 1 do
      if A[k, j] = 1 then
AdjList.Add(j)
k_deg = AdjList.Length
d = log(1 / k_deg)
for j = 0 to AdjList.Length - 2 do
   for i = j + 1 to AdjList.Length - 1 do
      AA[AdjList[i], AdjList[j]] = AA[AdjList[i], AdjList[j]] + d
//half of matrix filled, it is symmetric for undirected graph

Since you're using numpy, you can really cut down on your need to iterate for every operation in the algorithm. my numpy- and vectorized-fu aren't the greatest, but the below runs in around 2.5s on a graph with ~13,000 nodes:

def adar_adamic(adj_mat):
   ""
"Computes Adar-Adamic similarity matrix for an adjacency matrix"
""

Adar_Adamic = np.zeros(adj_mat.shape)
for i in adj_mat:
   AdjList = i.nonzero()[0] #column indices with nonzero values
k_deg = len(AdjList)
d = np.log(1.0 / k_deg) # row i 's AA score

#add i 's score to the neighbor'
s entry
for i in xrange(len(AdjList)):
   for j in xrange(len(AdjList)):
   if AdjList[i] != AdjList[j]:
   cell = (AdjList[i], AdjList[j])
Adar_Adamic[cell] = Adar_Adamic[cell] + d

return Adar_Adamic

I don't see a way of reducing the time complexity, but it can be vectorized:

degrees = A.sum(axis = 0)
weights = np.log10(1.0 / degrees)
adamic_adar = (A * weights).dot(A.T)

With A a regular Numpy array. It seems you're using graph_tool.spectral.adjacency and thus A would be a sparse matrix. In that case the code would be:

from scipy.sparse
import csr_matrix

degrees = A.sum(axis = 0)
weights = csr_matrix(np.log10(1.0 / degrees))
adamic_adar = A.multiply(weights) * A.T

Suggestion : 2

Adamic Adar is a measure used to compute the closeness of nodes based on their shared neighbors.,This algorithm is in the alpha tier. For more information on algorithm tiers, see Graph algorithms.,The Adamic Adar algorithm was introduced in 2003 by Lada Adamic and Eytan Adar to predict links in a social network. It is computed using the following formula:,The relationship direction used to compute similarity between node1 and node2. Possible values are OUTGOING, INCOMING and BOTH.

RETURN gds.alpha.linkprediction.adamicAdar(node1: Node, node2: Node, {
   relationshipQuery: String,
   direction: String
})
CREATE
   (zhen: Person {
      name: 'Zhen'
   }),
   (praveena: Person {
      name: 'Praveena'
   }),
   (michael: Person {
      name: 'Michael'
   }),
   (arya: Person {
      name: 'Arya'
   }),
   (karin: Person {
      name: 'Karin'
   }),

   (zhen) - [: FRIENDS] - > (arya),
   (zhen) - [: FRIENDS] - > (praveena),
   (praveena) - [: WORKS_WITH] - > (karin),
   (praveena) - [: FRIENDS] - > (michael),
   (michael) - [: WORKS_WITH] - > (karin),
   (arya) - [: FRIENDS] - > (karin)
 MATCH(p1: Person {
    name: 'Michael'
 })
 MATCH(p2: Person {
    name: 'Karin'
 })
 RETURN gds.alpha.linkprediction.adamicAdar(p1, p2) AS score
 MATCH(p1: Person {
    name: 'Michael'
 })
 MATCH(p2: Person {
    name: 'Karin'
 })
 RETURN gds.alpha.linkprediction.adamicAdar(p1, p2, {
    relationshipQuery: 'FRIENDS'
 }) AS score

Suggestion : 3

Adamic-Adar index will be computed for each pair of nodes given in the iterable. The pairs must be given as 2-tuples (u, v) where u and v are nodes in the graph. If ebunch is None then all non-existent edges in the graph will be used. Default value: None.,Compute the Adamic-Adar index of all node pairs in ebunch.,An iterator of 3-tuples in the form (u, v, p) where (u, v) is a pair of nodes and p is their Adamic-Adar index.,Reference Introduction Graph types Algorithms Functions Graph generators Linear algebra Converting to and from other data formats Relabeling nodes Reading and writing graphs Drawing Exceptions Utilities License Citing Credits Glossary Reference

>>>
import networkx as nx
   >>>
   G = nx.complete_graph(5) >>>
   preds = nx.adamic_adar_index(G, [(0, 1), (2, 3)]) >>>
   for u, v, p in preds:
   ...'(%d, %d) -> %.8f' % (u, v, p)
   ...
   '(0, 1) -> 2.16404256'
'(2, 3) -> 2.16404256'

Suggestion : 4

The Adamic-Adar index is meant for undirected graphs, since it is computed using the degree of the shared neighbors by two vertices in the graph. This implementation computes the index for every pair of vertices connected by an edge and associates it with that edge., Overview Graph Data Model Partitioned Graph Model Partitioned Vertex and Edge IDs Graphs optimized for updates Graph Pattern Matching Graph Algorithms Mutating Graphs Namespaces and Sharing PGX Server Design , Security Authentication Authorization End to end security example API Permission Mapping , Distributed Execution Graph Loading Features Graph Storing Features Misc Graph Operations Data Structures Algorithms PGQL Features Control Endpoints Other Features

/*
 * Copyright (C) 2013 - 2022 Oracle and/or its affiliates. All rights reserved.
 */
package oracle.pgx.algorithms;

import oracle.pgx.algorithm.EdgeProperty;
import oracle.pgx.algorithm.annotations.GraphAlgorithm;
import oracle.pgx.algorithm.PgxGraph;
import oracle.pgx.algorithm.PgxVertex;
import oracle.pgx.algorithm.annotations.Out;

import static java.lang.Math.log;

@GraphAlgorithm
public class AdamicAdar {
  public void adamicAdar(PgxGraph g, @Out EdgeProperty<Double> aa) {
    g.getEdges().forEach(e -> {
      PgxVertex src = e.sourceVertex();
      PgxVertex dst = e.destinationVertex();

      double value = src.getNeighbors()
          .filter(n -> n.hasEdgeFrom(dst))
          .sum(n -> 1 / log(n.getDegree()));

      aa.set(e, value);
    });
  }
}
/*
 * Copyright (C) 2013 - 2022 Oracle and/or its affiliates. All rights reserved.
 */

procedure adamic_adar(graph G; edgeProp<double> aa) {

  foreach (e: G.edges) {
    node src = e.fromNode();
    node dst = e.toNode();

    // In C++ backend, the compiler optimizes below
    e.aa = sum (n: src.nbrs) (n.hasEdgeFrom(dst)) {1 / log(n.numNbrs())};

    // into
    // e.aa = sum(n: src.commonNbrs(dst)) {1 / log(n.numNbrs())};
  }
}

Suggestion : 5

Updated: March 27, 2019

import numpy as np
import random
import networkx as nx
from IPython.display
import Image
import matplotlib.pyplot as plt
pip install networkx
n = 34
m = 78
G_karate = nx.karate_club_graph()

pos = nx.spring_layout(G_karate)
nx.draw(G_karate, cmap = plt.get_cmap('rainbow'), with_labels = True, pos = pos)
n = G_karate.number_of_nodes()
m = G_karate.number_of_edges()
print("Number of nodes : %d" % n)
print("Number of edges : %d" % m)
print("Number of connected components : %d" % nx.number_connected_components(G_karate))
plt.figure(figsize = (12, 8))
nx.draw(G_karate)
plt.gca().collections[0].set_edgecolor("#000000")
# Remove 20 % of the edges
proportion_edges = 0.2
edge_subset = random.sample(G_karate.edges(), int(proportion_edges * G_karate.number_of_edges()))

# Create a copy of the graph and remove the edges
G_karate_train = G_karate.copy()
G_karate_train.remove_edges_from(edge_subset)