GitHunt
RI

Rishujamaiyar/Similarity-Detection-using-Graph-SAGE-Python

Explained Graph Embedding generation and link prediction

Dataset :

This custom dataset is about the cricket players participated in the 2019 ICC Cricket World Cup .

151 players from 10 countries participated in the tournament. This custom dataset is made by the players' information provided at the news18 website.

Knowledge Graph Structure :

The Graph have a central node named (WC) for World cup.This node is linked to 10 nodes each representing one country and each country is linked to four nodes indicating the different types of players (Batsman,Bowler,All-rounder,Wicket-Keeper). All these four nodes are linked to their respective players.

graph illustration

We are going to predict the link between any two players by using Graph Embeddings.

Expected Results :

If we consider Player 1 = Virat Kohli.

Then its link with any batsman from India would be the highest.

The link between other players (All-rounders/Bowlers) in Indian team would be the next closest.

After that the players from different countries would be the least linked candidates.

Methodology :

For generating Node embeddings : Graph SAGE

For generating Graph : Steller Graph

For similarity comparison : Cosine Similarity

References :

Inductive Representation Learning on Large Graphs, Hamilton et al., NeurIPS 2017.

Languages

Jupyter Notebook100.0%

Contributors

Created July 6, 2020
Updated March 11, 2026