Tuesday, June 26, 2012

A Thought on Link Prediction

I'm reading a paper by Richard et. al. from the ICML 2012 paper list called Estimation of Simultaneously Sparse and Low Rank Matrices. I'm not sure why until now I was conflating these two ideas but in retrospect they are clearly different and one might want to optimize for both. Since the Latent Feature Log Linear (LFLL) model of Mennon and Elkan is in spirit a low-rank matrix factorization algorithm I was wondering how to simultaneously enforce sparsity in it; I think using an $L_1$ regularizer on the latent features might be worth trying.

However the paper also got me thinking about link prediction. Here's a quote from the paper:
Link prediction - the matrix $A$ is the adjacency matrix of a partially observed graph; entries are 0 for both not-existing and undiscovered links. The search space is unrestricted as before and the matrix $S$ contains the scores for link prediction; the ideal loss function is the empirical average of the zero-one loss for each coefficient, \[
l_{E} (S, A) = \frac{1}{|E|} \sum_{(i,j) \in E} 1_{(A_{ij} - 1/2) \cdot S_{ij} \leq 0}.
So I read that as, ``this is a P-U problem that we are reducing to pointwise classification.'' However my preferred method for P-U problems is to reduce to ranking (AUC loss). What would that look like for link prediction?
  1. Instances are edges (i.e., pairs of vertices plus dyadic specific information).
  2. Reduction of AUC is to pairwise classification, so pairs of edges, or pairs of pairs of vertices.
  3. Each positive (observed) edge in the adjacency graph would be paired with an unlabeled (unobserved or unexplored) edge, the latter perhaps drawn uniformly from all possible edges; or possibly from all possible edges given one vertex (``per-vertex AUC'').
  4. The final classification model could be purely latent (e.g., pure LFLL), purely explicitly feature driven (e.g., bilinear implemented with VW), or a combination (e.g., LFLL with side information).
    1. In my experience LLFL with side information is very tricky to train, unlike pure LLFL.

Next time I run into a link prediction problem I'm going to give this a whirl.


  1. This (link prediction with latent factor models optimized for AUC) is basically what we did in "our" (St. and Chr. did most of the work) UAI 2009 paper: Bayesian Personalized Ranking from Implicit Feedback. Works pretty well. It was also used by some of the top teams in KDD Cup 2011 (Track 2).