Machined Learnings: ICML 2011 Notables

Friday, July 8, 2011

ICML 2011 Notables

Here are some papers I've flagged for follow-up, in no particular order:

Minimum probability flow (MPF). This has the potential to train a wide variety of probabilistic models much more quickly by avoiding the computation of the partition function. Since I'm obsessed with speed this caught my attention: there are lots of techniques I've basically ignored because I considered them too slow. Maybe this is a game changer? I'll have to try it on something to know for sure.
Sparse Additive Generative Models of Text (SAGE). I'm guessing the authors were initially interested in sparse LDA, but found that the multinomial token emission specification was not conducive to this manipulation. In any event, my summary is: replace probabilities with log probabilities in the token emission model for LDA, and center the emissions with respect to the background (token frequency). There are two main benefits: 1) the resulting per-topic specifications can be made extremely sparse since they only model difference from the background; 2) additional latent parameters can be handled via addition (of logs) rather than multiplication (of probabilities). Unfortunately, there is a partition term buried in the update which is $O (|V|)$ where $V$ is the vocabulary. Perhaps the SAGE authors should talk to the MPF authors :)
Learning Scoring Functions with Order-Preserving Losses and Standardized Supervision. This paper is about clarifying when ranking objective functions have consistent reductions to regression or pairwise classification with a scoring function. Equation 6 has the right structure for implementation in Vowpal Wabbit, and there is a consistent recipe for reducing DCG and NDCG buried in here, if I can figure it out :)
Adaptively Learning the Crowd Kernel. A generalization of MDS using triplet-based relative simliarity instead of absolute similarity. This is awesome because eliciting absolute similarity judgements from people is very difficult, whereas triplet-based relative similarity (``is object $a$ more similar to $b$ or to $c$?'') is very natural.

In addition the invited speakers were all fabulous, and the Thursday afternoon invited cross-conference session was especially entertaining.

Machined Learnings

Friday, July 8, 2011

ICML 2011 Notables

No comments:

Post a Comment