Friday, June 29, 2012

My ICML 2012 Notables

I've already devoted entire blog posts to some of the ICML 2012 papers, but there are some other papers that caught my attention for which I only have a quick comment.

  • Online Structured Prediction via Coactive Learning: read the full blog post.
  • Predicting Accurate Probabilities with a Ranking Loss: read the full blog post.
  • Training Restricted Boltzmann Machines on Word Observations. I haven't used RBMs in over a decade, for practical text classification problems a bag-of-bigrams representation is often sufficient, and LDA is my go-to technique for unsupervised feature extraction for text. So why do I like this paper? First, the computational efficiency improvement appears substantial, which is always of interest: I like deep learning in theory, but in practice I'm very impatient. Second the idea of discovering higher order structure in text (5-grams!) is intriguing. Third (like LDA) the technique is clearly more generally applicable and I wonder what it would do on a social graph. That all suggests there is some chance that I might actually try this on a real problem.
  • Fast Prediction of New Feature Utility: I'm constantly in the situation of trying to chose which features to try next, and correlating with the negative gradient of the loss function makes intuitive sense.
  • Plug-in Martingales for Testing Exchangeability On-Line: how awesome would it be if VW in online learning mode could output a warning that says ``the input data does not appear to be generated by an exchangeable distribution; try randomly shuffling your data to improve generalization.''
  • Dimensionality Reduction by Local Discriminative Gaussians: This seems imminently practical. The major limitation is that it is a supervised dimensionality reduction technique, so it would apply to cases where there is one problem with a deficit of labeled data and a related problem using the same features with an abundance of labeled data (which is a special case of Transfer Learning). I usually find myself in the ``few labeled data and lots of unlabeled data'' case demanding an unsupervised technique, but that could be because I don't ask myself the following question often enough: ``is there a related problem which has lots of training data associated with it?''
  • Finding Botnets Using Minimal Graph Clusterings: Very entertaining. I was asked in a job interview once how I would go about identifying and filtering out automated traffic from search logs. There's no ``right answer'', and black-letter machine learning techniques don't obviously apply, so creativity is at a premium.

No comments:

Post a Comment