- Online Structured Prediction via Coactive Learning: read the full blog post.
- Predicting Accurate Probabilities with a Ranking Loss: read the full blog post.
- Training Restricted Boltzmann Machines on Word Observations. I haven't used RBMs in over a decade, for practical text classification problems a bag-of-bigrams representation is often sufficient, and LDA is my go-to technique for unsupervised feature extraction for text. So why do I like this paper? First, the computational efficiency improvement appears substantial, which is always of interest: I like deep learning in theory, but in practice I'm very impatient. Second the idea of discovering higher order structure in text (5-grams!) is intriguing. Third (like LDA) the technique is clearly more generally applicable and I wonder what it would do on a social graph. That all suggests there is some chance that I might actually try this on a real problem.
- Fast Prediction of New Feature Utility: I'm constantly in the situation of trying to chose which features to try next, and correlating with the negative gradient of the loss function makes intuitive sense.
- Plug-in Martingales for Testing Exchangeability On-Line: how awesome would it be if VW in online learning mode could output a warning that says ``the input data does not appear to be generated by an exchangeable distribution; try randomly shuffling your data to improve generalization.''
- Dimensionality Reduction by Local Discriminative Gaussians: This seems imminently practical. The major limitation is that it is a supervised dimensionality reduction technique, so it would apply to cases where there is one problem with a deficit of labeled data and a related problem using the same features with an abundance of labeled data (which is a special case of Transfer Learning). I usually find myself in the ``few labeled data and lots of unlabeled data'' case demanding an unsupervised technique, but that could be because I don't ask myself the following question often enough: ``is there a related problem which has lots of training data associated with it?''
- Finding Botnets Using Minimal Graph Clusterings: Very entertaining. I was asked in a job interview once how I would go about identifying and filtering out automated traffic from search logs. There's no ``right answer'', and black-letter machine learning techniques don't obviously apply, so creativity is at a premium.
Friday, June 29, 2012
I've already devoted entire blog posts to some of the ICML 2012 papers, but there are some other papers that caught my attention for which I only have a quick comment.