Here's a widely applicable encoding scheme that I've used for several different problems to improve performance. It is related to segmented regression (for statistical prediction folks) and also special order sets of type 2 (for operations research folks). We actually call it ``SOS2 encoding'' around the office.
In one dimension it works like this: one chooses a set of points for the variable, then for any input value two features are activated corresponding to the points bracketing the value, with weights corresponding to a linear interpolation between the two features. For example, for online dating physical distance between two people helps predict many quantities of interest. To encode the distance between two people for input to a model, I first take the $\log_2$ of the distance (ad-hoc intuited transform), then SOS2 encode using integral points. So, if two people have a distance difference of 72 miles, the two features output would be
LOG2_DISTANCE_6:0.83 LOG2_DISTANCE_7:0.19Since this encoding is sparse it ends up interacting well with other variables (i.e., quadratic vowpal terms involving SOS2 encoded features can produce useful lift). In a two-dimensional problem I SOS2 encoded each dimension and then interacted them quadratically with some success. For higher dimensional problems I've gotten better mileage from other techniques (e.g., VQ).