So my group maintains and improves certain models that are used as oracles in various decision making processes around the company. One of the big things that we do is identify new sources of information (or transformations of existing sources) that improve performance on proxy measures such as AUC or log likelihood. Once we identify pieces of information, we can make a new model which is relatively easy, but it is up to the engineering teams to make the new information available to the oracle processes in production. This can range from easy (e.g., "already explicitly represented in a database somewhere") to difficult (e.g., "an incremental computation needs to run every time a certain event occurs and the results memoized").
So today I was in a meeting where there were two different models for which I was proposing two changes both in the difficult zone. So, paraphrasing, here's how it went:
V.P. Engineering: "Ok, which one should I do first?"
Me: "This one."
V.P. Engineering: "Why?"
Me: "My gut tells me it will have more impact."
V.P. Engineering: "You don't know the relationship between your proxy measures and actual business metric impacts?"
V.P. Engineering: "Can't you do simulations and make predictions or something?"
Me: "I could, but those simulations would involve models of user behaviour and how people adapt to the different decisions we make. There are so many unknowns there that I could get any answer I wanted."
V.P. Engineering: "This is a lame state of affairs. You are asking me to commit resources on the basis of your hunches."
It really is lame, because the decision isn't just between machine learning projects: there are completely different kinds of activity (product feature enhancements, UI changes, site performance enhancements) that could positively impact business metrics and machine learning projects compete for engineering resources with these other proposals.
So how do other organizations handle this stuff?