tag:blogger.com,1999:blog-4446292666398344382.comments2024-05-14T20:48:32.117-07:00Machined LearningsPaul Mineirohttp://www.blogger.com/profile/05439062526157173163noreply@blogger.comBlogger149125tag:blogger.com,1999:blog-4446292666398344382.post-75244406828830291292024-05-14T13:59:58.531-07:002024-05-14T13:59:58.531-07:00I was looking for a fast and simple approx for con...I was looking for a fast and simple approx for converting a float to log in C# and your magic numbers work really well! The only routine in DotNet that converts to a Log is in double and it's more precision than I need. For a million converts I see a speed reduction of 141 mSec down to 89 mSec using your constants. Thank you!<br /><br />public static float FastLog(float val)<br />{<br /> return BitConverter.SingleToInt32Bits(val) * 8.2629582881927490e-8f - 87.989971088f;<br />}reo2https://www.blogger.com/profile/16844554150778191349noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-45885717234894201112022-11-20T11:58:39.340-08:002022-11-20T11:58:39.340-08:00I have to admit #4 and #4 made me laugh. As a math...I have to admit #4 and #4 made me laugh. As a mathematician, I only publish in journals, with the one exception of FPSAC, so your last sentence is pretty intriguing to me.Allen Knutsonhttps://www.blogger.com/profile/15616422252030334511noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-55845985536490533032018-12-16T15:14:21.158-08:002018-12-16T15:14:21.158-08:00The magic numbers show up when you do a rational f...The magic numbers show up when you do a rational function approximation to the residual ( log_2(1+z)-z ). <br /><br />Yes, the shift is to extract the manitissa: the exponent part of the representation is "easy to log".Paul Mineirohttps://www.blogger.com/profile/05439062526157173163noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-42716766377004217082018-12-10T10:30:00.678-08:002018-12-10T10:30:00.678-08:00Could you please explain what means those magic nu...Could you please explain what means those magic numbers ((1 << 23), 121.2740838f, 27.7280233f, 4.84252568f, 1.49012907f) in fast exp approximation?<br /><br />I guess (1 << 23) it is the shift on float mantissa length (23 bits). Am I right?<br /><br />Thank you!Oleghttps://www.blogger.com/profile/16730718168796265004noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-51546485151077209212018-03-24T08:13:01.071-07:002018-03-24T08:13:01.071-07:00Wild! Meta-learning is certainly a blossoming fie...Wild! Meta-learning is certainly a blossoming field.Paul Mineirohttps://www.blogger.com/profile/05439062526157173163noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-599464683939967962018-03-23T13:22:18.268-07:002018-03-23T13:22:18.268-07:00See Table 6 of
"Neural Optimizer Search with...See Table 6 of <br />"Neural Optimizer Search with Reinforcement Learning"<br />https://arxiv.org/abs/1709.07417Tim Vieirahttps://www.blogger.com/profile/12725412738623492825noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-50370149264084232332018-03-08T10:20:23.575-08:002018-03-08T10:20:23.575-08:00Thanks for the heads up.Thanks for the heads up.Paul Mineirohttps://www.blogger.com/profile/05439062526157173163noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-33292366630152321282018-03-08T07:37:24.840-08:002018-03-08T07:37:24.840-08:00FYI: "WARNING: cdn.mathjax.org has been retir...FYI: "WARNING: cdn.mathjax.org has been retired. Check https://www.mathjax.org/cdn-shutting-down/ for migration tips."<br /><br />You might want to update your mathjax cdn, or even switch to KaTex for faster performance! ;)Jobhttps://www.blogger.com/profile/09855523088550323759noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-63187665703197064322017-09-14T08:37:41.438-07:002017-09-14T08:37:41.438-07:00Well, I'd never heard of SIMPOL before!Well, I'd never heard of SIMPOL before!Paul Mineirohttps://www.blogger.com/profile/05439062526157173163noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-21140421015514716582017-09-14T07:56:03.676-07:002017-09-14T07:56:03.676-07:00Thanks for topic. My C sucks (but I am 55 year old...Thanks for topic. My C sucks (but I am 55 year old lawyer). I used this idea of divide by 2 to make a stab at writing a reasonably accurate and not too slow log function for SIMPOL which doesn't have a built in log function. Once between range of .5 to 1, I worked out some polynomial approximations.<br />Written is SIMPOL and in a first stab way that I could find my typing and logical errors.<br /><br /><br />constant log10_2 "0.30102999566398119521373889472449"<br /><br />function main()<br /> number n<br /> number log<br /> log = 0<br /> integer e<br /> e = 0<br /> string message,title<br /> anyvalue prompt<br /> message = "Entry a number."<br /> title = "log test"<br /> prompt = ""<br /> n = .toval(getuserinput(message,title,prompt, error =e),"",10)<br /> log = jdk_log(n)<br />end function .tostr(log,10)<br /><br />function jdk_log(number n)<br /> number log,log10,log3<br /> number eval, shift<br /> integer p<br /> p = 0<br /> eval = n <br /> shift = 0<br /> if n < 1/2<br /> eval = eval * 100<br /> shift = 2<br /> end if<br /><br /> while eval > 1 <br /> eval = eval/2<br /> p = p +1<br /> end while<br /> number x, x2, x3,x4, x5, x6, x7, x8, x9, x10<br /> number test10, test3 <br /> x = eval<br /> x2 = x * x <br /> x3 = x2 * x<br /> x4 = x3 * x<br /> x5 = x4 * x<br /> x6 = x5 * x<br /> x7 = x6 * x<br /> x8 = x7 * x<br /> x9 = x8 * x<br /> x10 = x9 * x<br /><br /> test10 = -(1436/1000) <br /> test10 = test10 + (5541/1000 * x)<br /> test10 = test10 - (12204/1000 * x2)<br /> test10 = test10 + (13987/1000 * x3)<br /> test10 = test10 + (0304/1000 * x4)<br /> test10 = test10 - (17075/1000 * x5)<br /> test10 = test10 + (4459/1000 * x6)<br /> test10 = test10 + (30204/1000 * x7)<br /> test10 = test10 - (42185/1000 * x8)<br /> test10 = test10 + (23255/1000 * x9)<br /> test10 = test10 - (4849/1000 * x10)<br /><br /> test3 = -(944/1000) <br /> test3 = test3 + (1814/1000 * x)<br /> test3 = test3 - (1241/1000 * x2)<br /> test3 = test3 + (371/1000 * x3)<br /><br /><br /> log10 = p * .toval(log10_2,"",10) + round(test10,1/1000000) - shift<br /> log3 = p * .toval(log10_2,"",10) + round(test3,1/1000000) - shift <br /> log = round((log10 + log3 )/2,1/10000)<br />end function log<br /><br /><br /><br />jdkhttps://www.blogger.com/profile/17987574304860090197noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-91818969077766173592017-09-08T14:48:46.531-07:002017-09-08T14:48:46.531-07:00Great! ThanksGreat! ThanksRavihttps://www.blogger.com/profile/03453457907385341473noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-31660378319331792752017-09-08T14:01:57.445-07:002017-09-08T14:01:57.445-07:00Consider slide 84 of http://www.thespermwhale.com/...Consider slide 84 of http://www.thespermwhale.com/jaseweston/icml2016/icml2016-memnn-tutorial.pdf ... the primary task is story comprehension but adding the additional task of predicting the (exact) teacher response helps.<br /><br />I also just noticed slide 27 ... but I'm not familiar with that part.Paul Mineirohttps://www.blogger.com/profile/05439062526157173163noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-46636133846070234032017-09-08T10:47:37.755-07:002017-09-08T10:47:37.755-07:00Hi Paul,
For your first point - i.e. multitask re...Hi Paul,<br /> For your first point - i.e. multitask regularization, could you give some citations both from RL world and dialog world.<br />Ravihttps://www.blogger.com/profile/03453457907385341473noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-19525310934857485322017-08-30T12:50:53.190-07:002017-08-30T12:50:53.190-07:00Great question!
The decision service (which is do...Great question!<br /><br />The decision service (which is done out of MSR-NY, I'm just a user/contributor) has a problem which plagues many machine learning toolkit products: the customer is either so sophisticated that they want to write their own, or so unsophisticated that they are unable to operate the tool. Concentrating on a specific vertical and providing a simple interface is a way to bridge the gap, hence the DS has been focusing on news recommendations as a vertical scenario for go-to-market.<br /><br />Internally, we are using the decision service for a variety of scenarios, and the open source version is capable of handling general use cases and is production ready.Paul Mineirohttps://www.blogger.com/profile/05439062526157173163noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-87092216821554309902017-08-30T09:18:14.414-07:002017-08-30T09:18:14.414-07:00Hi Paul,
I´m very interested in RL as a service. ...Hi Paul,<br /><br />I´m very interested in RL as a service. I had already read about the service you worked on, the paper is very interesting. <br /><br />But what I found at machine learning options in Azure is the Custom option, which seems to be for especific use cases, like news recommendations (taking the articles content in consideration).<br /><br />The service is being fully used in more general use cases (as cited in the paper)? I mean, is it ready to production or is in a kind of beta?<br /><br />Congratulations for your work!Rubens Santoshttps://www.blogger.com/profile/03307012403677741970noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-65106021894661893572017-07-10T11:40:07.118-07:002017-07-10T11:40:07.118-07:00It's been a while since I've thought about...It's been a while since I've thought about this, but the bit manipulations are essentially extracting a multiple of a power of two, so that might work better than explicit division and squaring.Paul Mineirohttps://www.blogger.com/profile/05439062526157173163noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-23949771676078276722017-07-08T09:39:41.309-07:002017-07-08T09:39:41.309-07:00The power series for the exponential converges mor...The power series for the exponential converges more rapidly for arguments near zero. So divide your argument by some power of 2 to bring it close to zero, do the truncated power series, and then square the result the appropriate number of times. I don't know if this would beat the stock Exp function, but it is worth a look. stymiedhttps://www.blogger.com/profile/17031413084599382303noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-82303018017217566542017-02-16T08:05:08.157-08:002017-02-16T08:05:08.157-08:00Under these conditions, I would still train end-to...Under these conditions, I would still train end-to-end, but with explicit regularization to control overfitting.Paul Mineirohttps://www.blogger.com/profile/05439062526157173163noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-15375821005192932052017-02-15T18:51:01.820-08:002017-02-15T18:51:01.820-08:00Sometimes end-to-end system is not a good idea sin...Sometimes end-to-end system is not a good idea since we found it tends to overfit the dataset. If we train each sub-system with different data, it tends to get better performance on unknown environment.cnxhttps://www.blogger.com/profile/12614847138980399016noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-56939114981009547092017-01-18T08:37:24.595-08:002017-01-18T08:37:24.595-08:00Not out yet (afaik). To be fair, it is just a wor...Not out yet (afaik). To be fair, it is just a workshop paper. If you review the conference version of the paper, demand a code release!Paul Mineirohttps://www.blogger.com/profile/05439062526157173163noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-69580984750384199532017-01-18T05:10:01.305-08:002017-01-18T05:10:01.305-08:00But where's the code to verify?But where's the code to verify?Carlos Perezhttps://www.blogger.com/profile/01488838149594154679noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-45123320729363656902016-12-12T22:51:58.957-08:002016-12-12T22:51:58.957-08:00The Alexa prize (https://developer.amazon.com/alex...The Alexa prize (https://developer.amazon.com/alexaprize) is pretty cool. They acquired mxnet which will remain an open project afaik. Alex Smola's team has a mandate to do research, so given typical latencies I expect to see things out of there hitting the major conferences. Charles Elkan and Ralf Herbrich are active NIPS contributors. Amazon had a paper at NIPS this year (main conference).<br /><br />So, not as much as other companies with more history, but the direction is encouraging.Paul Mineirohttps://www.blogger.com/profile/05439062526157173163noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-442121453712609562016-12-12T22:18:33.120-08:002016-12-12T22:18:33.120-08:00This comment has been removed by the author.Paul Mineirohttps://www.blogger.com/profile/05439062526157173163noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-63190319200439789572016-12-12T17:41:05.882-08:002016-12-12T17:41:05.882-08:00What research has Amazon "opened up"?What research has Amazon "opened up"?Anonymoushttps://www.blogger.com/profile/17867461482990681857noreply@blogger.comtag:blogger.com,1999:blog-4446292666398344382.post-50687384551473814042016-07-20T16:33:45.394-07:002016-07-20T16:33:45.394-07:00Yes, it is released under the New BSD License.Yes, it is released under the New BSD License.Paul Mineirohttps://www.blogger.com/profile/05439062526157173163noreply@blogger.com