mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: To all the recommendation people..
Date Tue, 17 May 2011 20:43:39 GMT
More contests at: http://challenge.gov/NIH/132-nlm-show-off-your-apps-innovative-uses-of-nlm-information


On May 15, 2011, at 10:25 PM, Alex Kozlov wrote:

> On Sat, May 14, 2011 at 9:11 PM, Jake Mannix <jake.mannix@gmail.com> wrote:
> 
>> Due to the whole Netflix data lawsuit, the training data is synthetic,
>> which
>> puts the contestants at a disadvantage, and another interesting fact:
>> runtime
>> performance is at issue: your code will be run *live*, with your model
>> being
>> used to produce recommendations with a hard timeout of 50ms - if you
>> miss this more than 20% of the time, you fail to progress to the end of
>> the semi-final round.
>> 
> 
> If the dataset is synthetic (and I assume not random) is the goal to just
> guess the model that generated the dataset?  Assuming it performs well, how
> far us the 'synthetic' model from the actual customer behavior so that there
> are no 'surprises' when it runs 'live'?
> 
> Potentially, there are more avenues for a lawsuit than in the Netflix case
> since money is involved (just a thought).
> 
> Alex K

--------------------------------------------
Grant Ingersoll
Join the LUCENE REVOLUTION
Lucene & Solr User Conference
May 25-26, San Francisco
www.lucenerevolution.org


Mime
View raw message