mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject RecommenderJob and NaN
Date Tue, 11 Oct 2011 16:34:26 GMT
I'm running trunk RecommenderJob (via build-asf-email.sh) and am not getting any recommendations
due to NaNs being calculated in the AggregateAndRecommend step.  I'm not quite sure what is
going on as it seems like this was working as little as two weeks ago (post Sebastian's big
change to RecJob), but I don't see a whole lot of changes in that part of the code.

The data is user id's mapping to email thread ids.  My input data is simply a triple of user
id, thread id, 1 (meaning that user participated in that thread)  It seems like I will have
a lot of good values in the inputs to the AggregateAndRecommend step, except one id will be
NaN and this then seems to get added in and makes everything NaN (I realize this is a very
naive understanding).  I sense that I should be looking upstream in the process for a fix,
but I am not sure where that is.

Any ideas where I should be looking to eliminate these NaNs?  If you want to try this with
a small data set, you can get it here: http://www.lucidimagination.com/devzone/technical-articles/scaling-mahout
(but note the companion article is not published yet.)

Thanks,
Grant
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message