mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: RecommenderJob and NaN
Date Tue, 11 Oct 2011 16:36:33 GMT
Where is the NaN coming up -- what has this value?
It should be propagated in some cases but not others. I'm not aware of
any changes here.

Generally small data sets will have this problem of not being able to
compute much of anything useful, so NaN might be right here.
But you say it was different recently, which seems to rule that out.

On Tue, Oct 11, 2011 at 5:34 PM, Grant Ingersoll <gsingers@apache.org> wrote:
> I'm running trunk RecommenderJob (via build-asf-email.sh) and am not getting any recommendations
due to NaNs being calculated in the AggregateAndRecommend step.  I'm not quite sure what
is going on as it seems like this was working as little as two weeks ago (post Sebastian's
big change to RecJob), but I don't see a whole lot of changes in that part of the code.
>
> The data is user id's mapping to email thread ids.  My input data is simply a triple
of user id, thread id, 1 (meaning that user participated in that thread)  It seems like I
will have a lot of good values in the inputs to the AggregateAndRecommend step, except one
id will be NaN and this then seems to get added in and makes everything NaN (I realize this
is a very naive understanding).  I sense that I should be looking upstream in the process
for a fix, but I am not sure where that is.
>
> Any ideas where I should be looking to eliminate these NaNs?  If you want to try this
with a small data set, you can get it here: http://www.lucidimagination.com/devzone/technical-articles/scaling-mahout
(but note the companion article is not published yet.)
>
> Thanks,
> Grant

Mime
View raw message