Return-Path: X-Original-To: apmail-mahout-user-archive@www.apache.org Delivered-To: apmail-mahout-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 104BC9D2A for ; Thu, 13 Oct 2011 10:47:23 +0000 (UTC) Received: (qmail 29414 invoked by uid 500); 13 Oct 2011 10:47:22 -0000 Delivered-To: apmail-mahout-user-archive@mahout.apache.org Received: (qmail 29298 invoked by uid 500); 13 Oct 2011 10:47:21 -0000 Mailing-List: contact user-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mahout.apache.org Delivered-To: mailing list user@mahout.apache.org Received: (qmail 29290 invoked by uid 99); 13 Oct 2011 10:47:20 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Oct 2011 10:47:20 +0000 Received: from localhost (HELO [10.0.0.77]) (127.0.0.1) (smtp-auth username gsingers, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Oct 2011 10:47:20 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1084) Subject: Re: RecommenderJob and NaN From: Grant Ingersoll In-Reply-To: <4E969ABD.5080806@apache.org> Date: Thu, 13 Oct 2011 06:47:18 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: <4BF9F8D8-72B8-4E53-B3D6-0D823013A7B9@apache.org> References: <6DB59617-CF5D-4F99-BB6A-699F8713300B@apache.org> <2AFBB46C-581D-4ADF-A04A-7F9237742FA5@apache.org> <44E8A341-5495-419F-BB07-6ADD4CE615D6@apache.org> <518229D1-B659-4B70-A206-EE42D363D9E2@transpac.com> <2B9669EC-FEE3-4CF8-9092-70F777E31D0B@apache.org> <4E969ABD.5080806@apache.org> To: user@mahout.apache.org X-Mailer: Apple Mail (2.1084) On Oct 13, 2011, at 4:01 AM, Sebastian Schelter wrote: > Grant, >=20 > Can you share a little more details about the results, do you get any > exceptions? Or do you just get no results? No results. >=20 > Using the NaNs inside the similarity matrix vectors has been included = in > the job for a very long time and should not cause any problems. As = Sean > already mentioned we have unit tests with toy data that should catch = the > very obvious errors in this code. Yeah, I don't know what happened. I know I was getting results as = little as two weeks ago. I will try rolling back to an earlier commit. >=20 > Can you share the dataset? I can do a testrun on my research cluster. I already have earlier in this thread. There is a small set via the = link below or you can use the ASF email public dataset on Amazon or any = subset of it. >=20 > --sebastian >=20 > On 13.10.2011 08:37, Sean Owen wrote: >> RecommenderJob? The unit tests run it all the time. >> There should not be any glitches with static variables -- don't think >> there are any. >>=20 >> On Thu, Oct 13, 2011 at 7:33 AM, Lance Norskog = wrote: >>> Is this job working well for anyone now? >>> When was the last time this job worked for someone? >>>=20 >>> On Wed, Oct 12, 2011 at 11:30 AM, Grant Ingersoll = wrote: >>>=20 >>>> Both local and on EC2 >>>>=20 >>>> On Oct 12, 2011, at 2:10 PM, Ken Krugler wrote: >>>>=20 >>>>> Hi Grant, >>>>>=20 >>>>> Just curious, are you running this locally or distributed? >>>>>=20 >>>>> I'd run into a similar issue, though in a completely different = algorithm >>>> (Jimmy Lin's PageRank implementation) due to the use of a static = variable. >>>>>=20 >>>>> When running locally, this wasn't getting cleared between loops, = and thus >>>> I got wonky results. >>>>>=20 >>>>> The same thing would have happened with JVM reuse enabled. >>>>>=20 >>>>> -- Ken >>>>>=20 >>>>> On Oct 12, 2011, at 3:28pm, Grant Ingersoll wrote: >>>>>=20 >>>>>> Digging some more: >>>>>>=20 >>>>>> In AggregateAndRecommend, around lines 143, I have, for userId 0, = a >>>> simColumn of: >>>>>>=20 >>>> = {22966:0.9566912651062012,81901:0.9566912651062012,263375:0.95669126510620= 12,263374:0.9566912651062012,263376:NaN} >>>>>>=20 >>>>>> Which then becomes the numerator and the denom. >>>>>>=20 >>>>>> Looping, my next simCol is: >>>>>>=20 >>>> = {22966:0.9566912651062012,81901:0.9566912651062012,263375:NaN,263374:0.956= 6912651062012,263376:0.9566912651062012} >>>>>>=20 >>>>>> and then >>>>>>=20 >>>> = {22966:0.9566912651062012,81901:0.9566912651062012,263375:0.95669126510620= 12,263374:NaN,263376:0.9566912651062012} >>>>>>=20 >>>>>> ... >>>>>>=20 >>>>>> Each time, those are getting added into the numerators/denoms = value, >>>> such that by the time we are done looping (line 161), we have: >>>>>> numerators: = {22966:NaN,81901:NaN,263376:NaN,263375:NaN,263374:NaN} >>>>>> denoms: {22966:NaN,81901:NaN,263376:NaN,263375:NaN,263374:NaN} >>>>>>=20 >>>>>> numberOfSimilarItemsUsed: >>>> {81901:5.0,22966:5.0,263376:5.0,263375:5.0,263374:5.0} >>>>>>=20 >>>>>> Not sure on how to interpret this as I haven't dug into the math = here >>>> yet or figured out where those NaN are coming from originally. >>>>>>=20 >>>>>> On Oct 11, 2011, at 2:55 PM, Grant Ingersoll wrote: >>>>>>=20 >>>>>>>=20 >>>>>>> On Oct 11, 2011, at 2:49 PM, Grant Ingersoll wrote: >>>>>>>=20 >>>>>>>>=20 >>>>>>>> On Oct 11, 2011, at 12:36 PM, Sean Owen wrote: >>>>>>>>=20 >>>>>>>>> Where is the NaN coming up -- what has this value? >>>>>>>>=20 >>>>>>>> simColumn seems to be the originator in the Aggregate step. = For >>>> instance, my current breakpoint shows: >>>>>>>> {309682:0.9566912651062012,42938:0.9566912651062012,309672:NaN} >>>>>>>>=20 >>>>>>>> I can also see some in the PartialMultiplyMapper via the >>>> similarityMatrixColumn. >>>>>>>>=20 >>>>>>>> Is that set by SimilarityMatrixRowWrapperMapper? >>>>>>>> >>>>>>>> /* remove self similarity */ >>>>>>>> similarityMatrixRow.set(key.get(), Double.NaN); >>>>>>>> >>>>>>>=20 >>>>>>> Ah, but that is just taking care of itself, so maybe not the = issue. >>>>>>>=20 >>>>>>>>=20 >>>>>>>>=20 >>>>>>>>=20 >>>>>>>>> It should be propagated in some cases but not others. I'm not = aware >>>> of >>>>>>>>> any changes here. >>>>>>>>=20 >>>>>>>> yeah, me neither. This is all related to MAHOUT-798. >>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>> Generally small data sets will have this problem of not being = able to >>>>>>>>> compute much of anything useful, so NaN might be right here. >>>>>>>>> But you say it was different recently, which seems to rule = that out. >>>>>>>>=20 >>>>>>>> I also _believe_ I'm seeing it in a much larger data set on = Hadoop, >>>> it's just that's a whole lot harder to debug. >>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>> On Tue, Oct 11, 2011 at 5:34 PM, Grant Ingersoll < >>>> gsingers@apache.org> wrote: >>>>>>>>>> I'm running trunk RecommenderJob (via build-asf-email.sh) and = am not >>>> getting any recommendations due to NaNs being calculated in the >>>> AggregateAndRecommend step. I'm not quite sure what is going on as = it seems >>>> like this was working as little as two weeks ago (post Sebastian's = big >>>> change to RecJob), but I don't see a whole lot of changes in that = part of >>>> the code. >>>>>>>>>>=20 >>>>>>>>>> The data is user id's mapping to email thread ids. My input = data is >>>> simply a triple of user id, thread id, 1 (meaning that user = participated in >>>> that thread) It seems like I will have a lot of good values in the = inputs >>>> to the AggregateAndRecommend step, except one id will be NaN and = this then >>>> seems to get added in and makes everything NaN (I realize this is a = very >>>> naive understanding). I sense that I should be looking upstream in = the >>>> process for a fix, but I am not sure where that is. >>>>>>>>>>=20 >>>>>>>>>> Any ideas where I should be looking to eliminate these NaNs? = If you >>>> want to try this with a small data set, you can get it here: >>>> = http://www.lucidimagination.com/devzone/technical-articles/scaling-mahout(= but note the companion article is not published yet.) >>>>>>>>>>=20 >>>>>>>>>> Thanks, >>>>>>>>>> Grant >>>>>>>>=20 >>>>>>>>=20 >>>>>>>=20 >>>>>>> -------------------------------------------- >>>>>>> Grant Ingersoll >>>>>>> http://www.lucidimagination.com >>>>>>> Lucene Eurocon 2011: http://www.lucene-eurocon.com >>>>>>>=20 >>>>>>=20 >>>>>> -------------------------------------------- >>>>>> Grant Ingersoll >>>>>> http://www.lucidimagination.com >>>>>> Lucene Eurocon 2011: http://www.lucene-eurocon.com >>>>>>=20 >>>>>=20 >>>>> -------------------------- >>>>> Ken Krugler >>>>> +1 530-210-6378 >>>>> http://bixolabs.com >>>>> custom big data solutions & training >>>>> Hadoop, Cascading, Mahout & Solr >>>>>=20 >>>>>=20 >>>>>=20 >>>>=20 >>>> -------------------------------------------- >>>> Grant Ingersoll >>>> http://www.lucidimagination.com >>>> Lucene Eurocon 2011: http://www.lucene-eurocon.com >>>>=20 >>>>=20 >>>=20 >>>=20 >>> -- >>> Lance Norskog >>> goksron@gmail.com >>>=20 >=20 -------------------------- Grant Ingersoll http://www.lucidimagination.com Lucene Eurocon 2011: http://www.lucene-eurocon.com