Mailing-List: contact user-help@mahout.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@mahout.apache.org
Received-SPF: pass (nike.apache.org: domain of joshdevins@gmail.com designates
 209.85.219.53 as permitted sender)
MIME-Version: 1.0
Sender: joshdevins@gmail.com
In-Reply-To: <5138FA8E.8010508@apache.org>
References: 
 <CAJuyzr1v=VWh1Kr+M2DpQ9HHFWrJNRvmJ-e62=FVg0vRrvvVEw@mail.gmail.com>
 <CAJwFCa3KecZOf0whLBiYNNw=R7Ej8324e_e-dw1XLTSFfBHKjg@mail.gmail.com>
 <CAEccTyz_Vr65zL8Gd0ZeQODmHA9g4rymfHM8ToxL4kYw73GG1Q@mail.gmail.com>
 <71EEBFD7-22E4-499B-BA44-1932274445E3@gmail.com>
 <CAJuyzr2WgaY+BuUB1t_n-37N_r=BDWFRnXSpMwOwWE32twrkCQ@mail.gmail.com>
 <CAEccTywYwZp9xdqjsHAk0FufDrMNwsJsdmu8dLE8jJufDzrfEA@mail.gmail.com>
 <CAJuyzr0DxnRYxwV3kf39ENVwnzxSaeyQWDN7vWDNdC9ZSV=AwA@mail.gmail.com>
 <CAEccTywSj4ox9A73_7G0EPeSfJB0daWwt3pXY9kfO9AXRKg12Q@mail.gmail.com>
 <CAJwFCa39KpFrsVKEGZM7U6n+LumLvLMDuUxH9GV1ZJ81Zw_92g@mail.gmail.com>
 <513769CB.6050509@apache.org>
 <CAJuyzr1JUJv+-jnutEZbrL=wfKghqJx69wm7rJfde93y3gDpdQ@mail.gmail.com>
 <CAJuyzr1u0L=oq3aLCat9gsdEJ51XGSYFmt4n1O5M_Aa_51uCdg@mail.gmail.com>
 <5138AAA9.5030203@apache.org>
 <CAJuyzr3pZjbWLxwf1r4OBVg-3HSgip+8tV+Xu+yC9cHon9EQTg@mail.gmail.com>
 <5138FA8E.8010508@apache.org>
From: Josh Devins <hi@joshdevins.com>
Date: Thu, 7 Mar 2013 21:53:04 +0100
Message-ID: 
 <CAJuyzr3W0-AAqCMMoKp7s_GB3NMsZbTZd8wGopWXVTvg1Xcm_Q@mail.gmail.com>
Subject: Re: Top-N recommendations from SVD
To: user@mahout.apache.org
Content-Type: multipart/alternative; boundary=e89a8ff1c6d6d0eede04d75be587

--e89a8ff1c6d6d0eede04d75be587
Content-Type: text/plain; charset=UTF-8

I'm running a job right now that uses your static `dot` method from your
previous post, ontop of v0.7 (nothing from trunk). This has cut the time
down by about 1/3 but it's still around 500ms per user. I'll give your
latest patch a go hopefully tomorrow and report back.

We're working on another approach too. Will email you off thread if it
proves fruitful, perhaps whip up a patch as well.

Josh


On 7 March 2013 21:37, Sebastian Schelter <ssc@apache.org> wrote:

> Hi Josh,
>
> I made another attempt today. It directly computes the dot products,
> introduces a mutable version of RecommendedItem and uses Lucene's
> PriorityQueue to keep the top k.
>
> I hope this gives you some improvements.
>
> Here's the patch (must be applied against trunk):
>
>
> https://issues.apache.org/jira/secure/attachment/12572605/MAHOUT-1151-2.patch
>
> Best,
> Sebastian
>
> On 07.03.2013 16:00, Josh Devins wrote:
> > I ran from what's in trunk as of this morning. I didn't dig in further to
> > see where that extra time was coming from but can do so when I get some
> > time soon.
> >
> >
> > On 7 March 2013 15:56, Sebastian Schelter <ssc@apache.org> wrote:
> >
> >> Hi Josh,
> >>
> >> Did you run the patch from the jira issue or did you run the trunk? I
> >> made some follow up changes after uploading the patch. I can't imagine
> >> why those small changes would lead to an increase of 50% in the runtime.
> >>
> >> /s
> >>
> >>
> >>
> >> On 07.03.2013 15:02, Josh Devins wrote:
> >>> So the good news is that the patch runs ;)  The bad news is that it's
> >>> slower, going from 1600-1800ms to ~2500ms to calculate a single users'
> >> topK
> >>> recommendations. For kicks, I ran a couple other experiments,
> >> progressively
> >>> removing code to isolate the problem area. Results are detailed here:
> >>> https://gist.github.com/joshdevins/5106930
> >>>
> >>> Conclusions thus far:
> >>>  * the patch is not helpful (for performance) and should be reverted or
> >>> fixed again (sorry Sebastian)
> >>>  * the dot product operation in `Vector` is not efficient enough for
> >> large
> >>> vectors/matrices, when used as it is in the ALS `RecommenderJob`,
> inside
> >> a
> >>> loop over `M`
> >>>
> >>> I've tried a few other experiments with Colt (for example) but there
> was
> >> no
> >>> noticeable gain. Parallelizing inside the map task (manually or with
> >>> Parallel Colt) is possible but obviously is not ideal in an environment
> >>> like Hadoop -- this would save memory since you only need a few map
> tasks
> >>> loading the matrices, but isn't playing very nicely within a shared
> >> cluster
> >>> :)
> >>>
> >>> Next step at this point is to look at either reducing the number of
> items
> >>> to recommend over, LSH or a third secret plan that "the PhD's" are
> >> thinking
> >>> about. Paper forthcoming, no doubt :D
> >>>
> >>> @Sebastian, happy to run any patches on our cluster/dataset before
> making
> >>> more commits.
> >>>
> >>>
> >>>
> >>> On 6 March 2013 20:58, Josh Devins <hi@joshdevins.com> wrote:
> >>>
> >>>> Got sidetracked today but I'll run Sebastian's version in trunk
> tomorrow
> >>>> and report back.
> >>>>
> >>>>
> >>>> On 6 March 2013 17:07, Sebastian Schelter <ssc@apache.org> wrote:
> >>>>
> >>>>> I already committed a fix in that direction. I modified our
> >>>>> FixedSizePriorityQueue to allow inspection of its head for direct
> >>>>> comparison. This obviates the need to instantiate a Comparable and
> >> offer
> >>>>> it to the queue.
> >>>>>
> >>>>> /s
> >>>>>
> >>>>>
> >>>>> On 06.03.2013 17:01, Ted Dunning wrote:
> >>>>>> I would recommend against a mutable object on maintenance grounds.
> >>>>>>
> >>>>>> Better is to keep the threshold that a new score must meet and only
> >>>>>> construct the object on need.  That cuts the allocation down to
> >>>>> negligible
> >>>>>> levels.
> >>>>>>
> >>>>>> On Wed, Mar 6, 2013 at 6:11 AM, Sean Owen <srowen@gmail.com> wrote:
> >>>>>>
> >>>>>>> OK, that's reasonable on 35 machines. (You can turn up to 70
> >> reducers,
> >>>>>>> probably, as most machines can handle 2 reducers at once).
> >>>>>>> I think the recommendation step loads one whole matrix into memory.
> >>>>> You're
> >>>>>>> not running out of memory but if you're turning up the heap size to
> >>>>>>> accommodate, you might be hitting swapping, yes. I think (?) the
> >>>>>>> conventional wisdom is to turn off swap for Hadoop.
> >>>>>>>
> >>>>>>> Sebastian yes that is probably a good optimization; I've had good
> >>>>> results
> >>>>>>> reusing a mutable object in this context.
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, Mar 6, 2013 at 10:54 AM, Josh Devins <hi@joshdevins.com>
> >>>>> wrote:
> >>>>>>>
> >>>>>>>> The factorization at 2-hours is kind of a non-issue (certainly
> fast
> >>>>>>>> enough). It was run with (if I recall correctly) 30 reducers
> across
> >> a
> >>>>> 35
> >>>>>>>> node cluster, with 10 iterations.
> >>>>>>>>
> >>>>>>>> I was a bit shocked at how long the recommendation step took and
> >> will
> >>>>>>> throw
> >>>>>>>> some timing debug in to see where the problem lies exactly. There
> >>>>> were no
> >>>>>>>> other jobs running on the cluster during these attempts, but it's
> >>>>>>> certainly
> >>>>>>>> possible that something is swapping or the like. I'll be looking
> >> more
> >>>>>>>> closely today before I start to consider other options for
> >> calculating
> >>>>>>> the
> >>>>>>>> recommendations.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
> >>
> >
>
>

--e89a8ff1c6d6d0eede04d75be587--