mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sebastian Schelter <...@apache.org>
Subject Re: 0.8 progress
Date Mon, 08 Jul 2013 17:34:30 GMT
Hi Peng,

You cannot "inject"  the ParallelALSFactorizationJob into a recommender
class. Have a look at factorize-netflix.sh in examples to see how to use it
for hold out tests.

Best,
Sebastian


2013/7/8 Peng Cheng <pc175@uowmail.edu.au>

> Hi Sebastian,
>
> I'm sorry for the entirely noobish questions: where can I download the
> judging.txt ground truth set? (netflix is pulling it off everywhere, so far
> I can only get the legacy trainingSet and qualifying.txt)
> and how do I inject the ParallelAlsFactorizationJob into a common
> recommender class?
> I was trying to reproduce your result (I own a small cluster), but don't
> even know where to start. The only related thing i found in mahout-example
> is a format converter.
>
> Thanks a lot if you can give me a hint.
>
> - Yours Peng
>
>
> On 13-07-01 01:24 AM, Sebastian Schelter wrote:
>
>> I successfully ran the ALS and cooccurrence-based recommenders on the
>> Netflix dataset on a 26 machine cluster using Hadoop 1.0.4.
>>
>> --sebastian
>>
>>
>> On 28.06.2013 21:31, Jake Mannix wrote:
>>
>>> I can run LDA on Twitter's cluster, on both reuters and some "real data",
>>> as well as LR/SGD.
>>>
>>>
>>> On Fri, Jun 28, 2013 at 11:51 AM, Grant Ingersoll <gsingers@apache.org
>>> >wrote:
>>>
>>>  We really should setup a VM that we can run a couple of nodes (perhaps
>>>> at
>>>> ASF?) on that we can share w/ everyone that makes it easy to test our
>>>> stuff
>>>> on Hadoop for the specific version that we ship.
>>>>
>>>> On Jun 28, 2013, at 2:41 PM, Robin Anil <robin.anil@gmail.com> wrote:
>>>>
>>>>  Can someone (if you have time and experience). Write a small shim to
>>>>> run
>>>>> all examples one after the other on a cluster and write up instructions
>>>>>
>>>> on
>>>>
>>>>> how to do it.?
>>>>>
>>>>> Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.
>>>>>
>>>>>
>>>>> On Fri, Jun 28, 2013 at 1:11 PM, Sebastian Schelter <ssc@apache.org>
>>>>>
>>>> wrote:
>>>>
>>>>> Its crucial that we retest everything on a real cluster before the
>>>>>>
>>>>> release.
>>>>
>>>>> I will do this for the recommenders code next week.
>>>>>>
>>>>>> --sebastian
>>>>>> Am 28.06.2013 14:03 schrieb "Grant Ingersoll" <gsingers@apache.org>:
>>>>>>
>>>>>>  I should have time next week to do the release, if we can get these
>>>>>>> knocked out.  If not next week, the following.
>>>>>>>
>>>>>>> On Jun 28, 2013, at 5:46 AM, Suneel Marthi <suneel_marthi@yahoo.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>  1. Could someone look at Mahout-1257? There is a patch that's
been
>>>>>>>>
>>>>>>> submitted but I am not sure if this has been superseded by Sean's
>>>>>>>
>>>>>> against
>>>>
>>>>> Mahout-1239.
>>>>>>>
>>>>>>>> 2. Stevo, I am for fixing the findbugs excludes as part of
0.8
>>>>>>>>
>>>>>>> release,
>>>>
>>>>> I see that the number of warnings has gone up over the last few builds.
>>>>>>>
>>>>>>>> 3. I am more concerned about the cause of the mysterious
cosmic rays
>>>>>>>>
>>>>>>> that randomly fail unit tests (since we have moved to running
>>>>>>> parallel
>>>>>>> tests).  I see that happening on my local repository too.
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ______________________________**__
>>>>>>>> From: Stevo Slavić <sslavic@gmail.com>
>>>>>>>> To: dev@mahout.apache.org
>>>>>>>> Sent: Friday, June 28, 2013 3:21 AM
>>>>>>>> Subject: Re: 0.8 progress
>>>>>>>>
>>>>>>>>
>>>>>>>> Well done team!
>>>>>>>>
>>>>>>>> Build is unstable, oscillates, IMO regardless of changes
made.
>>>>>>>> Judging
>>>>>>>>
>>>>>>> from
>>>>>>>
>>>>>>>> logs I suspect that some of the Jenkins nodes are not configured
>>>>>>>> well,
>>>>>>>>
>>>>>>> /tmp
>>>>>>>
>>>>>>>> directory security related issues, and file size constraints.
Could
>>>>>>>> be
>>>>>>>>
>>>>>>> also
>>>>>>>
>>>>>>>> issue with our tests.
>>>>>>>>
>>>>>>>> Javadoc was reported earlier not to be OK (not all modules
in
>>>>>>>>
>>>>>>> aggregated
>>>>>>
>>>>>>> javadoc), and code quality reports are not working OK, e.g. findbugs
>>>>>>>> doesn't respect excludes - plan to work on this during weekend.
>>>>>>>>
>>>>>>>> Do we want to fix these before or after 0.8 release?
>>>>>>>>
>>>>>>>> Kind regards,
>>>>>>>> Stevo Slavić.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Jun 28, 2013 at 12:32 AM, Robin Anil <robin.anil@gmail.com>
>>>>>>>>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> All Done
>>>>>>>>>
>>>>>>>>> Robin Anil | Software Engineer | +1 312 869 2602 | Google
Inc.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sun, Jun 23, 2013 at 11:36 PM, Robin Anil <robin.anil@gmail.com
>>>>>>>>> >
>>>>>>>>>
>>>>>>>> wrote:
>>>>>>>
>>>>>>>> I sent the comments. The code is good. But without the matrix/vector
>>>>>>>>>>
>>>>>>>>> input
>>>>>>>>>
>>>>>>>>>> we cant ship it in the release. Hope Yiqun and Da
Zhang can make
>>>>>>>>>>
>>>>>>>>> those
>>>>>>
>>>>>>> changes quickly.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Robin Anil | Software Engineer | +1 312 869 2602
| Google Inc.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sun, Jun 23, 2013 at 8:46 PM, Grant Ingersoll
<
>>>>>>>>>>
>>>>>>>>> gsingers@apache.org
>>>>>>
>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>  I see 1 issue left: MAHOUT-1214.  It is assigned
to Robin.  Any
>>>>>>>>>>>
>>>>>>>>>> chance
>>>>>>
>>>>>>> we
>>>>>>>>>
>>>>>>>>>> can finish this up this week?
>>>>>>>>>>>
>>>>>>>>>>> -Grant
>>>>>>>>>>>
>>>>>>>>>>> On Jun 23, 2013, at 9:26 AM, Suneel Marthi <
>>>>>>>>>>>
>>>>>>>>>> suneel_marthi@yahoo.com
>>>>
>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>  Finally got to finishing up M-833, the changes
can be reviewed
>>>>>>>>>>>> at
>>>>>>>>>>>>
>>>>>>>>>>> https://reviews.apache.org/r/**11774/diff/3/<https://reviews.apache.org/r/11774/diff/3/>
>>>>>>>>>>> .
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ______________________________**__
>>>>>>>>>>>> From: Grant Ingersoll <gsingers@apache.org>
>>>>>>>>>>>> To: dev@mahout.apache.org
>>>>>>>>>>>> Sent: Tuesday, June 11, 2013 10:09 AM
>>>>>>>>>>>> Subject: Re: 0.8 progress
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I pushed M-1030 and M-1233.  If we can get
M-833 and M-1214 in
>>>>>>>>>>>> by
>>>>>>>>>>>>
>>>>>>>>>>> Thursday, I can roll an RC on Thursday.
>>>>>>>>>>>
>>>>>>>>>>>> -Grant
>>>>>>>>>>>>
>>>>>>>>>>>> On Jun 11, 2013, at 8:56 AM, Grant Ingersoll
<
>>>>>>>>>>>> gsingers@apache.org
>>>>>>>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Down to 4 issues!  I would say what they
are, but JIRA is
>>>>>>>>>>>>> flaking
>>>>>>>>>>>>>
>>>>>>>>>>>> out
>>>>>>>
>>>>>>>> again.
>>>>>>>>>>>
>>>>>>>>>>>> My instinct is that 1030 and 1233 can be
pushed.  Suneel has
>>>>>>>>>>>>> been
>>>>>>>>>>>>>
>>>>>>>>>>>> working hard to get M-833 in.  Not sure on
M-1214, Robin?
>>>>>>>>>>>
>>>>>>>>>>>> -G
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Jun 9, 2013, at 6:10 PM, Grant Ingersoll
<
>>>>>>>>>>>>> gsingers@apache.org
>>>>>>>>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> On Jun 9, 2013, at 6:02 PM, Grant Ingersoll
<
>>>>>>>>>>>>>>
>>>>>>>>>>>>> gsingers@apache.org
>>>>
>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> M-1067 -- Dmitriy  --  This is an enhancement,
should we push?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Looks like this was committed
already.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>  ------------------------------**--------------
>>>>>>>>>>>> Grant Ingersoll | @gsingers
>>>>>>>>>>>> http://www.lucidworks.com
>>>>>>>>>>>>
>>>>>>>>>>> ------------------------------**--------------
>>>>>>>>>>> Grant Ingersoll | @gsingers
>>>>>>>>>>> http://www.lucidworks.com
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  ------------------------------**--------------
>>>>>>> Grant Ingersoll | @gsingers
>>>>>>> http://www.lucidworks.com
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>  ------------------------------**--------------
>>>> Grant Ingersoll | @gsingers
>>>> http://www.lucidworks.com
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message