mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: Discussion Of ML environment/MR, Mahout
Date Wed, 13 Mar 2013 10:00:07 GMT
On Wed, Mar 13, 2013 at 2:04 AM, Dmitriy Lyubimov <> wrote:

> Yeah. The stuck point for me is page-rankish-finding stationary
> distributions and extremely popular ALS based stuff. We've beaten the heck
> out of it a year ago and Sebastian conclusively stated Giraph ALS knocks
> the socks off MR version. Add to that a bisect search for a good

This keeps being said, but, I thought Sebastian just said that the M/R
version he mentioned being much slower was a different version, deleted
from this project? See my other email. The current version is similar to
the one I just benchmarked, and that appeared to be about as fast as
GraphLab (still not clear if the same amount of work is being compared

This matches my hunch that these things are about the same, modulo some
extra disk I/O, which is not most of the runtime.

I point it out in case this is underpinning many people's logic for
rebuilding a bunch of stuff because it will be a *lot* faster. Surely some
stuff can be done more naturally in a graph paradigm but not everything, or
most? I'm worried about the conclusion because of cases like this.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message