mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris K Wensel <ch...@wensel.net>
Subject Re: Has anyone tried Spark with Mahout?
Date Tue, 01 Nov 2011 15:35:02 GMT
I've made a few comments on the differences here.

http://www.quora.com/Apache-Hadoop/What-are-the-differences-between-Crunch-and-Cascading/answer/Chris-K-Wensel

chris

On Oct 31, 2011, at 2:44 PM, Ted Dunning wrote:

> +Chris Wensel
> 
> The biggest difference between Cascading and Plume/Crunch/FlumeJava is that the latter
all do more lazy evaluation and more program restructuring and much less large scale scheduling.
 Certainly the PCFJ group do much more to make the results look like a java collection and
are better at talking to conventional java types.
> 
> I think that Cascading could do the more extensive job graph rewrites.  It would be hard
for Cascading to generalize its data structures, though without major backward compatibility
issues.  
> 
> In sum, I think that the difference between Cascading and PCFJ is largely a matter of
taste, not inherent system design.
> 
> 
> On Mon, Oct 31, 2011 at 2:36 PM, Charles Earl <charlescearl@me.com> wrote:
> Thanks. This is an insightful discussion. Having just glanced now at both Plume and Crunch
these seem similar to Cascading in the sense of being dataflow languages. I wonder are you
able to comment on if there are important distinctions.

--
Chris K Wensel
chris@concurrentinc.com
http://www.concurrentinc.com

-- Concurrent, Inc. offers mentoring, support for Cascading


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message