mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Hall <>
Subject Re: Introduction for student interested in GSoC
Date Thu, 26 Mar 2009 01:53:57 GMT
On Wed, Mar 25, 2009 at 6:41 PM, Ted Dunning <> wrote:
> Groovy closures are just objects as well, but they can't easily be
> serialized because they can capture references to other objects which are
> unlikely to exist on the far machine.

Same problem in Scala... But I just punt and assume people behave.
Strong assumption, but the one heavy user of SMR (me) has so far not
had much trouble doing that. :-)

> Can you say more about the compiler plugin?  Or provide a pointer?

All it does is make all anonymous closures implement The Scala compiler is unnecessarily picky by

> Also, in your example here, how would you deal with the situation where a is
> incremented in map closure?  Just punt and say undefined?

Scala encourages immutability for a reason :-)

The "right" answer is almost certainly to define easy-to-use
constructs to communicate between the nodes when needed. I have
ThreadLocal[T], which avoids serialization.

Actually, since we're on this topic. The Wolfe, et al paper I cited at
the beginning draws out the concern that MapReduce actually isn't the
right paradigm for a lot of ML, and that you need to do clever things
like using junction tree topology to get better performance.

-- David

> On Wed, Mar 25, 2009 at 1:23 PM, David Hall <> wrote:
>> scala closures are just objects. With the compiler plugin I wrote it's
>> trivial to to serialize closures and send them down the wire. In fact,
>> that's how SMR works at the moment.
>> int a = 3;
>> for( (k,v) <- pairs) yield (v,k+ a)
>> translates to
>> new anonfun$obfuscationgarbage$1(a) )
> --
> Ted Dunning, CTO
> DeepDyve

View raw message