crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <>
Subject Re: Alternative strategy for incorporating Java 8 lambdas into Crunch
Date Fri, 11 Dec 2015 17:43:06 GMT
I think it's kind of awesome, but the attachment didn't go through- PR or
On Fri, Dec 11, 2015 at 7:42 AM David Whiting <> wrote:

> While fixing the bug where the IFn version of mapValues on PGroupedTable
> was missing, I got thinking that this is quite an inefficient way of
> including support for lambdas and method references, and it still didn't
> actually support quite a few of the features that would make it easy to
> code against.
> Negative parts of existing lambda implementation:
> 1) Explosion of already-crowded PCollection, PTable and PGroupedTable
> interfaces, and having to implement those methods in all implementations.
> 2) Not supporting flatMap to Optional or Stream types.
> 3) Not exposing convenient types for reduce-type operations (Stream
> instead of Iterable, for example).
> Something that would solve all three of these is to build lambda support
> as a separate artifact (so we can use all java8 types), and instead of the
> API being directly on the PSomething interfaces, we just have convenient
> ways to wrap up lambdas into DoFns or MapFns via statically-imported
> methods.
> The usage then becomes
> import static org.apache.crunch.Lambda.*;
> ...
> someCollection.parallelDo(flatMap(d -> someFnOf(d)), pt)
> ...
> otherGroupedTable.mapValue(reduce(seq -> seq.mapToInt(i -> i).sum()),
> ints())
> Where flatMap and reduce are static methods on Lambda, and Lambda goes in
> it's own artifact (to preserve compatibility with 6 and 7 for the rest of
> Crunch).
> I've attached a basic proof-of-concept implementation which I've tested a
> few things with, and I'm very happy to sketch out a more substantial
> implementation if people here think it's a good idea in general.
> Thoughts? Ideas? Suggestions? Please tell me if this is crazy.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message