crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <josh.wi...@gmail.com>
Subject Re: Alternative strategy for incorporating Java 8 lambdas into Crunch
Date Fri, 11 Dec 2015 21:57:16 GMT
That's the sexiest thing I've seen in some time. +1 for a lambda module,
but how does that work in Maven-fu? Is it like a conditional compile or
something?

On Fri, Dec 11, 2015 at 1:20 PM, David Whiting <davw@apache.org> wrote:

> Oops, my bad. Here's a Gist:
> https://gist.github.com/DavW/e2588e42c45ad8c06038
>
> On 11 December 2015 at 18:43, Josh Wills <josh.wills@gmail.com> wrote:
>
> > I think it's kind of awesome, but the attachment didn't go through- PR or
> > gist?
> > On Fri, Dec 11, 2015 at 7:42 AM David Whiting <davw@apache.org> wrote:
> >
> > > While fixing the bug where the IFn version of mapValues on
> PGroupedTable
> > > was missing, I got thinking that this is quite an inefficient way of
> > > including support for lambdas and method references, and it still
> didn't
> > > actually support quite a few of the features that would make it easy to
> > > code against.
> > >
> > > Negative parts of existing lambda implementation:
> > > 1) Explosion of already-crowded PCollection, PTable and PGroupedTable
> > > interfaces, and having to implement those methods in all
> implementations.
> > > 2) Not supporting flatMap to Optional or Stream types.
> > > 3) Not exposing convenient types for reduce-type operations (Stream
> > > instead of Iterable, for example).
> > >
> > > Something that would solve all three of these is to build lambda
> support
> > > as a separate artifact (so we can use all java8 types), and instead of
> > the
> > > API being directly on the PSomething interfaces, we just have
> convenient
> > > ways to wrap up lambdas into DoFns or MapFns via statically-imported
> > > methods.
> > >
> > > The usage then becomes
> > > import static org.apache.crunch.Lambda.*;
> > > ...
> > > someCollection.parallelDo(flatMap(d -> someFnOf(d)), pt)
> > > ...
> > > otherGroupedTable.mapValue(reduce(seq -> seq.mapToInt(i -> i).sum()),
> > > ints())
> > >
> > > Where flatMap and reduce are static methods on Lambda, and Lambda goes
> in
> > > it's own artifact (to preserve compatibility with 6 and 7 for the rest
> of
> > > Crunch).
> > > I've attached a basic proof-of-concept implementation which I've
> tested a
> > > few things with, and I'm very happy to sketch out a more substantial
> > > implementation if people here think it's a good idea in general.
> > >
> > > Thoughts? Ideas? Suggestions? Please tell me if this is crazy.
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message