Return-Path: X-Original-To: apmail-crunch-dev-archive@www.apache.org Delivered-To: apmail-crunch-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 47C0618CE9 for ; Fri, 11 Dec 2015 17:43:20 +0000 (UTC) Received: (qmail 90261 invoked by uid 500); 11 Dec 2015 17:43:20 -0000 Delivered-To: apmail-crunch-dev-archive@crunch.apache.org Received: (qmail 90224 invoked by uid 500); 11 Dec 2015 17:43:20 -0000 Mailing-List: contact dev-help@crunch.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@crunch.apache.org Delivered-To: mailing list dev@crunch.apache.org Received: (qmail 90209 invoked by uid 99); 11 Dec 2015 17:43:19 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Dec 2015 17:43:19 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 7AF16180A96 for ; Fri, 11 Dec 2015 17:43:19 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.88 X-Spam-Level: ** X-Spam-Status: No, score=2.88 tagged_above=-999 required=6.31 tests=[AC_DIV_BONANZA=0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id c3JLLzPQKo5b for ; Fri, 11 Dec 2015 17:43:18 +0000 (UTC) Received: from mail-qk0-f181.google.com (mail-qk0-f181.google.com [209.85.220.181]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 1D8E326ABB for ; Fri, 11 Dec 2015 17:43:18 +0000 (UTC) Received: by qkdp187 with SMTP id p187so49417478qkd.1 for ; Fri, 11 Dec 2015 09:43:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :content-type; bh=zrlEqfxUK4e1qaUen0tQ7/XLvIchvBP16Z3n49ft89k=; b=mGq6VPvFTvgdCBUdQhzkaQbYvGf36A2LgJDmfIP7YgNA33XNO81x0liu4qkxFHpKmC +aCuPnOU1VtPkrn/xzUrF58u7HNRo2sb5Unsg8XRjiR48fni8xDOGb0iltTI8yh7qmfW FjSgwVKBUojiYOKBRm/WiXmdFTMrgxj+yknFr6zwC8VVISXRsoxPZM/zSEbfexov3TUU NuVoX8qDTmDVbpylwp/pT1UDz8PwFO8WtAFX+9eOlDlSzQtd8Ex2l8sVv7ayY4e9r+0q UVicY09HfpobbX8AbVMUIrokLO9brpzpklFl5zZJefR17d3ji8xjnxwcluAQ5hWnldzn I6KQ== X-Received: by 10.129.80.138 with SMTP id e132mr9729624ywb.90.1449855797126; Fri, 11 Dec 2015 09:43:17 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Josh Wills Date: Fri, 11 Dec 2015 17:43:06 +0000 Message-ID: Subject: Re: Alternative strategy for incorporating Java 8 lambdas into Crunch To: dev@crunch.apache.org Content-Type: multipart/alternative; boundary=001a1147fa52c36f510526a2dbf6 --001a1147fa52c36f510526a2dbf6 Content-Type: text/plain; charset=UTF-8 I think it's kind of awesome, but the attachment didn't go through- PR or gist? On Fri, Dec 11, 2015 at 7:42 AM David Whiting wrote: > While fixing the bug where the IFn version of mapValues on PGroupedTable > was missing, I got thinking that this is quite an inefficient way of > including support for lambdas and method references, and it still didn't > actually support quite a few of the features that would make it easy to > code against. > > Negative parts of existing lambda implementation: > 1) Explosion of already-crowded PCollection, PTable and PGroupedTable > interfaces, and having to implement those methods in all implementations. > 2) Not supporting flatMap to Optional or Stream types. > 3) Not exposing convenient types for reduce-type operations (Stream > instead of Iterable, for example). > > Something that would solve all three of these is to build lambda support > as a separate artifact (so we can use all java8 types), and instead of the > API being directly on the PSomething interfaces, we just have convenient > ways to wrap up lambdas into DoFns or MapFns via statically-imported > methods. > > The usage then becomes > import static org.apache.crunch.Lambda.*; > ... > someCollection.parallelDo(flatMap(d -> someFnOf(d)), pt) > ... > otherGroupedTable.mapValue(reduce(seq -> seq.mapToInt(i -> i).sum()), > ints()) > > Where flatMap and reduce are static methods on Lambda, and Lambda goes in > it's own artifact (to preserve compatibility with 6 and 7 for the rest of > Crunch). > I've attached a basic proof-of-concept implementation which I've tested a > few things with, and I'm very happy to sketch out a more substantial > implementation if people here think it's a good idea in general. > > Thoughts? Ideas? Suggestions? Please tell me if this is crazy. > > --001a1147fa52c36f510526a2dbf6--