Return-Path: X-Original-To: apmail-crunch-dev-archive@www.apache.org Delivered-To: apmail-crunch-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E3B6B17562 for ; Wed, 28 Jan 2015 19:22:50 +0000 (UTC) Received: (qmail 8423 invoked by uid 500); 28 Jan 2015 19:22:46 -0000 Delivered-To: apmail-crunch-dev-archive@crunch.apache.org Received: (qmail 8388 invoked by uid 500); 28 Jan 2015 19:22:46 -0000 Mailing-List: contact dev-help@crunch.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@crunch.apache.org Delivered-To: mailing list dev@crunch.apache.org Received: (qmail 8376 invoked by uid 99); 28 Jan 2015 19:22:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Jan 2015 19:22:45 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of davidwhiting@gmail.com designates 74.125.82.171 as permitted sender) Received: from [74.125.82.171] (HELO mail-we0-f171.google.com) (74.125.82.171) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Jan 2015 19:22:21 +0000 Received: by mail-we0-f171.google.com with SMTP id k11so20339642wes.2 for ; Wed, 28 Jan 2015 11:21:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=fOi1tIxY6KozWkKyYT48ArJhL+ySxxYWnZDJfQUy06U=; b=JXuBU+xGsuTLo0R57tjiUHHrdScd6jN28DFZrJAE/U38G17ZfP4JH2xMXj/XyC5Lnj 9i0Jj1W6GEbUzJE7Vf59Caa6798XsIUCz/kSnrjuvbyXSWyjD2f9udDDB1H1hi+c8s9B a1DRJlz3VGKNyokusNKET5I8HA6SCYL4k9mxA5VPtCtoOr038G1VHXu/+UTtxz8CrKca mJ9xMDU1/RTno/BPxuFBDOGytRhancI2tclIN9KdQ0M8/eQnQSgpZuhceglpR3e5M8+I Lv72mYAakWGRrVFwS7s5XKqRUnpfHb6qlk1At3J7xb9izIwzfB7GsL6lzetJfSATCnGa WRAA== MIME-Version: 1.0 X-Received: by 10.180.189.67 with SMTP id gg3mr10125894wic.4.1422472894963; Wed, 28 Jan 2015 11:21:34 -0800 (PST) Received: by 10.194.58.130 with HTTP; Wed, 28 Jan 2015 11:21:34 -0800 (PST) In-Reply-To: References: Date: Wed, 28 Jan 2015 14:21:34 -0500 Message-ID: Subject: Re: .materialize() returns empty collection on pipeline error? From: David Whiting To: dev@crunch.apache.org Content-Type: multipart/alternative; boundary=001a11c342ba9b6767050dbb474a X-Virus-Checked: Checked by ClamAV on apache.org --001a11c342ba9b6767050dbb474a Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I think "fail catastrophically" is probably exactly what should happen here. You can always catch and use an empty iterable if it fails. A common use case here is to do one step, materialize it into a collection or map, then pass that into a DoFn to use as a small lookup table. This failure mode means that future steps silently continue to execute with empty lookup tables as part of their processing on the cluster. On 28 January 2015 at 13:45, Josh Wills wrote: > Yeah, I think that before, we would just fail catastrophically by throwin= g > a CrunchRuntimeException, which I found annoying. Do you prefer that > behavior? It's certainly something that could be configurable. > > J > > On Wed, Jan 28, 2015 at 10:36 AM, Jinal Shah > wrote: > > > I think it was intented from these commits I see here > > > > > https://github.com/apache/crunch/commit/3711cea61bded4c90b235a01163ae5f85= 5089917 > > and > > > > > https://github.com/apache/crunch/commit/ded504eb133fa0814e2d90ff2a662e72a= 67e04bb > > . > > Josh can enhance on this more. > > > > On Wed, Jan 28, 2015 at 9:26 AM, M=C4=81rti=C5=86=C5=A1 Kalv=C4=81ns < > > martins.kalvans@gmail.com> > > wrote: > > > > > Hi. > > > > > > When pipeline fails on cluster with some exception, materialize() > returns > > > empty collection and just logs error message. > > > > > > I'm (very, very) puzzled about this behaviour: > > > > > > > > > https://github.com/apache/crunch/blob/master/crunch-core/src/main/java/or= g/apache/crunch/materialize/MaterializableIterable.java#L92 > > > Is this really intended behaviour? > > > > > > If so, then some documentation for materialize() function about this > > > behaviour would be really nice to have. :) > > > > > > > > > -- > > > M=C4=81rti=C5=86=C5=A1 > > > > > > > > > -- > Director of Data Science > Cloudera > Twitter: @josh_wills > --001a11c342ba9b6767050dbb474a--