Return-Path: X-Original-To: apmail-crunch-dev-archive@www.apache.org Delivered-To: apmail-crunch-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D3B9F10689 for ; Thu, 27 Feb 2014 15:34:49 +0000 (UTC) Received: (qmail 97364 invoked by uid 500); 27 Feb 2014 15:34:49 -0000 Delivered-To: apmail-crunch-dev-archive@crunch.apache.org Received: (qmail 97334 invoked by uid 500); 27 Feb 2014 15:34:48 -0000 Mailing-List: contact dev-help@crunch.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@crunch.apache.org Delivered-To: mailing list dev@crunch.apache.org Received: (qmail 97326 invoked by uid 99); 27 Feb 2014 15:34:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 27 Feb 2014 15:34:47 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,MSGID_FROM_MTA_HEADER,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of stepinto@live.com designates 65.55.111.80 as permitted sender) Received: from [65.55.111.80] (HELO blu0-omc2-s5.blu0.hotmail.com) (65.55.111.80) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 27 Feb 2014 15:34:41 +0000 Received: from BLU0-SMTP91 ([65.55.111.72]) by blu0-omc2-s5.blu0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Thu, 27 Feb 2014 07:34:20 -0800 X-TMN: [nrY/o58bI6aoQcIL1ND7+J9LkaZs22aY] X-Originating-Email: [stepinto@live.com] Message-ID: Received: from mail-we0-f178.google.com ([74.125.82.178]) by BLU0-SMTP91.phx.gbl over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Thu, 27 Feb 2014 07:34:20 -0800 Received: by mail-we0-f178.google.com with SMTP id q59so3046359wes.9 for ; Thu, 27 Feb 2014 07:34:19 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=FMAiw6QfmvtSSF903fqDE51Cj7veqWtWlvYPKDKnPRo=; b=IAcTxNqx/jq9iEnX+UMOLTq8T5evvO6o1OAqZSVtQwTqaGiHHPsx12/PFRKr6a0kt1 3IY0lAgB6sdxr8QKLxclJyfSHW9sBHK1V66t5tDTBwwPjtuW1i2rtk6chXEqzEdrdegj p7SJVfl1dWEvUA/ojkOOAJykmDYVcxsvaYz09TcSIJHEnxQK4RSE+Ml1VzjYC85jNyUT 8N93TTLppMPFWwaeU/JcLWakyp0H4eEZ2LNjJEmWhpTT610xYS6Qsrahnc5/05Xm4Cl3 71cXXYdXBOBCOl2ExKnKKMlutbT5bkrxVdLpmszGp7s8fI8Ddh0HbT9ijkxkVORt5BCr 3XeQ== MIME-Version: 1.0 X-Received: by 10.180.103.227 with SMTP id fz3mr10260707wib.29.1393515259220; Thu, 27 Feb 2014 07:34:19 -0800 (PST) Received: by 10.194.138.233 with HTTP; Thu, 27 Feb 2014 07:34:18 -0800 (PST) Received: by 10.194.138.233 with HTTP; Thu, 27 Feb 2014 07:34:18 -0800 (PST) In-Reply-To: References: Date: Thu, 27 Feb 2014 23:34:18 +0800 Subject: Re: Support OutputCommitter? From: Chao Shi To: dev@crunch.apache.org Content-Type: multipart/alternative; boundary="f46d04428e7003dc9a04f3650e65" X-OriginalArrivalTime: 27 Feb 2014 15:34:20.0272 (UTC) FILETIME=[62052700:01CF33D1] X-Virus-Checked: Checked by ClamAV on apache.org --f46d04428e7003dc9a04f3650e65 Content-Type: text/plain; charset="ISO-8859-1" Hi Tom, I will have to use named-output. About your example DatasetTarget, is it safe to setOutputFormat() explicitly here? I guess this may conflict with other targets that only use the same trick. Is it possible for us to have a general approach to get OutputCommitter work? Hi Chao, Crunch doesn't call the output committer explicitly itself, it's called by the MR framework as a normal part of running a job. However, in Crunch's MapReduceTarget#configureForMapReduce the output format is not typically set for the named-output case (which is the only case that is executed now, as I discovered in the thread mentioned below), so it defaults to FileOutputFormat, with its semantics. (This is why HBaseTarget calls FileOutputFormat.setOutputPath, which it wouldn't have to if it set the output format explicitly to HBase's TableOutputFormat.) Are you setting the HCatOutputFormat in the named-output case? In the Crunch Target I'm writing I've set the OutputFormat explicitly: https://github.com/tomwhite/kite/blob/CDK-308-dataset-output-format/kite-data/kite-data-crunch/src/main/java/org/kitesdk/data/crunch/DatasetTarget.java#L106 Cheers, Tom On Thu, Feb 27, 2014 at 7:54 AM, Gabriel Reid wrote: > For reference, here's the link to the previous thread on this: > http://mail-archives.apache.org/mod_mbox/crunch-dev/201401.mbox/%3cCAF-WD4Sig2n7yMxiZSji8trQy-8wfUy5_7dnKC=dkSxmrfSPVA@mail.gmail.com%3e > > On Thu, Feb 27, 2014 at 7:56 AM, Josh Wills wrote: >> +tom >> >> Didn't Tom have a thing like this a little while ago? >> >> >> On Wed, Feb 26, 2014 at 8:04 PM, Chao Shi wrote: >> >>> Hi crunch devs, >>> >>> I'm developing target wrapper for HCatOutputFormat, which uses a custom >>> OutputCommiter to get results committed to hive. It seems its >>> OutputCommitter is not called at all. Looking into the code, I can't find >>> where crunch calls it. Is it really supported? >>> >>> Thanks, >>> Chao >>> >> >> >> >> -- >> Director of Data Science >> Cloudera >> Twitter: @josh_wills --f46d04428e7003dc9a04f3650e65--