Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D2DD3D1CC for ; Fri, 2 Nov 2012 17:35:51 +0000 (UTC) Received: (qmail 73657 invoked by uid 500); 2 Nov 2012 17:35:46 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 73559 invoked by uid 500); 2 Nov 2012 17:35:46 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 73552 invoked by uid 99); 2 Nov 2012 17:35:46 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Nov 2012 17:35:46 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of dhruv21@gmail.com designates 209.85.216.179 as permitted sender) Received: from [209.85.216.179] (HELO mail-qc0-f179.google.com) (209.85.216.179) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Nov 2012 17:35:39 +0000 Received: by mail-qc0-f179.google.com with SMTP id b14so2463505qcs.38 for ; Fri, 02 Nov 2012 10:35:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=N0pbDWGCrXoAW91SwbknX5OLHUMAIMNxBBZdk9zWZu8=; b=F4WTu9xoQB9N65stuqSlMViAJGJIRB+86Vh0cIM0zNfXMWdCYJAK8Eimt50dldfElZ oFngw+QSBPE3pso8LyxgEBroWOUKimvnPur2za7SE3+HdnXlwhDQwP/LN1epqlTmafsx iMEphBxOBzdKcI/59Fm7y5ZBD3fzU5K9jM65WRoL/M2smWrQCw2R4ZZz2XwM9mzz+0Ko /hD5mJ7Zez5+bryl1lRWWEJkqziyDQF7CImRFnYWfXa+B5Xq1teOmvT2YB41f/U/2XFy uvx5FTcEElJIG8lMaWfPPPjTITIJRzrjeXrIoajwQ3B+CbwQaGQRZPJyo7gGw7qxmkF1 GPDg== MIME-Version: 1.0 Received: by 10.229.176.25 with SMTP id bc25mr746501qcb.148.1351877719155; Fri, 02 Nov 2012 10:35:19 -0700 (PDT) Received: by 10.49.133.33 with HTTP; Fri, 2 Nov 2012 10:35:19 -0700 (PDT) In-Reply-To: References: Date: Fri, 2 Nov 2012 10:35:19 -0700 Message-ID: Subject: Re: OutputFormat and Reduce Task From: Dhruv To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=001636ef06833ae62904cd868fe6 X-Virus-Checked: Checked by ClamAV on apache.org --001636ef06833ae62904cd868fe6 Content-Type: text/plain; charset=ISO-8859-1 Thanks Harsh, just to be clear--if I have a large key set and if I run with just one reducer which is the default, the OutputFormat and the RecordWriter will be constructed only once? On Thu, Nov 1, 2012 at 8:14 PM, Harsh J wrote: > Hi Dhruv, > > Inline. > > On Fri, Nov 2, 2012 at 4:15 AM, Dhruv wrote: > > I'm trying to optimize the performance of my OutputFormat's > implementation. > > I'm doing things similar to HBase's TableOutputFormat--sending the > reducer's > > output to a distributed k-v store. So, the context.write() call basically > > winds up doing a Put() on the store. > > > > Although I haven't profiled, a sequence of thread dumps on the reduce > tasks > > reveal that the threads are RUNNABLE and hanging out in the put() and its > > subsequent method calls. So, I proceeded to decouple these two by > > implementing the producer (context.write()) consumer > (RecordWriter.write()) > > pattern using ExecutorService. > > With HBase involved, this is only partly correct. The HTable API, > which regular TableOutputFormat uses, provides a "AutoFlush" option > which if disabled, begins to buffer writes to regionservers instead of > doing a flush of Puts/Deletes at every single invoke. > > The TableOutputFormat by default does disable AutoFlush, to provide > this behavior. > > Read more on that at > > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#setAutoFlush(boolean,%20boolean) > and/or in Lars' book, "HBase: The Definitive Guide". > > > My understanding is that Context.write() calls RecordWriter.write() and > that > > these two are synchronous calls. The first will block until the second > > method completes.Each reduce phase blocks until the context.write() > > finishes, so the next reduce on the next key also blocks, making things > run > > slow in my case. Is this correct? > > Given the above explanation, this is untrue if HBase's > TableOutputFormat is involved, but true otherwise for general FS > interacting OFs. > > > Does this mean that OutputFormat is > > instantiated once by the TaskTracker for the Job's reduce logic and all > keys > > operated on by the reducers get the same instance of the OutputFormat. > Or, > > is it that for each key operated by the reducer, a new OutputFormat is > > instantiated? > > The TaskTracker is a service daemon that does not execute any > user-code. Only a single OutputFormat object is instantiated in a > single Task. The RecordWriter wrapped in it too is only instantiated > once per Task. > > > Thanks, > > Dhruv > > > > -- > Harsh J > --001636ef06833ae62904cd868fe6 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Thanks Harsh, just to be clear--if I have a large key set and if I run with= just one reducer which is the default, the OutputFormat and the RecordWrit= er will be constructed only once?




On Thu, Nov 1, 2012 at 8:14 PM, Harsh J = <harsh@cloudera.com> wrote:
Hi Dhruv,

Inline.

On Fri, Nov 2, 2012 at 4:15 AM, Dhruv <dhruv21@gmail.com> wrote:
> I'm trying to optimize the performance of my OutputFormat's im= plementation.
> I'm doing things similar to HBase's TableOutputFormat--sending= the reducer's
> output to a distributed k-v store. So, the context.write() call basica= lly
> winds up doing a Put() on the store.
>
> Although I haven't profiled, a sequence of thread dumps on the red= uce tasks
> reveal that the threads are RUNNABLE and hanging out in the put() and = its
> subsequent method calls. So, I proceeded to decouple these two by
> implementing the producer (context.write()) consumer (RecordWriter.wri= te())
> pattern using ExecutorService.

With HBase involved, this is only partly correct. The HTable API,
which regular TableOutputFormat uses, provides a "AutoFlush" opti= on
which if disabled, begins to buffer writes to regionservers instead of
doing a flush of Puts/Deletes at every single invoke.

The TableOutputFormat by default does disable AutoFlush, to provide
this behavior.

Read more on that at
http://hbase.apache.org/apidocs/org/apache/hadoop= /hbase/client/HTable.html#setAutoFlush(boolean,%20boolean)
and/or
in Lars' book, "HBase: The Definitive Guide".

> My understanding is that Context.write() calls RecordWriter.write() an= d that
> these two are synchronous calls. The first will block until the second=
> method completes.Each reduce phase blocks until the context.write() > finishes, so the next reduce on the next key also blocks, making thing= s run
> slow in my case. Is this correct?

Given the above explanation, this is untrue if HBase's
TableOutputFormat is involved, but true otherwise for general FS
interacting OFs.

> Does this mean that OutputFormat is
> instantiated once by the TaskTracker for the Job's reduce logic an= d all keys
> operated on by the reducers get the same instance of the OutputFormat.= Or,
> is it that for each key operated by the reducer, a new OutputFormat is=
> instantiated?

The TaskTracker is a service daemon that does not execute any
user-code. Only a single OutputFormat object is instantiated in a
single Task. The RecordWriter wrapped in it too is only instantiated
once per Task.

> Thanks,
> Dhruv



--
Harsh J

--001636ef06833ae62904cd868fe6--