Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 91615 invoked from network); 9 Jul 2009 05:54:48 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 9 Jul 2009 05:54:48 -0000 Received: (qmail 12536 invoked by uid 500); 9 Jul 2009 05:54:56 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 12440 invoked by uid 500); 9 Jul 2009 05:54:56 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 12430 invoked by uid 99); 9 Jul 2009 05:54:56 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Jul 2009 05:54:56 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jason.hadoop@gmail.com designates 209.85.212.196 as permitted sender) Received: from [209.85.212.196] (HELO mail-vw0-f196.google.com) (209.85.212.196) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Jul 2009 05:54:45 +0000 Received: by vwj34 with SMTP id 34so1090106vwj.5 for ; Wed, 08 Jul 2009 22:54:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=Zl4THXOeX8YRk1BZ2wb4GQORCWuTdgrgLUxdiYrCmKU=; b=fnS2uFDUCpHu2APL7LwLsAcLvUm0RJ3+ydpp/u/Ou7ah1gnMgUNvlm4ob8I0TUEhv4 LKKp6iIc4Mp0QnDUIswOyxafpqoCetonW0A+TaDkXT/4kzyBbeGYOhyS7X2Ipra7hLPh C4jxvef5SvqyfY17prbFEtLySLGNn8CcGELOo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=F889RWGQAqL1XyL7sYAJfGEjtOEFb//kAl+9vuzrHXu4YZvDB8OKh2IU1gHcnjUSPC y2CueiMFdfbUjz3MBDufeWNVkTeZkIXzOxqkbbNqzZb0u8wCvG4XbiQ9SV1zSdqLHiWf 52e49kniYBuMukfJ1+qbM1c8OlVTFy3tq8HVI= MIME-Version: 1.0 Received: by 10.220.75.73 with SMTP id x9mr509892vcj.56.1247118864539; Wed, 08 Jul 2009 22:54:24 -0700 (PDT) In-Reply-To: References: <445c748b0907081513r278b7157l250a8f2c0f28211f@mail.gmail.com> <1E45DBB0-A0D7-40D4-B457-4D84274C8471@apache.org> Date: Wed, 8 Jul 2009 22:54:24 -0700 Message-ID: <314098690907082254n69254779rf4804bfc012dba8a@mail.gmail.com> Subject: Re: Merging many output files from reducer From: jason hadoop To: common-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=0016e64f5f4ce9e0cf046e3f7c73 X-Virus-Checked: Checked by ClamAV on apache.org --0016e64f5f4ce9e0cf046e3f7c73 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit In the example code from Pro Hadoop, is a sample map reduce job that uses mapside join to merge the files into a single output. It is part of the chapter 9 examples. On Wed, Jul 8, 2009 at 4:55 PM, Ted Dunning wrote: > On Wed, Jul 8, 2009 at 3:38 PM, Owen O'Malley wrote: > > > > > On Jul 8, 2009, at 3:13 PM, Pankil Doshi wrote: > > > > Can anyone guide me to merge my output files from reducer to single file > >> in > >> HDFS. > >> > > > > The usual approach is to leave them as separate files. > > > Also, the need to merge often arises from a need to import the data into an > external database. That doesn't sound like your need because you already > know and have rejected dfs -cat. > > It may help to think of the containing directory as the actual file and the > files inside that directory as no more interesting than the inodes and > blocks that make up a normal unix file. > -- Pro Hadoop, a book to guide you from beginner to hadoop mastery, http://www.amazon.com/dp/1430219424?tag=jewlerymall www.prohadoopbook.com a community for Hadoop Professionals --0016e64f5f4ce9e0cf046e3f7c73--