Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1A0809675 for ; Mon, 30 Jan 2012 08:02:12 +0000 (UTC) Received: (qmail 33085 invoked by uid 500); 30 Jan 2012 08:02:09 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 31702 invoked by uid 500); 30 Jan 2012 08:01:50 -0000 Mailing-List: contact mapreduce-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-user@hadoop.apache.org Delivered-To: mailing list mapreduce-user@hadoop.apache.org Received: (qmail 31660 invoked by uid 99); 30 Jan 2012 08:01:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 30 Jan 2012 08:01:45 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of stan.ieugen@gmail.com designates 74.125.82.48 as permitted sender) Received: from [74.125.82.48] (HELO mail-ww0-f48.google.com) (74.125.82.48) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 30 Jan 2012 08:01:38 +0000 Received: by wgbgn7 with SMTP id gn7so3956478wgb.29 for ; Mon, 30 Jan 2012 00:01:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=Ifz1V3+vW+ggHRsWzPYu1C+NxcvfKCX6BcMyQkUZKL4=; b=AJOONTf1gjsNlsPRWI43vHxRNIGwj7dUeLsziarHw63g19lc521tdzb7wlJKFYQVDW PnMUOkFgto//z3iO+1iQBTPksZqNMvdJw1AasSfdjpimlQWaKcatdUevR2y8dmUrfAgV Q+/dZ7mld+xxsFRn3A0Fk3m0mxKe6YJ7uBa9Y= Received: by 10.180.81.35 with SMTP id w3mr25527627wix.10.1327910477625; Mon, 30 Jan 2012 00:01:17 -0800 (PST) Received: from [172.28.125.107] (et-0-nat-1.gw-nat-a.sme.buh.ro.oneandone.net. [212.227.35.137]) by mx.google.com with ESMTPS id x7sm3310433wif.10.2012.01.30.00.01.16 (version=SSLv3 cipher=OTHER); Mon, 30 Jan 2012 00:01:16 -0800 (PST) Message-ID: <4F264E4B.9030003@gmail.com> Date: Mon, 30 Jan 2012 10:01:15 +0200 From: Ioan Eugen Stan User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.24) Gecko/20111108 Lightning/1.0b2 Thunderbird/3.1.16 MIME-Version: 1.0 To: mapreduce-user@hadoop.apache.org Subject: Re: Fw: reducers outputs References: <1327817102.96189.YahooMailNeo@web39702.mail.mud.yahoo.com> <1327901934.94910.YahooMailNeo@web39708.mail.mud.yahoo.com> <1327902441.59390.YahooMailNeo@web39702.mail.mud.yahoo.com> In-Reply-To: <1327902441.59390.YahooMailNeo@web39702.mail.mud.yahoo.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Pe 30.01.2012 07:47, aliyeh saeedi a scris: > > > > > > > I want to save them with my own names, How NameNode will keep their names? > > > > ________________________________ > From: Joey Echeverria > To: mapreduce-user@hadoop.apache.org; aliyeh saeedi > Sent: Sunday, 29 January 2012, 17:10 > Subject: Re: reducers outputs > > Reduce output is normally stored in HDFS, just like your other files. > Are you seeing > different behavior? > > -Joey > > On Sun, Jan 29, 2012 at 1:05 AM, aliyeh saeedi wrote: >> Hi >> I want to save reducers outputs like other files in Hadoop. Does NameNode >> keep any information about them? How can I do this? >> Or can I add a new component to Hadoop like NameNode and make JobTracker to >> consult with it too (I mean I want to make JobTracker to consult with >> NameNode AND myNewComponent both)? > > > You aren't making a lot of sens, at least to me :). But if tou wish to save reducer output somehow different you will have to implement your own class that implements http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/OutputFormat.html. It's easier to extend http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/FileOutputFormat.html and override the parts that you need. The framework will call methods from the above mentioned class to persist the data from reducers. You instruct the framework to use your class when you call job.setOutputFormatClass(SequenceFileOutputFormat.class); (this makes the output a SequenceFile). Example to save under a different name: public static class RenamedSequenceFile extends SequenceFileOutputFormat { @Override public Path getDefaultWorkFile(TaskAttemptContext context, String extension) throws IOException { FileOutputCommitter committer = (FileOutputCommitter) getOutputCommitter(context); return new Path(committer.getWorkPath(), "myBetterName"); } } This will output your reducer data into "myBetterName" file as key values pairs (behaviour inherited from SequanceFileOutputFormat). I hope this helps, -- Ioan Eugen Stan http://ieugen.blogspot.com