Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C888E10D28 for ; Sat, 12 Oct 2013 14:54:16 +0000 (UTC) Received: (qmail 82718 invoked by uid 500); 12 Oct 2013 14:54:08 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 82263 invoked by uid 500); 12 Oct 2013 14:54:07 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 82256 invoked by uid 99); 12 Oct 2013 14:54:06 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 12 Oct 2013 14:54:06 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of sonalgoyal4@gmail.com designates 209.85.217.178 as permitted sender) Received: from [209.85.217.178] (HELO mail-lb0-f178.google.com) (209.85.217.178) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 12 Oct 2013 14:54:00 +0000 Received: by mail-lb0-f178.google.com with SMTP id z5so4264505lbh.23 for ; Sat, 12 Oct 2013 07:53:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=mlr0eAuMo5jj0K3fnqqHqFPSIW15RCWQ6NJ8i28sXuA=; b=GrdRVuHJ5Ojv6ptB3KYXOrew1gmqOqlUrCbHfs35Pee7dGisODTyauejScSv4H39P8 vTq3j4l40j61RaP2yYDBqjQBgSsmPLUHY5ispjGYlodXw52rNLGCTeYqu3XeVreuL4E1 3aWOAT4f7mRsCSzyPkiPXDPOiaB1jJhND9HxU+qk+1tRuNgc4fuBtN+FOr6SLcN/l+XR XETsJxSmzRxTp3BwbinZ9zFtWDMFlojFr9QWc30iEMt0pI34/ZnZDolDizz0J6zAp2Cs Im5o2KJwbS+wwlNQf+Ac/arJLNVhJslNXY447bBAwvvmlbddpPA777LCDZWput+Ye4Yu FBLQ== MIME-Version: 1.0 X-Received: by 10.152.6.169 with SMTP id c9mr2183385laa.28.1381589620064; Sat, 12 Oct 2013 07:53:40 -0700 (PDT) Received: by 10.114.175.37 with HTTP; Sat, 12 Oct 2013 07:53:40 -0700 (PDT) In-Reply-To: References: Date: Sat, 12 Oct 2013 20:23:40 +0530 Message-ID: Subject: Re: Writing to multiple directories in hadoop From: Sonal Goyal To: "user@hadoop.apache.org" Content-Type: multipart/alternative; boundary=089e01493c0687864904e88c6602 X-Virus-Checked: Checked by ClamAV on apache.org --089e01493c0687864904e88c6602 Content-Type: text/plain; charset=ISO-8859-1 Hi Jamal, If I remember correctly, you can use the write(key, value, basePath) method of MultipleOutput in your reducer to get different directories. http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.html#write(KEYOUT, VALUEOUT, java.lang.String) Here is what the API says Use MultipleOutputs.write(KEYOUT key, VALUEOUT value, String baseOutputPath) to write key and value to a path specified by baseOutputPath, with no need to specify a named output: private MultipleOutputs out; public void setup(Context context) { out = new MultipleOutputs(context); ... } public void reduce(Text key, Iterable values, Context context) throws IOException, InterruptedException { for (Text t : values) { out.write(key, t, generateFileName(<*parameter list...*>)); } } protected void cleanup(Context context) throws IOException, InterruptedException { out.close(); } Use your own code in generateFileName() to create a custom path to your results. '/' characters in baseOutputPath will be translated into directory levels in your file system. Also, append your custom-generated path with "part" or similar, otherwise your output will be -00000, -00001 etc. No call to context.write() is necessary. See example generateFileName() code below. private String generateFileName(Text k) { // expect Text k in format "Surname|Forename" String[] kStr = k.toString().split("\\|"); String sName = kStr[0]; String fName = kStr[1]; // example for k = Smith|John // output written to /user/hadoop/path/to/output/Smith/John-r-00000 (etc) return sName + "/" + fName; } Best Regards, Sonal Nube Technologies On Sat, Oct 12, 2013 at 3:49 AM, jamal sasha wrote: > Hi, > > I am trying to separate my output from reducer to different folders.. > > My dirver has the following code: > FileOutputFormat.setOutputPath(job, new Path(output)); > //MultipleOutputs.addNamedOutput(job, namedOutput, > outputFormatClass, keyClass, valueClass) > //MultipleOutputs.addNamedOutput(job, namedOutput, > outputFormatClass, keyClass, valueClass) > MultipleOutputs.addNamedOutput(job, "foo", > TextOutputFormat.class, NullWritable.class, Text.class); > MultipleOutputs.addNamedOutput(job, "bar", > TextOutputFormat.class, Text.class,NullWritable.class); > MultipleOutputs.addNamedOutput(job, "foobar", > TextOutputFormat.class, Text.class, NullWritable.class); > > And then my reducer has the following code: > mos.write("foo",NullWritable.get(),new Text(jsn.toString())); > mos.write("bar", key,NullWritable.get()); > mos.write("foobar", key,NullWritable.get()); > > But in the output, I see: > > output/foo-r-0001 > output/foo-r-0002 > output/foobar-r-0001 > output/bar-r-0001 > > > But what I am trying is : > > output/foo/part-r-0001 > output/foo/part-r-0002 > output/bar/part-r-0001 > output/foobar/part-r-0001 > > How do I do this? > Thanks > --089e01493c0687864904e88c6602 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi Jamal,

If I remember correctly, you = can use the write(key, value, basePath) method =A0of MultipleOutput in your= reducer to get different directories.=A0


Here is what the API says

Use=A0MultipleOutputs.write(KEYOUT key, VALUEOUT value, S= tring baseOutputPath)=A0to write key and value to a path specified b= y=A0baseOutputPath, with no need to specify a named output:

 private MultipleOutputs out;
=20
 public void setup(Context context) {
   out =3D new MultipleOutputs(context);
   ...
 }
=20
 public void reduce(Text key, Iterable values, Context context) throws IOEx=
ception, InterruptedException {
 for (Text t : values) {
   out.write(key, t, generateFileName(<parameter list...>));
   }
 }
=20
 protected void cleanup(Context context) throws IOException, InterruptedExc=
eption {
   out.close();
 }
 

Use your own code in=A0generateFileName()=A0t= o create a custom path to your results. '/' characters in=A0b= aseOutputPath=A0will be translated into directory levels in your fil= e system. Also, append your custom-generated path with "part" or = similar, otherwise your output will be -00000, -00001 etc. No call to=A0context.write()=A0is necessary. See example=A0generateFileN= ame()=A0code below.

 private String generateFileName(Text k) {
   // expect Text k in format "Surname|Forename"
   String[] kStr =3D k.toString().split("\\|");
  =20
   String sName =3D kStr[0];
   String fName =3D kStr[1];

   // example for k =3D Smith|John
   // output written to /user/hadoop/path/to/output/Smith/John-r-00000 (etc=
)
   return sName + "/" + fName;
 }

Best = Regards,
Sonal
N= ube Technologies=A0






On Sat, Oct 12, 2013 at 3:49 AM, jamal s= asha <jamalshasha@gmail.com> wrote:
Hi,

I am trying to separate my output f= rom reducer to different folders..

My dirver has t= he following code:
=A0FileOutputFormat.setOutputPath(job, ne= w Path(output));
=A0 =A0 =A0 =A0 =A0 =A0 //MultipleOutputs.addNamedOutput(job, namedOut= put, outputFormatClass, keyClass, valueClass)
=A0 =A0 =A0 =A0 =A0= =A0 //MultipleOutputs.addNamedOutput(job, namedOutput, outputFormatClass, = keyClass, valueClass)
=A0 =A0 =A0 =A0 =A0 =A0 MultipleOutputs.addNamedOutput(job, "foo&= quot;, TextOutputFormat.class, NullWritable.class, Text.class);
= =A0 =A0 =A0 =A0 =A0 =A0 MultipleOutputs.addNamedOutput(job, "bar"= , TextOutputFormat.class, Text.class,NullWritable.class);
=A0 =A0 =A0 =A0 =A0 =A0 MultipleOutputs.addNamedOutput(job, "foob= ar", TextOutputFormat.class, Text.class, NullWritable.class);

And then my reducer has the following code:
mos.write("foo",NullWritable.get(),new Text(jsn.toString()));
mos.write("bar", key,NullWritable.get());
mos.writ= e("foobar", key,NullWritable.get());

But in the output, I see:

output/foo-r-0001
output/foo-r-0002
output/foobar-r-0001
out= put/bar-r-0001


But what I am tr= ying is :

output/foo/part-r-0001
output/= foo/part-r-0002
output/bar/part-r-0001
output/foobar/part-r-0001

How do I do this?
Thanks

--089e01493c0687864904e88c6602--