hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joey Echeverria <j...@cloudera.com>
Subject Re: Multiple Outputs Not Being Written to File
Date Fri, 06 May 2011 18:15:32 GMT
You need to add a call to MultipleOutputs.close() in your reducer's cleanup:

 public void cleanup(Context) throws IOException {
   mos.close();
   ...
 }

On Fri, May 6, 2011 at 1:55 PM, Geoffry Roberts
<geoffry.roberts@gmail.com> wrote:
> All,
>
> I am attempting to take a large file and split it up into a series of
> smaller files.  I want the smaller files to be named based on values taken
> from the large file.  I am using
> org.apache.hadoop.mapreduce.lib.output.MultipleOutputs to do this.
>
> The job runs without error and produces a set of files as expected and each
> file is named as expected.  But most of the files are empty.  Apparently, no
> data was written to them.  The fact that the file was created at all should
> confirm that there was data coming in from the mapper.  When my reducer
> counts as it iterates through the values then logs the count.  I am seeing
> reasonable counts in my logs.  The number of lines in an output file should
> equal the count.   I have counts but no lines.
>
> What could be causing this?
>
> My Mapper:
> protected void map(LongWritable key, Text value, Context ctx) throws
> IOException,
>             InterruptedException {
>         String[] ss = value.toString().split(",");
>         String locale = ss[F.DEPARTURE_LOCALE];
>         ctx.write(new Text(locale), value);
>     }
>
> My Reducer:
> private MultipleOutputs<Text, Text> mos;
>
> @Override
>  protected void setup(Context ctx) throws IOException, InterruptedException
> {
>         mos = new MultipleOutputs<Text, Text>(ctx);
>  }
>
>     @Override
>     protected void reduce(Text key, Iterable<Text> values, Context ctx)
>             throws IOException, InterruptedException {
>         int k = 0;
>         /*
>          * The key at this point can have blanks and slashes. Let us get rid
>          * of both.
>          */
>         String blankless = key.toString().replace(' ', '+');
>         String path = blankless.toString().replace("/", "");
>         try {
>             for (Text value : values) {
>                 k++;
>                 String[] ss = value.toString().split(F.DELIMITER);
>                 String id = ss[F.ID];
>                 String[] sslessid = Arrays.copyOfRange(ss, 1, ss.length);
>                 String line = UT.array2String(sslessid);
>
> // An output file is being created,
>                 mos.write(new Text(id), new Text(line), path);
>             }
>         } catch (NullPointerException e) {
>             LOG.error("<br/>" + "blankless=" + blankless);
>             LOG.error("<br/>" + "values=" + values.toString());
>         }
>
> // In my logs, I see reasonable counts even when the output file is empty.
>         LOG.info("<br/>key=" + path + " count=" + k);
>     }
> --
> Geoffry Roberts
>
>



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

Mime
View raw message