hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Li" <annndy....@gmail.com>
Subject Re: FileOutputFormat which does not write key value?
Date Wed, 20 Feb 2008 02:01:21 GMT
Shouldn't the official way to do this is to implement your own RecordWriter
and implement the
OutputFormatClass.

conf.setOutputFormat(yourClass);

Inside the yourClass, you can return your own RecordWriter class in the
getRecordWriter method.

I did it on the FileInputFormat with my own RecordReader and it worked for
me
to take KEY and null VALUE into the Mapper.  I believe it is the same thing
vice versa.

But there should be a formal way instead of try-and-error to see what the
system default
is.  I guess the system does not have a standard spec to define what is the
default values?
Maybe this is why Ted has such concern of incompatible in 0.16.*?

-Andy

On Feb 19, 2008 3:02 PM, Lukas Vlcek <lukas.vlcek@gmail.com> wrote:

> Hmmm...
>
> May be I should rather go to bet (it is just midnight in my part of the
> world...) but I think I did what you are saying:
>
> Configuration:
>         conf.setOutputKeyClass(NullWritable.class);
>         conf.setOutputValueClass(Text.class);
>
> And the reducer:
> public class PermutationReduce extends MapReduceBase implements
> Reducer<Text, Text, NullWritable, Text> {
>
>    public void reduce(Text key, Iterator<Text> values,
> OutputCollector<NullWritable, Text> output, Reporter reporter) throws
> IOException {
>        while (values.hasNext()) {
>            output.collect(NullWritable.get(), values.next());
>        }
>
>    }
> }
>
> Regards,
> Lukas
>
> On 2/19/08, Owen O'Malley <oom@yahoo-inc.com> wrote:
> >
> >
> > On Feb 19, 2008, at 1:52 PM, Lukas Vlcek wrote:
> >
> > > Hi,
> > >
> > > I don't care about key value in the output file. Is there any way
> > > how I can
> > > suppress key in the output?
> > > Is there a way how to tell (Text)OutputFormat not to write key but
> > > value
> > > only? Or can I pass my own implementation of RecordWriter into
> > > FileOutputFormat?
> >
> > The easiest way is to put either null or a NullWritable in for the
> > key coming out of the reduce. The TextOutputFormat will drop the tab
> > character. You can also define your own OutputFormat and encode them
> > as you wish.
> >
> > -- Owen
> >
>
>
>
> --
> http://blog.lukas-vlcek.com/
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message