hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@veoh.com>
Subject Re: FileOutputFormat which does not write key value?
Date Wed, 20 Feb 2008 02:13:56 GMT

Re-reading the thread convinces me that this is a difference between
TextOutputFormat and other output formats.


On 2/19/08 6:01 PM, "Andy Li" <annndy.lee@gmail.com> wrote:

> Shouldn't the official way to do this is to implement your own RecordWriter
> and implement the
> OutputFormatClass.
> 
> conf.setOutputFormat(yourClass);
> 
> Inside the yourClass, you can return your own RecordWriter class in the
> getRecordWriter method.
> 
> I did it on the FileInputFormat with my own RecordReader and it worked for
> me
> to take KEY and null VALUE into the Mapper.  I believe it is the same thing
> vice versa.
> 
> But there should be a formal way instead of try-and-error to see what the
> system default
> is.  I guess the system does not have a standard spec to define what is the
> default values?
> Maybe this is why Ted has such concern of incompatible in 0.16.*?
> 
> -Andy
> 
> On Feb 19, 2008 3:02 PM, Lukas Vlcek <lukas.vlcek@gmail.com> wrote:
> 
>> Hmmm...
>> 
>> May be I should rather go to bet (it is just midnight in my part of the
>> world...) but I think I did what you are saying:
>> 
>> Configuration:
>>         conf.setOutputKeyClass(NullWritable.class);
>>         conf.setOutputValueClass(Text.class);
>> 
>> And the reducer:
>> public class PermutationReduce extends MapReduceBase implements
>> Reducer<Text, Text, NullWritable, Text> {
>> 
>>    public void reduce(Text key, Iterator<Text> values,
>> OutputCollector<NullWritable, Text> output, Reporter reporter) throws
>> IOException {
>>        while (values.hasNext()) {
>>            output.collect(NullWritable.get(), values.next());
>>        }
>> 
>>    }
>> }
>> 
>> Regards,
>> Lukas
>> 
>> On 2/19/08, Owen O'Malley <oom@yahoo-inc.com> wrote:
>>> 
>>> 
>>> On Feb 19, 2008, at 1:52 PM, Lukas Vlcek wrote:
>>> 
>>>> Hi,
>>>> 
>>>> I don't care about key value in the output file. Is there any way
>>>> how I can
>>>> suppress key in the output?
>>>> Is there a way how to tell (Text)OutputFormat not to write key but
>>>> value
>>>> only? Or can I pass my own implementation of RecordWriter into
>>>> FileOutputFormat?
>>> 
>>> The easiest way is to put either null or a NullWritable in for the
>>> key coming out of the reduce. The TextOutputFormat will drop the tab
>>> character. You can also define your own OutputFormat and encode them
>>> as you wish.
>>> 
>>> -- Owen
>>> 
>> 
>> 
>> 
>> --
>> http://blog.lukas-vlcek.com/
>> 


Mime
View raw message