hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark <static.void....@gmail.com>
Subject Re: KeyValueTextInputFormat
Date Fri, 27 Aug 2010 14:41:42 GMT
  On 8/26/10 7:47 PM, newpant wrote:
> Hi, do you use JobConf.setInputFormat(KeyValueTextInputFormat.class) to set
> the input format class ? Default input format class is TextInputFormat, and
> the Key type is LongWritable, which store offset of lines in the file (in
> byte)
>
> if your reducer accept a different key or value from mapper output, you need
> to setMapOutputKeyClass and setMapOutputValueClass
>
> 2010/8/27 Mark<static.void.dev@gmail.com>
>
>>   When I configure my job to use a KeyValueTextInputFormat doesn't that
>> imply that the key and value to my mapper will be both Text?
>>
>> I have it set up like this and I am using the default Mapper.class ie
>> IdentityMapper
>> - KeyValueTextInputFormat.addInputPath(job, new Path(otherArgs[0]));
>>
>> but I keep receiving this error:
>> - java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be
>> cast to org.apache.hadoop.io.Text
>>
>> I would expect this error if I was using the FileInputFormat because that
>> return the key as a LongWritable and the value as Text but I am unsure of
>> why its happening here.
>>
>> Also on the same note, when I supply FileInputFormat or
>> KeyValueTextInputFormat does that implicitly set job.setMapOutputKeyClass
>> and job.setMapOutputValueClass. When are these used?
>>
>> Thanks for the clarification
>>
>>
>>
>>
>>
No I didnt set that and when I did everything worked as expected. I 
thought if I used:

KeyValueTextInputFormat.addInputPath(job, new Path(otherArgs[0]))


it would set that for me or at lest know that it would be text/text as 
input. Im guessing that is wrong.

if your reducer accept a different key or value from mapper output, you need
to setMapOutputKeyClass and setMapOutputValueClass

When would this ever come up? Does it just cast to the appropriate 
classes then?

Thanks


Mime
View raw message