hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bejoy Ks <bejoy.had...@gmail.com>
Subject Re: Reading fields from a Text line
Date Thu, 02 Aug 2012 15:27:13 GMT
Hi Tariq

Again I strongly suspect the IdentityMapper in play here. The reasoning why
I suspect so is

When you have the whole data in output file it should be the Identity
Mapper. Due to the mismatch in input key type at class level and method
level the framework is falling back to IdentityMapper. I have noticed this
fall back while using new mapreduce API.
public static class XPTMapper extends Mapper<*IntWritable*, Text,
LongWritable, Text>{

                public void map(*LongWritable* key, Text value, Context
context)
throws IOException, InterruptedException{


When you change the Input Key type to LongWritable in class level, it is
your custom mapper(XPTMapper) being called. Because of some exceptional
cases it is just going into if condition where you are not writing anything
out of Mapper and hence an empty output file.

public static class XPTMapper extends Mapper<*LongWritable*, Text,
LongWritable, Text>{

                public void map(*LongWritable* key, Text value, Context
context)
throws IOException, InterruptedException{

To cross check this, try enabling some logging on your code to see exactly
what is happening.

By the way are you getting the output of this line in your logs when you
change the input key type to LongWritable?
context.setStatus("INVALID LINE..SKIPPING........");
If so that confirms my assumption. :)

Try adding more logs to trace the flow and see what is going wrong. Or you
can use MRunit to unit test your code as the first step.

Hope it helps!..

Regards
Bejoy KS

Mime
View raw message