hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan LeCompte" <lecom...@gmail.com>
Subject Re: Reduce hanging with custom value objects?
Date Sat, 30 Aug 2008 19:41:29 GMT
I see this in the syslog for the map task:

2008-08-30 14:34:57,186 INFO org.apache.hadoop.mapred.MapTask:
Starting flush of map output
2008-08-30 14:34:57,186 INFO org.apache.hadoop.mapred.MapTask:
bufstart = 0; bufend = 121688; bufvoid = 99614720
2008-08-30 14:34:57,186 INFO org.apache.hadoop.mapred.MapTask: kvstart
= 0; kvend = 2173; length = 327680
2008-08-30 14:34:57,365 INFO org.apache.hadoop.mapred.MapTask: Index:
(0, 3076, 3076)
2008-08-30 14:34:57,365 INFO org.apache.hadoop.mapred.MapTask: Finished spill 0
2008-08-30 14:34:57,396 INFO org.apache.hadoop.mapred.TaskRunner:
attempt_200808301358_0012_m_000000_0: No outputs topromote from
hdfs://localhost:54310/user/ryan/output/_temporary/_attempt_200808301358_0012_m_000000_0

My guess is this is why the reduce task is timing out -- it's not
getting any data to process. Any ideas why this could be happening? I
added print() statements in the custom writable's readFields()/write()
methods and it's showing up in the stdout logs. Any ideas?

Thanks,
Ryan


On Sat, Aug 30, 2008 at 10:32 AM, Ryan LeCompte <lecompte@gmail.com> wrote:
> The job finally came back with output. Notice that I don't get these
> errors when I use just the primitive writables:
>
> 08/08/30 09:00:57 INFO mapred.FileInputFormat: Total input paths to process : 3
> 08/08/30 09:00:57 INFO mapred.FileInputFormat: Total input paths to process : 3
> 08/08/30 09:00:58 INFO mapred.JobClient: Running job: job_200808300858_0003
> 08/08/30 09:00:59 INFO mapred.JobClient:  map 0% reduce 0%
> 08/08/30 09:01:05 INFO mapred.JobClient:  map 33% reduce 0%
> 08/08/30 09:01:09 INFO mapred.JobClient:  map 100% reduce 0%
> 08/08/30 09:24:55 INFO mapred.JobClient: Task Id :
> attempt_200808300858_0003_m_000001_0, Status : FAILED
> Too many fetch-failures
> 08/08/30 09:28:04 WARN mapred.JobClient: Error reading task
> outputConnection timed out
>
> Any ideas?
>
> On Sat, Aug 30, 2008 at 10:10 AM, Ryan LeCompte <lecompte@gmail.com> wrote:
>> Hello all,
>>
>> I'm new to Hadoop. I'm trying to write a small hadoop map/reduce
>> program that instead of reading/writing the primitive
>> LongWritable,IntWritable, etc. classes I'm using a custom object that
>> I wrote that implements the Writable interface. I'm still using a
>> LongWritable for the keys, but using my CustomWritable for the values.
>> My input is still a LongWritable,Text because I'm parsing a raw log
>> file. However, the output of the map is LongWritable,CustomWritable
>> and the input to reduce is LongWritable,CustomWritable, and the output
>> from reduce is LongWritable,CustomWritable.
>>
>> I'm noticing that the map part processes fine, however when it gets to
>> reduce it just hangs at 0%. I'm not seeing any useful output in the
>> logs either. Here's my job configuration:
>>
>>      conf.setJobName("test");
>>      conf.setOutputKeyClass(LongWritable.class);
>>      conf.setOutputValueClass(CustomWritable.class);
>>      conf.setMapperClass(MapClass.class);
>>      conf.setCombinerClass(Reduce.class);
>>      conf.setReducerClass(Reduce.class);
>>      FileInputFormat.setInputPaths(conf, remainingArguments.get(0));
>>      FileOutputFormat.setOutputPath(conf, new Path(remainingArguments.get(1)));
>>
>> Am I missing something here? Do I need to specify the conf.OutputFormat?
>>
>> Thanks,
>> Ryan
>>
>

Mime
View raw message