hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Douglas <chri...@yahoo-inc.com>
Subject Re: Stackoverflow
Date Tue, 03 Jun 2008 06:35:10 GMT
> I have no Java implementation of my job, sorry.

Since it's all in the map side, IdentityMapper/IdentityReducer is  
fine, as long as both the splits and the number of reduce tasks are  
the same.

> The data is a representation for loglines, and not exactly small,  
> e.g. the
> stuff has already been reduced once.

By "not exactly small, do you mean each line is long or that there  
are many records?

> The interesting thing is that it happens inside the last Map task,  
> not in the
> reducer tasks.
> As you can see above the mapper cmd is rather on the simple side.

util.QuickSort is only used on the map side, so this shouldn't have  
anything to do with the reduce. Is it always and only the *last* map  
task that fails? If I sent you a patch that would print a trace with  
the partitions, would you mind running it? Do you have any other  
settings that differ from the defaults? -C

View raw message