hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koert Kuipers <ko...@tresata.com>
Subject Re: output files are empty when i turn compression on
Date Wed, 11 Apr 2012 17:33:05 GMT
in case someone else ever runs into this: the issue was that in my reducer
i used a hadoop FileSystem which i closed after i was done with it.
apparently one shouldn't close these since they are shared or singletons...
i used it to open a file from hdfs for a parallel merge sort. i created the
FileSystem in my configure() method and closed it in my close() method of
the reducer. bad idea apparently. removing the fs.close() solved the issue.

On Wed, Apr 11, 2012 at 1:02 PM, Koert Kuipers <koert@tresata.com> wrote:

> i have a simple map-reduce job that i test with only 2 mappers, 2 reducers
> and very small input (10 lines of text).
> it runs fine without compression. but as soon as i turn on compression
> (mapred.compress.map.output=true), the output files (part-00000.snappy,
> etc.) are empty. zero records. using logging i can see that my reducer
> succesfully calls output.collect(key, value) yet they dont show up in the
> file. i tried both snappy and gzip. do i need to do some sort of flushing?
> i am on hadoop 0.20.2

View raw message