hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zac Shepherd <zsheph...@about.com>
Subject bz2 decompress in place
Date Wed, 21 Aug 2013 18:00:52 GMT
Hello,

I'm using an ancient version of Hadoop (0.20.2+228) and trying to run a 
m/r job over a bz2 compressed file (18G).  Since splitting support 
wasn't added until 0.21.0, a single mapper is getting allocated and will 
take far too long to complete.  Is there a way that I can decompress the 
file in place, or am I going to have to copy it down, decompress it 
locally, and then copy it back up to the cluster?

Thanks for any help,
Zac Shepherd

Mime
View raw message