hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From x6i4uybz labs <x6i4uyzbz.l...@gmail.com>
Subject M/R, Strange behavior with multiple Gzip files
Date Wed, 05 Dec 2012 16:02:25 GMT
Hi everybody,

I have a M/R job which does a bulk import to hbase.
I have to process many gzip files (2800 x ~ 100mb)

I don't understand why my job instanciates 80 maps but runs each map
sequentialy like if there is only one big gz file.

Is there a problem in my driver ? Or maybe I miss something.
I use "FileInputFormat.addInputPath(job, new Path(args[0]))" where args[0]
is a directory.

Can you help me, please ?

Thanks, Guillaume

View raw message