hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tatarinov, Igor" <itatari...@ebay.com>
Subject too many mappers
Date Wed, 16 Apr 2014 21:03:26 GMT
For some reason, I can't decrease the number of mappers in Hive (0.12) and Hadoop 2.2. I believe
I was able to do that in 0.10.

My table has 170K rows and 2000 small (20KB) uncompressed files (I'll try to make Hive merge
these small files in the future).

The relevant Hive settings are below:

hive> SET hive.input.format;
hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
hive> SET mapreduce.input.fileinputformat.split.maxsize;
mapreduce.input.fileinputformat.split.maxsize=1073741824
hive> SET hive.hadoop.supports.splittable.combineinputformat;
hive.hadoop.supports.splittable.combineinputformat=true
hive> SET mapred.max.split.size;
mapred.max.split.size=1073741824

When I run select count(1), I get 658 mappers (one for every 3 files?):
Hadoop job information for Stage-1: number of mappers: 658; number of reducers: 1

The table is regular and uncompressed:

# Storage Information
SerDe Library:          org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat:            org.apache.hadoop.mapred.TextInputFormat
OutputFormat:           org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Compressed:             No

What am I missing?

Thanks!


Mime
View raw message