hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Pestritto <m...@pestritto.com>
Subject Fwd: Hive-74
Date Wed, 30 Sep 2009 13:51:04 GMT
Including hive-user in case someone has any experience with this..
Thanks
-Matt

---------- Forwarded message ----------
From: Matt Pestritto <matt@pestritto.com>
Date: Tue, Sep 29, 2009 at 5:26 PM
Subject: Hive-74
To: hive-dev@hadoop.apache.org


Hi-

I'm having a problem using CombineHiveInputSplit.  I believe this was
patched in http://issues.apache.org/jira/browse/HIVE-74

I'm currently running hadoop 20.1 using hive trunk.

hive-default.xml has the following property:
<property>
  <name>hive.input.format</name>
  <value></value>
  <description>The default input format, if it is not specified, the system
assigns it. It is set to HiveInputFormat for hadoop versions 17, 18 and 19,
whereas it is set to CombinedHiveInputFormat for hadoop 20. The user can
always overwrite it - if there is a bug in CombinedHiveInputFormat, it can
always be manually set to HiveInputFormat. </description>
</property>

I added the following to hive-site.xml:  ( Notice, the description in
hive-default.xml has CombinedHiveInputFormat which does not work for me -
the property value seems to be Combine(-d) )
<property>
  <name>hive.input.format</name>
  <value>org.apache.hadoop.hive.ql.io.CombineHiveInputFormat</value>
  <description>The default input format, if it is not specified, the system
assigns it. It is set to HiveInputFormat for hadoop versions 17, 18 and 19,
whereas it is set to CombinedHiveInputFormat for hadoop 20. The user can
always overwrite it - if there is a bug in CombinedHiveInputFormat, it can
always be manually set to HiveInputFormat. </description>
</property>

When I launch a job the cli exits immediately:
hive> select count(1) from my_table;
Total MapReduce jobs = 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.ExecDriver
hive> exit ;

If I set the property value to org.apache.hadoop.hive.ql.io.HiveInputFormat,
the job runs fine.

Suggestions ? Is there something that I am missing ?

Thanks
-Matt

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message