hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vsevolod Ostapenko (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HIVE-13280) Error when more than 1 mapper for HBase storage handler
Date Wed, 12 Oct 2016 15:08:20 GMT

    [ https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568866#comment-15568866
] 

Vsevolod Ostapenko edited comment on HIVE-13280 at 10/12/16 3:08 PM:
---------------------------------------------------------------------

Hive-Hbase integration documentation (https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration)
states that hbase.mapred.output.outputtable property is optional, and needed only when one
wants to insert into a table. The latter statement is obviously incorrect, as prior to Feb
26, 2016, this property wasn't even documented and inserts into HBase-backed tables were working
just fine with MR engine.

If TEZ does require hbase.mapred.output.outputtable property to be explicitly set, documentation
needs to be updated to indicate that fact.

One more thing, all the existing samples have hbase.mapred.output.outputtable and hbase.table.name
set to the same value. If there is no use case when they are different, why the former even
needed?


was (Author: seva_ostapenko):
Hive-Hbase integration documentation (https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration)
claims that hbase.mapred.output.outputtable property is optional, and provides no good explanation
under what circumstances one would want or need to define it. In all the provided samples
values of hbase.mapred.output.outputtable and hbase.table.name are the same, so samples are
hot helpful and not self-explanatory.

If TEZ does require hbase.mapred.output.outputtable property to be explicitly set, documentation
needs to be updated to indicate that fact.
Also, it would be helpful to provide some background why this property exists in the first
place.


> Error when more than 1 mapper for HBase storage handler
> -------------------------------------------------------
>
>                 Key: HIVE-13280
>                 URL: https://issues.apache.org/jira/browse/HIVE-13280
>             Project: Hive
>          Issue Type: Bug
>          Components: HBase Handler
>    Affects Versions: 2.0.0
>            Reporter: Damien Carol
>            Assignee: Damien Carol
>
> With a simple query (select from orc table and insert into HBase external table):
> {code:sql}
> insert into table register.register  select * from aa_temp
> {code}
> The aa_temp table have 45 orc files. It generate 45 mappers.
> Some mappers fail with this error:
> {noformat}
> Caused by: java.lang.IllegalArgumentException: Must specify table name
>         at org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
>         at org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101)
>         at org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87)
>         at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300)
>         at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290)
>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126)
>         ... 25 more
> ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:35, Vertex
vertex_1457964631631_0015_3_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]DAG did not succeed
due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 (state=08S01,code=2)
> {noformat}
> If I do an ALTER CONCATENATE for aa_temp. And redo the query. Everything is fine because
there are only one mapper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message