hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuming Wang (JIRA)" <>
Subject [jira] [Created] (HIVE-14112) Join a HBase mapped big table shouldn't convert to MapJoin
Date Tue, 28 Jun 2016 02:01:57 GMT
Yuming Wang created HIVE-14112:

             Summary: Join a HBase mapped big table shouldn't convert to MapJoin
                 Key: HIVE-14112
             Project: Hive
          Issue Type: Bug
          Components: StorageHandler
    Affects Versions: 1.1.0, 1.2.0
            Reporter: Yuming Wang
            Assignee: Yuming Wang
            Priority: Minor

Two tables, _hbasetable_risk_control_defense_idx_uid_ is HBase mapped table:
[root@dev01 ~]# hadoop fs -du -s -h /hbase/data/tandem/hbase-table-risk-control-defense-idx-uid
3.0 G  9.0 G  /hbase/data/tandem/hbase-table-risk-control-defense-idx-uid
[root@dev01 ~]# hadoop fs -du -s -h /user/hive/warehouse/openapi_invoke_base
6.6 G  19.7 G  /user/hive/warehouse/openapi_invoke_base
The smallest table is 3.0G, is greater than _hive.mapjoin.smalltable.filesize_ and
When join these tables, Hive auto convert it to mapjoin:
hive> select count(*) from hbasetable_risk_control_defense_idx_uid t1 join openapi_invoke_base
t2 on (t1.key=t2.merchantid);
Query ID = root_20160628092222_9f9d3f25-857b-412c-8a75-3d9228bd5ee5
Total jobs = 1
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed
in 8.0
Execution log at: /tmp/root/root_20160628092222_9f9d3f25-857b-412c-8a75-3d9228bd5ee5.log
2016-06-28 09:22:10	Starting to launch local task to process map join;	maximum memory = 1908932608
the root cause is hive use _/user/hive/warehouse/hbasetable_risk_control_defense_idx_uid_
as it location, but it empty. so hive auto convert it to mapjoin.
My opinion is set right location when mapping HBase table.

This message was sent by Atlassian JIRA

View raw message