hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Romain Thibaux (JIRA)" <j...@apache.org>
Subject [jira] Created: (HIVE-1814) Mapjoin fails on multiple partitions
Date Tue, 30 Nov 2010 08:16:11 GMT
Mapjoin fails on multiple partitions
------------------------------------

                 Key: HIVE-1814
                 URL: https://issues.apache.org/jira/browse/HIVE-1814
             Project: Hive
          Issue Type: Bug
            Reporter: Romain Thibaux


This query works:

set hive.optimize.bucketmapjoin = true;
set hive.optimize.bucketmapjoin.sortedmerge = true;
set hive.input.format = org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
SELECT /*+ MAPJOIN(b) */ a.field_a, b.field_b
FROM table_a a
JOIN table_b b
ON a.ds = '2010-08-30' AND b.ds = '2010-08-30' AND a.user = b.user;


This query fails with a Null Pointer Exception:

set hive.optimize.bucketmapjoin = true;
set hive.optimize.bucketmapjoin.sortedmerge = true;
set hive.input.format = org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
SELECT /*+ MAPJOIN(b) */ a.field_a, b.field_b
FROM table_a a
JOIN table_b b
ON a.ds >= '2010-08-30' AND b.ds <= '2010-09-30' AND b.ds >= '2010-08-30' AND b.ds
<= '2010-09-30' AND a.ds = b.ds AND a.user = b.user;


java.lang.NullPointerException
        at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:622)
        at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:121)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:118)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47)



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message