hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chao (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HIVE-8216) auto_smb_mapjoin_14.q failed test with exception. [Spark Branch]
Date Mon, 05 Jan 2015 18:54:34 GMT

     [ https://issues.apache.org/jira/browse/HIVE-8216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chao resolved HIVE-8216.
------------------------
    Resolution: Fixed

Resolved via HIVE-8202.

> auto_smb_mapjoin_14.q failed test with exception. [Spark Branch]
> ----------------------------------------------------------------
>
>                 Key: HIVE-8216
>                 URL: https://issues.apache.org/jira/browse/HIVE-8216
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>            Reporter: Chao
>
> While trying to enable auto_smb_mapjoin_14.q, the following query:
> {code}
> select count(*) from (
>   select a.key as key, a.value as val1, b.value as val2 from tbl1 a join tbl2 b on a.key
= b.key
> ) subq1;
> {code}
> failed with exception:
> {noformat}
> 2014-09-22 11:42:56,157 ERROR [Executor task launch worker-2]: spark.SparkMapRecordHandler
(SparkMapRecordHandler.java:processRow(150)) - org.apache.hadoop.hive.ql.metadata.HiveException:
Hive Runtime Error while processing row {"key":0,"value":"val_0"}
>   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550)
>   at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:140)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:47)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:28)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at org.apache.spark.shuffle.hash.HashShuffleWriter.write(HashShuffleWriter.scala:65)
>   at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>   at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:54)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199)
>   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.processOp(SMBMapJoinOperator.java:258)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
>   at org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:137)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
>   at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
>   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540)
>   ... 15 more
> {noformat}
> The query plan doesn't look correct:
> {noformat}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
>     Spark
>       Edges:
>         Reducer 2 <- Map 1 (GROUP)
>       DagName: chao_20140922113636_e90b1567-df72-43f4-b9ea-15f986de35c2:11
>       Vertices:
>         Map 1 
>             Map Operator Tree:
>                 TableScan
>                   alias: a
>                   Statistics: Num rows: 10 Data size: 70 Basic stats: COMPLETE Column
stats: NONE
>                   Filter Operator
>                     predicate: key is not null (type: boolean)
>                     Statistics: Num rows: 5 Data size: 35 Basic stats: COMPLETE Column
stats: NONE
>                     Sorted Merge Bucket Map Join Operator
>                       condition map:
>                            Inner Join 0 to 1
>                       condition expressions:
>                         0 
>                         1 
>                       keys:
>                         0 key (type: int)
>                         1 key (type: int)
>                       Select Operator
>                         Group By Operator
>                           aggregations: count()
>                           mode: hash
>                           outputColumnNames: _col0
>                           Reduce Output Operator
>                             sort order: 
>                             value expressions: _col0 (type: bigint)
>         Map 3 
>             Map Operator Tree:
>                 TableScan
>                   alias: b
>                   Statistics: Num rows: 10 Data size: 70 Basic stats: COMPLETE Column
stats: NONE
>                   Filter Operator
>                     predicate: key is not null (type: boolean)
>                     Statistics: Num rows: 5 Data size: 35 Basic stats: COMPLETE Column
stats: NONE
>                       Sorted Merge Bucket Map Join Operator
>                         condition map:
>                              Inner Join 0 to 1
>                         condition expressions:
>                           0 
>                           1 
>                         keys:
>                           0 key (type: int)
>                           1 key (type: int)
>                         Select Operator
>                           Group By Operator
>                             aggregations: count()
>                             mode: hash
>                             outputColumnNames: _col0
>                             Reduce Output Operator
>                               sort order: 
>                               value expressions: _col0 (type: bigint)
>         Reducer 2 
>             Reduce Operator Tree:
>               Group By Operator
>                 aggregations: count(VALUE._col0)
>                 mode: mergepartial
>                 outputColumnNames: _col0
>                 Select Operator
>                   expressions: _col0 (type: bigint)
>                   outputColumnNames: _col0
>                   File Output Operator
>                     compressed: false
>                     table:
>                         input format: org.apache.hadoop.mapred.TextInputFormat
>                         output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>                         serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
>       Processor Tree:
>         ListSink
> {noformat}
> I think it's related to SMB Join, so this JIRA should be solved once that is done.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message