hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venki Korukanti (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-7799) TRANSFORM failed in transform_ppr1.q[Spark Branch]
Date Fri, 22 Aug 2014 20:50:11 GMT

    [ https://issues.apache.org/jira/browse/HIVE-7799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14107462#comment-14107462
] 

Venki Korukanti commented on HIVE-7799:
---------------------------------------

ScriptOperator is spawning a separate thread which is adding the records to collector. Iterator
thread is trying to read from RowContainer while ScriptOperator spawned thread is adding the
records. There may be other operators that may spawn threads for processing. Looks like we
need a synchronized queue with persistence support. 

> TRANSFORM failed in transform_ppr1.q[Spark Branch]
> --------------------------------------------------
>
>                 Key: HIVE-7799
>                 URL: https://issues.apache.org/jira/browse/HIVE-7799
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>            Reporter: Chengxiang Li
>            Assignee: Chengxiang Li
>              Labels: Spark-M1
>         Attachments: HIVE-7799.1-spark.patch, HIVE-7799.2-spark.patch
>
>
> Here is the exception:
> {noformat}
> 2014-08-20 01:14:36,594 ERROR executor.Executor (Logging.scala:logError(96)) - Exception
in task 0.0 in stage 1.0 (TID 0)
> java.lang.NullPointerException
>         at org.apache.hadoop.hive.ql.exec.spark.HiveKVResultCache.next(HiveKVResultCache.java:113)
>         at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.next(HiveBaseFunctionResultList.java:124)
>         at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.next(HiveBaseFunctionResultList.java:82)
>         at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:42)
>         at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>         at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>         at org.apache.spark.shuffle.hash.HashShuffleWriter.write(HashShuffleWriter.scala:65)
>         at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>         at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>         at org.apache.spark.scheduler.Task.run(Task.scala:54)
>         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:722)
> {noformat}
> Basically, the cause is that RowContainer is misused(it's not allowed to write once someone
read row from it), i'm trying to figure out whether it's a hive issue or just in hive on spark
mode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message