hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chengxiang Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-7799) TRANSFORM failed in transform_ppr1.q[Spark Branch]
Date Mon, 25 Aug 2014 07:25:58 GMT

    [ https://issues.apache.org/jira/browse/HIVE-7799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14108843#comment-14108843
] 

Chengxiang Li commented on HIVE-7799:
-------------------------------------

Depends on the implementation of {{ResultIterator.hasNext()}}, it is designed to be a lazy
iterator as it only try to call {{processNextRecord()}} while RowContainer is empty, but RowContainer
does not support add more rows after already read as mentioned in previous comments. Here
is what happens while different queries is executed:
# For Map only job, it write map output into file directly, no need Collector in this case.
# For Map Reduce job with GroupByOperator, {{HiveBaseFunctionResultList.collect()}} is triggered
by {{closeRecordProcessor()}}, which is beyond the lazy-computing logic, so the ResultIterator
does not do lazy computing in this case.
# For Map Reduce job without GroupByOperator(like cluster by queries), ResultIterator do lazy
computing, and it clear RowContainer each time befor call {{processNextRecord()}}. While read/write
HiveBaseFunctionResultList in the same thread, access progress of RowContainer is like .....clear()->addRow()->first()->clear()->addRow()->first()......
so it won't violate RowContainer's access rule. But with mutli threads to read/write HiveBaseFunctionResultList,
as the ScriptOperator does which venki mentioned above, it would definitely hit this JIRA
issue.

In my opinion, there are 2 solutions:
# remove ResultIterator lazy computing feature as patch1 does.
# implement a RowConatiner-like class, which support current RowContainer features. it also
need to be thread-safe, and support add row after {{first()}} is already called. 

The second solution is quite complex, it may introduce performance degrade after support thread-safe
access and write-after-read, compare with the performance upgrade of lazy-computing support,
it's hardly to say whether it's worthy or not now. So I suggest we take the first solution
to fix this issue, and left the possible optimization to milestone 4.

> TRANSFORM failed in transform_ppr1.q[Spark Branch]
> --------------------------------------------------
>
>                 Key: HIVE-7799
>                 URL: https://issues.apache.org/jira/browse/HIVE-7799
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>            Reporter: Chengxiang Li
>            Assignee: Chengxiang Li
>              Labels: Spark-M1
>         Attachments: HIVE-7799.1-spark.patch, HIVE-7799.2-spark.patch, HIVE-7799.3-spark.patch
>
>
> Here is the exception:
> {noformat}
> 2014-08-20 01:14:36,594 ERROR executor.Executor (Logging.scala:logError(96)) - Exception
in task 0.0 in stage 1.0 (TID 0)
> java.lang.NullPointerException
>         at org.apache.hadoop.hive.ql.exec.spark.HiveKVResultCache.next(HiveKVResultCache.java:113)
>         at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.next(HiveBaseFunctionResultList.java:124)
>         at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.next(HiveBaseFunctionResultList.java:82)
>         at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:42)
>         at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>         at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>         at org.apache.spark.shuffle.hash.HashShuffleWriter.write(HashShuffleWriter.scala:65)
>         at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>         at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>         at org.apache.spark.scheduler.Task.run(Task.scala:54)
>         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:722)
> {noformat}
> Basically, the cause is that RowContainer is misused(it's not allowed to write once someone
read row from it), i'm trying to figure out whether it's a hive issue or just in hive on spark
mode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message