hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HIVE-16078) improve abort checking in Tez/LLAP
Date Wed, 01 Mar 2017 23:43:45 GMT

    [ https://issues.apache.org/jira/browse/HIVE-16078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891303#comment-15891303
] 

Sergey Shelukhin edited comment on HIVE-16078 at 3/1/17 11:43 PM:
------------------------------------------------------------------

[~sseth] will test in the cluster eventually; the constant is already double; not sure how
nRows can be reversed - can you elaborate? the limit is only changed when nRows is set to
0. As for long ops, not sure. In this case each item is a VRB, not row, so assuming no interrupt
checks in the pipeline 1000-row check will be ~1M rows processed between check. I also suspect
based on the method names (forwardOverflow etc.) that some batching may be happening internally,
making it even more rows. Perhaps [~mmccline] can comment given the callstack above


was (Author: sershe):
[~sseth] will test in the cluster eventually; the constant is already double; not sure how
nRows can be reversed - can you elaborate? the count is only changed when nRows is set to
0. As for long ops, not sure. In this case each item is a VRB, not row, so assuming no interrupt
checks in the pipeline 1000-row check will be ~1M rows processed between check. I also suspect
based on the method names (forwardOverflow etc.) that some batching may be happening internally,
making it even more rows. Perhaps [~mmccline] can comment given the callstack above

> improve abort checking in Tez/LLAP
> ----------------------------------
>
>                 Key: HIVE-16078
>                 URL: https://issues.apache.org/jira/browse/HIVE-16078
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-16078.patch
>
>
> Sometimes, a fragment can run for a long time after a query fails. It looks from logs
like the abort/interrupt were called correctly on the thread, yet the thread hangs around
minutes after, doing the below. Other tasks for the same job appear to have exited correctly,
after the same abort logic (at least, the same log lines, fwiw)
> {noformat}
> 	at org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.copyByValue(VectorCopyRow.java:317)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:263)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:189)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:389)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardOverflow(VectorMapJoinGenerateResultOperator.java:628)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:277)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:189)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:389)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardOverflow(VectorMapJoinGenerateResultOperator.java:628)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:277)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:189)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:389)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardOverflow(VectorMapJoinGenerateResultOperator.java:628)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:277)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:189)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:389)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardOverflow(VectorMapJoinGenerateResultOperator.java:628)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:277)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:189)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:389)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardOverflow(VectorMapJoinGenerateResultOperator.java:628)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:277)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:189)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:389)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardOverflow(VectorMapJoinGenerateResultOperator.java:628)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:277)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:189)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:389)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardOverflow(VectorMapJoinGenerateResultOperator.java:628)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:277)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:189)
> 	at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:389)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
> 	at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:137)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
> 	at org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:123)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
> 	at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
> 	at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:783)
> 	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:86)
> 	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:70)
> 	at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:420)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message