drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sudheesh Katkam (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-3967) Broken Test: TestDrillbitResilience.cancelAfterEverythingIsCompleted()
Date Fri, 23 Oct 2015 15:23:27 GMT

    [ https://issues.apache.org/jira/browse/DRILL-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14971150#comment-14971150
] 

Sudheesh Katkam commented on DRILL-3967:
----------------------------------------

Note that after planning is done, the Foreman thread exits; the state transitions are handled
by RPC threads.

The cancel signal is sent as soon as the three batches arrive (not ideal), at this point the
query state is either RUNNING or COMPLETED.

The test passes if the state transition from RUNNING --> CANCELLATION_REQUESTED is handled
by a BitServer thread. The resume signal from the UserServer thread is ignored.
The test hangs   if the state transition from RUNNING --> CANCELLATION_REQUESTED -->
CANCELLED is handled by the UserServer thread. The thread is waiting (on itself) for a resume
signal when closing.
The test fails      if the state transition from RUNNING --> COMPLETED is handled by a
BitServer thread. The CANCELLATION_REQUESTED message is simply ignored. And the resume signal
allows the query to move to COMPLETED state.

> Broken Test: TestDrillbitResilience.cancelAfterEverythingIsCompleted()
> ----------------------------------------------------------------------
>
>                 Key: DRILL-3967
>                 URL: https://issues.apache.org/jira/browse/DRILL-3967
>             Project: Apache Drill
>          Issue Type: Test
>          Components: Execution - Flow, Execution - RPC
>    Affects Versions: 1.2.0
>            Reporter: Andrew
>            Assignee: Sudheesh Katkam
>            Priority: Minor
>
> TestDrillbitResilience.cancelAfterEverythingIsCompleted() can sometimes fail. I've noticed
that running this test on an m2.xlarge on AWS causes a reproducible failure when running against
the patch for https://issues.apache.org/jira/browse/DRILL-3749 (Upgraded Hadoop and Curator
libraries).
> When running this test with the same patch on my laptop, this test passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message