drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sudheesh Katkam" <skat...@maprtech.com>
Subject Re: Review Request 34603: DRILL-3167: When a query fails, Foreman should wait for all fragments to finish cleaning up before sending a FAILED state to the client
Date Wed, 01 Jul 2015 17:42:38 GMT


> On June 30, 2015, 7:29 p.m., Sudheesh Katkam wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java, line
874
> > <https://reviews.apache.org/r/34603/diff/3/?file=996134#file996134line874>
> >
> >     Can you add unit tests in TestDrillbitResilience?
> 
> abdelhakim deneche wrote:
>     what do you want to test for ?

Pause on a drillbit, fail on another drillbit, and resume the pause. Check if the state is
returned to the user only after the resume.

There must be other cases that you could test, given that it's a change?


> On June 30, 2015, 7:29 p.m., Sudheesh Katkam wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java,
line 315
> > <https://reviews.apache.org/r/34603/diff/3/?file=996135#file996135line315>
> >
> >     Discussion: Why is this the right thing to do? Is FAILED state a fair assumption
for the fragment?
> 
> abdelhakim deneche wrote:
>     I didn't realize fragmentDone() also updates the fragment status. I will update the
patch to fix that.
> 
> abdelhakim deneche wrote:
>     I updated CancelMessageHandler.cancelFragment() to return OK if the fragment we are
trying to cancel already finished executing. So the only fragments that will be reported as
FAILED are fragments that never got the cancellation request for some reason, as I said in
another comment a better alternative would be to add a separate fragment state but I don't
know if this will be useful from a user perspective.
> 
> abdelhakim deneche wrote:
>     another alternative would be to NOT udpate the fragment state so it will show up
as the last known state and have a separate method calls node.fragmentComplete() instead of
fragmentDone()

Looks good with the updated patch.


- Sudheesh


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34603/#review89946
-----------------------------------------------------------


On June 30, 2015, 10:37 p.m., abdelhakim deneche wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34603/
> -----------------------------------------------------------
> 
> (Updated June 30, 2015, 10:37 p.m.)
> 
> 
> Review request for drill, Jacques Nadeau, Jason Altekruse, and Sudheesh Katkam.
> 
> 
> Bugs: DRILL-3167
>     https://issues.apache.org/jira/browse/DRILL-3167
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> - In case of a failure the Foreman will cancel all fragments and move to a FAILING state
until all fragments are terminated
> - QueryManager.cancelExecutingFragments() returns false if no fragment available
> - Web UI still displays FAILED when Foreman state is FAILING
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java 3e461ef

>   exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java
6656bf6 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java
dd26a76 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java
9f302a2 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java 716fb66

>   exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java 9318233

>   exec/java-exec/src/main/resources/rest/profile/list.ftl cf92ede 
>   exec/java-exec/src/main/resources/rest/profile/profile.ftl 46cdc83 
>   protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java e76d748 
>   protocol/src/main/java/org/apache/drill/exec/proto/beans/QueryResult.java 474e330 
>   protocol/src/main/protobuf/UserBitShared.proto 0451fd2 
> 
> Diff: https://reviews.apache.org/r/34603/diff/
> 
> 
> Testing
> -------
> 
> unit tests are passing along with functional and tpch100
> 
> 
> Thanks,
> 
> abdelhakim deneche
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message