helix-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HELIX-681) Participant should not fail state transition on fail to delete / relay message
Date Wed, 21 Mar 2018 18:01:00 GMT

    [ https://issues.apache.org/jira/browse/HELIX-681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16408337#comment-16408337
] 

ASF GitHub Bot commented on HELIX-681:
--------------------------------------

Github user zhan849 commented on a diff in the pull request:

    https://github.com/apache/helix/pull/152#discussion_r176182830
  
    --- Diff: helix-core/src/main/java/org/apache/helix/messaging/handling/HelixTask.java
---
    @@ -168,7 +169,14 @@ public HelixTaskResult call() {
     
           // forward relay messages attached to this message to other participants
           if (taskResult.isSuccess()) {
    -        forwardRelayMessages(accessor, _message, taskResult.getCompleteTime());
    +        try {
    +          forwardRelayMessages(accessor, _message, taskResult.getCompleteTime());
    +        } catch (Exception e) {
    +          // Fail to send relay message should not result in a task execution failure
    +          // Currently we don't log error to ZK to reduce writes as when accessor throws
    +          // exception, ZK might not be in good condition.
    +          logger.error("Failed to send relay messages.", e);
    --- End diff --
    
    will change


> Participant should not fail state transition on fail to delete / relay message
> ------------------------------------------------------------------------------
>
>                 Key: HELIX-681
>                 URL: https://issues.apache.org/jira/browse/HELIX-681
>             Project: Apache Helix
>          Issue Type: Bug
>            Reporter: Hao Zhang
>            Priority: Major
>
> Currently we have a general try-catch block in HelixTask and HelixTaskExecutor, which,
upon any exception thrown from state transition routine, will fail state transition. However
there are at least the following cases in which state transition should be considered as successful:
>  * When we fail to delete message after successfully handled message and updated current
state -> this is because we already completed state transition and current state is consistent
between participant and ZK
>  * When we fail to send out relay message > as relay message provides only best effort
of delivering messages, which has nothing to do with state transition's results. In case of
fail to relay message, controller will resend message which ensures correctness.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message