impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sailesh Mukil (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-5749: coordinator race hits DCHECK 'num remaining backends > 0'
Date Thu, 03 Aug 2017 20:14:56 GMT
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5749: coordinator race hits DCHECK 'num_remaining_backends_ > 0'

Patch Set 1:

> Does this trigger only when there are two concurrent calls to
 > UpdateBackendExecStatus() from the same backend? If so, do we
 > understand why that happens so often?

My understanding is this:
A fragment instance sends reports every 'n' seconds. Due to a congested network, two of these
reports for the same fragment instance from a backend can arrive at the coordinator and start
being processed at around the same time, hence leading to this issue.

Ideally a second report cannot be send until the first one is ACKd by the coordinator, since
a lock is held until the report is ACKd, in the ReportProfileThread(); but there is only one
case where a second report will be sent before the first one is responded to, i.e.  from FragmentInstanceState::Finalize().

So ReportProfileThread() sends the one report of the last finstance, then Finalize() sends
the second report of the same finstance before the first one is responded to.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I1528661e5df6d9732ebfeb414576c82ec5c92241
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <>
Gerrit-Reviewer: Henry Robinson <>
Gerrit-Reviewer: Sailesh Mukil <>
Gerrit-HasComments: No

View raw message