impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Tauber-Marshall (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-5749: coordinator race hits DCHECK 'num remaining backends > 0'
Date Fri, 04 Aug 2017 18:08:34 GMT
Hello Michael Ho, Sailesh Mukil,

I'd like you to reexamine a change.  Please visit

to look at the new patch set (#2).

Change subject: IMPALA-5749: coordinator race hits DCHECK 'num_remaining_backends_ > 0'

IMPALA-5749: coordinator race hits DCHECK 'num_remaining_backends_ > 0'

In Coordinator::UpdateBackendExecStatus(), we check if the backend
has already completed with BackendState::IsDone() and return without
applying the update if so to avoid updating num_remaining_backends_
twice for the same completed backend.

The problem is that the value of BackendState::IsDone() is updated by
the call to BackendState::ApplyExecStatusReport() that comes after it,
but these operations are not performed atomically, so if there are
two simultaneous calls to UpdateBackendExecStatus(), they can both
call IsDone(), both get 'false', and then proceed to erroneously both
update num_remaining_backends_, hitting a DCHECK.

The solution is to perform both the call to IsDone() and the update to
it atomically by holding the BackendState::lock_.

- Ran test_finst_cancel_when_query_complete 10,000 times without
  hitting the DCHECK (previously, it would hit about once per 300

Change-Id: I1528661e5df6d9732ebfeb414576c82ec5c92241
M be/src/runtime/
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/
3 files changed, 18 insertions(+), 11 deletions(-)

  git pull ssh:// refs/changes/77/7577/2
To view, visit
To unsubscribe, visit

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1528661e5df6d9732ebfeb414576c82ec5c92241
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <>
Gerrit-Reviewer: Henry Robinson <>
Gerrit-Reviewer: Michael Ho <>
Gerrit-Reviewer: Sailesh Mukil <>

View raw message