impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bharath Vissapragada (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances
Date Wed, 15 Nov 2017 20:30:50 GMT
Hello Sailesh Mukil, Dimitris Tsirogiannis, Alex Behm, Dan Hecht, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8449

to look at the new patch set (#9).

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber instances
......................................................................

IMPALA-3613: Avoid topic updates to unregistered subscriber instances

Bug:

Without this patch, when a subscriber repeatedly reconnects to the
statestore, the latter queues the initial heartbeat message and a
bunch of topic updates to every instance of the registered subscriber.
These queued updates are eventually picked up by the heartbeating/topic
update threads and the corresponding RPCs are made to the subscribers.
The subscriber then rejects these updates since they were meant for an
earlier registration. This is usually possible if the subscriber has
some network problems leading to failing RPCs.

Such a node is eventually marked by the statestore as bad, but depending
on the configurations set, the issue can snowball into DDOS kind of
attack when the entire thread pool of heartbeating/topic updates is
filled with instances from the problematic host. This can result in
the statestore missing timely heartbeats to other subscribers making
them reconnect. This worsens the situation and the resulting topic
updates for the reconnects will fully saturate the network on the
statestore host, until the statestore daemon is restarted.

Fix:

This patch maps topic updates/heartbeats to a specific subscriber
registered instance rather to a subscriber id (that stays same across
reconnects). That way, when we encounter a topic update that was meant to
a stale subscriber, we can simply reject it.

Testing:

Tested this locally by adding relevant logging. I made the subscribers
to reconnect aggressively(a) and delaying heartbeats from the statestore
side (b,c).

(a) --statestore_subscriber_timeout_seconds=1
(b) --statestore_max_missed_heartbeats=1000
(c) --statestore_heartbeat_frequency_ms=60000

Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
---
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
4 files changed, 86 insertions(+), 45 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/8449/9
-- 
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 9
Gerrit-Owner: Bharath Vissapragada <bharathv@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bharathv@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sailesh@cloudera.com>

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message