hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Agarwal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13029) Have FairCallQueue try all lower priority sub queues before backoff
Date Thu, 21 Apr 2016 19:46:25 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15252533#comment-15252533

Arpit Agarwal commented on HADOOP-13029:

Hi [~mingma], backoff has two goals:
# Prevent stalling the IPC reader thread indefinitely.
# Throttle clients when the NameNode is under load by signaling congestion.

Spilling to lower priority queues delays the 'congestion' signal so it makes (2) less effective.
The networking world seems to think earlier notification of congestion is better e.g. [TCP
RED|https://en.wikipedia.org/wiki/Random_early_detection], [ECN|https://en.wikipedia.org/wiki/Explicit_Congestion_Notification]
and delay-based congestion control.

bq. A heavy user generates lots of rpc requests, but it only filled up 1/4 of the lowest priority
sub queue. However that is enough to cause lock contention with DN RPC requests.
[~xyao] recently introduced HADOOP-12916 with the goal of addressing the same problem.

> Have FairCallQueue try all lower priority sub queues before backoff
> -------------------------------------------------------------------
>                 Key: HADOOP-13029
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13029
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Ming Ma
> Currently if FairCallQueue and backoff are enabled, backoff will kick in as soon as the
assigned sub queue is filled up.
> {noformat}
>   /**
>    * Put and offer follow the same pattern:
>    * 1. Get the assigned priorityLevel from the call by scheduler
>    * 2. Get the nth sub-queue matching this priorityLevel
>    * 3. delegate the call to this sub-queue.
>    *
>    * But differ in how they handle overflow:
>    * - Put will move on to the next queue until it lands on the last queue
>    * - Offer does not attempt other queues on overflow
>    */
> {noformat}
> Seems it is better to try lower priority sub queues when the assigned sub queue is full,
just like the case when backoff is disabled. This will give regular users more opportunities
and allow the cluster to be configured with smaller call queue length. [~chrili], [~arpitagarwal],
what do you think?

This message was sent by Atlassian JIRA

View raw message