hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Li (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-9640) RPC Congestion Control with FairCallQueue
Date Fri, 06 Dec 2013 19:55:41 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-9640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris Li updated HADOOP-9640:
-----------------------------

    Attachment: faircallqueue2.patch

[~daryn] Definitely, this new patch is pluggable so that it defaults to the LinkedBlockingQueue
via FIFOCallQueue. We will also be testing performance on larger clusters in January.

Please let me know your thoughts on this new patch.

In this new patch (faircallqueue2.patch):
*Architecture*
The FairCallQueue is responsible for its Scheduler and Mux, which in the future will be pluggable
as well. It is not made pluggable now since there is only one option today.

Changes to NameNodeRPCServer (and others) are no longer necessary.

*Scheduling Token*
Using username right now, but will switch to jobID when a good way of including it is decided
upon.

*Cross-server scheduling*
Scheduling across servers (for instance, the Namenode can have 2 RPC Servers for users and
service calls) will be supported in a future patch.

*Configuration*
Configuration keys are keyed by port, so for a server running on 8020:

_ipc.8020.callqueue.impl_
Defaults to FIFOCallQueue.class, which uses a LinkedBlockingQueue. To enable priority, use
"org.apache.hadoop.ipc.FairCallQueue"

_ipc.8020.faircallqueue.priority-levels_
Defaults to 4, controls the number of priority levels in the faircallqueue.

_ipc.8020.history-scheduler.service-users_
A comma separated list of users that will be exempt from scheduling and given top priority.
Used for giving the service users (hadoop or hdfs) absolute high priority. e.g. "hadoop,hdfs"

_ipc.8020.history-scheduler.history-length_
The number of past calls to remember. HistoryRpcScheduler will schedule requests based on
this pool. Defaults to 1000.

_ipc.8020.history-scheduler.thresholds_
A comma separated list of ints that specify the thresholds for scheduling in the history scheduler.
For instance with 4 queues and a history-length of 1000: "50,400,750" will schedule requests
greater than 750 into queue 3, > 400 into queue 2, > 50 into queue 1, else into queue
0. Defaults to an even split (for a history-length of 200 and 4 queues it would be 50 each:
"50,100,150")

_ipc.8020.wrr-multiplexer.weights_
A comma separated list of ints that specify weights for each queue. For instance with 4 queues:
"10,5,5,1", which sets the handlers to draw from the queues with the following pattern:
* Read queue0 10 times
* Read queue1 5 times
* Read queue2 5 times
* Read queue3 1 time
And then repeat. Defaults to a log2 split: For 4 queues, it would be 8,4,2,1

> RPC Congestion Control with FairCallQueue
> -----------------------------------------
>
>                 Key: HADOOP-9640
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9640
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 3.0.0, 2.2.0
>            Reporter: Xiaobo Peng
>              Labels: hdfs, qos, rpc
>         Attachments: MinorityMajorityPerformance.pdf, NN-denial-of-service-updated-plan.pdf,
faircallqueue.patch, faircallqueue2.patch, rpc-congestion-control-draft-plan.pdf
>
>
> Several production Hadoop cluster incidents occurred where the Namenode was overloaded
and failed to respond. 
> We can improve quality of service for users during namenode peak loads by replacing the
FIFO call queue with a [Fair Call Queue|https://issues.apache.org/jira/secure/attachment/12616864/NN-denial-of-service-updated-plan.pdf].
(this plan supersedes rpc-congestion-control-draft-plan).
> Excerpted from the communication of one incident, “The map task of a user was creating
huge number of small files in the user directory. Due to the heavy load on NN, the JT also
was unable to communicate with NN...The cluster became responsive only once the job was killed.”
> Excerpted from the communication of another incident, “Namenode was overloaded by GetBlockLocation
requests (Correction: should be getFileInfo requests. the job had a bug that called getFileInfo
for a nonexistent file in an endless loop). All other requests to namenode were also affected
by this and hence all jobs slowed down. Cluster almost came to a grinding halt…Eventually
killed jobtracker to kill all jobs that are running.”
> Excerpted from HDFS-945, “We've seen defective applications cause havoc on the NameNode,
for e.g. by doing 100k+ 'listStatus' on very large directories (60k files) etc.”



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message