Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Date: Wed, 26 Jul 2017 08:36:00 +0000 (UTC)
From: "Jiandan Yang  (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.13090136.1501058144000.16025.1501058160277@Atlassian.JIRA>
In-Reply-To: <JIRA.13090136.1501058144000@Atlassian.JIRA>
References: <JIRA.13090136.1501058144000@Atlassian.JIRA> <JIRA.13090136.1501058144746@jira-lw-us.apache.org>
Subject: [jira] [Created] (HDFS-12200) Optimize CachedDNSToSwitchMapping to
 avoid cpu utilization is too high
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
archived-at: Wed, 26 Jul 2017 08:36:21 -0000

Jiandan Yang  created HDFS-12200:
------------------------------------

             Summary: Optimize CachedDNSToSwitchMapping to avoid cpu utiliz=
ation is too high
                 Key: HDFS-12200
                 URL: https://issues.apache.org/jira/browse/HDFS-12200
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: namenode
            Reporter: Jiandan Yang=20


1. Background :
Our hadoop cluster is disaggregated storage and compute, HDFS is deployed t=
o 600+ machines, YARN is deployed to another machine pool where off-line jo=
b and online service are run, Yarn's offline job will visit HDFS, but point=
s The machines used for offline jobs are dynamically changing because the o=
nline service has a higher priority, and when the online service is idle, t=
he machine will be assigned to offline tasks, and when the online service i=
s busy, it will seize the resources of the offline job.
We found that sometimes NameNode cpu utilization rate of 90% or even 100%. =
The most serious is cpu utilization rate of 100% for a long time result in =
writing journalNode timeout, eventually leading to NameNode hang up. The re=
ason is  offline tasks running in a few hundred servers access HDFS at the =
same time, NameNode resolve rack of client machine, started several hundred=
 sub-process.=20

{code:java}
"process reaper"#10864 daemon prio=3D10 os_prio=3D0 tid=3D0x00007fe270a3180=
0 nid=3D0x38d93 runnable [0x00007fcdc36fc000]
   java.lang.Thread.State: RUNNABLE
        at java.lang.UNIXProcess.waitForProcessExit(Native Method)
        at java.lang.UNIXProcess.lambda$initStreams$4(UNIXProcess.java:301)
        at java.lang.UNIXProcess$$Lambda$7/1447689627.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExec=
utor.java:1147)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExe=
cutor.java:622)
        at java.lang.Thread.run(Thread.java:834
{code}

Our configuration as follows:
{code:java}
net.topology.node.switch.mapping.impl =3D ScriptBasedMapping,=20
net.topology.script.file.name =3D 'a python script'
{code}


2. Optimization
In order to solve these two problems, we have optimized the CachedDNSToSwit=
chMapping
(1) Added the DataNode IP list  to the file of  dfs.hosts configured. when =
NameNode starts it  preloads DataNode rack information to the cache, get a =
batch of racks of hosts when running script once (the corresponding configu=
ration is net.topology.script.number,the default value of 100)

(2) Step (1) has ensured that the cache has all the DataNodes=E2=80=99 rack=
,  so if the cache did not hit, then the host must be a client machine, the=
n directly return /default-rack,

(3) Each time you add new DataNodes you need to add the new DataNodes=E2=80=
=99 IP address to the file specified by dfs.hosts, and then run command of =
bin/hdfs dfsadmin -refreshNodes, it will put the newly added DataNodes=E2=
=80=99 rack into cache
(4) Add new configuration items dfs.namenode.topology.resolve-non-cache-hos=
t, the value is false to open the above function, and the value is true to =
turn off the above functions, default value is true to keep compatibility


--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org