hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6022) Moving deadNodes from being thread local. Improving dead datanode handling in DFSClient
Date Wed, 26 Feb 2014 21:30:23 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13913530#comment-13913530
] 

Hadoop QA commented on HDFS-6022:
---------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12631314/HADOOP-6022.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 1 new or modified
test files.

      {color:red}-1 javac{color}.  The applied patch generated 1546 javac compiler warnings
(more than the trunk's current 1545 warnings).

    {color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with eclipse:eclipse.

    {color:red}-1 findbugs{color}.  The patch appears to introduce 3 new Findbugs (version
1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs.

    {color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6246//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/6246//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/6246//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6246//console

This message is automatically generated.

> Moving deadNodes from being thread local. Improving dead datanode handling in DFSClient

> ----------------------------------------------------------------------------------------
>
>                 Key: HDFS-6022
>                 URL: https://issues.apache.org/jira/browse/HDFS-6022
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>    Affects Versions: 0.23.9, 0.23.10, 2.2.0, 2.3.0
>            Reporter: Jack Levin
>              Labels: patch
>             Fix For: 3.0.0
>
>         Attachments: HADOOP-6022.patch
>
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> This patch solves an issue of deadNodes list being thread local.  deadNodes list is created
by DFSClient when some problems with write/reading, or contacting datanode exist.  The problem
is that deadNodes is not visible to other DFSInputStream threads, hence every DFSInputStream
ends up building its own deadNodes.  This affect performance of DFSClient to a large degree
especially when a datanode goes completely offline (there is a tcp connect delay experienced
by all DFSInputStream threads affecting performance of the whole cluster).
> This patch moves deadNodes to be global in DFSClient class so that as soon as a single
DFSInputStream thread reports a dead datanode, all other DFSInputStream threads are informed,
negating the need to create their own independent lists (concurrent Map really). 
> Further, a global deadNodes health check manager thread (DeadNodeVerifier) is created
to verify all dead datanodes every 5 seconds, and remove the same list as soon as it is up.
 That thread under normal conditions (deadNodes empty) would be sleeping.  If deadNodes is
not empty, the thread will attempt to open tcp connection every 5 seconds to affected datanodes.
> This patch has a test (TestDFSClientDeadNodes) that is quite simple, since the deadNodes
creation is not affected by the patch, we only test datanode removal from deadNodes by the
health check manager thread.  Test will create a file in dfs minicluster, read from the same
file rapidly, cause datanode to restart, and test is the health check manager thread does
the right thing, removing the alive datanode from the global deadNodes list.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message