hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4246) The exclude node list should be more forgiving, for each output stream
Date Sat, 01 Dec 2012 20:01:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508038#comment-13508038

Hadoop QA commented on HDFS-4246:

{color:green}+1 overall{color}.  Here are the results of testing the latest attachment 
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 1 new or modified
test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with eclipse:eclipse.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new Findbugs (version
1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs.

    {color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3586//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3586//console

This message is automatically generated.
> The exclude node list should be more forgiving, for each output stream
> ----------------------------------------------------------------------
>                 Key: HDFS-4246
>                 URL: https://issues.apache.org/jira/browse/HDFS-4246
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>            Priority: Minor
>         Attachments: HDFS-4246.patch
> Originally observed by Inder on the mailing lists:
> {quote}
> Folks,
> i was wondering if there is any mechanism/logic to move a node back from the excludedNodeList
to live nodes to be tried for new block creation.
> In the current DFSOutputStream code i do not see this. The use-case is if the write timeout
is being reduced and certain nodes get aggressively added to the excludedNodeList and the
client caches DFSOutputStream then the excludedNodes never get tried again in the lifetime
of the application caching DFSOutputStream
> {quote}
> What this leads to, is a special scenario, that may impact smaller clusters more than
larger ones:
> 1. File is opened for continuous hflush/sync-based writes, such as a HBase WAL for example.
This file is gonna be kept open for a very very long time, by design.
> 2. Over time, nodes are excluded for various errors, such as DN crashes, network failures,
> 3. Eventually, exclude list == live nodes list or close, and the write suffers. At time
of equality, the write also fails with an error of not being able to get a block allocation.
> We should perhaps make the excludeNodes list a timed-cache collection, so that even if
it begins filling up, the older excludes are pruned away, giving those nodes a try again for
> One place we have to be careful about, though, is rack-failures. Those sometimes never
come back fast enough, and can be problematic to retry code with such an eventually-forgiving
list. Perhaps we can retain forgiven nodes and if they are entered again, we may double or
triple the forgiveness value (in time units), to counter this? Its just one idea.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message