hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Clampffer (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-11014) libhdfs++: Make connection to HA clusters faster
Date Fri, 14 Oct 2016 13:58:20 GMT
James Clampffer created HDFS-11014:

             Summary: libhdfs++: Make connection to HA clusters faster
                 Key: HDFS-11014
                 URL: https://issues.apache.org/jira/browse/HDFS-11014
             Project: Hadoop HDFS
          Issue Type: Sub-task
            Reporter: James Clampffer
            Assignee: James Clampffer
            Priority: Minor

Right now when we get a StandbyException from the NN we inject a 20 second delay before we
try the alternate NN even if it's the first failover.  The first failover shouldn't have a
delay (java client skips delay on first failover).

Another minor change I'd like to make is to reduce the default number of failover attempts
from 15 (used in the apache config) to 4.  My impression is that higher numbers of failovers
are really handy for longer running batch jobs but in the libhdfs++ case the client is often
an interactive application.  In this case it's generally preferable to fail sooner so a user
doesn't have to wait the ~8 minutes to time out when using default settings.

4 failovers is based on the assumption that if we can't immediately connect there is either
a GC pause which will most likely be finished before the second connection attempt or it's
a network or config issue that will take some sorting out by an admin.  It'd still be possible
to override these in the config for more tuning if a specific deployment tends to have more
or less network issues.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message