hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jesse Yates (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6440) Support more than 2 NameNodes
Date Sat, 17 Jan 2015 20:54:35 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281537#comment-14281537

Jesse Yates commented on HDFS-6440:

Some follow up after actually looking at the code:

bq. Is it possible that doWork throws IOException other than RemoteException?
Yup. In fact, the implemention of doWork at EditLogTailer#ln291 can throw an IOException if
the call to the proxy for rollEditLog throws an IOException. Sure, this is a bit brittle -
a remoteException could be thrown by that call (or any other) as an IOException, but that
really can't be helped because we have no other way of differentiating right now. 

bq. 6. needCheckpoint == true implies sendRequests == true thus when call doCheckpiont(),
sendRequest is always true.

Yup, that was a slight logic bug. I think setting send request should look like:
          // on all nodes, we build the checkpoint. However, we only ship the checkpoint if
have a
          // rollback request, are the checkpointer, are outside the quiet period.
         boolean sendRequest = needCheckpoint &&  (isPrimaryCheckPointer
              || secsSinceLast >= checkpointConf.getQuietPeriod());
to actually not send the request every time - it wasn't going to break anything before, but
now it should actually conserve bandwidth :) 

bq. 7. Could you break this line
My IDE has that at 99 chars long - isn't 100 chars the standard line width? However, I moved
the IOE from the rest of the signature up to the second half of the method declaration.

bq. 11. Finally, could you reduce the changes in `MiniDFSCluster.java`, as many of them are
not changed, e.g. `MiniDFSCluster.java:911-986`.
I think I'm at the minimal number of changes there. Git thinks there are line add and removes
frequently when things move around a bit, as this patch necessitates. Fortunately, they should
be easy to ignore... but let me know if I'm missing what you are getting at.

> Support more than 2 NameNodes
> -----------------------------
>                 Key: HDFS-6440
>                 URL: https://issues.apache.org/jira/browse/HDFS-6440
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: auto-failover, ha, namenode
>    Affects Versions: 2.4.0
>            Reporter: Jesse Yates
>            Assignee: Jesse Yates
>         Attachments: Multiple-Standby-NameNodes_V1.pdf, hdfs-6440-cdh-4.5-full.patch,
hdfs-6440-trunk-v1.patch, hdfs-multiple-snn-trunk-v0.patch
> Most of the work is already done to support more than 2 NameNodes (one active, one standby).
This would be the last bit to support running multiple _standby_ NameNodes; one of the standbys
should be available for fail-over.
> Mostly, this is a matter of updating how we parse configurations, some complexity around
managing the checkpointing, and updating a whole lot of tests.

This message was sent by Atlassian JIRA

View raw message