hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-10702) Add a Client API and Proxy Provider to enable stale read from Standby
Date Thu, 18 Aug 2016 07:49:22 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426044#comment-15426044

Andrew Wang commented on HDFS-10702:

Hi Jiayi, thanks for working on this. I didn't get through all of it, notably the StaleReadProxyProvider
or the unit tests, but I did do a pass over the other stuff. This is mostly nitty, but a few
bigger questions fell out too:

* Typo: "chekc"
* Client#getStaleBound is unused
* In the proto, I'd prefer we add a new struct rather than just an int64, that way we can
add more fields later if necessary. Like how we have RPCCallerContext, this is essentially
also a form of context. Same for how we have a CallerContext object rather than fields directly.
* Do you have any thoughts about an HDFS-specific RPC header? I doubt staleness is useful
to other users of the RPC subsystem, and really we only need it for ClientNameNodeProtocol.
* StandbyState, we can unpack the if statements to simplify the logic.
* Please don't use a wildcard imports (e.g. ClientNNTranslatorPB, HdfsAdmin)
* DFSClient.java, more accurate javadoc would be to say "returns SyncInfo from the..." rather
than saying transaction ID.
* EditLogTailer has some unnecessary whitespace changes
* The new FSNamesystem methods, these are just for testing? It'd be better if we did this
by reaching into the Server instead, or found some way to test this via mocking. Whitespace
could also be improved.
* FSN#checkOperation, the comments don't match up anymore
* Do we really need a SyncInfo? There's a NamenodeProtocol#getTransactionID which might be
sufficient, if we also expose it to clients. The HA state only seems to be used in a test.
* Related, in NNRpcServer, getSyncInfo is checkOperation WRITE. If this is so, won't this
always return "active"? Unless we're calling getSyncInfo on standby NNs too?
* Also related, since getSyncInfo calls getTransactionID, which in turn checks for superuser
permissions, I don't think this will work as a normal user. Could you add a unit test?
* ProtoUtil, RemoteEditLogManifest have some unnecessary whitespace changes
* Server: typo requet
* MethodCategory, is there no way to do this with Java annotations instead? People are likely
not going to remember to update this file when appropriate.

> Add a Client API and Proxy Provider to enable stale read from Standby
> ---------------------------------------------------------------------
>                 Key: HDFS-10702
>                 URL: https://issues.apache.org/jira/browse/HDFS-10702
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Jiayi Zhou
>            Assignee: Jiayi Zhou
>            Priority: Minor
>         Attachments: HDFS-10702.001.patch, HDFS-10702.002.patch, HDFS-10702.003.patch,
> Currently, clients must always talk to the active NameNode when performing any metadata
operation, which means active NameNode could be a bottleneck for scalability. One way to solve
this problem is to send read-only operations to Standby NameNode. The disadvantage is that
it might be a stale read. 
> Here, I'm thinking of adding a Client API to enable/disable stale read from Standby which
gives Client the power to set the staleness restriction.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message