hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ming Ma (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-10702) Add a Client API and Proxy Provider to enable stale read from Standby
Date Fri, 04 Nov 2016 19:01:59 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15637336#comment-15637336

Ming Ma commented on HDFS-10702:

Thanks [~zhz] for the ping. Thanks [~clouderajiayi] [~mackrorysd] for the great work.

Yes it might be useful to leverage inotify, or at least evaluating it. In this SbNN polling
approach, I am interested in knowing more how the applications plan to use it, specifically
when they will decide to call getSyncInfo. In multi tenant environment, an application might
care about specific files/directories, not necessarily the namespace has changed at a global

Here are some comments specific to the patch.

* Standby namenode has its own checkpoint lock to reduce checkpoint's impact on block report.
Thus there could be some assumption that checkpointer is the only reader of namespace in standby.
You might want to confirm if there is any implication.
* In the case of multiple standbys, one is the checkpointer, thus you can consider allowing
client to connect to standbys not doing checkpoint.
* if the server config is "dfs.ha.allow.stale.reads" is set to false, and client side enables
stale read, it seems the client will still keep trying. Wonder if client side should consider
the server side config as well.
* Federation configuration support might need some more work. It could depend on how you want
to enable it on client side. Current patch is based on run time config on per client instance.
You can also allow define client side config like "dfs.client.<name-service>.ha.allow.stale.reads".
* After NN failover, does StaleReadProxyProvider#standbyProxies get refreshed? If not, a long
running client could keep using the old standby.
* RPC layer is more general that HDFS. So it will be better if allowStandbyRead can be refactored

> Add a Client API and Proxy Provider to enable stale read from Standby
> ---------------------------------------------------------------------
>                 Key: HDFS-10702
>                 URL: https://issues.apache.org/jira/browse/HDFS-10702
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Jiayi Zhou
>            Assignee: Jiayi Zhou
>            Priority: Minor
>         Attachments: HDFS-10702.001.patch, HDFS-10702.002.patch, HDFS-10702.003.patch,
HDFS-10702.004.patch, HDFS-10702.005.patch, HDFS-10702.006.patch, StaleReadfromStandbyNN.pdf
> Currently, clients must always talk to the active NameNode when performing any metadata
operation, which means active NameNode could be a bottleneck for scalability. One way to solve
this problem is to send read-only operations to Standby NameNode. The disadvantage is that
it might be a stale read. 
> Here, I'm thinking of adding a Client API to enable/disable stale read from Standby which
gives Client the power to set the staleness restriction.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message