hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Douglas <cdoug...@apache.org>
Subject Re: [DISCUSS] Merge HDFS-10467 to (Router-based federation) trunk
Date Thu, 24 Aug 2017 20:39:53 GMT
I'd definitely support merging this to trunk. The implementation is
almost entirely outside of HDFS and, as Inigo detailed, has been
tested at scale. The branch is in a functional state with
documentation and tests. -C

On Mon, Aug 21, 2017 at 6:11 PM, Iñigo Goiri <elgoiri@gmail.com> wrote:
> Hi all,
>
>
>
> We would like to open a discussion on merging the Router-based Federation
> feature to trunk.
>
> Last week, there was a thread about which branches would go into 3.0 and
> given that YARN federation is going, this might be a good time for this to
> be merged too.
>
>
> We have been running "Router-based federation" in production for a year.
>
> Meanwhile, we have been releasing it in a feature branch (HDFS-10467 [1])
> for a while.
>
> We are reasonably confident that the state of the branch is about to meet
> the criteria to be merged onto trunk.
>
>
> *Feature*:
>
> This feature aggregates multiple namespaces into a single one transparently
> to the user.
>
> It has a similar architecture to YARN federation (YARN-2915).
>
> It consists on Routers that handle requests from the clients and forwards
> them to the right subcluster and exposes the same API as the Namenode.
>
> Currently we use a mount table (similar to ViewFs) but can be replaced by
> other approaches.
>
> The Routers share their state in a State Store.
>
>
>
> The main advantage is that clients interact with the Routers as they were
> Namenode so there is no changes in the client required other than poiting
> to the right address.
>
> In addition, all the management is moved to the server side so changes to
> the Mount Table can be done without having to sync the clients (pull/push).
>
>
>
> *Status*:
>
> The branch already contains all the features required to work end-to-end.
>
> There are a couple open JIRAs that would be required for the merged (i.e.,
> Web UI) but they should be finished soon.
>
> We have been running it in production for the last year and we have a paper
> with some of the details of our production deployment [2].
>
> We have 4 production deployments with the largest one spanning more than
> 20k servers across 6 subclusters.
>
> In addition, the guys at LinkedIn had started testing Router-based
> federation and they will be adding security to the branch.
>
>
>
> The modifications to the rest of HDFS are minimal:
>
>    - Changed visibility for some methods (e.g., MiniDFSCluster)
>    - Added some utilities to extract addresses
>    - Modified hdfs and hdfs.cmd to start the Router and manager the
>    federation
>    - Modified hdfs-default.xml
>
> Everything else is self-contained in a federation package.
>
> In addition, all the functionality is in the Router so it’s disabled by
> default.
>
> Even when enabled, there is no impact for regular HDFS and it would only
> require to configure the trust between the Namenode and the Router once
> security is enabled.
>
>
>
> I have been continuously rebasing the feature branch (updated up to 1 week
> ago) so the merge should be a straightforward cherry-pick.
>
>
>
> *Problems*:
>
> The problems I’m aware of are the following:
>
>    - We implement ClientProtocol so anytime a new method is added there, we
>    would need to add it to the Router. However, it’s straightforward to add
>    unimplemented methods.
>    - There is some argument about naming the feature as “Router-based
>    federation” but I’m open for better names.
>
>
>
> *Credits*:
>
> I’d like to thank the people at Microsoft (specially, Jason, Ricardo,
> Chris, Subru, Jakob, Carlo and Giovanni), Twitter (Ming and Gera), and
> LinkedIn (Zhe, Erik and Konstantin) for the discussion and the ideas.
>
> Special thanks to Chris Douglas for the thorough reviews!
>
>
>
> Please look through the branch; feedback is welcome. Thanks!
>
>
> Cheers,
>
> Inigo
>
>
>
>
> [1] https://issues.apache.org/jira/browse/HDFS-10467
>
> [2] https://www.usenix.org/conference/atc17/technical-
> sessions/presentation/misra

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org


Mime
View raw message