hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manoj Govindassamy <man...@cloudera.com>
Subject Re: [DISCUSS] Merge HDFS-10467 to (Router-based federation) trunk
Date Tue, 29 Aug 2017 22:02:16 GMT
Hi Inigo and team,

Great work guys. Good to know that you have this feature already running in production at
a massive scale. 

1. A consolidated patch will be very useful for looking at the implementation details of the
feature end-to-end. 
2. The design doc has good details on the feature. Do you have any other docs/write-ups detailing
pros/cons compared to the existing HDFS Federation feature.
3. Any recommended/best-practices mount table configurations for the downstream projects?


Thanks,
Manoj G

> On Aug 28, 2017, at 8:02 PM, Iñigo Goiri <elgoiri@gmail.com> wrote:
> 
> Brahma, thank you for the comments.
> i) I can send a patch with the diff between branches.
> ii) Working with Giovanni for the review.
> iii) We had some numbers in our cluster.
> iv) We could have a Router just for giving a view of all the namespaces
> without giving RPC accesses. Another case might be only allowing WebHDFS
> and not RPC. We could consolidate nevertheless.
> I will open a JIRA to extend the documentation with the configuration keys.
> v) I'm open to do more tests. I think the guys from LinkedIn wanted to test
> some more frameworks in their dev setup. In addition, before merging, I'd
> run the version in trunk for a few days.
> v) Good catches, I'll open JIRAs for those.
> 
> On Mon, Aug 28, 2017 at 6:12 AM, Brahma Reddy Battula <
> brahmareddy.battula@huawei.com> wrote:
> 
>> Nice Feature, Great work Guys. Looking forward getting in this, as already
>> YARN federation is in.
>> 
>> At first glance I have few questions
>> 
>> i) Could have a consolidated patch for better review..?
>> 
>> ii) Hoping  "Federation Metrics" and "Federation UI" will be included.
>> 
>> iii) do we've RPC benchmarks ?
>> 
>> iv) As of now "dfs.federation.router.rpc.enable"  and
>> "dfs.federation.router.store.enable" made "true", does we need to keep
>> this configs..? since without this router might not be useful..?
>> 
>> iv) bq. The rest of the options are documented in [hdfs-default.xml]
>> I feel, better to document  all the configurations. I see, there are so
>> many, how about document in tabular format..?
>> 
>> v) Downstream projects (Spark,HBASE,HIVE..) integration testing..? looks
>> you mentioned, is that enough..?
>> 
>> v) mvn install (and package) is failing with following error
>> 
>> [INFO]   Adding ignore: *
>> [WARNING] Rule 1: org.apache.maven.plugins.enforcer.BanDuplicateClasses
>> failed with message:
>> Duplicate classes found:
>> 
>>  Found in:
>>    org.apache.hadoop:hadoop-client-minicluster:jar:3.0.0-
>> beta1-SNAPSHOT:compile
>>    org.apache.hadoop:hadoop-client-runtime:jar:3.0.0-
>> beta1-SNAPSHOT:compile
>>  Duplicate classes:
>>    org/apache/hadoop/shaded/org/apache/curator/framework/api/
>> DeleteBuilder.class
>>    org/apache/hadoop/shaded/org/apache/curator/framework/
>> CuratorFramework.class
>> 
>> 
>> I added "hadoop-client-minicluster" to ignore list to get success
>> 
>> hadoop\hadoop-client-modules\hadoop-client-integration-tests\pom.xml
>> 
>>                  <dependencies>
>>                    <dependency>
>>                      <groupId>org.apache.hadoop</groupId>
>>                      <artifactId>hadoop-annotations</artifactId>
>>                      <ignoreClasses>
>>                        <ignoreClass>*</ignoreClass>
>>                      </ignoreClasses>
>>                    </dependency>
>>                    <dependency>
>>                      <groupId>org.apache.hadoop</groupId>
>>                      <artifactId>hadoop-client-minicluster</artifactId>
>>                      <ignoreClasses>
>>                        <ignoreClass>*</ignoreClass>
>>                      </ignoreClasses>
>>                    </dependency>
>> 
>> 
>> Please correct me If I am wrong.
>> 
>> 
>> --Brahma Reddy Battula
>> 
>> -----Original Message-----
>> From: Chris Douglas [mailto:cdouglas@apache.org]
>> Sent: 25 August 2017 06:37
>> To: Andrew Wang
>> Cc: Iñigo Goiri; hdfs-dev@hadoop.apache.org; subru@apache.org
>> Subject: Re: [DISCUSS] Merge HDFS-10467 to (Router-based federation) trunk
>> 
>> On Thu, Aug 24, 2017 at 2:25 PM, Andrew Wang <andrew.wang@cloudera.com>
>> wrote:
>>> Do you mind holding this until 3.1? Same reasoning as for the other
>>> branch merge proposals, we're simply too late in the 3.0.0 release cycle.
>> 
>> That wouldn't be too dire.
>> 
>> That said, this has the same design and impact as YARN federation.
>> Specifically, it sits almost entirely outside core HDFS, so it will not
>> affect clusters running without R-BF.
>> 
>> Merging would allow the two router implementations to converge on a common
>> backend, which has started with HADOOP-14741 [1]. If the HDFS side only
>> exists in 3.1, then that work would complicate maintenance of YARN in
>> 3.0.x, which may require bug fixes as it stabilizes.
>> 
>> Merging lowers costs for maintenance with a nominal risk to stability.
>> The feature is well tested, deployed, and actively developed. The
>> modifications to core HDFS [2] (~23k) are trivial.
>> 
>> So I'd still advocate for this particular merge on those merits. -C
>> 
>> [1] https://issues.apache.org/jira/browse/HADOOP-14741
>> [2] git diff --diff-filter=M $(git merge-base apache/HDFS-10467
>> apache/trunk)..apache/HDFS-10467
>> 
>>> On Thu, Aug 24, 2017 at 1:39 PM, Chris Douglas <cdouglas@apache.org>
>> wrote:
>>>> 
>>>> I'd definitely support merging this to trunk. The implementation is
>>>> almost entirely outside of HDFS and, as Inigo detailed, has been
>>>> tested at scale. The branch is in a functional state with
>>>> documentation and tests. -C
>>>> 
>>>> On Mon, Aug 21, 2017 at 6:11 PM, Iñigo Goiri <elgoiri@gmail.com> wrote:
>>>>> Hi all,
>>>>> 
>>>>> 
>>>>> 
>>>>> We would like to open a discussion on merging the Router-based
>>>>> Federation feature to trunk.
>>>>> 
>>>>> Last week, there was a thread about which branches would go into
>>>>> 3.0 and given that YARN federation is going, this might be a good
>>>>> time for this to be merged too.
>>>>> 
>>>>> 
>>>>> We have been running "Router-based federation" in production for a
>> year.
>>>>> 
>>>>> Meanwhile, we have been releasing it in a feature branch
>>>>> (HDFS-10467
>>>>> [1])
>>>>> for a while.
>>>>> 
>>>>> We are reasonably confident that the state of the branch is about
>>>>> to meet the criteria to be merged onto trunk.
>>>>> 
>>>>> 
>>>>> *Feature*:
>>>>> 
>>>>> This feature aggregates multiple namespaces into a single one
>>>>> transparently to the user.
>>>>> 
>>>>> It has a similar architecture to YARN federation (YARN-2915).
>>>>> 
>>>>> It consists on Routers that handle requests from the clients and
>>>>> forwards them to the right subcluster and exposes the same API as
>>>>> the Namenode.
>>>>> 
>>>>> Currently we use a mount table (similar to ViewFs) but can be
>>>>> replaced by other approaches.
>>>>> 
>>>>> The Routers share their state in a State Store.
>>>>> 
>>>>> 
>>>>> 
>>>>> The main advantage is that clients interact with the Routers as
>>>>> they were Namenode so there is no changes in the client required
>>>>> other than poiting to the right address.
>>>>> 
>>>>> In addition, all the management is moved to the server side so
>>>>> changes to the Mount Table can be done without having to sync the
>>>>> clients (pull/push).
>>>>> 
>>>>> 
>>>>> 
>>>>> *Status*:
>>>>> 
>>>>> The branch already contains all the features required to work
>>>>> end-to-end.
>>>>> 
>>>>> There are a couple open JIRAs that would be required for the merged
>>>>> (i.e., Web UI) but they should be finished soon.
>>>>> 
>>>>> We have been running it in production for the last year and we have
>>>>> a paper with some of the details of our production deployment [2].
>>>>> 
>>>>> We have 4 production deployments with the largest one spanning more
>>>>> than 20k servers across 6 subclusters.
>>>>> 
>>>>> In addition, the guys at LinkedIn had started testing Router-based
>>>>> federation and they will be adding security to the branch.
>>>>> 
>>>>> 
>>>>> 
>>>>> The modifications to the rest of HDFS are minimal:
>>>>> 
>>>>>   - Changed visibility for some methods (e.g., MiniDFSCluster)
>>>>>   - Added some utilities to extract addresses
>>>>>   - Modified hdfs and hdfs.cmd to start the Router and manager the
>>>>>   federation
>>>>>   - Modified hdfs-default.xml
>>>>> 
>>>>> Everything else is self-contained in a federation package.
>>>>> 
>>>>> In addition, all the functionality is in the Router so it’s
>>>>> disabled by default.
>>>>> 
>>>>> Even when enabled, there is no impact for regular HDFS and it would
>>>>> only require to configure the trust between the Namenode and the
>>>>> Router once security is enabled.
>>>>> 
>>>>> 
>>>>> 
>>>>> I have been continuously rebasing the feature branch (updated up to
>>>>> 1 week
>>>>> ago) so the merge should be a straightforward cherry-pick.
>>>>> 
>>>>> 
>>>>> 
>>>>> *Problems*:
>>>>> 
>>>>> The problems I’m aware of are the following:
>>>>> 
>>>>>   - We implement ClientProtocol so anytime a new method is added
>>>>> there, we
>>>>>   would need to add it to the Router. However, it’s
>>>>> straightforward to add
>>>>>   unimplemented methods.
>>>>>   - There is some argument about naming the feature as “Router-based
>>>>>   federation” but I’m open for better names.
>>>>> 
>>>>> 
>>>>> 
>>>>> *Credits*:
>>>>> 
>>>>> I’d like to thank the people at Microsoft (specially, Jason,
>>>>> Ricardo, Chris, Subru, Jakob, Carlo and Giovanni), Twitter (Ming
>>>>> and Gera), and LinkedIn (Zhe, Erik and Konstantin) for the discussion
>> and the ideas.
>>>>> 
>>>>> Special thanks to Chris Douglas for the thorough reviews!
>>>>> 
>>>>> 
>>>>> 
>>>>> Please look through the branch; feedback is welcome. Thanks!
>>>>> 
>>>>> 
>>>>> Cheers,
>>>>> 
>>>>> Inigo
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> [1] https://issues.apache.org/jira/browse/HDFS-10467
>>>>> 
>>>>> [2] https://www.usenix.org/conference/atc17/technical-
>>>>> sessions/presentation/misra
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
>>>> For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org
>>>> 
>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org
>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org


Mime
View raw message