hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anu Engineer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15774) Discovery of HA servers
Date Fri, 21 Sep 2018 18:22:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16624003#comment-16624003
] 

Anu Engineer commented on HADOOP-15774:
---------------------------------------

{quote}[~anu] any thoughts on the general approach? I'm not very familiar with the OZone architecture;
do you guys also need to specify servers manually?
{quote}
[~elgoiri] Thanks for posting the design docs. My apologies for the delayed response, been
busy with Ozone Alpha release.

I have proposed the same idea a couple of times internally. Each time it got shot down because
of the issue that [~stevel@apache.org] is referring to.
{quote}Being part of the YARN may bring some weird dependencies (HDFS->YARN), any ideas
here? Should it be move to a separate place? Maybe parts?
{quote}
The most important concern was always HDFS taking a dependency on YARN. It creates a cyclical
dependency.

Ozone needs and plans to build something very similar to this, but our thought has been to
create something like this in SCM.

For Hadoop and also from Ozone perspective, The consideration for such a service should be
slightly different. It is possible for someone to run a cluster without HDFS or even without
YARN. A Discovery service should be independent of these specific services.

If we are planning to make this a Hadoop level discovery service we should probably build
this independent of all other services. If possible, even independent of ZooKeeper.

Then this service can be the bootup service, and all other services including HDFS, YARN and
even Zookeeper can use this service to discover endpoints. Many services need to find the
address of zookeeper too.

If this service can be built independent of all other services, Ozone can certainly use this,
and we don't need to reinvent discovery.

Another critical issue that Ozone struggles with this the issue of config changes – for
example, in the HA case if a server fails and a new server is added Datanodes need to be told
of that.

Ozone solves it by reading these changes from SCM, not only config via Heartbeat.

But this problem is more generic, in the sense that there is a class of configuration changes
that need to be pushed to all entities in the cluster. We have been thinking about building
a discovery and config store for Ozone.

> Discovery of HA servers
> -----------------------
>
>                 Key: HADOOP-15774
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15774
>             Project: Hadoop Common
>          Issue Type: New Feature
>            Reporter: Íñigo Goiri
>            Priority: Major
>         Attachments: Discovery Service.pdf
>
>
> Currently, Hadoop relies on configuration files to specify the servers.
> This requires maintaining these configuration files and propagating the changes.
> Hadoop should have a framework to provide discovery.
> For example, in HDFS, we could define the Namenodes in a shared location and the DNs
would use the framework to find the Namenodes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message