hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brahma Reddy Battula <brahmareddy.batt...@huawei.com>
Subject RE: NameNode HA from a client perspective
Date Wed, 04 May 2016 09:10:15 GMT
1. Have a list of namenodes, built from configurations.
2. Execute the op on each namenode until its success.
3. Have the successfull namenode url as active namenode, and use the same for next operations.
4. Whenever a StandByException or some network exception (other than remote exceptions) occurs,
then repeat #2 and #3, starting from the next namenode url in the list.


--Brahma Reddy Battula

From: Cecile, Adam [mailto:Adam.Cecile@hitec.lu]
Sent: 04 May 2016 16:26
To: Sandeep Nemuri
Cc: user@hadoop.apache.org
Subject: RE: NameNode HA from a client perspective


Hello,



I'm not sure to understand your answer, may I add a little piece of code:



def _build_hdfs_url(self, hdfs_path, hdfs_operation, opt_query_param_tuples=[]):

        """

        :type hdfs_path: str

        :type hdfs_operation: str

        """

        if not hdfs_path.startswith("/"):

            raise WebHdfsException("The web hdfs path must start with / but found " + hdfs_path,
None, None)



        url = 'http://' + self.host + ':' + str(self.port) + '/webhdfs/v1' + hdfs_path + '?user.name='
+ self.user + '&op=' + hdfs_operation

        len_param = len(opt_query_param_tuples)

        for index in range(len_param):

            key_value = opt_query_param_tuples[index]

            url += "&{}={}".format(key_value[0], str(key_value[1]))

        return url



Here is a plain python standard distribution function extracted from an app: the problem here
is "self.host", it has to be IP address ou DNS name of the NameNode, however I'd like to turn
this into something dynamic resolving to the current active master.



Regards, Adam.



________________________________
De : Sandeep Nemuri <nhsandeep6@gmail.com<mailto:nhsandeep6@gmail.com>>
Envoyé : mercredi 4 mai 2016 09:15
À : Cecile, Adam
Cc : user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Objet : Re: NameNode HA from a client perspective

I think you can simply use the nameservice (dfs.nameservices) which is defined in hdfs-site.xml
The hdfs client should be able to resolve the current active namenode and get the necessary
information.

Thanks,
Sandeep Nemuri
[https://mailfoogae.appspot.com/t?sender=abmhzYW5kZWVwNkBnbWFpbC5jb20%3D&type=zerocontent&guid=69e2c096-0009-4482-a881-df6dfb44434f]ᐧ

On Wed, May 4, 2016 at 12:04 PM, Cecile, Adam <Adam.Cecile@hitec.lu<mailto:Adam.Cecile@hitec.lu>>
wrote:

Hello All,


I'd like to have a piece of advice regarding how my HDFS clients should handle the NameNode
high availability feature.
I have a complete setup running with ZKFC and I can see one active and one standby NameNode.
When I kill the active one, the standy gets active and when the original one get back online
it turns into a standby node, perfect.

However I'm not sure how my client apps should handle this, a couple of ideas:
* Handle the bad HTTP code from standby node to switch to the other one
* Integrate Zookeeper client to query for the current active node
* Hack something like a shared-ip linked to the active node

Then I'll have to handle a switch that may occurs during the execution of a client app: should
I just crash and rely on the cluster to restart the job.


Thanks in advance,

Best regards from Luxembourg.​



--
  Regards
  Sandeep Nemuri
Mime
View raw message