hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cecile, Adam" <Adam.Cec...@hitec.lu>
Subject RE: NameNode HA from a client perspective
Date Wed, 04 May 2016 08:26:13 GMT

I'm not sure to understand your answer, may I add a little piece of code:

def _build_hdfs_url(self, hdfs_path, hdfs_operation, opt_query_param_tuples=[]):
        :type hdfs_path: str
        :type hdfs_operation: str
        if not hdfs_path.startswith("/"):
            raise WebHdfsException("The web hdfs path must start with / but found " + hdfs_path,
None, None)

        url = 'http://' + self.host + ':' + str(self.port) + '/webhdfs/v1' + hdfs_path + '?user.name='
+ self.user + '&op=' + hdfs_operation
        len_param = len(opt_query_param_tuples)
        for index in range(len_param):
            key_value = opt_query_param_tuples[index]
            url += "&{}={}".format(key_value[0], str(key_value[1]))
        return url

Here is a plain python standard distribution function extracted from an app: the problem here
is "self.host", it has to be IP address ou DNS name of the NameNode, however I'd like to turn
this into something dynamic resolving to the current active master.

Regards, Adam.

De : Sandeep Nemuri <nhsandeep6@gmail.com>
Envoy? : mercredi 4 mai 2016 09:15
? : Cecile, Adam
Cc : user@hadoop.apache.org
Objet : Re: NameNode HA from a client perspective

I think you can simply use the nameservice (dfs.nameservices) which is defined in hdfs-site.xml
The hdfs client should be able to resolve the current active namenode and get the necessary

Sandeep Nemuri

On Wed, May 4, 2016 at 12:04 PM, Cecile, Adam <Adam.Cecile@hitec.lu<mailto:Adam.Cecile@hitec.lu>>

Hello All,

I'd like to have a piece of advice regarding how my HDFS clients should handle the NameNode
high availability feature.
I have a complete setup running with ZKFC and I can see one active and one standby NameNode.
When I kill the active one, the standy gets active and when the original one get back online
it turns into a standby node, perfect.

However I'm not sure how my client apps should handle this, a couple of ideas:
* Handle the bad HTTP code from standby node to switch to the other one
* Integrate Zookeeper client to query for the current active node
* Hack something like a shared-ip linked to the active node

Then I'll have to handle a switch that may occurs during the execution of a client app: should
I just crash and rely on the cluster to restart the job.

Thanks in advance,

Best regards from Luxembourg.?

  Sandeep Nemuri

View raw message