From "Qi Yun Liu" <>
Subject Re:Re: Re:Re: Re:Re: Re:How to get datanode numbers in
Date Tue, 21 Apr 2015 02:24:11 GMT
Hi Siddharth,

Per your comments, I created a review request for jira AMBARI-10515:

I am sure I used the correct method #getHostsWithComponent() and not the #getHostWithComponent()
. In my test I found that now the method only returns
one host. And then I found its code is:
     if (len(components) > 0 and len(components[0]["StackServiceComponents"]["hostnames"])
> 0):
        # component available - determine hosts and memory
        componentHostname = components[0]["StackServiceComponents"]["hostnames"][0]
        componentHosts = [host for host in hosts["items"] if host["Hosts"]["host_name"] ==
        return componentHosts
We could see that it only return one host by line 'componentHostname = components[0]["StackServiceComponents"]["hostnames"][0]'.

So, for getting all the hosts, I modified the code to add a for-loop structure as below:
      if (len(components) > 0 and len(components[0]["StackServiceComponents"]["hostnames"])
> 0):
        componentHosts = []
        for index in range(len(components[0]["StackServiceComponents"]["hostnames"])):
          # component available - determine hosts and memory
          componentHostname = components[0]["StackServiceComponents"]["hostnames"][index]
          componentHosts.append([host for host in hosts["items"] if host["Hosts"]["host_name"]
== componentHostname][0])
         return componentHosts
And now, it could return all hosts as expected


At 2015-04-18 02:28:59, "Srimanth Gunturi" <> wrote:

Hi Qi Yun Liu,

On the trunk source #getHostsWithComponent() returns all hosts containing the component.

There is also #getHostWithComponent() which returns the first of those hosts.

I am hoping the right method is being used.

Also, #recommendHDFSConfigurations() has access to all the component-layout (in services variable),
and all details about the host (in hosts variable), so you can do additional improvements.

Hope that helps.



From: Qi Yun Liu <>
Sent: Friday, April 17, 2015 4:38 AM
Subject: Re:Re: Re:Re: Re:How to get datanode numbers in
Hi Siddharth,

Yes, I agree that it makes sense to get only 1 host when we were not supporting heterogeneous
environments. At present, we might support heterogeneous environments, so it also make sense
to update this method accordingly. In addition, I really need this method to return all hosts
and it's also possible that others might also have similar requirements. Therefore, I created
a patch for jira AMBARI-10515 and did some tests to ensure no regression bug is brought. Could
you please help take a review on it?

Thanks a lot!

At 2015-04-17 00:34:51, "Siddharth Wagle" <> wrote:

By we, I meant the stack advisor feature only.


From: Siddharth Wagle
Sent: Thursday, April 16, 2015 9:34 AM
Subject: Re: Re:Re: Re:How to get datanode numbers in

Hi Qi Yun Li,

I believe the 1 host is intended behavior since a the moment we are not supporting heterogeneous
environments that is what any 1 of candidate hosts is chose to represent what cpu / memory
/ disk characteristics to use for recommending configurations for a component.

Srimanth, can attest to this.



From: Qi Yun Liu <>
Sent: Wednesday, April 15, 2015 11:34 PM
Subject: Re:Re: Re:How to get datanode numbers in
Hi Siddharth,

Thanks a lot for your comments! According to your suggestions´╝î I did a test:
1. Launch Ambari server GUI and start a brand new cluster installation
2. In 'Assign Slaves and Clients' page, select two hosts(, as
the datanodes
3. After clicking Next button, I found in the ambari-server/resources/stacks/HDP/2.2/services/,
the value of datanodeHosts got by line 'datanodeHosts = self.getHostsWithComponent("HDFS",
"DATANODE", services, hosts)' only includes 1 host info as below and its length is 1 not 2:
datanodeHosts= [{u'href': u'/api/v1/hosts/', u'Hosts': {u'last_heartbeat_time':
1429155431532, u'recovery_summary': u'DISABLED', u'host_health_report': u'', u'disk_info':
[{u'available': u'83808976', u'used': u'7951800', u'percent': u'9%', u'device': u'/dev/mapper/vg_sdsvm923094-lv_root',
u'mountpoint': u'/', u'type': u'ext4', u'size': u'96671468'}, {u'available': u'4031416', u'used':
u'0', u'percent': u'0%', u'device': u'tmpfs', u'mountpoint': u'/dev/shm', u'type': u'tmpfs',
u'size': u'4031416'}, {u'available': u'378216', u'used': u'92028', u'percent': u'20%', u'device':
u'/dev/sda1', u'mountpoint': u'/boot', u'type': u'ext4', u'size': u'495844'}], u'desired_configs':
None, u'cpu_count': 2, u'recovery_report': {u'component_reports': [], u'summary': u'DISABLED'},
u'host_state': u'HEALTHY', u'os_arch': u'x86_64', u'total_mem': 8062836, u'host_status': u'HEALTHY',
u'last_registration_time': 1429153847302, u'os_family': u'redhat6', u'host_name': u'',
u'ip': u'', u'rack_info': u'/default-rack', u'os_type': u'redhat6', u'last_agent_env':
{u'transparentHugePage': u'never', u'hostHealth': {u'agentTimeStampAtReporting': 1429155391291,
u'activeJavaProcs': [], u'serverTimeStampAtReporting': 1429155391348, u'liveServices': [{u'status':
u'Healthy', u'name': u'ntpd', u'desc': u''}]}, u'umask': 18, u'reverseLookup': True, u'alternatives':
[], u'existingUsers': [], u'firewallName': u'iptables', u'stackFoldersAndFiles': [{u'type':
u'directory', u'name': u'/etc/slider'}], u'existingRepos': [], u'installedPackages': [], u'firewallRunning':
False}, u'public_host_name': u'', u'ph_cpu_count': 2}}]

I think the correct value of datanodeHosts should not only include '' but also
include '', and the length of datanodeHosts should be 2, but not 1, because they
were all selected as datanode host.

Is it a bug? 

At 2015-04-11 13:09:33, "Siddharth Wagle" <> wrote:

Hi Qi Yun Liu,

This method is what you are looking for : stacks/HDP/2.0.6/services



datanodeHosts = self.getHostsWithComponent("HDFS", "DATANODE", services, hosts)

In: stacks/HDP/2.2/services/




From: Qi Yun Liu <>
Sent: Friday, April 10, 2015 7:13 PM
Subject: Re:How to get datanode numbers in
I just wanna to get datanode numbers in,
configurations, clusterData)?

Anyone could help me?

Thanks in advance!

At 2015-04-10 15:09:18, "Qi Yun Liu" <> wrote:

Hi Experts,

How to get datanode numbers in, configurations,
clusterData)? using its input parameters 'self', 'configurations' or 'clusterData'. At the
same time, another method 'def getComponentLayoutValidations(self, services, hosts)' has a
input parameter 'services', so it could get nameNodeHosts using 'services', but I failed to
get parameter 'services' in method recommendHDFSConfigurations.

Any comments?

Thanks a lot!

