Mailing-List: contact user-help@ambari.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@ambari.apache.org
Received-SPF: pass (athena.apache.org: domain of yusaku@hortonworks.com
 designates 64.78.52.187 as permitted sender)
Subject: Re: decommission multiple nodes issue
MIME-Version: 1.0
From: Yusaku Sako <yusaku@hortonworks.com>
To: "user@ambari.apache.org" <user@ambari.apache.org>, Sean Roberts
	<sroberts@hortonworks.com>
Thread-Topic: decommission multiple nodes issue
Thread-Index: 
 AQHQVRwpMNzCg9Z9ik67u3EOCI8rx50KGFiA//98nICAAIa2AIAA6j0AgACqOgCAABLpgA==
Date: Wed, 4 Mar 2015 04:49:41 +0000
Message-ID: <D11BCEC5.556B4%yusaku@hortonworks.com>
In-Reply-To: <D11BBDBA.55626%yusaku@hortonworks.com>
Accept-Language: en-US
Content-Language: en-US
Content-Type: multipart/alternative;
	boundary="_000_D11BCEC5556B4yusakuhortonworkscom_"

--_000_D11BCEC5556B4yusakuhortonworkscom_
Content-Type: text/plain; charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable

BTW, I've started a new Wiki on decommissioning DataNodes: https://cwiki.ap=
ache.org/confluence/display/AMBARI/API+to+decommission+DataNodes

Yusaku

From: Yusaku Sako <yusaku@hortonworks.com<mailto:yusaku@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" <user@amb=
ari.apache.org<mailto:user@ambari.apache.org>>
Date: Tuesday, March 3, 2015 7:41 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" <user@ambari.ap=
ache.org<mailto:user@ambari.apache.org>>, Sean Roberts <sroberts@hortonwork=
s.com<mailto:sroberts@hortonworks.com>>
Subject: Re: decommission multiple nodes issue

Hi Greg,

This is actually by design.
If you want to decommission all DataNodes regardless of their host maintena=
nce mode, you need to change "RequestInfo/level" from "CLUSTER" to "HOST_CO=
MPONENT".
When you set the "level" to "CLUSTER", bulk operations (in this case decomm=
ission) would be skipped on the matching target resources in case the host(=
s) are in maintenance mode.
If you set to "HOST_COMPONENT", it would ignore any host-level maintenance =
mode.
This is a really mysterious, undocumented part of Ambari, unfortunately.

Yusaku

From: Greg Hill <greg.hill@RACKSPACE.COM<mailto:greg.hill@RACKSPACE.COM>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" <user@amb=
ari.apache.org<mailto:user@ambari.apache.org>>
Date: Tuesday, March 3, 2015 9:32 AM
To: Sean Roberts <sroberts@hortonworks.com<mailto:sroberts@hortonworks.com>=
>, "user@ambari.apache.org<mailto:user@ambari.apache.org>" <user@ambari.apa=
che.org<mailto:user@ambari.apache.org>>
Subject: Re: decommission multiple nodes issue

I have verified that if maintenance mode is set on a host, then it is ignor=
ed by the decommission process, but only if you try to decommission multipl=
e hosts at the same time.  I'll open a bug.

Greg

From: Sean Roberts <sroberts@hortonworks.com<mailto:sroberts@hortonworks.co=
m>>
Date: Monday, March 2, 2015 at 1:34 PM
To: Greg <greg.hill@rackspace.com<mailto:greg.hill@rackspace.com>>, "user@a=
mbari.apache.org<mailto:user@ambari.apache.org>" <user@ambari.apache.org<ma=
ilto:user@ambari.apache.org>>
Subject: Re: decommission multiple nodes issue

Greg - Same here on submitting JSON. Although they are JSON documents you h=
ave to submit them as plain form. This is true across all of Ambari. I open=
ed a bug for it a month back.


--
Hortonworks - We do Hadoop

Sean Roberts
Partner Solutions Engineer - EMEA
@seano

From: Greg Hill <greg.hill@rackspace.com><mailto:greg.hill@rackspace.com>
Date: March 2, 2015 at 19:32:34
To: Sean Roberts <sroberts@hortonworks.com>><mailto:sroberts@hortonworks.co=
m>, user@ambari.apache.org<mailto:user@ambari.apache.org><user@ambari.apach=
e.org>><mailto:user@ambari.apache.org>
Subject:  Re: decommission multiple nodes issue

That causes a server error.  I=92ve yet to see any part of the API that acc=
epts JSON arrays like that as input; it=92s almost always, if not always, a=
 comma-separated string like I posted.  Many methods even return double-enc=
oded JSON values (i.e. =93key=94: =93[\=94value1\=94,\=94value2\=94]").  It=
=92s kind of annoying and inconsistent, honestly, and not documented anywhe=
re.  You just have to have your client code choke on it and then go add ano=
ther data[key] =3D json.loads(data[key]) in the client to account for it.

I am starting to think it=92s because I set the nodes into maintenance mode=
 first, as doing the decommission command manually from the client works fi=
ne when the nodes aren=92t in maintenance mode.  I=92ll keep digging, I gue=
ss, but it is weird that the exact same command worked this time (the comma=
ndArgs are identical to the one that did nothing).

Greg

From: Sean Roberts <sroberts@hortonworks.com<mailto:sroberts@hortonworks.co=
m>>
Date: Monday, March 2, 2015 at 1:22 PM
To: Greg <greg.hill@rackspace.com<mailto:greg.hill@rackspace.com>>, "user@a=
mbari.apache.org<mailto:user@ambari.apache.org>" <user@ambari.apache.org<ma=
ilto:user@ambari.apache.org>>
Subject: Re: decommission multiple nodes issue


Racker Greg - I=92m not familiar with the decommissioning API, but if it=92=
s consistent with the rest of Ambari, you=92ll need to change from this:

"excluded_hosts": =93slave-1.local,slave-2.local"

To this:

"excluded_hosts" : [ "slave-1.local","slave-2.local" ]


--
Hortonworks - We do Hadoop

Sean Roberts
Partner Solutions Engineer - EMEA
@seano

From: Greg Hill <greg.hill@rackspace.com><mailto:greg.hill@rackspace.com>
Reply: user@ambari.apache.org<mailto:user@ambari.apache.org><user@ambari.ap=
ache.org>><mailto:user@ambari.apache.org>
Date: March 2, 2015 at 19:08:13
To: user@ambari.apache.org<mailto:user@ambari.apache.org><user@ambari.apach=
e.org>><mailto:user@ambari.apache.org>
Subject:  decommission multiple nodes issue

I have some code for decommissioning datanodes prior to removal.  It seems =
to work fine with a single node, but with multiple nodes it fails.  When pa=
ssing multiple hosts, I am putting the names in a comma-separated string, a=
s seems to be the custom with other Ambari API commands.  I attempted to se=
nd it as a JSON array, but the server complained about that.  Let me know i=
f that is the wrong format.  The decommission request completes successfull=
y, it just never writes the excludes file so no nodes are decommissioned.

This fails for mutiple nodes:

"RequestInfo": {
                "command": "DECOMMISSION",
                "context": "Decommission DataNode=94),
                "parameters": {"slave_type": =93DATANODE", "excluded_hosts"=
: =93slave-1.local,slave-2.local"},
                "operation_level": {
=93level=94: =93CLUSTER=94,
=93cluster_name=94: cluster_name
},
            },
            "Requests/resource_filters": [{
                "service_name": =93HDFS",
                "component_name": =93NAMENODE",
            }],

But this works for a single node:

"RequestInfo": {
                "command": "DECOMMISSION",
                "context": "Decommission DataNode=94),
                "parameters": {"slave_type": =93DATANODE", "excluded_hosts"=
: =93slave-1.local"},
                "operation_level": {
=93level=94: =93HOST_COMPONENT=94,
=93cluster_name=94: cluster_name,
=93host_name=94: =93slave-1.local=94,
=93service_name=94: =93HDFS=94
},
            },
            "Requests/resource_filters": [{
                "service_name": =93HDFS",
                "component_name": =93NAMENODE",
            }],

Looking on the actual node, it=92s obvious that the file isn=92t being writ=
ten by the command output:

(multiple hosts, notice there is no =91Writing File=92 line)
File['/etc/hadoop/conf/dfs.exclude'] {'owner': 'hdfs', 'content': Template(=
'exclude_hosts_list.j2'), 'group': 'hadoop'}
Execute[''] {'user': 'hdfs'}
ExecuteHadoop['dfsadmin -refreshNodes'] {'bin_dir': '/usr/hdp/current/hadoo=
p-client/bin', 'conf_dir': '/etc/hadoop/conf', 'kinit_override': True, 'use=
r': 'hdfs'}
Execute['hadoop --config /etc/hadoop/conf dfsadmin -refreshNodes'] {'logout=
put': False, 'path': ['/usr/hdp/current/hadoop-client/bin'], 'tries': 1, 'u=
ser': 'hdfs', 'try_sleep': 0}

(single host, it writes the exclude file)
File['/etc/hadoop/conf/dfs.exclude'] {'owner': 'hdfs', 'content': Template(=
'exclude_hosts_list.j2'), 'group': 'hadoop'}
Writing File['/etc/hadoop/conf/dfs.exclude'] because contents don't match
Execute[''] {'user': 'hdfs'}
ExecuteHadoop['dfsadmin -refreshNodes'] {'bin_dir': '/usr/hdp/current/hadoo=
p-client/bin', 'conf_dir': '/etc/hadoop/conf', 'kinit_override': True, 'use=
r': 'hdfs'}
Execute['hadoop --config /etc/hadoop/conf dfsadmin -refreshNodes'] {'logout=
put': False, 'path': ['/usr/hdp/current/hadoop-client/bin'], 'tries': 1, 'u=
ser': 'hdfs', 'try_sleep': 0}

The only notable difference in the command.json is the commandParams/exclud=
ed_hosts param, so it=92s not like the request is passing the information a=
long incorrectly.  I=92m going to play around with the format I use to pass=
 it in and take some wild guesses like it=92s expecting double-encoded JSON=
 as I=92ve seen that in other places, but if someone knows the answer offha=
nd and can help out, that would be appreciated.  If it turns out to be a bu=
g in Ambari, I=92ll open a JIRA and rewrite our code to issue the decommiss=
ion call independently for each host.

Greg

--_000_D11BCEC5556B4yusakuhortonworkscom_
Content-Type: text/html; charset="Windows-1252"
Content-ID: <25F12839DF721349BA45D3D5D4B451CD@exch080.serverpod.net>
Content-Transfer-Encoding: quoted-printable

<html>
<head>
<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3DWindows-1=
252">
</head>
<body style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-lin=
e-break: after-white-space; color: rgb(0, 0, 0); font-size: 14px; font-fami=
ly: Calibri, sans-serif; ">
<div>BTW, I've started a new Wiki on decommissioning DataNodes:&nbsp;<a hre=
f=3D"https://cwiki.apache.org/confluence/display/AMBARI/API&#43;to&#43;deco=
mmission&#43;DataNodes">https://cwiki.apache.org/confluence/display/AMBARI/=
API&#43;to&#43;decommission&#43;DataNodes</a></div>
<div><br>
</div>
<div>Yusaku</div>
<div><br>
</div>
<span id=3D"OLK_SRC_BODY_SECTION">
<div style=3D"font-family:Calibri; font-size:11pt; text-align:left; color:b=
lack; BORDER-BOTTOM: medium none; BORDER-LEFT: medium none; PADDING-BOTTOM:=
 0in; PADDING-LEFT: 0in; PADDING-RIGHT: 0in; BORDER-TOP: #b5c4df 1pt solid;=
 BORDER-RIGHT: medium none; PADDING-TOP: 3pt">
<span style=3D"font-weight:bold">From: </span>Yusaku Sako &lt;<a href=3D"ma=
ilto:yusaku@hortonworks.com">yusaku@hortonworks.com</a>&gt;<br>
<span style=3D"font-weight:bold">Reply-To: </span>&quot;<a href=3D"mailto:u=
ser@ambari.apache.org">user@ambari.apache.org</a>&quot; &lt;<a href=3D"mail=
to:user@ambari.apache.org">user@ambari.apache.org</a>&gt;<br>
<span style=3D"font-weight:bold">Date: </span>Tuesday, March 3, 2015 7:41 P=
M<br>
<span style=3D"font-weight:bold">To: </span>&quot;<a href=3D"mailto:user@am=
bari.apache.org">user@ambari.apache.org</a>&quot; &lt;<a href=3D"mailto:use=
r@ambari.apache.org">user@ambari.apache.org</a>&gt;, Sean Roberts &lt;<a hr=
ef=3D"mailto:sroberts@hortonworks.com">sroberts@hortonworks.com</a>&gt;<br>
<span style=3D"font-weight:bold">Subject: </span>Re: decommission multiple =
nodes issue<br>
</div>
<div><br>
</div>
<div>
<div style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line=
-break: after-white-space; color: rgb(0, 0, 0); font-size: 14px; font-famil=
y: Calibri, sans-serif; ">
<div>Hi Greg,</div>
<div><br>
</div>
<div>This is actually by design.</div>
<div>If you want to decommission all DataNodes <span style=3D"font-weight: =
bold; ">
regardless of their host maintenance mode</span>, you need to change &quot;=
RequestInfo/level&quot; from &quot;CLUSTER&quot; to &quot;HOST_COMPONENT&qu=
ot;.</div>
<div>When you set the &quot;level&quot; to &quot;CLUSTER&quot;, bulk operat=
ions (in this case decommission) would be skipped on the matching target re=
sources in case the host(s) are in maintenance mode.</div>
<div>If you set to &quot;HOST_COMPONENT&quot;, it would ignore any host-lev=
el maintenance mode.</div>
<div>This is a really mysterious, undocumented part of Ambari, unfortunatel=
y.</div>
<div><br>
</div>
<div>Yusaku</div>
<div><br>
</div>
<span id=3D"OLK_SRC_BODY_SECTION">
<div style=3D"font-family:Calibri; font-size:11pt; text-align:left; color:b=
lack; BORDER-BOTTOM: medium none; BORDER-LEFT: medium none; PADDING-BOTTOM:=
 0in; PADDING-LEFT: 0in; PADDING-RIGHT: 0in; BORDER-TOP: #b5c4df 1pt solid;=
 BORDER-RIGHT: medium none; PADDING-TOP: 3pt">
<span style=3D"font-weight:bold">From: </span>Greg Hill &lt;<a href=3D"mail=
to:greg.hill@RACKSPACE.COM">greg.hill@RACKSPACE.COM</a>&gt;<br>
<span style=3D"font-weight:bold">Reply-To: </span>&quot;<a href=3D"mailto:u=
ser@ambari.apache.org">user@ambari.apache.org</a>&quot; &lt;<a href=3D"mail=
to:user@ambari.apache.org">user@ambari.apache.org</a>&gt;<br>
<span style=3D"font-weight:bold">Date: </span>Tuesday, March 3, 2015 9:32 A=
M<br>
<span style=3D"font-weight:bold">To: </span>Sean Roberts &lt;<a href=3D"mai=
lto:sroberts@hortonworks.com">sroberts@hortonworks.com</a>&gt;, &quot;<a hr=
ef=3D"mailto:user@ambari.apache.org">user@ambari.apache.org</a>&quot; &lt;<=
a href=3D"mailto:user@ambari.apache.org">user@ambari.apache.org</a>&gt;<br>
<span style=3D"font-weight:bold">Subject: </span>Re: decommission multiple =
nodes issue<br>
</div>
<div><br>
</div>
<div>
<div style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line=
-break: after-white-space; color: rgb(0, 0, 0); font-size: 14px; font-famil=
y: Calibri, sans-serif;">
<div>I have verified that if maintenance mode is set on a host, then it is =
ignored by the decommission process, but only if you try to decommission mu=
ltiple hosts at the same time. &nbsp;I'll open a bug.</div>
<div><br>
</div>
<div>Greg</div>
<div><br>
</div>
<span id=3D"OLK_SRC_BODY_SECTION">
<div style=3D"font-family:Calibri; font-size:11pt; text-align:left; color:b=
lack; BORDER-BOTTOM: medium none; BORDER-LEFT: medium none; PADDING-BOTTOM:=
 0in; PADDING-LEFT: 0in; PADDING-RIGHT: 0in; BORDER-TOP: #b5c4df 1pt solid;=
 BORDER-RIGHT: medium none; PADDING-TOP: 3pt">
<span style=3D"font-weight:bold">From: </span>Sean Roberts &lt;<a href=3D"m=
ailto:sroberts@hortonworks.com">sroberts@hortonworks.com</a>&gt;<br>
<span style=3D"font-weight:bold">Date: </span>Monday, March 2, 2015 at 1:34=
 PM<br>
<span style=3D"font-weight:bold">To: </span>Greg &lt;<a href=3D"mailto:greg=
.hill@rackspace.com">greg.hill@rackspace.com</a>&gt;, &quot;<a href=3D"mail=
to:user@ambari.apache.org">user@ambari.apache.org</a>&quot; &lt;<a href=3D"=
mailto:user@ambari.apache.org">user@ambari.apache.org</a>&gt;<br>
<span style=3D"font-weight:bold">Subject: </span>Re: decommission multiple =
nodes issue<br>
</div>
<div><br>
</div>
<blockquote id=3D"MAC_OUTLOOK_ATTRIBUTION_BLOCKQUOTE" style=3D"BORDER-LEFT:=
 #b5c4df 5 solid; PADDING:0 0 0 5; MARGIN:0 0 0 5;">
<div><style>body{font-family:Helvetica,Arial;font-size:13px}</style>
<div style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line=
-break: after-white-space;">
<div id=3D"bloop_customfont" style=3D"font-family:Helvetica,Arial;font-size=
:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;">
Greg - Same here on submitting JSON. Although they are JSON documents you h=
ave to submit them as plain form. This is true across all of Ambari. I open=
ed a bug for it a month back.</div>
<div id=3D"bloop_customfont" style=3D"font-family:Helvetica,Arial;font-size=
:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;">
<br>
</div>
<br>
<div id=3D"bloop_sign_1425324770023853056" class=3D"bloop_sign">
<div style=3D"font-family:helvetica,arial;font-size:13px">--&nbsp;<br>
Hortonworks - We do Hadoop</div>
<div style=3D"font-family:helvetica,arial;font-size:13px"><br>
</div>
<div style=3D"font-family:helvetica,arial;font-size:13px">Sean Roberts<br>
Partner Solutions Engineer - EMEA</div>
<div style=3D"font-family:helvetica,arial;font-size:13px">@seano</div>
</div>
<div class=3D"airmail_ext_on" style=3D"color:black"><br>
From:&nbsp;<span style=3D"color:black">Greg Hill</span> <a href=3D"mailto:g=
reg.hill@rackspace.com">
&lt;greg.hill@rackspace.com&gt;</a><br>
Date:&nbsp;<span style=3D"color:black">March 2, 2015 at 19:32:34</span><br>
To:&nbsp;<span style=3D"color:black">Sean Roberts</span> <a href=3D"mailto:=
sroberts@hortonworks.com">
&lt;sroberts@hortonworks.com&gt;&gt;</a>, <span style=3D"color:black"><a hr=
ef=3D"mailto:user@ambari.apache.org">user@ambari.apache.org</a></span><a hr=
ef=3D"mailto:user@ambari.apache.org">&lt;user@ambari.apache.org&gt;&gt;</a>=
<br>
Subject:&nbsp;<span style=3D"color:black"> Re: decommission multiple nodes =
issue <br>
</span></div>
<br>
<blockquote type=3D"cite" class=3D"clean_bq"><span>
<div style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line=
-break: after-white-space; color: rgb(0, 0, 0); font-size: 14px; font-famil=
y: Calibri, sans-serif;">
<div></div>
<div>
<title></title>
<div>That causes a server error. &nbsp;I=92ve yet to see any part of the AP=
I that accepts JSON arrays like that as input; it=92s almost always, if not=
 always, a comma-separated string like I posted. &nbsp;Many methods even re=
turn double-encoded JSON values (i.e. =93key=94: =93[\=94value1\=94,\=94val=
ue2\=94]&quot;).
 &nbsp;It=92s kind of annoying and inconsistent, honestly, and not document=
ed anywhere. &nbsp;You just have to have your client code choke on it and t=
hen go add another data[key] =3D json.loads(data[key]) in the client to acc=
ount for it.</div>
<div><br>
</div>
<div>I am starting to think it=92s because I set the nodes into maintenance=
 mode first, as doing the decommission command manually from the client wor=
ks fine when the nodes aren=92t in maintenance mode. &nbsp;I=92ll keep digg=
ing, I guess, but it is weird that the exact
 same command worked this time (the commandArgs are identical to the one th=
at did nothing).</div>
<div><br>
</div>
<div>Greg</div>
<div><br>
</div>
<span id=3D"OLK_SRC_BODY_SECTION"></span>
<div style=3D"font-family:Calibri; font-size:11pt; text-align:left; color:b=
lack; BORDER-BOTTOM: medium none; BORDER-LEFT: medium none; PADDING-BOTTOM:=
 0in; PADDING-LEFT: 0in; PADDING-RIGHT: 0in; BORDER-TOP: #b5c4df 1pt solid;=
 BORDER-RIGHT: medium none; PADDING-TOP: 3pt">
<span id=3D"OLK_SRC_BODY_SECTION"><span style=3D"font-weight:bold">From:</s=
pan> Sean Roberts &lt;<a href=3D"mailto:sroberts@hortonworks.com">sroberts@=
hortonworks.com</a>&gt;<br>
<span style=3D"font-weight:bold">Date:</span> Monday, March 2, 2015 at 1:22=
 PM<br>
<span style=3D"font-weight:bold">To:</span> Greg &lt;<a href=3D"mailto:greg=
.hill@rackspace.com">greg.hill@rackspace.com</a>&gt;, &quot;<a href=3D"mail=
to:user@ambari.apache.org">user@ambari.apache.org</a>&quot; &lt;<a href=3D"=
mailto:user@ambari.apache.org">user@ambari.apache.org</a>&gt;<br>
<span style=3D"font-weight:bold">Subject:</span> Re: decommission multiple =
nodes issue<br>
</span></div>
<div><br>
</div>
<blockquote id=3D"MAC_OUTLOOK_ATTRIBUTION_BLOCKQUOTE" style=3D"BORDER-LEFT:=
 #b5c4df 5 solid; PADDING:0 0 0 5; MARGIN:0 0 0 5;">
<div>
<div style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line=
-break: after-white-space;">
<div class=3D"bloop_markdown">
<p>Racker Greg - I=92m not familiar with the decommissioning API, but if it=
=92s consistent with the rest of Ambari, you=92ll need to change from this:=
</p>
<p><code>&quot;excluded_hosts&quot;: =93slave-1.local,slave-2.local&quot;</=
code></p>
<p>To this:</p>
<p><code>&quot;excluded_hosts&quot; : [ &quot;slave-1.local&quot;,&quot;sla=
ve-2.local&quot; ]</code></p>
</div>
<div class=3D"bloop_original_html">
<div id=3D"bloop_customfont" style=3D"font-family:Helvetica,Arial;font-size=
:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;">
<br>
</div>
<br>
<div id=3D"bloop_sign_1425323974632207872" class=3D"bloop_sign">
<div style=3D"font-family:helvetica,arial;font-size:13px">--&nbsp;<br>
Hortonworks - We do Hadoop</div>
<div style=3D"font-family:helvetica,arial;font-size:13px"><br>
</div>
<div style=3D"font-family:helvetica,arial;font-size:13px">Sean Roberts<br>
Partner Solutions Engineer - EMEA</div>
<div style=3D"font-family:helvetica,arial;font-size:13px">@seano</div>
</div>
<div class=3D"airmail_ext_on" style=3D"color:black"><br>
From:&nbsp;<span style=3D"color:black">Greg Hill</span> <a href=3D"mailto:g=
reg.hill@rackspace.com">
&lt;greg.hill@rackspace.com&gt;</a><br>
Reply:&nbsp;<span style=3D"color:black"><a href=3D"mailto:user@ambari.apach=
e.org">user@ambari.apache.org</a></span><a href=3D"mailto:user@ambari.apach=
e.org">&lt;user@ambari.apache.org&gt;&gt;</a><br>
Date:&nbsp;<span style=3D"color:black">March 2, 2015 at 19:08:13</span><br>
To:&nbsp;<span style=3D"color:black"><a href=3D"mailto:user@ambari.apache.o=
rg">user@ambari.apache.org</a></span><a href=3D"mailto:user@ambari.apache.o=
rg">&lt;user@ambari.apache.org&gt;&gt;</a><br>
Subject:&nbsp; <span style=3D"color:black">decommission multiple nodes issu=
e<br>
</span></div>
<br>
<blockquote type=3D"cite" class=3D"clean_bq">
<div style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line=
-break: after-white-space; color: rgb(0, 0, 0); font-size: 14px; font-famil=
y: Calibri, sans-serif;">
<div>
<div><span>I have some code for decommissioning datanodes prior to removal.=
 &nbsp;It seems to work fine with a single node, but with multiple nodes it=
 fails. &nbsp;When passing multiple hosts, I am putting the names in a comm=
a-separated string, as seems to be the custom
 with other Ambari API commands. &nbsp;I attempted to send it as a JSON arr=
ay, but the server complained about that. &nbsp;Let me know if that is the =
wrong format. &nbsp;The decommission request completes successfully, it jus=
t never writes the excludes file so no nodes are
 decommissioned.</span></div>
<div><span><br>
</span></div>
<div><span>This fails for mutiple nodes:</span></div>
<div><span><br>
</span></div>
<div>
<div>
<div><span>&quot;RequestInfo&quot;: {</span></div>
<div><span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &quot;co=
mmand&quot;: &quot;DECOMMISSION&quot;,</span></div>
<div><span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &quot;co=
ntext&quot;: &quot;Decommission DataNode=94),</span></div>
<div><span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &quot;pa=
rameters&quot;: {&quot;slave_type&quot;: =93DATANODE&quot;, &quot;excluded_=
hosts&quot;: =93slave-1.local,slave-2.local&quot;},</span></div>
<div><span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &quot;op=
eration_level&quot;: {</span></div>
<div><span>=93level=94: =93CLUSTER=94,</span></div>
<div><span>=93cluster_name=94: cluster_name</span></div>
<div><span>},</span></div>
<div><span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; },</span></div>
<div><span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &quot;Requests/resourc=
e_filters&quot;: [{</span></div>
<div><span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &quot;se=
rvice_name&quot;: =93HDFS&quot;,</span></div>
<div><span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &quot;co=
mponent_name&quot;: =93NAMENODE&quot;,</span></div>
<div><span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }],</span></div>
</div>
</div>
<div><span><br>
</span></div>
<div><span>But this works for a single node:</span></div>
<div><span><br>
</span></div>
<div>
<div><span>&quot;RequestInfo&quot;: {</span></div>
<div><span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &quot;co=
mmand&quot;: &quot;DECOMMISSION&quot;,</span></div>
<div><span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &quot;co=
ntext&quot;: &quot;Decommission DataNode=94),</span></div>
<div><span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &quot;pa=
rameters&quot;: {&quot;slave_type&quot;: =93DATANODE&quot;, &quot;excluded_=
hosts&quot;: =93slave-1.local&quot;},</span></div>
<div><span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &quot;op=
eration_level&quot;: {</span></div>
<div><span>=93level=94: =93HOST_COMPONENT=94,</span></div>
<div><span>=93cluster_name=94: cluster_name,</span></div>
<div><span>=93host_name=94: =93slave-1.local=94,</span></div>
<div><span>=93service_name=94: =93HDFS=94</span></div>
<div><span>},</span></div>
<div><span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; },</span></div>
<div><span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &quot;Requests/resourc=
e_filters&quot;: [{</span></div>
<div><span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &quot;se=
rvice_name&quot;: =93HDFS&quot;,</span></div>
<div><span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &quot;co=
mponent_name&quot;: =93NAMENODE&quot;,</span></div>
<div><span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }],</span></div>
</div>
<div><span><br>
</span></div>
<div><span>Looking on the actual node, it=92s obvious that the file isn=92t=
 being written by the command output:</span></div>
<div><span><br>
</span></div>
<div><span>(multiple hosts, notice there is no =91Writing File=92 line)</sp=
an></div>
<div>
<div><span>File['/etc/hadoop/conf/dfs.exclude'] {'owner': 'hdfs', 'content'=
: Template('exclude_hosts_list.j2'), 'group': 'hadoop'}</span></div>
<div><span>Execute[''] {'user': 'hdfs'}</span></div>
<div><span>ExecuteHadoop['dfsadmin -refreshNodes'] {'bin_dir': '/usr/hdp/cu=
rrent/hadoop-client/bin', 'conf_dir': '/etc/hadoop/conf', 'kinit_override':=
 True, 'user': 'hdfs'}</span></div>
<div><span>Execute['hadoop --config /etc/hadoop/conf dfsadmin -refreshNodes=
'] {'logoutput': False, 'path': ['/usr/hdp/current/hadoop-client/bin'], 'tr=
ies': 1, 'user': 'hdfs', 'try_sleep': 0}</span></div>
<div><span><br>
</span></div>
<div><span>(single host, it writes the exclude file)</span></div>
<div><span>File['/etc/hadoop/conf/dfs.exclude'] {'owner': 'hdfs', 'content'=
: Template('exclude_hosts_list.j2'), 'group': 'hadoop'}</span></div>
<div><span>Writing File['/etc/hadoop/conf/dfs.exclude'] because contents do=
n't match</span></div>
<div><span>Execute[''] {'user': 'hdfs'}</span></div>
<div><span>ExecuteHadoop['dfsadmin -refreshNodes'] {'bin_dir': '/usr/hdp/cu=
rrent/hadoop-client/bin', 'conf_dir': '/etc/hadoop/conf', 'kinit_override':=
 True, 'user': 'hdfs'}</span></div>
<div><span>Execute['hadoop --config /etc/hadoop/conf dfsadmin -refreshNodes=
'] {'logoutput': False, 'path': ['/usr/hdp/current/hadoop-client/bin'], 'tr=
ies': 1, 'user': 'hdfs', 'try_sleep': 0}</span></div>
</div>
<div><span><br>
</span></div>
<div><span>The only notable difference in the command.json is the commandPa=
rams/excluded_hosts param, so it=92s not like the request is passing the in=
formation along incorrectly. &nbsp;I=92m going to play around with the form=
at I use to pass it in and take some wild
 guesses like it=92s expecting double-encoded JSON as I=92ve seen that in o=
ther places, but if someone knows the answer offhand and can help out, that=
 would be appreciated. &nbsp;If it turns out to be a bug in Ambari, I=92ll =
open a JIRA and rewrite our code to issue the
 decommission call independently for each host.</span></div>
<div><span><br>
</span></div>
<div><span>Greg</span></div>
</div>
</div>
</blockquote>
</div>
<div class=3D"bloop_markdown"></div>
</div>
</div>
</blockquote>
</div>
</div>
</span></blockquote>
</div>
</div>
</blockquote>
</span></div>
</div>
</span></div>
</div>
</span>
</body>
</html>

--_000_D11BCEC5556B4yusakuhortonworkscom_--