ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alma Bob <almab...@gmail.com>
Subject Re: Decommission does not move data
Date Wed, 26 Nov 2014 21:21:52 GMT
That's really helpful. Thank you both of you.

On November 26, 2014 at 10:17:01 PM, Jaimin Jetly (jaimin@hortonworks.com) wrote:


Ambari client after triggering an decommission request checks for the NameNode JMX metrics
channeled via ambari-server API to identify if the datanode has been successfully decommissioned
or not.

The API that ambari-web client checks:
http://localhost:8080/api/v1/clusters/${clusterName}/services/HDFS/components/NAMENODE?fields=metrics/dfs/namenode/

Also I found a recent bug in HDFS that has been fixed recently and the fix would not have
been part of HDP stack in Ambari-1.6.0 release:
https://issues.apache.org/jira/browse/HDFS-3087



Thanks

Jaimin Jetly






On Wed, Nov 26, 2014 at 1:02 PM, Alma Bob <almabob1@gmail.com> wrote:
Hi,

Probably that's the case yes. Is it possible to check the state of the decommission via Ambari
or somehow?

Thank for the help anyway.
On November 26, 2014 at 9:45:25 PM, Yusaku Sako (yusaku@hortonworks.com) wrote:

Could this be just confusion resulting from how Ambari responds to
DataNode decommission requests?
Ambari receives a decommission request, it essentially updates the
"exclude" file to mark the host(s) for decommissioning and invokes
NameNode "refreshNodes" command.
This happens quickly, in seconds, and responds to the API client
saying it was performed successfully.
At this point, however, decommission is not done but it has just begun.

Yusaku

On Wed, Nov 26, 2014 at 12:36 PM, Alma Bob <almabob1@gmail.com> wrote:
> Hi,
>
> I'll start a new cluster to check it again and I'll report it back here.
> Maybe I can give access as well.
>
> @Yusaku, the decommission looks the same from both triggering from the UI
> and through the REST API as expected.
>
> Generally the decommission should take some time for obvious reasons, but it
> finished in a few seconds which made me wonder.
>
> On November 26, 2014 at 9:26:58 PM, Jaimin Jetly (jaimin@hortonworks.com)
> wrote:
>
> Hi Alma,
>
> Decommission in ideal scenario is expected to move data from the
> decommissioned node.
>
> Can you please provide information on what was the datanode status as per
> NameNode JMX metrics when you noticed this behavior? This information will
> help in debugging the issue further.
>
> For checking the decommission status of a node via NameNode jmx metrics,
> look for LiveNodes key at following url:
> http://c6401.ambari.apache.org:50070/jmx
>
> LiveNodes key keeps the status of each node under "adminState" attribute (In
> Service | Decommission In Progress | Decommissioned)
>
> Note:
> Decommission can take a while to finish for a moderate amount of data and so
> it is expected to take more time for 3TB data.
>
> If the datanode status was in "Decommission in Progress" at that time then
> this behavior is expected from HDFS as not all the block copies are moved to
> other nodes at that time.
>
>
> Thanks
>
> Jaimin Jetly
>
>
>
>
> On Wed, Nov 26, 2014 at 12:15 PM, Yusaku Sako <yusaku@hortonworks.com>
> wrote:
>>
>> I see. Decommission of DataNodes via Ambari should automatically
>> start the process of moving off blocks to other remaining DataNodes to
>> ensure that the replication factor of 3 is reached.
>> How did you trigger decommission on the 5 DataNodes?
>> Once you trigger decommission, Ambari should show the DataNodes as
>> "Decommissioning" if you drill down to the Host Detail page of the
>> said hosts. Decommission process can take a long time, depending on
>> the number of blocks involved. You can also check the NameNode Web UI
>> (available from QuickLinks) to verify that the DataNodes are indeed
>> decommissioning.
>>
>> Yusaku
>>
>> On Wed, Nov 26, 2014 at 11:41 AM, Alma Bob <almabob1@gmail.com> wrote:
>> > In this test I tried with 20 nodes with replication 3. I generated 3TB
>> > data
>> > and started to decommission 5 nodes and the fsck reported as replication
>> > is
>> > 3 but found block 2 in many cases.
>> >
>> > On November 26, 2014 at 8:34:20 PM, Yusaku Sako (yusaku@hortonworks.com)
>> > wrote:
>> >
>> > How many DataNodes do you have in your cluster, and what is your
>> > replication factor (dfs.replication in hdfs-site.xml)?
>> >
>> > Yusaku
>> >
>> > On Wed, Nov 26, 2014 at 11:11 AM, Alma Bob <almabob1@gmail.com> wrote:
>> >> Hi,
>> >>
>> >> I've been trying to remove nodes from the cluster and as it seems to me
>> >> the Datanode decommission does not move any data from the nodes. If I
>> >> check
>> >> with fsck it reports missing blocks. I'm using Ambari 1.6.0. Is it
>> >> supposed
>> >> to move data at all or I should take care of it?
>> >>
>> >> Best Regards,
>> >> Bob
>> >>
>> >>
>> >
>> > --
>> > CONFIDENTIALITY NOTICE
>> > NOTICE: This message is intended for the use of the individual or entity
>> > to
>> > which it is addressed and may contain information that is confidential,
>> > privileged and exempt from disclosure under applicable law. If the
>> > reader
>> > of this message is not the intended recipient, you are hereby notified
>> > that
>> > any printing, copying, dissemination, distribution, disclosure or
>> > forwarding of this communication is strictly prohibited. If you have
>> > received this communication in error, please contact the sender
>> > immediately
>> > and delete it from your system. Thank You.
>>
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified
>> that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender
>> immediately
>> and delete it from your system. Thank You.
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader of
> this message is not the intended recipient, you are hereby notified that any
> printing, copying, dissemination, distribution, disclosure or forwarding of
> this communication is strictly prohibited. If you have received this
> communication in error, please contact the sender immediately and delete it
> from your system. Thank You.

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed
and may contain information that is confidential, privileged and exempt from disclosure under
applicable law. If the reader of this message is not the intended recipient, you are hereby
notified that any printing, copying, dissemination, distribution, disclosure or forwarding
of this communication is strictly prohibited. If you have received this communication in error,
please contact the sender immediately and delete it from your system. Thank You.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message