ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jaimin Jetly <jai...@hortonworks.com>
Subject Re: Decommission does not move data
Date Wed, 26 Nov 2014 21:17:01 GMT
Ambari client after triggering an decommission request checks for the
NameNode JMX metrics channeled via ambari-server API to identify if the
datanode has been successfully decommissioned or not.

The API that ambari-web client checks:
http://localhost:8080/api/v1/clusters/${clusterName}/services/HDFS/components/NAMENODE?fields=metrics/dfs/namenode/

Also I found a recent bug in HDFS that has been fixed recently and the fix
would not have been part of HDP stack in Ambari-1.6.0 release:
https://issues.apache.org/jira/browse/HDFS-3087



Thanks

Jaimin Jetly




On Wed, Nov 26, 2014 at 1:02 PM, Alma Bob <almabob1@gmail.com> wrote:

> Hi,
>
> Probably that's the case yes. Is it possible to check the state of the
> decommission via Ambari or somehow?
>
> Thank for the help anyway.
>
> On November 26, 2014 at 9:45:25 PM, Yusaku Sako (yusaku@hortonworks.com)
> wrote:
>
> Could this be just confusion resulting from how Ambari responds to
> DataNode decommission requests?
> Ambari receives a decommission request, it essentially updates the
> "exclude" file to mark the host(s) for decommissioning and invokes
> NameNode "refreshNodes" command.
> This happens quickly, in seconds, and responds to the API client
> saying it was performed successfully.
> At this point, however, decommission is not done but it has just begun.
>
> Yusaku
>
> On Wed, Nov 26, 2014 at 12:36 PM, Alma Bob <almabob1@gmail.com> wrote:
> > Hi,
> >
> > I'll start a new cluster to check it again and I'll report it back here.
> > Maybe I can give access as well.
> >
> > @Yusaku, the decommission looks the same from both triggering from the
> UI
> > and through the REST API as expected.
> >
> > Generally the decommission should take some time for obvious reasons,
> but it
> > finished in a few seconds which made me wonder.
> >
> > On November 26, 2014 at 9:26:58 PM, Jaimin Jetly (jaimin@hortonworks.com)
>
> > wrote:
> >
> > Hi Alma,
> >
> > Decommission in ideal scenario is expected to move data from the
> > decommissioned node.
> >
> > Can you please provide information on what was the datanode status as
> per
> > NameNode JMX metrics when you noticed this behavior? This information
> will
> > help in debugging the issue further.
> >
> > For checking the decommission status of a node via NameNode jmx metrics,
> > look for LiveNodes key at following url:
> > http://c6401.ambari.apache.org:50070/jmx
> >
> > LiveNodes key keeps the status of each node under "adminState" attribute
> (In
> > Service | Decommission In Progress | Decommissioned)
> >
> > Note:
> > Decommission can take a while to finish for a moderate amount of data
> and so
> > it is expected to take more time for 3TB data.
> >
> > If the datanode status was in "Decommission in Progress" at that time
> then
> > this behavior is expected from HDFS as not all the block copies are
> moved to
> > other nodes at that time.
> >
> >
> > Thanks
> >
> > Jaimin Jetly
> >
> >
> >
> >
> > On Wed, Nov 26, 2014 at 12:15 PM, Yusaku Sako <yusaku@hortonworks.com>
> > wrote:
> >>
> >> I see. Decommission of DataNodes via Ambari should automatically
> >> start the process of moving off blocks to other remaining DataNodes to
> >> ensure that the replication factor of 3 is reached.
> >> How did you trigger decommission on the 5 DataNodes?
> >> Once you trigger decommission, Ambari should show the DataNodes as
> >> "Decommissioning" if you drill down to the Host Detail page of the
> >> said hosts. Decommission process can take a long time, depending on
> >> the number of blocks involved. You can also check the NameNode Web UI
> >> (available from QuickLinks) to verify that the DataNodes are indeed
> >> decommissioning.
> >>
> >> Yusaku
> >>
> >> On Wed, Nov 26, 2014 at 11:41 AM, Alma Bob <almabob1@gmail.com> wrote:
> >> > In this test I tried with 20 nodes with replication 3. I generated
> 3TB
> >> > data
> >> > and started to decommission 5 nodes and the fsck reported as
> replication
> >> > is
> >> > 3 but found block 2 in many cases.
> >> >
> >> > On November 26, 2014 at 8:34:20 PM, Yusaku Sako (
> yusaku@hortonworks.com)
> >> > wrote:
> >> >
> >> > How many DataNodes do you have in your cluster, and what is your
> >> > replication factor (dfs.replication in hdfs-site.xml)?
> >> >
> >> > Yusaku
> >> >
> >> > On Wed, Nov 26, 2014 at 11:11 AM, Alma Bob <almabob1@gmail.com>
> wrote:
> >> >> Hi,
> >> >>
> >> >> I've been trying to remove nodes from the cluster and as it seems to
> me
> >> >> the Datanode decommission does not move any data from the nodes. If
> I
> >> >> check
> >> >> with fsck it reports missing blocks. I'm using Ambari 1.6.0. Is it
> >> >> supposed
> >> >> to move data at all or I should take care of it?
> >> >>
> >> >> Best Regards,
> >> >> Bob
> >> >>
> >> >>
> >> >
> >> > --
> >> > CONFIDENTIALITY NOTICE
> >> > NOTICE: This message is intended for the use of the individual or
> entity
> >> > to
> >> > which it is addressed and may contain information that is
> confidential,
> >> > privileged and exempt from disclosure under applicable law. If the
> >> > reader
> >> > of this message is not the intended recipient, you are hereby
> notified
> >> > that
> >> > any printing, copying, dissemination, distribution, disclosure or
> >> > forwarding of this communication is strictly prohibited. If you have
> >> > received this communication in error, please contact the sender
> >> > immediately
> >> > and delete it from your system. Thank You.
> >>
> >> --
> >> CONFIDENTIALITY NOTICE
> >> NOTICE: This message is intended for the use of the individual or
> entity
> >> to
> >> which it is addressed and may contain information that is confidential,
> >> privileged and exempt from disclosure under applicable law. If the
> reader
> >> of this message is not the intended recipient, you are hereby notified
> >> that
> >> any printing, copying, dissemination, distribution, disclosure or
> >> forwarding of this communication is strictly prohibited. If you have
> >> received this communication in error, please contact the sender
> >> immediately
> >> and delete it from your system. Thank You.
> >
> >
> >
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the
> reader of
> > this message is not the intended recipient, you are hereby notified that
> any
> > printing, copying, dissemination, distribution, disclosure or forwarding
> of
> > this communication is strictly prohibited. If you have received this
> > communication in error, please contact the sender immediately and delete
> it
> > from your system. Thank You.
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified
> that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender
> immediately
> and delete it from your system. Thank You.
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message