hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aleksandr Shulman <al...@cloudera.com>
Subject Re: Full stack rolling restart
Date Thu, 29 May 2014 17:57:41 GMT
Yes, a full-stack rolling upgrade is possible. To perform a full-stack
rolling restart of the CDH platform, we added and tested that functionality
through Cloudera Manager, starting in CM4, running CDH4 and onward.
For HBase rolling upgrades, the only Cloudera-supported path is through
Cloudera Manager (though we've tested it without CM as well). For
HDFS/MR/YARN/ZK, it's also supported using only CDH, though you can also
use CM to do it.

For the full stack rolling upgrade including HBase, here is how Cloudera
Manager curates the process at a high level:

1. Restart all master nodes by restarting services in reverse dependency
 -- Master services = HMaster, NN, ZK, JT, etc.
 -- Reverse-dependency order: For example, HBase, then HDFS, then ZK (since
HDFS depends on ZK and hbase depends on HDFS).

This gets a bit more complicated if there is High-Availability enabled.
Also as a general rule, backup master services (e.g. backup master) are
upgraded before the active master services.

2. Restart all worker nodes (nodes that run worker services) in batches
(default is 1, but is configurable)
 -- Worker service = DN, RS, TT, etc.
 -- Reverse-dependency order: Turn off balancer. Decommission RS (by
closing and moving off all the regions one by one), gracefully shut down
DN, start the DN back up,  start the RS back up, load that RS with regions.
Repeat for each worker node.

Once the master and worker services have been restarted on all nodes of the
cluster, the hbase balancer is then turned back on and the cluster is
considered upgraded.

Caveat: Rolling upgrades are only supported between minor versions of CDH.
So 4.x to 4.y OR 5.x to 5.y (but not 4.x to 5.y).

Did that answer your question?

On Thu, May 29, 2014 at 9:11 AM, Jeremy Carroll <phobos182@gmail.com> wrote:

> We have taken the approach of graceful stop of the RegionServer in
> maintenance. Then restarting the DataNode. Once it has registered and back
> online we start the RegionServer and move it's regions back. We do not
> compact before or after the operation since it takes a short period of
> time, and minor compactions will regain the small amount of locality lost
> during the maintenance operation.
> I believe it's doubtful that he Hadoop project itself will release an
> official cluster management / operations framework. So we built a lot of
> this ourselves.
> Sent from my iPhone
> > On May 28, 2014, at 11:37 PM, sameerv <smv1@hotmail.com> wrote:
> >
> > I am curious to know what the industry folks think of rolling restart on
> the
> > full stack. Envisioning something like each node which services it runs,
> > stop all services, use new configs and start all services. Is it
> feasible to
> > do this ? Can folks who have tried, share their experiences please?
> >
> > Thanks,
> > Sameer
> >
> >
> >
> >
> > --
> > View this message in context:
> http://apache-hbase.679495.n3.nabble.com/Full-stack-rolling-restart-tp4059877.html
> > Sent from the HBase User mailing list archive at Nabble.com.

Best Regards,

Aleks Shulman

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message