ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Onischuk" <aonis...@hortonworks.com>
Subject Re: Review Request 35168: NameNode is forced to leave safemode, which causes HBMaster master to crash if done too quickly
Date Sat, 06 Jun 2015 01:00:52 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35168/#review86887
-----------------------------------------------------------



ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py
<https://reviews.apache.org/r/35168/#comment139027>

    We have already functionality for that
    
    Execute("...",
      tries = 60,
      try_sleep = 10,
      ignore_failures = True
    )
    
    It would be much not do redundant duplications of code.


- Andrew Onischuk


On June 6, 2015, 12:55 a.m., Alejandro Fernandez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35168/
> -----------------------------------------------------------
> 
> (Updated June 6, 2015, 12:55 a.m.)
> 
> 
> Review request for Ambari, Andrew Onischuk, Jonathan Hurley, Nate Cole, and Sid Wagle.
> 
> 
> Bugs: AMBARI-11743
>     https://issues.apache.org/jira/browse/AMBARI-11743
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> 1. Install cluster with Ambari 2.1 and HDP 2.3
> 2. Add services HDFS, YARN, MR, ZK, and HBaste
> 3. Perform several Stop All and Start All on HDFS service
> 4. Periodically, HBase Master will crash
> 
> 
> Diffs
> -----
> 
>   ambari-common/src/main/python/resource_management/libraries/functions/copy_tarball.py
de05da2 
>   ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py
d26d145 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_namenode.py b920c17 
> 
> Diff: https://reviews.apache.org/r/35168/diff/
> 
> 
> Testing
> -------
> 
> Tested this on a live cluster. Several attempts at restarting NameNode worked and Hbase
master was still up.
> Also tested it with NameNode HA.
> 
> Still need to test Rolling Upgrade and run full set of unit tests.
> 
> 
> Thanks,
> 
> Alejandro Fernandez
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message