ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Speidel" <jspei...@hortonworks.com>
Subject Re: Review Request 37294: HDFS check fails after move NameNode on NN HA cluster
Date Tue, 11 Aug 2015 16:44:31 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37294/#review94929
-----------------------------------------------------------


Please don't commit this patch until resolving the issue raised in this review.
If you have any questions/concerns and want to talk, please let me know and we can set up
a call to discuss.


ambari-server/src/main/java/org/apache/ambari/server/topology/TopologyManager.java (line 295)
<https://reviews.apache.org/r/37294/#comment149642>

    This is not the correct place to determine which hosts/components to add.
    A LogicalRequest represents a user request such as a provision cluster or scale cluster
request.
    A LogicalRequest is composed of one or more HostRequest's.  A HostRequest represents work
done as part of the logical request on a specific host.  Host requests are completed as they
are matched to available hosts and are processed independently of all other host requests.
 Therefore using LogicalRequest.hasCompleted() is incorrect when determining which host/components
to add to the pending components collection as it is very possible that some of the host requests
have completed while others are still pending.  Consider the case of a provision cluster request
with one master and 100 slaves where no hosts are available at the time of the request.  After
the request, the host request associated with the master components is completed followed
by some number of slave host requests but not all of them.  It could be any amount of time
until all of the slave host requests have completed; in fact they may never all complete.
 Now the NN on the master is moved to another host.  Because
  the logical request is still not complete, we would still return stale data regarding the
NN from this method.
    
    What we really want to do is to only include host/component information for pending host
requests.  So, the change here in TopologyManager.getPendingHostComponents() where LocicalRequest.hasCompleted()
is called should be rolled  back and LogicalRequest.getProjectedTopology() should be modified
to only include information for pending host requests, instead of the current behavior of
returning information for all host requests.  Pending host requests may be contained in both
the requestsWithReservedHosts and outstandingHostRequests collections.  This way, each host
request is individually checked so that we don't return information for completed host requests.


- John Speidel


On Aug. 10, 2015, 3:42 p.m., Dmytro Sen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/37294/
> -----------------------------------------------------------
> 
> (Updated Aug. 10, 2015, 3:42 p.m.)
> 
> 
> Review request for Ambari, John Speidel and Robert Nettleton.
> 
> 
> Bugs: AMBARI-12688
>     https://issues.apache.org/jira/browse/AMBARI-12688
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> This ticket is related to
> https://issues.apache.org/jira/browse/AMBARI-10750 Initial merge of advanced api provisioning
work
> Detailed scenario
> 1. Deploy 3-node cluster from a blueprint (NameNode is on host1. Here NN is mapped to
host1 in the cluster topology)
> 2. Enable NN HA (now NameNodes are on host1 and host2) 
> 3. Move NN from host1 to host3(now NameNodes are on host3 and host2, but according to
cluster topology NN also mapped to host1)
> 4. HDFS service check fails
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/topology/TopologyManager.java
8e7fb7f 
>   ambari-server/src/main/java/org/apache/ambari/server/utils/StageUtils.java 3da0fe2

>   ambari-server/src/test/java/org/apache/ambari/server/controller/KerberosHelperTest.java
5d84fbc 
>   ambari-server/src/test/java/org/apache/ambari/server/utils/StageUtilsTest.java e85d9a1

> 
> Diff: https://reviews.apache.org/r/37294/diff/
> 
> 
> Testing
> -------
> 
> [INFO] Ambari Server ..................................... SUCCESS [1:14:15.296s]
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> 
> 
> Thanks,
> 
> Dmytro Sen
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message