ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Fernandez" <afernan...@hortonworks.com>
Subject Re: Review Request 30306: RU - Cluster install did not transition cluster_version to CURRENT because host has only AMS
Date Tue, 27 Jan 2015 18:14:50 GMT


> On Jan. 27, 2015, 3:02 p.m., Jonathan Hurley wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java,
lines 491-494
> > <https://reviews.apache.org/r/30306/diff/1/?file=836148#file836148line491>
> >
> >     Manual database edits are not something that we should support because they
mess with JPA's caching. This work seems dangerous to do on a heartbeat as it involves a lot
of work and queries. I'd rather you use an event to do this work asynchronously.

I wasn't expecting manual edits while the server was running, but if the value in the DB does
change and it happens to equal the same value that the agent reports, which could certainly
happen with support, then we still need a way for the changes to the hostcomponentstate to
bubble up through the host_version and then the cluster_version tables. So this ensures consistency
for very little additional overhead.


> On Jan. 27, 2015, 3:02 p.m., Jonathan Hurley wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java,
line 1411
> > <https://reviews.apache.org/r/30306/diff/1/?file=836150#file836150line1411>
> >
> >     Naming; this is confusing and hard to read. Instead, maybe make this compInfo.isVersionAdvertised()

versionAdvertised is a boolean, so I'm following the convention.


> On Jan. 27, 2015, 3:02 p.m., Jonathan Hurley wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java,
lines 1451-1456
> > <https://reviews.apache.org/r/30306/diff/1/?file=836150#file836150line1451>
> >
> >     Does this code need a lock? You're querying and then adding if absent, but couldn't
2 heartbeats come in while you're doing this work? Locking would probably be good, especially
since I'm recommending that you do this work in an async event, not from the heartbeat handler
directly.

Technically yes. I had an earlier patch for the state transitions that used locking code but
I ditched it to avoid performance hits. The heartbeats we are expecting can come from any
host, so we should lock around creating the RepoVersion object. After that, we only expect
a status heartbeat for one component at a time in a host, so the host_version updates may
not need it, but the cluster_vesrion updates will also need locking code. I'll fix this in
the next patch.


> On Jan. 27, 2015, 3:02 p.m., Jonathan Hurley wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/state/svccomphost/ServiceComponentHostImpl.java,
lines 1722-1727
> > <https://reviews.apache.org/r/30306/diff/1/?file=836151#file836151line1722>
> >
> >     Shouldn't the upgrade logic take care of this?

The new logic is for a brand new cluster install to detect the exact version installed when
components advertise the version on a heartbeat. Originally, we were going to bootstrap everything
in one shot, but we decided we need the heartbeat handler to recognize the version in order
for a RU to also detect when a component has moved to a version. So this logic handles both
brand new install and RU.


- Alejandro


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30306/#review69800
-----------------------------------------------------------


On Jan. 27, 2015, 4:10 a.m., Alejandro Fernandez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30306/
> -----------------------------------------------------------
> 
> (Updated Jan. 27, 2015, 4:10 a.m.)
> 
> 
> Review request for Ambari, Jonathan Hurley, Nate Cole, and Yurii Shylov.
> 
> 
> Bugs: AMBARI-9340
>     https://issues.apache.org/jira/browse/AMBARI-9340
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> New cluster install through the Install Wizard correctly transitioned all of the host_versions
into CURRENT, but did not propagate the changes to the cluster_version, which stayed stuck
on UPGRADING, because one host only had AMS, so it could not create a host_version record.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java 092f9d54db36686412d7ccaa3318330688cf67cd

>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java b344a052549bb393028a3d60ea9fdb0c2f70fde0

>   ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
220a5af9923444703eab4cff1875f19aeee08742 
>   ambari-server/src/main/java/org/apache/ambari/server/state/svccomphost/ServiceComponentHostImpl.java
372325df8b1e1191b87441304dc9a994f8b56f72 
> 
> Diff: https://reviews.apache.org/r/30306/diff/
> 
> 
> Testing
> -------
> 
> Installed a 3-node cluster with only HDFS, ZK, and AMS. One of the hosts only had the
AMS component. When the install finishes, the other 2 hosts correctly had host_version records
in CURRENT, and the cluster_version was in CURRENT.
> After the install finishes, added a 4th node with all 3 components, and it correctly
created a host_version record in the CURRENT state.
> 
> This means that the UI needs to be robust enough to handle the case where some hosts
don't have host_version records.
> 
> Unit tests are in progress.
> 
> 
> Thanks,
> 
> Alejandro Fernandez
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message