ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hurley" <jhur...@hortonworks.com>
Subject Re: Review Request 35814: Cluster deployment is missing tez.tar.gz in HDFS since service responsible for uploading tarball is not co-hosted with Tez Client
Date Wed, 24 Jun 2015 02:47:11 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35814/#review89126
-----------------------------------------------------------

Ship it!


I'm not very clear on the inter-dependencies of tez and the rest of the stack. What you proposed
seemed fine on the surface and implementation matches your description. With that said, did
you get signoff from an engineer with a better understanding of the dependencies of these
components?

**Also, what about stacks already deploy? Won't they have problems with this?**


ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/scripts/historyserver.py
(line 83)
<https://reviews.apache.org/r/35814/#comment141725>

    Are you even old enough for that reference?



ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/scripts/historyserver.py
(lines 84 - 85)
<https://reviews.apache.org/r/35814/#comment141726>

    These seem to happen in the `pre_rolling_restart()` and then again in `start()` - they
needed in both places?


- Jonathan Hurley


On June 23, 2015, 8:53 p.m., Alejandro Fernandez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35814/
> -----------------------------------------------------------
> 
> (Updated June 23, 2015, 8:53 p.m.)
> 
> 
> Review request for Ambari, Andrew Onischuk, Dmytro Sen, Jonathan Hurley, Nate Cole, and
Vitalyi Brodetskyi.
> 
> 
> Bugs: AMBARI-12113
>     https://issues.apache.org/jira/browse/AMBARI-12113
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> STR:
> - Deploy cluster with HDFS, YARN, MR, and Tez on 4 hosts as follows,
> -- Host 1: NameNode, ResourceManager, ZK Server, DataNode, NodeManager
> -- Host 2: Secondary NameNode, App Timeline Server, ZK Server, DataNode, NodeManager.
> -- Host 3: ZK Server, DataNode, NodeManager.
> -- Host 4: Clients
> -- Host 5: Clients
> 
> In this case, Host 1 has RM but no Tez client, so it cannot possibly upload the tez tarball
to HDFS.
> Also, consider the following 2 uses cases:
> 1. Install Tez first, which will require YARN.
> 2. Install YARN first, which does not require Tez, but still need to upload tez.tar.gz
when the Tez Service Check runs.
> 
> 
> tez.tar.gz needs to be copied to HDFS. The problem is that we don't have a way right
now to copy it after all services have been installed and started during cluster deployment,
so instead, we rely on services starting to copy the tarball.
> In order for this to work, the host with Tez Client also needs to have HDFS Client, Yarn
Client, and MR Client. Further, copying to HDFS requires NameNode to be up, and DataNodes
to be functional.
> AMBARI-9997 had ResourceManager copy the tez tarball; the problem was that if the host
with RM didn't have Tez client, it wouldn't find the tarball.
> The change I'm proposing is to
> - Switch this to HistoryServer instead of RM since HistoryServer already copies the mapreduce
tarball.
> - Installing Tez also requires YARN service, including HistoryServer. HistoryServer is
now co-hosted with Tez Client, so this guarantees it can copy the tarball.
> - Installing HistoryServer by itself will not copy the tarball. However, if Tez is installed
later, then its Service Check is responsible for copying the tarball to HDFS, and this host
is also guaranteed to have HDFS Client.
> 
> 
> Diffs
> -----
> 
>   ambari-common/src/main/python/resource_management/libraries/functions/copy_tarball.py
8eab473 
>   ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/scripts/params_linux.py
935f3a2 
>   ambari-server/src/main/resources/common-services/SPARK/1.2.0.2.2/package/scripts/job_history_server.py
4b0bbfa 
>   ambari-server/src/main/resources/common-services/TEZ/0.4.0.2.1/metainfo.xml f42af02

>   ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/metainfo.xml 01c3c26

>   ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/scripts/historyserver.py
af37153 
>   ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/scripts/params_linux.py
d74340f 
>   ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/scripts/resourcemanager.py
88a3cba 
>   ambari-server/src/main/resources/stacks/HDP/2.1/role_command_order.json ec38ee2 
>   ambari-server/src/test/python/stacks/2.0.6/YARN/test_historyserver.py 3457315 
>   ambari-server/src/test/python/stacks/2.0.6/YARN/test_resourcemanager.py 94e26b5 
> 
> Diff: https://reviews.apache.org/r/35814/diff/
> 
> 
> Testing
> -------
> 
> Deployed a cluster with the following combinations:
> 1. HDFS, ZK, YARN, MR. Then installed Tez, which added Tez Client to the host with HistoryServer.
And when the Tez Service Check ran as part of the Install, it was able to copy the tez tarball
to HDFS since that host also contained HDFS client.
> 2. HDFS, ZK, Tez. This required installing YARN/MR as well. So HistoryServer Start was
responsible for copying the tez tarball to HDFS, and was able to do so since Tez client was
co-hosted on the HistoryServer host.
> 
> Total run:764
> Total errors:0
> Total failures:0
> OK
> 
> 
> Thanks,
> 
> Alejandro Fernandez
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message