ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMBARI-12113) Cluster deployment is missing tez.tar.gz in HDFS since service responsible for uploading tarball is not co-hosted with Tez Client
Date Wed, 24 Jun 2015 23:48:05 GMT

    [ https://issues.apache.org/jira/browse/AMBARI-12113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600390#comment-14600390
] 

Hadoop QA commented on AMBARI-12113:
------------------------------------

{color:green}+1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12741704/AMBARI-12113.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 2 new or modified
test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in ambari-server.

Test results: https://builds.apache.org/job/Ambari-trunk-test-patch/3256//testReport/
Console output: https://builds.apache.org/job/Ambari-trunk-test-patch/3256//console

This message is automatically generated.

> Cluster deployment is missing tez.tar.gz in HDFS since service responsible for uploading
tarball is not co-hosted with Tez Client
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMBARI-12113
>                 URL: https://issues.apache.org/jira/browse/AMBARI-12113
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.1.0
>            Reporter: Alejandro Fernandez
>            Assignee: Alejandro Fernandez
>            Priority: Critical
>             Fix For: 2.1.0
>
>         Attachments: AMBARI-12113.branch-2.1.patch, AMBARI-12113.patch
>
>
> STR:
> * Deploy cluster with HDFS, YARN, MR, and Tez on 4 hosts as follows,
> ** Host 1: NameNode, ResourceManager, ZK Server, DataNode, NodeManager
> ** Host 2: Secondary NameNode, App Timeline Server, ZK Server, DataNode, NodeManager.
> ** Host 3: ZK Server, DataNode, NodeManager.
> ** Host 4: Clients
> ** Host 5: Clients
> In this case, Host 1 has RM but no Tez client, so it cannot possibly upload the tez tarball
to HDFS.
> Also, consider the following 2 uses cases:
> 1. Install Tez first, which will require YARN.
> 2. Install YARN first, which does not require Tez, but still need to upload tez.tar.gz
when the Tez Service Check runs.
> {code}
> Traceback (most recent call last):
>   File "/var/lib/ambari-agent/cache/common-services/TEZ/0.4.0.2.1/package/scripts/service_check.py",
line 98, in <module>
>     TezServiceCheck().execute()
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 216, in execute
>     method(env)
>   File "/var/lib/ambari-agent/cache/common-services/TEZ/0.4.0.2.1/package/scripts/service_check.py",
line 75, in service_check
>     bin_dir = params.hadoop_bin_dir
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 157,
in __init__
>     self.env.run()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
152, in run
>     self.run_action(resource, action)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
118, in run_action
>     provider_action()
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/execute_hadoop.py",
line 55, in action_run
>     environment = self.resource.environment,
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 157,
in __init__
>     self.env.run()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
152, in run
>     self.run_action(resource, action)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
118, in run_action
>     provider_action()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
line 254, in action_run
>     tries=self.resource.tries, try_sleep=self.resource.try_sleep)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70,
in inner
>     result = function(command, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92,
in checked_call
>     tries=tries, try_sleep=try_sleep)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140,
in _call_wrapper
>     result = _call(command, **kwargs_copy)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 290,
in _call
>     raise Fail(err_msg)
> resource_management.core.exceptions.Fail: Execution of 'hadoop --config /usr/hdp/2.2.6.0-2800/hadoop/conf
jar /usr/hdp/current/tez-client/tez-examples*.jar orderedwordcount /tmp/tezsmokeinput/sample-tez-test
/tmp/tezsmokeoutput/' returned 255. Running OrderedWordCount
> 15/06/17 04:21:50 INFO client.TezClient: Tez Client Version: [ component=tez-api, version=0.5.2.2.2.6.0-2800,
revision=790e651b4a64f7589008208580c9790548c2baf8, SCM-URL=scm:git:https://git-wip-us.apache.org/repos/asf/tez.git,
buildTIme=20150518-1651 ]
> 15/06/17 04:21:51 INFO impl.TimelineClientImpl: Timeline service address: http://c6405.ambari.apache.org:8188/ws/v1/timeline/
> 15/06/17 04:21:51 INFO client.RMProxy: Connecting to ResourceManager at c6405.ambari.apache.org/192.168.64.105:8050
> 15/06/17 04:21:53 INFO client.TezClient: Submitting DAG application with id: application_1434514777618_0005
> 15/06/17 04:21:53 INFO client.TezClientUtils: Using tez.lib.uris value from configuration:
/hdp/apps/2.2.6.0-2800/tez/tez.tar.gz
> java.io.FileNotFoundException: File does not exist: /hdp/apps/2.2.6.0-2800/tez/tez.tar.gz
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1140)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1132)
> 	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1132)
> 	at org.apache.hadoop.fs.FileSystem.resolvePath(FileSystem.java:750)
> 	at org.apache.tez.client.TezClientUtils.getLRFileStatus(TezClientUtils.java:127)
> 	at org.apache.tez.client.TezClientUtils.setupTezJarsLocalResources(TezClientUtils.java:178)
> 	at org.apache.tez.client.TezClient.getTezJarResources(TezClient.java:721)
> 	at org.apache.tez.client.TezClient.submitDAGApplication(TezClient.java:689)
> 	at org.apache.tez.client.TezClient.submitDAGApplication(TezClient.java:667)
> 	at org.apache.tez.client.TezClient.submitDAG(TezClient.java:353)
> 	at org.apache.tez.examples.OrderedWordCount.run(OrderedWordCount.java:208)
> 	at org.apache.tez.examples.OrderedWordCount.run(OrderedWordCount.java:232)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> 	at org.apache.tez.examples.OrderedWordCount.main(OrderedWordCount.java:240)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
> 	at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
> 	at org.apache.tez.examples.ExampleDriver.main(ExampleDriver.java:61)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {code}
> Analysis:
> tez.tar.gz needs to  be copied to HDFS. The problem is that we don't have a way right
now to copy it after all services have been installed and started during cluster deployment,
so instead, we rely on services starting to copy the tarball.
> In order for this to work, the host with Tez Client also needs to have HDFS Client, Yarn
Client, and MR Client. Further, copying to HDFS requires NameNode to be up, and DataNodes
to be functional.
> AMBARI-9997 had ResourceManager copy the tez tarball; the problem was that if the host
with RM didn't have Tez client, it wouldn't find the tarball.
> The change I'm proposing is to
> * Switch this to HistoryServer instead of RM since HistoryServer already copies the mapreduce
tarball.
> * Installing Tez also requires YARN service, including HistoryServer. HistoryServer is
now co-hosted with Tez Client, so this guarantees it can copy the tarball.
> * Installing HistoryServer by itself will not copy the tarball. However, if Tez is installed
later, then its Service Check is responsible for copying the tarball to HDFS, and this host
is also guaranteed to have HDFS Client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message