hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <...@hortonworks.com>
Subject Re: YARN: LocalResources and file distribution
Date Mon, 02 Dec 2013 23:27:49 GMT

 YARN, by default, will only download *resource* from a shared namespace (e.g. HDFS).

 If /home/hadoop/robert/large_jar.jar is available on each node then you can specify path
as file:///home/hadoop/robert/large_jar.jar and it should work.

 Else, you'll need to copy /home/hadoop/robert/large_jar.jar to HDFS and then specify hdfs://host:port/path/to/large_jar.jar.


On Dec 1, 2013, at 12:03 PM, Robert Metzger <metrobert@gmail.com> wrote:

> Hello,
> I'm currently writing code to run my application using Yarn (Hadoop 2.2.0).
> I used this code as a skeleton: https://github.com/hortonworks/simple-yarn-app
> Everything works fine on my local machine or on a cluster with the shared directories,
but when I want to access resources outside of commonly accessible locations, my application
> I have my application in a large jar file, containing everything (Submission Client,
Application Master, and Workers). 
> The submission client registers the large jar file as a local resource for the Application
master's context.
> In my understanding, Yarn takes care of transferring the client-local resources to the
application master's container.
> This is also stated here: http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
> You can use the LocalResource to add resources to your application request. This will
cause YARN to distribute the resource to the ApplicationMaster node.
> If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar", I'll get the
following error from the nodemanager (another node in the cluster):
> 2013-12-01 20:13:00,810 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
> So it seems as this node tries to access the file from its local file system.
> Do I have to use another "protocol" for the file, something like "file://host:port/home/blabla"
> Is it true that Yarn is able to distribute files (not using hdfs obviously?) ?
> The distributedshell-example suggests that I have to use HDFS: https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
> Sincerely,
> Robert

Arun C. Murthy
Hortonworks Inc.

NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

View raw message