hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Metzger <metrob...@gmail.com>
Subject YARN: LocalResources and file distribution
Date Sun, 01 Dec 2013 20:03:46 GMT

I'm currently writing code to run my application using Yarn (Hadoop 2.2.0).
I used this code as a skeleton:

Everything works fine on my local machine or on a cluster with the shared
directories, but when I want to access resources outside of commonly
accessible locations, my application fails.

I have my application in a large jar file, containing everything
(Submission Client, Application Master, and Workers).
The submission client registers the large jar file as a local resource for
the Application master's context.

In my understanding, Yarn takes care of transferring the client-local
resources to the application master's container.
This is also stated here:

You can use the LocalResource to add resources to your application request.
> This will cause YARN to distribute the resource to the ApplicationMaster
> node.

If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar",
I'll get the following error from the nodemanager (another node in the

2013-12-01 20:13:00,810 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..

So it seems as this node tries to access the file from its local file

Do I have to use another "protocol" for the file, something like
"file://host:port/home/blabla" ?

Is it true that Yarn is able to distribute files (not using hdfs
obviously?) ?

The distributedshell-example suggests that I have to use HDFS:


View raw message