hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Trezzo (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1492) truly shared cache for jars (jobjar/libjar)
Date Tue, 18 Feb 2014 19:32:28 GMT

    [ https://issues.apache.org/jira/browse/YARN-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904451#comment-13904451
] 

Chris Trezzo commented on YARN-1492:
------------------------------------

Thanks for the comments [~jlowe]! [~sjlee0] and I will update the doc to incorporate the helpful
feedback. Specific comments are in-line.

bq. The public localizer will only localize files that are publicly available, however the
staging directory is not publicly available. Clients must upload publicly localized files
elsewhere in order for that to work, but files outside of the staging directory won't be automatically
cleaned when the job exits.

Good point. I will update the doc. One way to address this, is to modify the conditions for
uploading a resource to the shared cache. The resource either has to be publicly readable
or owned by the user requesting the localization. The later condition would require strong
authentication to run securely. The staging directory example would fall into this category.
I am not extremely familiar with the security part of the code base, but I will look into
this more and update the document.

bq. There's a race between the NM uploading the file to the shared cache area and the local
dist cache cleaner removing the local file.

Since uploading a resource to the shared cache is asynchronous and not required for job submission/execution
to succeed, adding the resource to the cache can be best effort. If the localization service
loses the race with the cache cleaner then the resource simply won't make it into the cache.
Does that sound reasonable?

bq. How parallel will the NM upload process be – is it serially uploading the resources
for each container and between containers?

This is a good question. One option would be to make this tunable using a thread pool. The
important part is that since the NM upload process is asynchronous and not critical for application
execution, it becomes an implementation detail.

bq. Is the cleaner running as part of the SCM? If so I don't think it necessary to store the
cleaner flag in the persisted state, and that would be a bit less traffic to the store while
cleaning.

Agreed. I will update the doc.

bq. It might be nice to provide a simpler store setup for the SCM for smaller clusters or
those not already using ZK for other things (e.g.: HA) Something like a leveldb store or simple
local filesystem storage would suffice since those don't require separate setup.

Agreed. The plan is to have a very clean interface between the storage mechanism and the rest
of the Manager. This will allow us to have multiple stores and we can definitely have a simpler
store.

bq. The cleaner should handle files that are orphaned in the cache if the NM fails to complete
the upload. Could use a timeout based on the file timestamp or other mechanisms to accomplish
this.

Agreed. I will make this more explicit in the doc. The design is that the cleaner service
iterates over all entries in HDFS, not just the entries the SCM knows about. This will ensure
that orphaned files/entries will be handled by the cleaner service. Modification time of the
entry directory in HDFS can be used by the cleaner service to determine staleness.

bq. What criteria will clients use to decide if files are public? As-is this doesn't seem
to address the original goals of the JIRA since hardly anything is declared public unless
already in a well-known place in HDFS today. I'd like the design to also state any proposed
changes to the behavior of the job submitter's handling of the dist cache during job submission
if there are any.

I will add a section to the document about application specific changes to MapReduce. The
shared cache api allows an application to add resources to the shared cache on a per-resource
basis, which should allow for any application-level resource caching policy. That being said,
we can elaborate more on how MapReduce will specifically support the shared cache.

bq. Nit: It should be made clearer that the client cannot notify the SCM that an application
is not using a resource until the application has completed, or we risk the cleaner removing
the resource while it is still in use by the application. The client protocol steps read as
if the client can submit and then immediately notify the SCM if desired.

Thanks. I will clarify this.

> truly shared cache for jars (jobjar/libjar)
> -------------------------------------------
>
>                 Key: YARN-1492
>                 URL: https://issues.apache.org/jira/browse/YARN-1492
>             Project: Hadoop YARN
>          Issue Type: New Feature
>    Affects Versions: 2.0.4-alpha
>            Reporter: Sangjin Lee
>            Assignee: Sangjin Lee
>         Attachments: shared_cache_design.pdf, shared_cache_design_v2.pdf, shared_cache_design_v3.pdf,
shared_cache_design_v4.pdf, shared_cache_design_v5.pdf
>
>
> Currently there is the distributed cache that enables you to cache jars and files so
that attempts from the same job can reuse them. However, sharing is limited with the distributed
cache because it is normally on a per-job basis. On a large cluster, sometimes copying of
jobjars and libjars becomes so prevalent that it consumes a large portion of the network bandwidth,
not to speak of defeating the purpose of "bringing compute to where data is". This is wasteful
because in most cases code doesn't change much across many jobs.
> I'd like to propose and discuss feasibility of introducing a truly shared cache so that
multiple jobs from multiple users can share and cache jars. This JIRA is to open the discussion.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message