hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Lilley <john.lil...@redpoint.net>
Subject efficiency of LocalResources and archives
Date Thu, 06 Jun 2013 20:10:54 GMT
Suppose that I have a large archive in HDFS, say, containing 500 files and 4GB.  I want to
make this available via YARN LocalResource.  The archive doesn't change very often (maybe
once per month).  Will YARN optimize for this?  Does the expanded per-node cache persist across
application runs (using something like modification time to know if re-expansion is needed)?

If the archive is re-expanded on each node every time the app is launched, should I set the
replication factor higher to reduce rack bandwidth?


View raw message