hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mikhail Bautin" <mbau...@gmail.com>
Subject jar files on NFS instead of DistributedCache
Date Fri, 18 Apr 2008 22:02:49 GMT

We are using Hadoop here at Stony Brook University to power the
next-generation text analytics backend for www.textmap.com.  We also have an
NFS partition that is mounted on all machines of our 100-node cluster.  I
found it much more convenient to store manually created files (e.g.
configuration) on the NFS partition and just use them from my mappers and
reducers rather than copying them to HDFS every time I change them, which is
necessary when using DistributedCache.  Is there a way to do the same for

Specifically, I just need a way to alter the child JVM's classpath via
JobConf, without having the framework copy anything in and out of HDFS,
because all my files are already accessible from all nodes.  I see how to do
that by adding a couple of lines to TaskRunner's run() method, e.g.:


or something similar.  Is there already such a feature or should I just go
ahead and implement it?


Mikhail Bautin

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message