hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer ...@yahoo-inc.com>
Subject Re: jar files on NFS instead of DistributedCache
Date Mon, 21 Apr 2008 23:59:34 GMT
On 4/21/08 2:18 PM, "Ted Dunning" <tdunning@veoh.com> wrote:
> I agree with the "fair and balanced" part.  I always try to keep my clusters
> fair and balanced!
> 
> Joydeep should mention his background.  In any case, I agree that high-end
> filers may provide good enough NFS service, but I would also contend that
> HDFS has been better for me than NFS from generic servers.

    We take a mixed approach to the NFS problem.

    For grids that have some sort of service level agreement associated with
it, we do not allow NFS connections.  The jobs must be reasonably self
contained.

    For other grids (research, development, etc), we do allow NFS
connections and hope that people don't do stupid things.

    It is probably worth pointing out that it is much easier for a user to
do stupid things with, say, 500 nodes than 5. So we take a much a more
conservative view for "grids we care about".

    As Joydeep said, the implementation of the stack does make a huge
difference.  NetApp and Sun are leaps and bounds better than most.  In the
case of Linux, it has made great strides forward but I'd be leary using it
for the sorts of workloads we have.


Mime
View raw message