hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Running on multiple CPU's
Date Mon, 16 Apr 2007 16:41:24 GMT
Eelco Lempsink wrote:
> Inspired by 
> http://www.mail-archive.com/nutch-user@lucene.apache.org/msg02394.html 
> I'm trying to run Hadoop on multiple CPU's, but without using HDFS.

To be clear: you need some sort of shared filesystem, if not HDFS, then 
NFS, S3, or something else.  For example, the job client interacts with 
the job tracker by copying files to the shared filesystem named by 
fs.default.name, and job inputs and outputs are assumed to come from a 
shared filesystem.

So, if you're using NFS, then you'd set fs.default.name to something 
like "file:///mnt/shared/hadoop/".  Note also that as your cluster 
grows, NFS will soon become a bottleneck.  That's why HDFS is provided: 
there aren't other readily available shared filesystems that scale 


View raw message