hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peeyush Bishnoi <peeyu...@yahoo-inc.com>
Subject Re: Using NFS without HDFS
Date Fri, 11 Apr 2008 12:43:53 GMT
Hello ,

To execute Hadoop Map-Reduce job input data should be on HDFS not on
NFS. 

Thanks

---
Peeyush



On Fri, 2008-04-11 at 12:40 +0100, slitz wrote:

> Hello,
> I'm trying to assemble a simple setup of 3 nodes using NFS as Distributed
> Filesystem.
> 
> Box A: 192.168.2.3, this box is either the NFS server and working as a slave
> node
> Box B: 192.168.2.30, this box is only JobTracker
> Box C: 192.168.2.31, this box is only slave
> 
> Obviously all three nodes can access the NFS shared, and the path to the
> share is /home/slitz/warehouse in all three.
> 
> My hadoop-site.xml file were copied over all nodes and looks like this:
> 
> <configuration>
> 
> <property>
> 
> <name>fs.default.name</name>
> 
>  <value>local</value>
> 
> <description>
> 
>  The name of the default file system. Either the literal string
> 
> "local" or a host:port for NDFS.
> 
>  </description>
> 
> </property>
> 
>  <property>
> 
> <name>mapred.job.tracker</name>
> 
>  <value>192.168.2.30:9001</value>
> 
> <description>
> 
>  The host and port that the MapReduce job
> 
> tracker runs at. If "local", then jobs are
> 
>  run in-process as a single map and reduce task.
> 
> </description>
> 
>  </property>
> 
> <property>
> 
> <name>mapred.system.dir</name>
> 
>  <value>/home/slitz/warehouse/hadoop_service/system</value>
> 
> <description>omgrotfcopterlol.</description>
> 
>  </property>
> 
> </configuration>
> 
> 
> As one can see, i'm not using HDFS at all.
> (Because all the free space i have is located in only one node, so using
> HDFS would be unnecessary overhead)
> 
> I've copied the input folder from hadoop to /home/slitz/warehouse/input.
> When i try to run the example line
> 
> bin/hadoop jar hadoop-*-examples.jar grep /home/slitz/warehouse/input/
> /home/slitz/warehouse/output 'dfs[a-z.]+'
> 
> the job starts and finish okay but at the end i get this error:
> 
> org.apache.hadoop.mapred.InvalidInputException: Input path doesn't exist :
> /home/slitz/hadoop-0.15.3/grep-temp-141595661
> at
> org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputFormat.java:154)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:508)
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:753)
> (...the error stack continues...)
> 
> i don't know why the input path being looked is in the local path
> /home/slitz/hadoop(...) instead of /home/slitz/warehouse/(...)
> 
> Maybe something is missing in my hadoop-site.xml?
> 
> 
> 
> slitz

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message