hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Simpson" <Scott.Simp...@computer.org>
Subject Re: Confusion about the Hadoop conf/slaves file
Date Tue, 11 Apr 2006 00:27:17 GMT
Doug Cutting wrote:

>Scott Simpson wrote:
>> Suppose I want to run Nutch 0.8 searches on separate machines than I 
>> crawl on. Is there a way to separate this so my crawling operation
>> (MapReduce) doesn't happen on my search machines?

>You could have two different configuration directories and set
>HADOOP_CONF_DIR (or use cd).

Excuse my ignorance on this issue. Say I have 5 machines in my Hadoop
cluster and I only list two of them in the configuration file when I do a
"fetch" or a "generate". Won't this just store the data on the two nodes
since that is all I've listed for my crawling machines? I'm trying to crawl
on two but store my data across all five.

View raw message