Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: core-user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of lists@nabble.com designates
 216.139.236.158 as permitted sender)
Message-ID: <22733399.post@talk.nabble.com>
Date: Thu, 26 Mar 2009 16:38:17 -0700 (PDT)
From: Sid123 <itissid@gmail.com>
To: core-user@hadoop.apache.org
Subject: How many nodes does one man want?
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit


Hi,
I am working of implementing some machine learning algorithms using Map Red.
I want to know that If I have data that takes 5-6 hours to train on a normal
machine. Will putting in 2-3 more nodes have an effect? I read in the yahoo
hadoop tutorial.
"Executing Hadoop on a limited amount of data on a small number of nodes may
not demonstrate particularly stellar performance as the overhead involved in
starting Hadoop programs is relatively high. Other parallel/distributed
programming paradigms such as MPI (Message Passing Interface) may perform
much better on two, four, or perhaps a dozen machines."

I have at my disposal 3 laptops each with 4 G RAM and 150G hard disk space
each...  I have 600M of training data....
-- 
View this message in context: http://www.nabble.com/How-many-nodes-does-one-man-want--tp22733399p22733399.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.