mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Isabel Drost <isa...@apache.org>
Subject Re: Load Dataset and Instances from database
Date Fri, 25 Nov 2011 12:46:02 GMT
On 24.11.2011 Ted Dunning wrote:
> Actually, one of the most reliable ways to kill a database is to use it as
> input or output for even a small Hadoop cluster.  Having hundreds of
> processes all open connections and read at once is fairly abusive.

Though that does not mean that data cannot by synced to hdfs before being used 
in a map/reduce job. Tools like sqoop help with that.

Isabel

Mime
View raw message