hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miles Osborne" <mi...@inf.ed.ac.uk>
Subject Re: Hadoop for computationally intensive tasks (no data)
Date Thu, 04 Sep 2008 17:17:05 GMT
have a look at the various machine learning applications of Map
Reduce:  they do lots of computations and here, the data corresponds
to intermediate values being used to update counts etc.

bedtime reading:

Mahout: (machine learning under Hadoop)

http://lucene.apache.org/mahout/

some machine learning papers:

Fully Distributed EM for Very Large Datasets.
Jason Wolfe, Aria Haghighi and Dan Klein

www.cs.berkeley.edu/~aria42/pubs/icml08-distributedem.pdf

another one:

www.cs.stanford.edu/people/ang//papers/nips06-mapreducemulticore.pdf

Miles

2008/9/4 Tenaali Ram <tenaaliram@gmail.com>:
> Hi,
>
> I am new to hadoop. What I have understood so far is- hadoop is used to
> process huge data using map-reduce paradigm.
>
> I am working on problem where I need to perform large number of
> computations, most computations can be done independently of each other (so
> I think each mapper can handle one or more such computations). However there
> is no data involved. Its just number crunching job. Is it suited for Hadoop
> ?
>
> Has anyone used hadoop for merely number crunching? If yes, how should I
> define input for the job and ensure that computations are distributed to all
> nodes in the grid?
>
> Thanks,
> Tenaali
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

Mime
View raw message