hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miles Osborne" <mi...@inf.ed.ac.uk>
Subject Re: Hadoop for computationally intensive tasks (no data)
Date Thu, 04 Sep 2008 17:17:05 GMT
have a look at the various machine learning applications of Map
Reduce:  they do lots of computations and here, the data corresponds
to intermediate values being used to update counts etc.

bedtime reading:

Mahout: (machine learning under Hadoop)


some machine learning papers:

Fully Distributed EM for Very Large Datasets.
Jason Wolfe, Aria Haghighi and Dan Klein


another one:



2008/9/4 Tenaali Ram <tenaaliram@gmail.com>:
> Hi,
> I am new to hadoop. What I have understood so far is- hadoop is used to
> process huge data using map-reduce paradigm.
> I am working on problem where I need to perform large number of
> computations, most computations can be done independently of each other (so
> I think each mapper can handle one or more such computations). However there
> is no data involved. Its just number crunching job. Is it suited for Hadoop
> ?
> Has anyone used hadoop for merely number crunching? If yes, how should I
> define input for the job and ensure that computations are distributed to all
> nodes in the grid?
> Thanks,
> Tenaali

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

View raw message