mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sreejith S <srssreej...@gmail.com>
Subject Re: Huge classification engine
Date Fri, 01 Apr 2011 07:37:37 GMT
Mahout can handle huge amount of data set.As a personal experience,
yesterday i run mahout classification on 4,00,000 reviews.
Amazingly, it took 10-15 mins only.
I guess there is no problem for a huge data set.Since mahout is scalable.


On Fri, Apr 1, 2011 at 2:26 AM, Ted Dunning <ted.dunning@gmail.com> wrote:

> This will be the easiest part if you parallelize the parsing and
> tokenization.  The classifier will be able to handle hundreds of pages per
> second per machine.
>
> On Thu, Mar 31, 2011 at 11:31 AM, Martin Provencher <
> mprovencher86@gmail.com
> > wrote:
>
> > For now, we need to be able to classify at least 50 pages per second, but
> > we intend to increase that number a lot (hadoop will be useful for that).
> >
>



-- 
*********************************
Sreejith.S

http://sreejiths.emurse.com/
http://srijiths.wordpress.com/
tweet2sree@twitter

*********************************
ILUGCBE
http://ilugcbe.techstud.org

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message