mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <>
Subject Re: TU Berlin Winter of Code Project
Date Fri, 06 Nov 2009 19:57:27 GMT
The question that I don't see addressed is whether you choose to use a fully
streaming approach as is done in Bixo or whether you will use a document
repository approach as is more common in most search engines.

Hbase is reputedly ready enough to serve as a document repository.  Using
such an approach would be very helpful for the incremental nature of web

What is the plan in this regard?

On Fri, Nov 6, 2009 at 11:47 AM, Grant Ingersoll <>wrote:

> This is obviously only a first draft of what we think would be a suited
> overall
> architecture

Ted Dunning, CTO

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message