hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eugeny N Dzhurinsky <b...@redwerk.com>
Subject Re: map/reduce and Lucene integration question
Date Thu, 13 Dec 2007 20:22:52 GMT
On Thu, Dec 13, 2007 at 11:31:49AM -0800, Ted Dunning wrote:
> After indexing, indexes are moved to multiple query servers.  The indexes on
> the local query servers are all on local disk.
> There are two dimensions to scaling search.  The first dimension is query
> rate.  To get that scaling, you simply replicate your basic search operator
> and balance using a simple load balancer.
> The second dimension is collection size.  If you have more than about 20
> million documents, you need to have several machines cooperate in a search.
> To scale in this dimension you have front end engines that do multi-searches
> against farms that each scale in the first dimension using load balancing.
> You need load balancing in front of your front end engines as well.
> With this architecture, you get good scaling in both queries per second and
> collection size and you maintain full HA.

Will that be correct if I would assume the things you described above are out
of scope of Hadoop?

Eugene N Dzhurinsky

View raw message