lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: Distributed Search Components
Date Mon, 21 Jun 2010 20:44:45 GMT

: I mean the implementation of the distributed search in Solr. Those classes
: that are responsible for the search-logic. I mean, from somewhere the
: searcher (or whatever) must get the knowledge about which shards exists,
: which of them to query and what their adresses are. 
: I want to learn more about the class, that manages this logic. Unfortunately
: I don't know which class it is.
: With "those" implementations I mean "MultiSearcher" and "solr's
: implementation of distributed search".

Ah .. ok ... my confusion was that i thought you were refering to multiple 
implementations of the same API (or same conceptual functionality) ... the 
fundemental approach is so completey differnet between those two systems 
that i didn't realize by "those" you mean "solr's way of doing distributed 
search" and "MultiSearcher's way of searching multiple Searchers" ... they 
are compleltey different concepts, let alone differnet implementations.

For a better understanding of how Solr implements distributed searching, 
start by looking at the SearchHandler -- it's the first part of the 
process that looks at a request and decides wether it should be executed 
locally (in which case one code patch is followed for each 
SearchComponent) or if it needs to be distributed to multiple shards (in 
which an entirely idffernet code path is executed for the local 
components) ... then take a look at how some of the SearchComponents deal 
with local vs distributed queries (QueryComponent and Facet component are 
two of the more interesting/complex ones as i recall)

: > On the otherhand, if coordinatorX just dela with shardA and shardB using 
: > an abstractiong at the Searcher level using something like MultiSearcher, 
: > then things like distributed faceting would require a *huge* amount of 
: > network IO as things like using the TermEnums and TermDocs on coordinatorX 
: > would result in all of that data being streamed from the individual 
: > (remote) searchers for each shard so the coordinator could execute the 
: > neccessary counting logic. 
: I honestly thought that the MultiSearcher would exactly do what you
: described here. What a missunderstanding of mine.

Nope.  MultiSearcher is a relatively "low level" abstract as far as solr 
is concerned -- it has no knowledge of things like stats and faceting stat 
-- it just provides an abstraction across several Searchers to treat them 
as one big index -- it doesn't even know if/when those individual 
Searchers are remote.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message