lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Sturlese <marc.sturl...@gmail.com>
Subject Re: big index vs. lots of small ones
Date Wed, 20 Jan 2010 16:38:20 GMT

Check out this patch witch solve the distributed IDF's problem:
https://issues.apache.org/jira/browse/SOLR-1632
I think it fixes what you are explaining. The price you pay is that there
are 2 requests per shard. If I am not worng the first is to get term
frequencies and needed info and the second one is the proper search request.
The patch also includes caching for terms in the first request.


Thorsten Scherler-3 wrote:
> 
> Hi all,
> 
> I have to do an analyses about following usecase.
> 
> I am working as consultant in a public company. We are talking about to
> offer in the future each public institution its own search server
> (probably) based on Apache Solr. However the user of our portal should
> be able to search all indexes.
> 
> The problematic part for our customer is that a meta search on various
> indexes which then later merges the response will change the scoring.
> 
> Imagine you have the two indexes
> - public health department (A)
> - press relations department (B)
> 
> Now you have 300 documents in A and only one in B about "influenza A".
> The B server will return the only document in its index with a very high
> score, since being the only one it gets a very high "base" score,
> correct?
> 
> On the other hand A may have much more important documents but they will
> not get the same "base" score.
> 
> Meaning on a merge most likely the document from Server B will be top of
> the list.
> 
> To prevent this phenomenon we are looking into merging all the
> standalone indexes in on big index but that will lead us in other
> problems because it will become pretty big pretty fast.
> 
> So here my questions:
> 
> - What are other people doing to solve this problem?
> - What is the best way with Solr to solve the problem of the "base"
> scoring?
> - What is the best way to have multiple indexes in solr?
> - Is it possible to get rid of the "base" scoring in solr?
> 
> TIA for any informations.
> 
> salu2
> -- 
> Thorsten Scherler <thorsten.at.apache.org>
> Open Source Java <consulting, training and solutions>
> 
> Sociedad Andaluza para el Desarrollo de la Sociedad 
> de la InformaciĆ³n, S.A.U. (SADESI)
> 
> 
> 
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/big-index-vs.-lots-of-small-ones-tp27241203p27244706.html
Sent from the Solr - User mailing list archive at Nabble.com.


Mime
View raw message