lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ku3ia <dem...@gmail.com>
Subject Re: Poor performance on distributed search
Date Mon, 19 Dec 2011 21:32:22 GMT
>>Uhm, either I misunderstand your question or you're doing 
>>a lot of extra work for nothing.... 

>>The whole point of sharding it exactly to collect the top N docs 
>>from each shard and merge them into a single result. So if 
>>you want 10 docs, just specify rows=10. Solr will query all 
>>the shards, get the top 10 docs from each and then 
>>merge them into a final list 10 items long. Both the initial 
>>fetch and the final merge are based on the 
>>sort criteria are respected. 

>>Score is the default "sort". If you specify other sort criteria, 
>>i.e. a field, then that sort is respected by the merge process. 

>>So why do you have this 2,000 requirement in the first 
>>place? This really sounds like an XY problem.
As I wrote it is a minimum for me. I can't change it. Final response must
has top 2K docs from all shards by query, so I specify rows=2000. Yeah, it
collects top N docs from each shard. In my case N=2000, so on production I
have 2000x30=60K, and on my own machine 2000x4=8K docs. Its true, this is an
extra work, but in other case, seems it's only way to get top 2K docs from
all shards, am I right?

>>>>P.S. Is any mechanism, for example, to get top 100 rows from each shard,
only merge it, sort by defined at query filed or score and pull result to
the user?
>>Uhm, either I misunderstand your question
For example I have 4 shards. Finally, I need 2000 docs. Now, when I'm using
&shards=127.0.0.1:8080/solr/shard1,127.0.0.1:8080/solr/shard2,127.0.0.1:8080/solr/shard3,127.0.0.1:8080/solr/shard4
Solr gets 2000 docs from each shard (shard1,2,3,4, summary we have 8000
docs) merge and sort it, for example, by default field (score), and returns
me only 2000 rows (not all 8000), which I specified at request.
So, my question was about, is any mechanism in Solr, which gets not 2000
rows from each shard, and say, If I specified 2000 docs at request, Solr
calculates how much shards I have (four shards), divides total rows onto
shards (2000/4=500) and sends to each shard queries with rows=500, but not
rows=2000, so finally, summary after merging and sorting I'll have 2000 rows
(maybe less), but not 8000... That was my question.

Thanks.

--
View this message in context: http://lucene.472066.n3.nabble.com/Poor-performance-on-distributed-search-tp3590028p3599636.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message