lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Earwin Burrfoot <>
Subject Re: is multi-threads searcher feasible idea to speed up?
Date Mon, 04 Oct 2010 06:33:03 GMT
Thread-per-segment approach should run well with Zoie MergePolicy.

On Tue, Sep 28, 2010 at 16:17, Michael McCandless
<> wrote:
> This is an excellent idea!
> And, desperately needed.
> It's high time Lucene can take advantage of concurrency when running a
> single query.  Machines have tons of cores these days!  (My dev box
> has 24!).
> Note that one simple way to do this is use ParallelMultiSearcher: it
> uses one thread per segment in your index.
> But, note that [perversely] this means if your index is optimized you
> get no concurrency gain!  So, you have to create your index w/ a
> carefully picked maxMergeDocs/MB to ensure you can use concurrency.
> I don't like having concurrency tied to index structure.  So a better
> approach would be to have each thread pull its own Scorer for the same
> query, but then each one does a .advance to it's "chunk" of the index,
> and then iterates from there.  Then merge PQs in the end just like
> MultiSearcher.
> Mike
> On Tue, Sep 28, 2010 at 7:24 AM, Li Li <> wrote:
>> hi all
>>    I want to speed up search time for my application. In a query, the
>> time is largly used in reading postlist(io with frq files) and
>> calculate scores and collect result(cpu, with Priority Queue). IO is
>> hardly optimized or already part optimized by nio. So I want to use
>> multithreads to utilize cpu. of course, it may be decrease QPS, but
>> the response time will also decrease-- that what I want. Because cpu
>> is easily obtained compared to faster hard disk.
>>    I read the codes of searching roughly and find it's not an easy
>> task to modify search process. So I want to use other easy method .
>>    One is use solr distributed search and dispatch documents to many
>> shards. but due to the network and global idf problem,it seems not a
>> good method for me.
>>    Another one is to modify the index structure and averagely
>> dispatch frq files.
>>    e.g    term1 -> doc1,doc2, doc3,doc4,doc5 in _1.frq
>>    I create to 2 indexes with
>>            term1->doc1,doc3,doc5
>>            term1->doc2,doc4
>>    when searching, I create 2 threads with 2 PriorityQueues to
>> collect top N docs and merging their results
>>    Is the 2nd idea feasible? Or any one has related idea? thanks.
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Kirill Zakharenko/Кирилл Захаренко (
Phone: +7 (495) 683-567-4
ICQ: 104465785

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message