lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From robert engels <reng...@ix.netcom.com>
Subject Re: TermInfosReader lazy term index reading
Date Fri, 02 Feb 2007 22:56:14 GMT
I think that is much more involved... I don't think there is an easy  
way to move a query between threads/pools once it has started unless  
you restart the entire query.

You might be able to dynamically lower the thread priority however  
when you detect a long query, so that smaller (faster) queries would  
have priority.


On Feb 2, 2007, at 4:44 PM, Doron Cohen wrote:

> robert engels <rengels@ix.netcom.com> wrote on 02/02/2007 14:08:46:
>
>> You might be able to quantify the search request ahead of time (# of
>> terms, # of high frequency terms, etc.) and assign the request to the
>> appropriate pool (quick, normal, lengthy).
>>
>> Then you can assign an appropriate # of threads to each pool.
>
> Or, to avoid pre-computation, requests can first be assigned to a
> 'faster' queue, assuming they are short, and only later, if a
> request turns out to be longer, it can me dynamically moved to a
> 'slower' queue, maybe less prioritized. (Similar I think to OS
> job scheduling.) (Can have more than 2 queues.)
>
> I wonder if there's danger that queueing queries would increase the
> avg time-to-complete, even if the total time is reduced?
>
>>
>> Most people understand that complex queries might take longer to
>> execute.
>>
>>
>> On Feb 2, 2007, at 4:01 PM, Yonik Seeley wrote:
>>
>>> On 2/2/07, robert engels <rengels@ix.netcom.com> wrote:
>>>> For a process that is mostly CPU bound (which is the case with  
>>>> Lucene
>>>> if the index is in the OS cache), having so many "active" threads
>>>> will actually hurt performance due to the context switching and
>>>> synchronization.
>>>
>>> Sure... it certainly wasn't by design to have that many threads all
>>> trying to do something.
>>>
>>>> Better to use a request queue / thread pool. (I
>>>> think I read somewhere that a good rule of thumb is 2x the  
>>>> number of
>>>> processors).
>>>
>>> You might hit a scenario where a couple of threads are doing long
>>> running queries, and that could lock out other queries that might
>>> otherwise execute quickly.  But overall, it's not a bad idea.
>>>
>>>> If most of the searches are IO bound having so many disparate
>>>> requests will hurt performance as well since the disk heads will be
>>>> seeking all over the place and losing any locality of data that
>>>> Lucene provides (postings, sequental term reads, etc.).
>>>
>>> We're not hitting disk... plenty of RAM.
>>>
>>> -Yonik
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message