lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Stanley <sastanl...@yahoo.com>
Subject RE: RE: Query performance with DateFilter
Date Thu, 06 Dec 2001 17:09:53 GMT
Doug,

Thanks for the fix.  For a single query at a time, this does indeed
solve the problem.  The query performance comes back up to the level it
was before (a change from 17,000ms to 800ms with the test query we are
using).  As you mentioned, however, performance does indeed suffer when
more than one query is being performed at a time.  With several 
simultanious queries, the timings degrades back to around 17s.

Thanks,

Scott


> -----Original Message-----
> From: Doug Cutting [mailto:DCutting@grandcentral.com]
> Sent: Monday, December 03, 2001 8:49 AM
> To: 'Lucene Developers List'
> Subject: RE: Query performance with DateFilter
> 
> 
> I have a guess about what the problem is.  Lucene used to do a better

job of
> re-using TermFreq input streams.  I've attached new versions of a few

files
> which should restore the earlier behavior.  Try running with these.

> This isn't actually a very good fix, since it uses a single element 
cache
> (as was done before).  For example, performance will suffer again if 
more
> than one thread uses a DateFilter at the same time.  A scalable fix 
would
> not be much harder to implement.  So if this fixes your problem, I
will
> check in the more scalable version.

> Doug

>> -----Original Message-----
>> From: Scott Stanley [mailto:sastanley3@yahoo.com]
>> Sent: Friday, November 30, 2001 2:58 PM
>> To: lucene-dev
>> Subject: Query performance with DateFilter
>> 
>> 
>> I have found that searching with date filtering is much slower since
>> shifting from Lucene 1.1b to lucene 1.2 rc2 (basically from 
>>com.lucene
>> to org.apache.lucene).
>>  
>> With 1.1b, search time was : 700ms
>> With 1.2rc2 : 11,000 ms!
>> (15 times slower)
>> (with  50,000 files indexed)
>>  
>> However, searching  with no filtering seems to be a bit faster with
>> 1.2rc2.
>>  
>> To be sure  that the DateFilter was responsible for the performance
>> hit, I tested this:
>>  
>>     DateFilter df = new DateFilter("DOC_DATE", 1000087883595L,
>>                                    1009087883595L)
>>     BitSet bs = df.bits(IndexReader.open("/index");
>>  
>> With Lucene 1.1b : 668 ms
>> With Lucene 1.2 rc2 : 9000 ms
>>  
>> Running this under JProbe, I noticed that the performance difference
>> was coming from the call to SegmentTermDocs.next().  This method
call
>> seems to be much slower because InputStream.readByte() is slower...
>>  
>> I noticed that InputStream.refill() and 
>> InputStream.readInternal() take
>> much more time.  I finally narrowed down to
>> RandomAccessFile.read(byte[], int, int) which is called 
>> around 50 times
>> more often in 1.2 RC2  than in the earlier version.
>>  
>> Is there an issue with the way FSDirectory handles 
>> bufferization of the
>> bytes read from the index files?  Is all of this related to the 
>Thread
>> Safety fix?   I guess the bottom line is,  is there anything we can 
>do
>> to bring the performance back up with the DateFilter? 
>> 
>> Scott
>> 


__________________________________________________
Do You Yahoo!?
Send your FREE holiday greetings online!
http://greetings.yahoo.com

--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message