lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <DCutt...@grandcentral.com>
Subject RE: Query performance with DateFilter
Date Mon, 03 Dec 2001 16:49:03 GMT
I have a guess about what the problem is.  Lucene used to do a better job of
re-using TermFreq input streams.  I've attached new versions of a few files
which should restore the earlier behavior.  Try running with these.

This isn't actually a very good fix, since it uses a single element cache
(as was done before).  For example, performance will suffer again if more
than one thread uses a DateFilter at the same time.  A scalable fix would
not be much harder to implement.  So if this fixes your problem, I will
check in the more scalable version.

Doug

> -----Original Message-----
> From: Scott Stanley [mailto:sastanley3@yahoo.com]
> Sent: Friday, November 30, 2001 2:58 PM
> To: lucene-dev
> Subject: Query performance with DateFilter
> 
> 
> I have found that searching with date filtering is much slower since
> shifting from Lucene 1.1b to lucene 1.2 rc2 (basically from com.lucene
> to org.apache.lucene).
>  
> With 1.1b, search time was : 700ms
> With 1.2rc2 : 11,000 ms!
> (15 times slower)
> (with  50,000 files indexed)
>  
> However, searching  with no filtering seems to be a bit faster with
> 1.2rc2.
>  
> To be sure  that the DateFilter was responsible for the performance
> hit, I tested this:
>  
>     DateFilter df = new DateFilter("DOC_DATE", 1000087883595L,
>                                    1009087883595L)
>     BitSet bs = df.bits(IndexReader.open("/index");
>  
> With Lucene 1.1b : 668 ms
> With Lucene 1.2 rc2 : 9000 ms
>  
> Running this under JProbe, I noticed that the performance difference
> was coming from the call to SegmentTermDocs.next().  This method call
> seems to be much slower because InputStream.readByte() is slower...
>  
> I noticed that InputStream.refill() and 
> InputStream.readInternal() take
> much more time.  I finally narrowed down to
> RandomAccessFile.read(byte[], int, int) which is called 
> around 50 times
> more often in 1.2 RC2  than in the earlier version.
>  
> Is there an issue with the way FSDirectory handles 
> bufferization of the
> bytes read from the index files?  Is all of this related to the Thread
> Safety fix?   I guess the bottom line is,  is there anything we can do
> to bring the performance back up with the DateFilter? 
> 
> Scott
> 
> __________________________________________________
> Do You Yahoo!?
> Buy the perfect holiday gifts at Yahoo! Shopping.
> http://shopping.yahoo.com
> 
> --
> To unsubscribe, e-mail:   
<mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message