lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Created: (LUCENE-2922) Optimize BlockTermsReader.seek
Date Tue, 15 Feb 2011 16:11:57 GMT
Optimize BlockTermsReader.seek
------------------------------

                 Key: LUCENE-2922
                 URL: https://issues.apache.org/jira/browse/LUCENE-2922
             Project: Lucene - Java
          Issue Type: Improvement
          Components: Index
            Reporter: Michael McCandless
            Assignee: Michael McCandless
            Priority: Minor
             Fix For: 4.0


When we seek, we first consult the terms index to find the right block
of 32 (default) terms that may hold the target term.  Then, we scan
that block looking for an exact match.

The scanning just uses next() and then compares the full term, but
this is actually rather wasteful.  First off, since all terms in the
block share a common prefix, we should compare the target against that
common prefix once, and then only compare the new suffix of each
term.  Second, since the term suffixes have already been read up front
into a byte[], we should do a no-copy comparison (vs today, where we
first read a copy into the local BytesRef and then compare).

With this opto, I removed the ability for BlockTermsWriter/Reader to
support arbitrary term sort order -- it's now hardwired to
BytesRef.utf8SortedAsUnicode.


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message