lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martin Sevigny" <sevi...@ajlsm.com>
Subject Reading terms performance
Date Thu, 05 Sep 2002 14:58:54 GMT
Lucene developers,

If an application using Lucene wants to read the list of values for a
field, it must use (I think) the IndexReader.terms() method. But this
method is costly, because it returns all values for all fields, although
we could want only the values of a field.

Are there any tricks here to increase performance? Are there any plans?
For instance, all field values are stored in a single file for a segment
(.tis). May be splitting the values in a specifica file per field would
make it work better?

The other thing I was wondering is the sorting of these terms. They are
retrieved in the order according to Java's compareTo() method. It means
that they are sometimes in alphabetical order (english or english-like
languages), but not always. Is this ordering really significant in the
internals of Lucene? Or is it just there for convenience to the
application developer?

I'm asking because we have an application that make los of use of these
list of terms, for non-english values, and performance in reading the
values and resorting them is a problem right now.

Thank's for any clues,

Martin Sévigny


--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message