lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless" <luc...@mikemccandless.com>
Subject Re: possible segment merge improvement?
Date Thu, 01 Nov 2007 10:04:56 GMT
"robert engels" <rengels@ix.netcom.com> wrote:

> Why not check the fields dictionary for the segments being merged,
> and if the same, just copy the binary data directly?

+1

While Lucene does not have a global field schema/semantics, unlike eg
KinoSearch, I think for many apps the fields are in fact static.

In KinoSearch, merging of stored fields & term vectors is always a
fast concatenation of the entry for that document, whereas Lucene must
re-interpret/re-number all fields on the doc, in general.  In fact I
think that KinoSearch stores field names directly in the index (ie,
not numbers).

If we make this change to Lucene then for those apps that effectively
have a static field schema (because all docs always have matching
fields), we can get the same performance that KinoSearch always gets
during its merging of stored fields & term vectors.  For all other
apps we must continue to re-interpret each field on each document.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message