lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless" <>
Subject Re: possible segment merge improvement?
Date Thu, 01 Nov 2007 10:04:56 GMT
"robert engels" <> wrote:

> Why not check the fields dictionary for the segments being merged,
> and if the same, just copy the binary data directly?


While Lucene does not have a global field schema/semantics, unlike eg
KinoSearch, I think for many apps the fields are in fact static.

In KinoSearch, merging of stored fields & term vectors is always a
fast concatenation of the entry for that document, whereas Lucene must
re-interpret/re-number all fields on the doc, in general.  In fact I
think that KinoSearch stores field names directly in the index (ie,
not numbers).

If we make this change to Lucene then for those apps that effectively
have a static field schema (because all docs always have matching
fields), we can get the same performance that KinoSearch always gets
during its merging of stored fields & term vectors.  For all other
apps we must continue to re-interpret each field on each document.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message