lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doron Cohen <>
Subject Re: possible segment merge improvement?
Date Thu, 01 Nov 2007 14:33:14 GMT wrote on 01/11/2007 16:10:27:

> > If we make this change to Lucene then for those apps that effectively
> > have a static field schema (because all docs always have matching
> > fields), we can get the same performance that KinoSearch always gets
> > during its merging of stored fields & term vectors.
> Does "all docs have matching fields" mean that the fields must be
> present (as well as identically typed) on each doc, or could they
> still be sparse?  If they can be sparse, how do you avoid
> renumbering???

Perhaps I interpreted this optimization proposal wrong. -

My understanding is that this is for stored fields data
in the field data (.fdt) file, where FieldNum might
need to be changed, in:

   DocFieldData --> FieldCount, <FieldNum, Bits, Value> FieldCount

My reading of Robert's suggestion is that when we know that
FieldInfos of the resulted segment is identical to the
FieldInfos of a certain (sub) segment being merged then
there is no need to parse+rewrite the field data for all
docs of that (sub)segment, rather they can be written as is.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message