lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doron Cohen (JIRA)" <>
Subject [jira] Updated: (LUCENE-756) Maintain norms in a single file .nrm
Date Thu, 21 Dec 2006 04:28:22 GMT
     [ ]

Doron Cohen updated LUCENE-756:

    Attachment: nrm.patch.txt

Replacing the patch file (prev file was garbage - "svn stat" instead of "svn diff").

Few words on how this patch works: 
- <segment>.nrm file was added.
- addDocument  (DocumentWriter) still writes each norm to a separate file - but that's in
- at merge, all norms are written to a single file.
- CFS now also maintains all norms in a single file.
- IndexWriter merge-decision now considers hasSeparateNorms() not only for CFS but also for
non compound.
- SegmentReader.openNorms() still creates ready-to-use/load Norm objects (which would read
the norms only when needed). But the Norm object is now assigned a normSeek value, which is
nonzero if the norm file is <segment>.nrm.
- existing indexes, prior to this change, are managed the same way that segments resulted
of addDocument are managed.

- I verified that also the (contrib) tests for FieldNormModifier and LengthNormModofier are

- I might add a test.
- more benchmarking?
- update fileFormat document.

> Maintain norms in a single file .nrm
> ------------------------------------
>                 Key: LUCENE-756
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Doron Cohen
>         Assigned To: Doron Cohen
>            Priority: Minor
>         Attachments: nrm.patch.txt
> Non-compound indexes are ~10% faster at indexing, and perform 50% IO activity comparing
to compound indexes. But their file descriptors foot print is much higher. 
> By maintaining all field norms in a single .nrm file, we can bound the number of files
used by non compound indexes, and possibly allow more applications to use this format.
> More details on the motivation for this in:
(in particular

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message