lucene-java-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject svn commit: r415851 - /lucene/java/trunk/xdocs/fileformats.xml
Date Wed, 21 Jun 2006 00:29:32 GMT
Author: gsingers
Date: Tue Jun 20 17:29:32 2006
New Revision: 415851

Updated the 1.9 reference at the top of the file and added in some cross references to the


Modified: lucene/java/trunk/xdocs/fileformats.xml
--- lucene/java/trunk/xdocs/fileformats.xml (original)
+++ lucene/java/trunk/xdocs/fileformats.xml Tue Jun 20 17:29:32 2006
@@ -14,7 +14,7 @@
                 This document defines the index file formats used
-                in Lucene version 1.9.  If you are using a different
+                in Lucene version 2.0.  If you are using a different
 		version of Lucene, please consult the copy of
 		<code>docs/fileformats.html</code> that was distributed
 		with the version you are using.
@@ -107,7 +107,7 @@
                     tokenized, but sometimes it is useful for certain identifier fields
                     to be indexed literally.
+                <p>See the <a href="">Field</a>
java docs for more information on Fields.</p>
             <subsection name="Segments">
@@ -230,8 +230,9 @@
                 <li><p>Term Vectors.  For each field in each document, the term
-                       (sometimes called document vector) is stored.  A term vector consists
-                       of term text and term frequency.
+                       (sometimes called document vector) may be stored.  A term vector consists
+                       of term text and term frequency.  To add Term Vectors to your index
see the
+                    <a href="">Field</a>
                 <li><p>Deleted documents.
@@ -249,7 +250,8 @@
                 All files belonging to a segment have the same name with varying
                 extensions.  The extensions correspond to the different file formats
-                described below.
+                described below. When using the Compound File format (default in 1.4 and
greater) these files are
+                collapsed into a single .cfs file (see below for details)
@@ -814,6 +816,7 @@
             	<p>FileName --&gt; String</p>
             	<p>FileData --&gt; raw file data</p>
+                <p>The raw file data is the data from the individual files named above.</p>
@@ -1096,7 +1099,10 @@
                             particular, it is the difference between the position of this
                             entry in that file and the position of the previous term's entry.
-                        <p>TODO: document skipInterval information</p>
+                        <p>SkipInterval is the fraction of TermDocs stored in skip
tables. It is used to accelerate TermDocs.skipTo(int).
+                            Larger values result in smaller indexes, greater acceleration,
but fewer accelerable cases, while
+                            smaller values result in bigger indexes, less acceleration and
+                            accelerable cases.</p>

View raw message