lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christoph Goller <gol...@detego-software.de>
Subject Re: optimized disk usage when creating a compound index
Date Fri, 06 Aug 2004 13:22:23 GMT
It will not be lost. I have already reviewed it.
There are some open issues concerning the changes in
TestCompoundFile, that I want to discuss with Bernhard
and then (hopefully next week) I will commit it.

Christoph

Erik Hatcher wrote:
> Bernhard,
> 
> Impressive work.  In order to prevent this from being lost in e-mail,  
> could you please create a new Bugzilla issue for each of your great  
> patches and attach the differences as CVS patches (cvs diff -Nu)?
> 
> Many thanks for these contributions.
> 
>     Erik
> 
> On Aug 6, 2004, at 3:52 AM, Bernhard Messer wrote:
> 
>> hi developers,
>>
>> i made some measurements on lucene disk usage during index creation.  
>> It's no surprise that during index creation,  within the index  
>> optimization, more disk space is necessary than the final index size  
>> will reach. What i didn't expect is such a high difference in disk  
>> size usage, switching the compound file option on or off. Using the  
>> compound file option, the disk usage during index creation is more  
>> than 3 times higher than the final index size. This could be a pain 
>> in  the neck, running projects like nutch, where huge datasets will 
>> be  indexed. The grow rate relies on the fact that SegmentMerger 
>> creates  the fully compound file first, before deleting the original, 
>> unused  files.
>> So i patched SegmentMerger and CompoundFileWriter classes in a way,  
>> that they will delete the file immediatly after copying the data  
>> within the compound. The result was, that we could reduce the  
>> necessary disk space from factor 3 to 2.
>> The change forces to make some modifications within the  
>> TestCompoundFile class also. In several test methods the original 
>> file  was compared to it's compound part. Using the modified 
>> SegmentMerger  and CompoundFileWriter, the file was already deleted 
>> and couldn't be  opened.
>>
>> Here are some statistics about disk usage during index creation:
>>
>> compound option is off:
>> final index size: 380 KB           max. diskspace used: 408 KB
>> final index size: 11079 KB       max. diskspace used: 11381 KB
>> final index size: 204148 KB      max. diskspace used: 20739 KB
>>
>> using compound index:
>> final index size: 380 KB           max. diskspace used: 1145 KB
>> final index size: 11079 KB       max. diskspace used: 33544 KB
>> final index size: 204148 KB      max. diskspace used: 614977 KB
>>
>> using compound index with patch:
>> final index size: 380 KB           max. diskspace used: 777 KB
>> final index size: 11079 KB       max. diskspace used: 22464 KB
>> final index size: 204148 KB      max. diskspace used: 410829
>>
>> The change was tested under windows and linux without any negativ 
>> side  effects. All JUnit test cases work fine. In the attachment 
>> you'll find  all the necessary files:
>>
>> SegmentMerger.java
>> CompoundFileWriter.java
>> TestCompoundFile.java
>>
>> SegmentMerger.diff
>> CompoundFileWriter.diff
>> TestCompoundFile.diff
>>
>> keep moving
>> Bernhard
>>
>>
>> Index: src/java/org/apache/lucene/index/CompoundFileWriter.java
>> ===================================================================
>> RCS file:  
>> /home/cvspublic/jakarta-lucene/src/java/org/apache/lucene/index/ 
>> CompoundFileWriter.java,v
>> retrieving revision 1.3
>> diff -r1.3 CompoundFileWriter.java
>> 163a164,166
>>
>>>
>>>                 // immediatly delete the copied file to safe  disk-space
>>>                 directory.deleteFile((String) fe.file);
>>
>> package org.apache.lucene.index;
>>
>> /**
>>  * Copyright 2004 The Apache Software Foundation
>>  *
>>  * Licensed under the Apache License, Version 2.0 (the "License");
>>  * you may not use this file except in compliance with the License.
>>  * You may obtain a copy of the License at
>>  *
>>  *     http://www.apache.org/licenses/LICENSE-2.0
>>  *
>>  * Unless required by applicable law or agreed to in writing, software
>>  * distributed under the License is distributed on an "AS IS" BASIS,
>>  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or  
>> implied.
>>  * See the License for the specific language governing permissions and
>>  * limitations under the License.
>>  */
>>
>> import org.apache.lucene.store.Directory;
>> import org.apache.lucene.store.OutputStream;
>> import org.apache.lucene.store.InputStream;
>> import java.util.LinkedList;
>> import java.util.HashSet;
>> import java.util.Iterator;
>> import java.io.IOException;
>>
>>
>> /**
>>  * Combines multiple files into a single compound file.
>>  * The file format:<br>
>>  * <ul>
>>  *     <li>VInt fileCount</li>
>>  *     <li>{Directory}
>>  *         fileCount entries with the following structure:</li>
>>  *         <ul>
>>  *             <li>long dataOffset</li>
>>  *             <li>UTFString extension</li>
>>  *         </ul>
>>  *     <li>{File Data}
>>  *         fileCount entries with the raw data of the corresponding  
>> file</li>
>>  * </ul>
>>  *
>>  * The fileCount integer indicates how many files are contained in  
>> this compound
>>  * file. The {directory} that follows has that many entries. Each  
>> directory entry
>>  * contains an encoding identifier, an long pointer to the start of  
>> this file's
>>  * data section, and a UTF String with that file's extension.
>>  *
>>  * @author Dmitry Serebrennikov
>>  * @version $Id: CompoundFileWriter.java,v 1.3 2004/03/29 22:48:02  
>> cutting Exp $
>>  */
>> final class CompoundFileWriter {
>>
>>     private static final class FileEntry {
>>         /** source file */
>>         String file;
>>
>>         /** temporary holder for the start of directory entry for 
>> this  file */
>>         long directoryOffset;
>>
>>         /** temporary holder for the start of this file's data 
>> section  */
>>         long dataOffset;
>>     }
>>
>>
>>     private Directory directory;
>>     private String fileName;
>>     private HashSet ids;
>>     private LinkedList entries;
>>     private boolean merged = false;
>>
>>
>>     /** Create the compound stream in the specified file. The file  
>> name is the
>>      *  entire name (no extensions are added).
>>      */
>>     public CompoundFileWriter(Directory dir, String name) {
>>         if (dir == null)
>>             throw new IllegalArgumentException("Missing directory");
>>         if (name == null)
>>             throw new IllegalArgumentException("Missing name");
>>
>>         directory = dir;
>>         fileName = name;
>>         ids = new HashSet();
>>         entries = new LinkedList();
>>     }
>>
>>     /** Returns the directory of the compound file. */
>>     public Directory getDirectory() {
>>         return directory;
>>     }
>>
>>     /** Returns the name of the compound file. */
>>     public String getName() {
>>         return fileName;
>>     }
>>
>>     /** Add a source stream. If sourceDir is null, it is set to the
>>      *  same value as the directory where this compound stream exists.
>>      *  The id is the string by which the sub-stream will be know in  the
>>      *  compound stream. The caller must ensure that the ID is 
>> unique.  If the
>>      *  id is null, it is set to the name of the source file.
>>      */
>>     public void addFile(String file) {
>>         if (merged)
>>             throw new IllegalStateException(
>>                 "Can't add extensions after merge has been called");
>>
>>         if (file == null)
>>             throw new IllegalArgumentException(
>>                 "Missing source file");
>>
>>         if (! ids.add(file))
>>             throw new IllegalArgumentException(
>>                 "File " + file + " already added");
>>
>>         FileEntry entry = new FileEntry();
>>         entry.file = file;
>>         entries.add(entry);
>>     }
>>
>>     /** Merge files with the extensions added up to now.
>>      *  All files with these extensions are combined sequentially 
>> into  the
>>      *  compound stream. After successful merge, the source files
>>      *  are deleted.
>>      */
>>     public void close() throws IOException {
>>         if (merged)
>>             throw new IllegalStateException(
>>                 "Merge already performed");
>>
>>         if (entries.isEmpty())
>>             throw new IllegalStateException(
>>                 "No entries to merge have been defined");
>>
>>         merged = true;
>>
>>         // open the compound stream
>>         OutputStream os = null;
>>         try {
>>             os = directory.createFile(fileName);
>>
>>             // Write the number of entries
>>             os.writeVInt(entries.size());
>>
>>             // Write the directory with all offsets at 0.
>>             // Remember the positions of directory entries so that we  
>> can
>>             // adjust the offsets later
>>             Iterator it = entries.iterator();
>>             while(it.hasNext()) {
>>                 FileEntry fe = (FileEntry) it.next();
>>                 fe.directoryOffset = os.getFilePointer();
>>                 os.writeLong(0);    // for now
>>                 os.writeString(fe.file);
>>             }
>>
>>             // Open the files and copy their data into the stream.
>>             // Remeber the locations of each file's data section.
>>             byte buffer[] = new byte[1024];
>>             it = entries.iterator();
>>             while(it.hasNext()) {
>>                 FileEntry fe = (FileEntry) it.next();
>>                 fe.dataOffset = os.getFilePointer();
>>                 copyFile(fe, os, buffer);
>>
>>                 // immediatly delete the copied file to safe disk-space
>>                 directory.deleteFile((String) fe.file);
>>             }
>>
>>             // Write the data offsets into the directory of the  
>> compound stream
>>             it = entries.iterator();
>>             while(it.hasNext()) {
>>                 FileEntry fe = (FileEntry) it.next();
>>                 os.seek(fe.directoryOffset);
>>                 os.writeLong(fe.dataOffset);
>>             }
>>
>>             // Close the output stream. Set the os to null before  
>> trying to
>>             // close so that if an exception occurs during the close,  
>> the
>>             // finally clause below will not attempt to close the  stream
>>             // the second time.
>>             OutputStream tmp = os;
>>             os = null;
>>             tmp.close();
>>
>>         } finally {
>>             if (os != null) try { os.close(); } catch (IOException e)  
>> { }
>>         }
>>     }
>>
>>     /** Copy the contents of the file with specified extension into the
>>      *  provided output stream. Use the provided buffer for moving data
>>      *  to reduce memory allocation.
>>      */
>>     private void copyFile(FileEntry source, OutputStream os, byte  
>> buffer[])
>>     throws IOException
>>     {
>>         InputStream is = null;
>>         try {
>>             long startPtr = os.getFilePointer();
>>
>>             is = directory.openFile(source.file);
>>             long length = is.length();
>>             long remainder = length;
>>             int chunk = buffer.length;
>>
>>             while(remainder > 0) {
>>                 int len = (int) Math.min(chunk, remainder);
>>                 is.readBytes(buffer, 0, len);
>>                 os.writeBytes(buffer, len);
>>                 remainder -= len;
>>             }
>>
>>             // Verify that remainder is 0
>>             if (remainder != 0)
>>                 throw new IOException(
>>                     "Non-zero remainder length after copying: " +  
>> remainder
>>                     + " (id: " + source.file + ", length: " + length
>>                     + ", buffer size: " + chunk + ")");
>>
>>             // Verify that the output length diff is equal to 
>> original  file
>>             long endPtr = os.getFilePointer();
>>             long diff = endPtr - startPtr;
>>             if (diff != length)
>>                 throw new IOException(
>>                     "Difference in the output file offsets " + diff
>>                     + " does not match the original file length " +  
>> length);
>>
>>         } finally {
>>             if (is != null) is.close();
>>         }
>>     }
>> }
>> Index: src/java/org/apache/lucene/index/SegmentMerger.java
>> ===================================================================
>> RCS file:  
>> /home/cvspublic/jakarta-lucene/src/java/org/apache/lucene/index/ 
>> SegmentMerger.java,v
>> retrieving revision 1.11
>> diff -r1.11 SegmentMerger.java
>> 151c151
>> <     // Perform the merge
>> ---
>>
>>>     // Perform the merge. Files will be deleted within  
>>> CompoundFileWriter.close()
>>
>> 153,158c153
>> <
>> <     // Now delete the source files
>> <     it = files.iterator();
>> <     while (it.hasNext()) {
>> <       directory.deleteFile((String) it.next());
>> <     }
>> ---
>>
>>>
>> package org.apache.lucene.index;
>>
>> /**
>>  * Copyright 2004 The Apache Software Foundation
>>  *
>>  * Licensed under the Apache License, Version 2.0 (the "License");
>>  * you may not use this file except in compliance with the License.
>>  * You may obtain a copy of the License at
>>  *
>>  *     http://www.apache.org/licenses/LICENSE-2.0
>>  *
>>  * Unless required by applicable law or agreed to in writing, software
>>  * distributed under the License is distributed on an "AS IS" BASIS,
>>  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or  
>> implied.
>>  * See the License for the specific language governing permissions and
>>  * limitations under the License.
>>  */
>>
>> import java.util.Vector;
>> import java.util.ArrayList;
>> import java.util.Iterator;
>> import java.io.IOException;
>>
>> import org.apache.lucene.store.Directory;
>> import org.apache.lucene.store.OutputStream;
>> import org.apache.lucene.store.RAMOutputStream;
>>
>> /**
>>  * The SegmentMerger class combines two or more Segments, represented  
>> by an IndexReader ({@link #add},
>>  * into a single Segment.  After adding the appropriate readers, call  
>> the merge method to combine the
>>  * segments.
>>  *<P>
>>  * If the compoundFile flag is set, then the segments will be merged  
>> into a compound file.
>>  *
>>  *
>>  * @see #merge
>>  * @see #add
>>  */
>> final class SegmentMerger {
>>   private boolean useCompoundFile;
>>   private Directory directory;
>>   private String segment;
>>
>>   private Vector readers = new Vector();
>>   private FieldInfos fieldInfos;
>>
>>   // File extensions of old-style index files
>>   private static final String COMPOUND_EXTENSIONS[] = new String[] {
>>     "fnm", "frq", "prx", "fdx", "fdt", "tii", "tis"
>>   };
>>   private static final String VECTOR_EXTENSIONS[] = new String[] {
>>     "tvx", "tvd", "tvf"
>>   };
>>
>>   /**
>>    *
>>    * @param dir The Directory to merge the other segments into
>>    * @param name The name of the new segment
>>    * @param compoundFile true if the new segment should use a  
>> compoundFile
>>    */
>>   SegmentMerger(Directory dir, String name, boolean compoundFile) {
>>     directory = dir;
>>     segment = name;
>>     useCompoundFile = compoundFile;
>>   }
>>
>>   /**
>>    * Add an IndexReader to the collection of readers that are to be  
>> merged
>>    * @param reader
>>    */
>>   final void add(IndexReader reader) {
>>     readers.addElement(reader);
>>   }
>>
>>   /**
>>    *
>>    * @param i The index of the reader to return
>>    * @return The ith reader to be merged
>>    */
>>   final IndexReader segmentReader(int i) {
>>     return (IndexReader) readers.elementAt(i);
>>   }
>>
>>   /**
>>    * Merges the readers specified by the {@link #add} method into the  
>> directory passed to the constructor
>>    * @return The number of documents that were merged
>>    * @throws IOException
>>    */
>>   final int merge() throws IOException {
>>     int value;
>>
>>     value = mergeFields();
>>     mergeTerms();
>>     mergeNorms();
>>
>>     if (fieldInfos.hasVectors())
>>       mergeVectors();
>>
>>     if (useCompoundFile)
>>       createCompoundFile();
>>
>>     return value;
>>   }
>>
>>   /**
>>    * close all IndexReaders that have been added.
>>    * Should not be called before merge().
>>    * @throws IOException
>>    */
>>   final void closeReaders() throws IOException {
>>     for (int i = 0; i < readers.size(); i++) {  // close readers
>>       IndexReader reader = (IndexReader) readers.elementAt(i);
>>       reader.close();
>>     }
>>   }
>>
>>   private final void createCompoundFile()
>>           throws IOException {
>>     CompoundFileWriter cfsWriter =
>>             new CompoundFileWriter(directory, segment + ".cfs");
>>
>>     ArrayList files =
>>       new ArrayList(COMPOUND_EXTENSIONS.length + fieldInfos.size());
>>
>>     // Basic files
>>     for (int i = 0; i < COMPOUND_EXTENSIONS.length; i++) {
>>       files.add(segment + "." + COMPOUND_EXTENSIONS[i]);
>>     }
>>
>>     // Field norm files
>>     for (int i = 0; i < fieldInfos.size(); i++) {
>>       FieldInfo fi = fieldInfos.fieldInfo(i);
>>       if (fi.isIndexed) {
>>         files.add(segment + ".f" + i);
>>       }
>>     }
>>
>>     // Vector files
>>     if (fieldInfos.hasVectors()) {
>>       for (int i = 0; i < VECTOR_EXTENSIONS.length; i++) {
>>         files.add(segment + "." + VECTOR_EXTENSIONS[i]);
>>       }
>>     }
>>
>>     // Now merge all added files
>>     Iterator it = files.iterator();
>>     while (it.hasNext()) {
>>       cfsWriter.addFile((String) it.next());
>>     }
>>
>>     // Perform the merge. Files will be deleted within  
>> CompoundFileWriter.close()
>>     cfsWriter.close();
>>
>>   }
>>
>>   /**
>>    *
>>    * @return The number of documents in all of the readers
>>    * @throws IOException
>>    */
>>   private final int mergeFields() throws IOException {
>>     fieldInfos = new FieldInfos();          // merge field names
>>     int docCount = 0;
>>     for (int i = 0; i < readers.size(); i++) {
>>       IndexReader reader = (IndexReader) readers.elementAt(i);
>>       fieldInfos.addIndexed(reader.getIndexedFieldNames(true), true);
>>       fieldInfos.addIndexed(reader.getIndexedFieldNames(false), false);
>>       fieldInfos.add(reader.getFieldNames(false), false);
>>     }
>>     fieldInfos.write(directory, segment + ".fnm");
>>
>>     FieldsWriter fieldsWriter = // merge field values
>>             new FieldsWriter(directory, segment, fieldInfos);
>>     try {
>>       for (int i = 0; i < readers.size(); i++) {
>>         IndexReader reader = (IndexReader) readers.elementAt(i);
>>         int maxDoc = reader.maxDoc();
>>         for (int j = 0; j < maxDoc; j++)
>>           if (!reader.isDeleted(j)) {               // skip deleted  docs
>>             fieldsWriter.addDocument(reader.document(j));
>>             docCount++;
>>           }
>>       }
>>     } finally {
>>       fieldsWriter.close();
>>     }
>>     return docCount;
>>   }
>>
>>   /**
>>    * Merge the TermVectors from each of the segments into the new one.
>>    * @throws IOException
>>    */
>>   private final void mergeVectors() throws IOException {
>>     TermVectorsWriter termVectorsWriter =
>>       new TermVectorsWriter(directory, segment, fieldInfos);
>>
>>     try {
>>       for (int r = 0; r < readers.size(); r++) {
>>         IndexReader reader = (IndexReader) readers.elementAt(r);
>>         int maxDoc = reader.maxDoc();
>>         for (int docNum = 0; docNum < maxDoc; docNum++) {
>>           // skip deleted docs
>>           if (reader.isDeleted(docNum)) {
>>             continue;
>>           }
>>           termVectorsWriter.openDocument();
>>
>>           // get all term vectors
>>           TermFreqVector[] sourceTermVector =
>>             reader.getTermFreqVectors(docNum);
>>
>>           if (sourceTermVector != null) {
>>             for (int f = 0; f < sourceTermVector.length; f++) {
>>               // translate field numbers
>>               TermFreqVector termVector = sourceTermVector[f];
>>               termVectorsWriter.openField(termVector.getField());
>>               String [] terms = termVector.getTerms();
>>               int [] freqs = termVector.getTermFrequencies();
>>
>>               for (int t = 0; t < terms.length; t++) {
>>                 termVectorsWriter.addTerm(terms[t], freqs[t]);
>>               }
>>             }
>>             termVectorsWriter.closeDocument();
>>           }
>>         }
>>       }
>>     } finally {
>>       termVectorsWriter.close();
>>     }
>>   }
>>
>>   private OutputStream freqOutput = null;
>>   private OutputStream proxOutput = null;
>>   private TermInfosWriter termInfosWriter = null;
>>   private int skipInterval;
>>   private SegmentMergeQueue queue = null;
>>
>>   private final void mergeTerms() throws IOException {
>>     try {
>>       freqOutput = directory.createFile(segment + ".frq");
>>       proxOutput = directory.createFile(segment + ".prx");
>>       termInfosWriter =
>>               new TermInfosWriter(directory, segment, fieldInfos);
>>       skipInterval = termInfosWriter.skipInterval;
>>       queue = new SegmentMergeQueue(readers.size());
>>
>>       mergeTermInfos();
>>
>>     } finally {
>>       if (freqOutput != null) freqOutput.close();
>>       if (proxOutput != null) proxOutput.close();
>>       if (termInfosWriter != null) termInfosWriter.close();
>>       if (queue != null) queue.close();
>>     }
>>   }
>>
>>   private final void mergeTermInfos() throws IOException {
>>     int base = 0;
>>     for (int i = 0; i < readers.size(); i++) {
>>       IndexReader reader = (IndexReader) readers.elementAt(i);
>>       TermEnum termEnum = reader.terms();
>>       SegmentMergeInfo smi = new SegmentMergeInfo(base, termEnum,  
>> reader);
>>       base += reader.numDocs();
>>       if (smi.next())
>>         queue.put(smi);                  // initialize queue
>>       else
>>         smi.close();
>>     }
>>
>>     SegmentMergeInfo[] match = new SegmentMergeInfo[readers.size()];
>>
>>     while (queue.size() > 0) {
>>       int matchSize = 0;              // pop matching terms
>>       match[matchSize++] = (SegmentMergeInfo) queue.pop();
>>       Term term = match[0].term;
>>       SegmentMergeInfo top = (SegmentMergeInfo) queue.top();
>>
>>       while (top != null && term.compareTo(top.term) == 0) {
>>         match[matchSize++] = (SegmentMergeInfo) queue.pop();
>>         top = (SegmentMergeInfo) queue.top();
>>       }
>>
>>       mergeTermInfo(match, matchSize);          // add new TermInfo
>>
>>       while (matchSize > 0) {
>>         SegmentMergeInfo smi = match[--matchSize];
>>         if (smi.next())
>>           queue.put(smi);              // restore queue
>>         else
>>           smi.close();                  // done with a segment
>>       }
>>     }
>>   }
>>
>>   private final TermInfo termInfo = new TermInfo(); // minimize consing
>>
>>   /** Merge one term found in one or more segments. The array  
>> <code>smis</code>
>>    *  contains segments that are positioned at the same term.  
>> <code>N</code>
>>    *  is the number of cells in the array actually occupied.
>>    *
>>    * @param smis array of segments
>>    * @param n number of cells in the array actually occupied
>>    */
>>   private final void mergeTermInfo(SegmentMergeInfo[] smis, int n)
>>           throws IOException {
>>     long freqPointer = freqOutput.getFilePointer();
>>     long proxPointer = proxOutput.getFilePointer();
>>
>>     int df = appendPostings(smis, n);          // append posting data
>>
>>     long skipPointer = writeSkip();
>>
>>     if (df > 0) {
>>       // add an entry to the dictionary with pointers to prox and 
>> freq  files
>>       termInfo.set(df, freqPointer, proxPointer, (int) (skipPointer -  
>> freqPointer));
>>       termInfosWriter.add(smis[0].term, termInfo);
>>     }
>>   }
>>
>>   /** Process postings from multiple segments all positioned on the
>>    *  same term. Writes out merged entries into freqOutput and
>>    *  the proxOutput streams.
>>    *
>>    * @param smis array of segments
>>    * @param n number of cells in the array actually occupied
>>    * @return number of documents across all segments where this term  
>> was found
>>    */
>>   private final int appendPostings(SegmentMergeInfo[] smis, int n)
>>           throws IOException {
>>     int lastDoc = 0;
>>     int df = 0;                      // number of docs w/ term
>>     resetSkip();
>>     for (int i = 0; i < n; i++) {
>>       SegmentMergeInfo smi = smis[i];
>>       TermPositions postings = smi.postings;
>>       int base = smi.base;
>>       int[] docMap = smi.docMap;
>>       postings.seek(smi.termEnum);
>>       while (postings.next()) {
>>         int doc = postings.doc();
>>         if (docMap != null)
>>           doc = docMap[doc];                      // map around  
>> deletions
>>         doc += base;                              // convert to 
>> merged  space
>>
>>         if (doc < lastDoc)
>>           throw new IllegalStateException("docs out of order");
>>
>>         df++;
>>
>>         if ((df % skipInterval) == 0) {
>>           bufferSkip(lastDoc);
>>         }
>>
>>         int docCode = (doc - lastDoc) << 1;      // use low bit to 
>> flag  freq=1
>>         lastDoc = doc;
>>
>>         int freq = postings.freq();
>>         if (freq == 1) {
>>           freqOutput.writeVInt(docCode | 1);      // write doc & freq=1
>>         } else {
>>           freqOutput.writeVInt(docCode);      // write doc
>>           freqOutput.writeVInt(freq);          // write frequency in doc
>>         }
>>
>>         int lastPosition = 0;              // write position deltas
>>         for (int j = 0; j < freq; j++) {
>>           int position = postings.nextPosition();
>>           proxOutput.writeVInt(position - lastPosition);
>>           lastPosition = position;
>>         }
>>       }
>>     }
>>     return df;
>>   }
>>
>>   private RAMOutputStream skipBuffer = new RAMOutputStream();
>>   private int lastSkipDoc;
>>   private long lastSkipFreqPointer;
>>   private long lastSkipProxPointer;
>>
>>   private void resetSkip() throws IOException {
>>     skipBuffer.reset();
>>     lastSkipDoc = 0;
>>     lastSkipFreqPointer = freqOutput.getFilePointer();
>>     lastSkipProxPointer = proxOutput.getFilePointer();
>>   }
>>
>>   private void bufferSkip(int doc) throws IOException {
>>     long freqPointer = freqOutput.getFilePointer();
>>     long proxPointer = proxOutput.getFilePointer();
>>
>>     skipBuffer.writeVInt(doc - lastSkipDoc);
>>     skipBuffer.writeVInt((int) (freqPointer - lastSkipFreqPointer));
>>     skipBuffer.writeVInt((int) (proxPointer - lastSkipProxPointer));
>>
>>     lastSkipDoc = doc;
>>     lastSkipFreqPointer = freqPointer;
>>     lastSkipProxPointer = proxPointer;
>>   }
>>
>>   private long writeSkip() throws IOException {
>>     long skipPointer = freqOutput.getFilePointer();
>>     skipBuffer.writeTo(freqOutput);
>>     return skipPointer;
>>   }
>>
>>   private void mergeNorms() throws IOException {
>>     for (int i = 0; i < fieldInfos.size(); i++) {
>>       FieldInfo fi = fieldInfos.fieldInfo(i);
>>       if (fi.isIndexed) {
>>         OutputStream output = directory.createFile(segment + ".f" + i);
>>         try {
>>           for (int j = 0; j < readers.size(); j++) {
>>             IndexReader reader = (IndexReader) readers.elementAt(j);
>>             byte[] input = reader.norms(fi.name);
>>             int maxDoc = reader.maxDoc();
>>             for (int k = 0; k < maxDoc; k++) {
>>               byte norm = input != null ? input[k] : (byte) 0;
>>               if (!reader.isDeleted(k)) {
>>                 output.writeByte(norm);
>>               }
>>             }
>>           }
>>         } finally {
>>           output.close();
>>         }
>>       }
>>     }
>>   }
>>
>> }
>> Index: src/test/org/apache/lucene/index/TestCompoundFile.java
>> ===================================================================
>> RCS file:  
>> /home/cvspublic/jakarta-lucene/src/test/org/apache/lucene/index/ 
>> TestCompoundFile.java,v
>> retrieving revision 1.5
>> diff -r1.5 TestCompoundFile.java
>> 20a21,24
>>
>>> import java.util.Collection;
>>> import java.util.HashMap;
>>> import java.util.Iterator;
>>> import java.util.Map;
>>
>> 197a202,204
>>
>>>
>>>             InputStream expected = dir.openFile(name);
>>>
>> 203c210
>> <             InputStream expected = dir.openFile(name);
>> ---
>>
>>>
>> 206a214
>>
>>>
>> 220a229,231
>>
>>>         InputStream expected1 = dir.openFile("d1");
>>>         InputStream expected2 = dir.openFile("d2");
>>>
>> 227c238
>> <         InputStream expected = dir.openFile("d1");
>> ---
>>
>>>
>> 229,231c240,242
>> <         assertSameStreams("d1", expected, actual);
>> <         assertSameSeekBehavior("d1", expected, actual);
>> <         expected.close();
>> ---
>>
>>>         assertSameStreams("d1", expected1, actual);
>>>         assertSameSeekBehavior("d1", expected1, actual);
>>>         expected1.close();
>>
>> 234c245
>> <         expected = dir.openFile("d2");
>> ---
>>
>>>
>> 236,238c247,249
>> <         assertSameStreams("d2", expected, actual);
>> <         assertSameSeekBehavior("d2", expected, actual);
>> <         expected.close();
>> ---
>>
>>>         assertSameStreams("d2", expected2, actual);
>>>         assertSameSeekBehavior("d2", expected2, actual);
>>>         expected2.close();
>>
>> 270,271d280
>> <         // Now test
>> <         CompoundFileWriter csw = new CompoundFileWriter(dir,  
>> "test.cfs");
>> 275a285,292
>>
>>>
>>>         InputStream[] check = new InputStream[data.length];
>>>         for (int i=0; i<data.length; i++) {
>>>            check[i] = dir.openFile(segment + data[i]);
>>>         }
>>>
>>>         // Now test
>>>         CompoundFileWriter csw = new CompoundFileWriter(dir,  
>>> "test.cfs");
>>
>> 283d299
>> <             InputStream check = dir.openFile(segment + data[i]);
>> 285,286c301,302
>> <             assertSameStreams(data[i], check, test);
>> <             assertSameSeekBehavior(data[i], check, test);
>> ---
>>
>>>             assertSameStreams(data[i], check[i], test);
>>>             assertSameSeekBehavior(data[i], check[i], test);
>>
>> 288c304
>> <             check.close();
>> ---
>>
>>>             check[i].close();
>>
>> 299c315,316
>> <     private void setUp_2() throws IOException {
>> ---
>>
>>>     private Map setUp_2() throws IOException {
>>>             Map streams = new HashMap(20);
>>
>> 303a321,322
>>
>>>
>>>             streams.put("f" + i, dir.openFile("f" + i));
>>
>> 305a325,326
>>
>>>
>>>         return streams;
>>
>> 308c329,336
>> <
>> ---
>>
>>>     private void closeUp(Map streams) throws IOException {
>>>         Iterator it = streams.values().iterator();
>>>         while (it.hasNext()) {
>>>             InputStream stream = (InputStream)it.next();
>>>             stream.close();
>>>         }
>>>     }
>>>
>> 364c392
>> <         setUp_2();
>> ---
>>
>>>         Map streams = setUp_2();
>>
>> 368c396
>> <         InputStream expected = dir.openFile("f11");
>> ---
>>
>>>         InputStream expected = (InputStream)streams.get("f11");
>>
>> 410c438,439
>> <         expected.close();
>> ---
>>
>>>         closeUp(streams);
>>>
>> 418c447
>> <         setUp_2();
>> ---
>>
>>>         Map streams = setUp_2();
>>
>> 422,423c451,452
>> <         InputStream e1 = dir.openFile("f11");
>> <         InputStream e2 = dir.openFile("f3");
>> ---
>>
>>>         InputStream e1 = (InputStream)streams.get("f11");
>>>         InputStream e2 = (InputStream)streams.get("f3");
>>
>> 426c455
>> <         InputStream a2 = dir.openFile("f3");
>> ---
>>
>>>         InputStream a2 = cr.openFile("f3");
>>
>> 486,487d514
>> <         e1.close();
>> <         e2.close();
>> 490a518,519
>>
>>>
>>>         closeUp(streams);
>>
>> 497c526
>> <         setUp_2();
>> ---
>>
>>>         Map streams = setUp_2();
>>
>> 569a599,600
>>
>>>
>>>         closeUp(streams);
>>
>> 574c605
>> <         setUp_2();
>> ---
>>
>>>         Map streams = setUp_2();
>>
>> 587a619,620
>>
>>>
>>>         closeUp(streams);
>>
>> 592c625
>> <         setUp_2();
>> ---
>>
>>>         Map streams = setUp_2();
>>
>> 617a651,652
>>
>>>
>>>         closeUp(streams);
>>
>> package org.apache.lucene.index;
>>
>> /**
>>  * Copyright 2004 The Apache Software Foundation
>>  *
>>  * Licensed under the Apache License, Version 2.0 (the "License");
>>  * you may not use this file except in compliance with the License.
>>  * You may obtain a copy of the License at
>>  *
>>  *     http://www.apache.org/licenses/LICENSE-2.0
>>  *
>>  * Unless required by applicable law or agreed to in writing, software
>>  * distributed under the License is distributed on an "AS IS" BASIS,
>>  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or  
>> implied.
>>  * See the License for the specific language governing permissions and
>>  * limitations under the License.
>>  */
>>
>> import java.io.IOException;
>> import java.io.File;
>> import java.util.Collection;
>> import java.util.HashMap;
>> import java.util.Iterator;
>> import java.util.Map;
>>
>> import junit.framework.TestCase;
>> import junit.framework.TestSuite;
>> import junit.textui.TestRunner;
>> import org.apache.lucene.store.OutputStream;
>> import org.apache.lucene.store.Directory;
>> import org.apache.lucene.store.InputStream;
>> import org.apache.lucene.store.FSDirectory;
>> import org.apache.lucene.store.RAMDirectory;
>> import org.apache.lucene.store._TestHelper;
>>
>>
>> /**
>>  * @author dmitrys@earthlink.net
>>  * @version $Id: TestCompoundFile.java,v 1.5 2004/03/29 22:48:06  
>> cutting Exp $
>>  */
>> public class TestCompoundFile extends TestCase
>> {
>>     /** Main for running test case by itself. */
>>     public static void main(String args[]) {
>>         TestRunner.run (new TestSuite(TestCompoundFile.class));
>> //        TestRunner.run (new TestCompoundFile("testSingleFile"));
>> //        TestRunner.run (new TestCompoundFile("testTwoFiles"));
>> //        TestRunner.run (new TestCompoundFile("testRandomFiles"));
>> //        TestRunner.run (new  
>> TestCompoundFile("testClonedStreamsClosing"));
>> //        TestRunner.run (new TestCompoundFile("testReadAfterClose"));
>> //        TestRunner.run (new TestCompoundFile("testRandomAccess"));
>> //        TestRunner.run (new  
>> TestCompoundFile("testRandomAccessClones"));
>> //        TestRunner.run (new TestCompoundFile("testFileNotFound"));
>> //        TestRunner.run (new TestCompoundFile("testReadPastEOF"));
>>
>> //        TestRunner.run (new TestCompoundFile("testIWCreate"));
>>
>>     }
>>
>>
>>     private Directory dir;
>>
>>
>>     public void setUp() throws IOException {
>>         //dir = new RAMDirectory();
>>         dir = FSDirectory.getDirectory(new  
>> File(System.getProperty("tempDir"), "testIndex"), true);
>>     }
>>
>>
>>     /** Creates a file of the specified size with random data. */
>>     private void createRandomFile(Directory dir, String name, int size)
>>     throws IOException
>>     {
>>         OutputStream os = dir.createFile(name);
>>         for (int i=0; i<size; i++) {
>>             byte b = (byte) (Math.random() * 256);
>>             os.writeByte(b);
>>         }
>>         os.close();
>>     }
>>
>>     /** Creates a file of the specified size with sequential data. 
>> The  first
>>      *  byte is written as the start byte provided. All subsequent  
>> bytes are
>>      *  computed as start + offset where offset is the number of the  
>> byte.
>>      */
>>     private void createSequenceFile(Directory dir,
>>                                     String name,
>>                                     byte start,
>>                                     int size)
>>     throws IOException
>>     {
>>         OutputStream os = dir.createFile(name);
>>         for (int i=0; i < size; i++) {
>>             os.writeByte(start);
>>             start ++;
>>         }
>>         os.close();
>>     }
>>
>>
>>     private void assertSameStreams(String msg,
>>                                    InputStream expected,
>>                                    InputStream test)
>>     throws IOException
>>     {
>>         assertNotNull(msg + " null expected", expected);
>>         assertNotNull(msg + " null test", test);
>>         assertEquals(msg + " length", expected.length(),  test.length());
>>         assertEquals(msg + " position", expected.getFilePointer(),
>>                                         test.getFilePointer());
>>
>>         byte expectedBuffer[] = new byte[512];
>>         byte testBuffer[] = new byte[expectedBuffer.length];
>>
>>         long remainder = expected.length() - expected.getFilePointer();
>>         while(remainder > 0) {
>>             int readLen = (int) Math.min(remainder,  
>> expectedBuffer.length);
>>             expected.readBytes(expectedBuffer, 0, readLen);
>>             test.readBytes(testBuffer, 0, readLen);
>>             assertEqualArrays(msg + ", remainder " + remainder,  
>> expectedBuffer,
>>                 testBuffer, 0, readLen);
>>             remainder -= readLen;
>>         }
>>     }
>>
>>
>>     private void assertSameStreams(String msg,
>>                                    InputStream expected,
>>                                    InputStream actual,
>>                                    long seekTo)
>>     throws IOException
>>     {
>>         if(seekTo >= 0 && seekTo < expected.length())
>>         {
>>             expected.seek(seekTo);
>>             actual.seek(seekTo);
>>             assertSameStreams(msg + ", seek(mid)", expected, actual);
>>         }
>>     }
>>
>>
>>
>>     private void assertSameSeekBehavior(String msg,
>>                                         InputStream expected,
>>                                         InputStream actual)
>>     throws IOException
>>     {
>>         // seek to 0
>>         long point = 0;
>>         assertSameStreams(msg + ", seek(0)", expected, actual, point);
>>
>>         // seek to middle
>>         point = expected.length() / 2l;
>>         assertSameStreams(msg + ", seek(mid)", expected, actual,  point);
>>
>>         // seek to end - 2
>>         point = expected.length() - 2;
>>         assertSameStreams(msg + ", seek(end-2)", expected, actual,  
>> point);
>>
>>         // seek to end - 1
>>         point = expected.length() - 1;
>>         assertSameStreams(msg + ", seek(end-1)", expected, actual,  
>> point);
>>
>>         // seek to the end
>>         point = expected.length();
>>         assertSameStreams(msg + ", seek(end)", expected, actual,  point);
>>
>>         // seek past end
>>         point = expected.length() + 1;
>>         assertSameStreams(msg + ", seek(end+1)", expected, actual,  
>> point);
>>     }
>>
>>
>>     private void assertEqualArrays(String msg,
>>                                    byte[] expected,
>>                                    byte[] test,
>>                                    int start,
>>                                    int len)
>>     {
>>         assertNotNull(msg + " null expected", expected);
>>         assertNotNull(msg + " null test", test);
>>
>>         for (int i=start; i<len; i++) {
>>             assertEquals(msg + " " + i, expected[i], test[i]);
>>         }
>>     }
>>
>>
>>     // ===========================================================
>>     //  Tests of the basic CompoundFile functionality
>>     // ===========================================================
>>
>>
>>     /** This test creates compound file based on a single file.
>>      *  Files of different sizes are tested: 0, 1, 10, 100 bytes.
>>      */
>>     public void testSingleFile() throws IOException {
>>         int data[] = new int[] { 0, 1, 10, 100 };
>>         for (int i=0; i<data.length; i++) {
>>             String name = "t" + data[i];
>>             createSequenceFile(dir, name, (byte) 0, data[i]);
>>
>>             InputStream expected = dir.openFile(name);
>>
>>             CompoundFileWriter csw = new CompoundFileWriter(dir, name  
>> + ".cfs");
>>             csw.addFile(name);
>>             csw.close();
>>
>>             CompoundFileReader csr = new CompoundFileReader(dir, name  
>> + ".cfs");
>>
>>             InputStream actual = csr.openFile(name);
>>             assertSameStreams(name, expected, actual);
>>             assertSameSeekBehavior(name, expected, actual);
>>
>>             expected.close();
>>             actual.close();
>>             csr.close();
>>         }
>>     }
>>
>>
>>     /** This test creates compound file based on two files.
>>      *
>>      */
>>     public void testTwoFiles() throws IOException {
>>         createSequenceFile(dir, "d1", (byte) 0, 15);
>>         createSequenceFile(dir, "d2", (byte) 0, 114);
>>
>>         InputStream expected1 = dir.openFile("d1");
>>         InputStream expected2 = dir.openFile("d2");
>>
>>         CompoundFileWriter csw = new CompoundFileWriter(dir, "d.csf");
>>         csw.addFile("d1");
>>         csw.addFile("d2");
>>         csw.close();
>>
>>         CompoundFileReader csr = new CompoundFileReader(dir, "d.csf");
>>
>>         InputStream actual = csr.openFile("d1");
>>         assertSameStreams("d1", expected1, actual);
>>         assertSameSeekBehavior("d1", expected1, actual);
>>         expected1.close();
>>         actual.close();
>>
>>
>>         actual = csr.openFile("d2");
>>         assertSameStreams("d2", expected2, actual);
>>         assertSameSeekBehavior("d2", expected2, actual);
>>         expected2.close();
>>         actual.close();
>>         csr.close();
>>     }
>>
>>     /** This test creates a compound file based on a large number of  
>> files of
>>      *  various length. The file content is generated randomly. The  
>> sizes range
>>      *  from 0 to 1Mb. Some of the sizes are selected to test the  
>> buffering
>>      *  logic in the file reading code. For this the chunk variable 
>> is  set to
>>      *  the length of the buffer used internally by the compound file  
>> logic.
>>      */
>>     public void testRandomFiles() throws IOException {
>>         // Setup the test segment
>>         String segment = "test";
>>         int chunk = 1024; // internal buffer size used by the stream
>>         createRandomFile(dir, segment + ".zero", 0);
>>         createRandomFile(dir, segment + ".one", 1);
>>         createRandomFile(dir, segment + ".ten", 10);
>>         createRandomFile(dir, segment + ".hundred", 100);
>>         createRandomFile(dir, segment + ".big1", chunk);
>>         createRandomFile(dir, segment + ".big2", chunk - 1);
>>         createRandomFile(dir, segment + ".big3", chunk + 1);
>>         createRandomFile(dir, segment + ".big4", 3 * chunk);
>>         createRandomFile(dir, segment + ".big5", 3 * chunk - 1);
>>         createRandomFile(dir, segment + ".big6", 3 * chunk + 1);
>>         createRandomFile(dir, segment + ".big7", 1000 * chunk);
>>
>>         // Setup extraneous files
>>         createRandomFile(dir, "onetwothree", 100);
>>         createRandomFile(dir, segment + ".notIn", 50);
>>         createRandomFile(dir, segment + ".notIn2", 51);
>>
>>         final String data[] = new String[] {
>>             ".zero", ".one", ".ten", ".hundred", ".big1", ".big2",  
>> ".big3",
>>             ".big4", ".big5", ".big6", ".big7"
>>         };
>>
>>         InputStream[] check = new InputStream[data.length];
>>         for (int i=0; i<data.length; i++) {
>>            check[i] = dir.openFile(segment + data[i]);
>>         }
>>
>>         // Now test
>>         CompoundFileWriter csw = new CompoundFileWriter(dir,  
>> "test.cfs");
>>         for (int i=0; i<data.length; i++) {
>>             csw.addFile(segment + data[i]);
>>         }
>>         csw.close();
>>
>>         CompoundFileReader csr = new CompoundFileReader(dir,  
>> "test.cfs");
>>         for (int i=0; i<data.length; i++) {
>>             InputStream test = csr.openFile(segment + data[i]);
>>             assertSameStreams(data[i], check[i], test);
>>             assertSameSeekBehavior(data[i], check[i], test);
>>             test.close();
>>             check[i].close();
>>         }
>>         csr.close();
>>     }
>>
>>
>>     /** Setup a larger compound file with a number of components, 
>> each  of
>>      *  which is a sequential file (so that we can easily tell that 
>> we  are
>>      *  reading in the right byte). The methods sets up 20 files - f0  
>> to f19,
>>      *  the size of each file is 1000 bytes.
>>      */
>>     private Map setUp_2() throws IOException {
>>             Map streams = new HashMap(20);
>>         CompoundFileWriter cw = new CompoundFileWriter(dir, "f.comp");
>>         for (int i=0; i<20; i++) {
>>             createSequenceFile(dir, "f" + i, (byte) 0, 2000);
>>             cw.addFile("f" + i);
>>
>>             streams.put("f" + i, dir.openFile("f" + i));
>>         }
>>         cw.close();
>>
>>         return streams;
>>     }
>>
>>     private void closeUp(Map streams) throws IOException {
>>         Iterator it = streams.values().iterator();
>>         while (it.hasNext()) {
>>             InputStream stream = (InputStream)it.next();
>>             stream.close();
>>         }
>>     }
>>
>>     public void testReadAfterClose() throws IOException {
>>         demo_FSInputStreamBug((FSDirectory) dir, "test");
>>     }
>>
>>     private void demo_FSInputStreamBug(FSDirectory fsdir, String file)
>>     throws IOException
>>     {
>>         // Setup the test file - we need more than 1024 bytes
>>         OutputStream os = fsdir.createFile(file);
>>         for(int i=0; i<2000; i++) {
>>             os.writeByte((byte) i);
>>         }
>>         os.close();
>>
>>         InputStream in = fsdir.openFile(file);
>>
>>         // This read primes the buffer in InputStream
>>         byte b = in.readByte();
>>
>>         // Close the file
>>         in.close();
>>
>>         // ERROR: this call should fail, but succeeds because the  buffer
>>         // is still filled
>>         b = in.readByte();
>>
>>         // ERROR: this call should fail, but succeeds for some reason  
>> as well
>>         in.seek(1099);
>>
>>         try {
>>             // OK: this call correctly fails. We are now past the 
>> 1024  internal
>>             // buffer, so an actual IO is attempted, which fails
>>             b = in.readByte();
>>         } catch (IOException e) {
>>         }
>>     }
>>
>>
>>     static boolean isCSInputStream(InputStream is) {
>>         return is instanceof CompoundFileReader.CSInputStream;
>>     }
>>
>>     static boolean isCSInputStreamOpen(InputStream is) throws  
>> IOException {
>>         if (isCSInputStream(is)) {
>>             CompoundFileReader.CSInputStream cis =
>>             (CompoundFileReader.CSInputStream) is;
>>
>>             return _TestHelper.isFSInputStreamOpen(cis.base);
>>         } else {
>>             return false;
>>         }
>>     }
>>
>>
>>     public void testClonedStreamsClosing() throws IOException {
>>         Map streams = setUp_2();
>>         CompoundFileReader cr = new CompoundFileReader(dir, "f.comp");
>>
>>         // basic clone
>>         InputStream expected = (InputStream)streams.get("f11");
>>         assertTrue(_TestHelper.isFSInputStreamOpen(expected));
>>
>>         InputStream one = cr.openFile("f11");
>>         assertTrue(isCSInputStreamOpen(one));
>>
>>         InputStream two = (InputStream) one.clone();
>>         assertTrue(isCSInputStreamOpen(two));
>>
>>         assertSameStreams("basic clone one", expected, one);
>>         expected.seek(0);
>>         assertSameStreams("basic clone two", expected, two);
>>
>>         // Now close the first stream
>>         one.close();
>>         assertTrue("Only close when cr is closed",  
>> isCSInputStreamOpen(one));
>>
>>         // The following should really fail since we couldn't expect to
>>         // access a file once close has been called on it (regardless  of
>>         // buffering and/or clone magic)
>>         expected.seek(0);
>>         two.seek(0);
>>         assertSameStreams("basic clone two/2", expected, two);
>>
>>
>>         // Now close the compound reader
>>         cr.close();
>>         assertFalse("Now closed one", isCSInputStreamOpen(one));
>>         assertFalse("Now closed two", isCSInputStreamOpen(two));
>>
>>         // The following may also fail since the compound stream is  
>> closed
>>         expected.seek(0);
>>         two.seek(0);
>>         //assertSameStreams("basic clone two/3", expected, two);
>>
>>
>>         // Now close the second clone
>>         two.close();
>>         expected.seek(0);
>>         two.seek(0);
>>         //assertSameStreams("basic clone two/4", expected, two);
>>
>>         closeUp(streams);
>>
>>     }
>>
>>
>>     /** This test opens two files from a compound stream and verifies  
>> that
>>      *  their file positions are independent of each other.
>>      */
>>     public void testRandomAccess() throws IOException {
>>         Map streams = setUp_2();
>>         CompoundFileReader cr = new CompoundFileReader(dir, "f.comp");
>>
>>         // Open two files
>>         InputStream e1 = (InputStream)streams.get("f11");
>>         InputStream e2 = (InputStream)streams.get("f3");
>>
>>         InputStream a1 = cr.openFile("f11");
>>         InputStream a2 = cr.openFile("f3");
>>
>>         // Seek the first pair
>>         e1.seek(100);
>>         a1.seek(100);
>>         assertEquals(100, e1.getFilePointer());
>>         assertEquals(100, a1.getFilePointer());
>>         byte be1 = e1.readByte();
>>         byte ba1 = a1.readByte();
>>         assertEquals(be1, ba1);
>>
>>         // Now seek the second pair
>>         e2.seek(1027);
>>         a2.seek(1027);
>>         assertEquals(1027, e2.getFilePointer());
>>         assertEquals(1027, a2.getFilePointer());
>>         byte be2 = e2.readByte();
>>         byte ba2 = a2.readByte();
>>         assertEquals(be2, ba2);
>>
>>         // Now make sure the first one didn't move
>>         assertEquals(101, e1.getFilePointer());
>>         assertEquals(101, a1.getFilePointer());
>>         be1 = e1.readByte();
>>         ba1 = a1.readByte();
>>         assertEquals(be1, ba1);
>>
>>         // Now more the first one again, past the buffer length
>>         e1.seek(1910);
>>         a1.seek(1910);
>>         assertEquals(1910, e1.getFilePointer());
>>         assertEquals(1910, a1.getFilePointer());
>>         be1 = e1.readByte();
>>         ba1 = a1.readByte();
>>         assertEquals(be1, ba1);
>>
>>         // Now make sure the second set didn't move
>>         assertEquals(1028, e2.getFilePointer());
>>         assertEquals(1028, a2.getFilePointer());
>>         be2 = e2.readByte();
>>         ba2 = a2.readByte();
>>         assertEquals(be2, ba2);
>>
>>         // Move the second set back, again cross the buffer size
>>         e2.seek(17);
>>         a2.seek(17);
>>         assertEquals(17, e2.getFilePointer());
>>         assertEquals(17, a2.getFilePointer());
>>         be2 = e2.readByte();
>>         ba2 = a2.readByte();
>>         assertEquals(be2, ba2);
>>
>>         // Finally, make sure the first set didn't move
>>         // Now make sure the first one didn't move
>>         assertEquals(1911, e1.getFilePointer());
>>         assertEquals(1911, a1.getFilePointer());
>>         be1 = e1.readByte();
>>         ba1 = a1.readByte();
>>         assertEquals(be1, ba1);
>>
>>         a1.close();
>>         a2.close();
>>         cr.close();
>>
>>         closeUp(streams);
>>     }
>>
>>     /** This test opens two files from a compound stream and verifies  
>> that
>>      *  their file positions are independent of each other.
>>      */
>>     public void testRandomAccessClones() throws IOException {
>>         Map streams = setUp_2();
>>         CompoundFileReader cr = new CompoundFileReader(dir, "f.comp");
>>
>>         // Open two files
>>         InputStream e1 = cr.openFile("f11");
>>         InputStream e2 = cr.openFile("f3");
>>
>>         InputStream a1 = (InputStream) e1.clone();
>>         InputStream a2 = (InputStream) e2.clone();
>>
>>         // Seek the first pair
>>         e1.seek(100);
>>         a1.seek(100);
>>         assertEquals(100, e1.getFilePointer());
>>         assertEquals(100, a1.getFilePointer());
>>         byte be1 = e1.readByte();
>>         byte ba1 = a1.readByte();
>>         assertEquals(be1, ba1);
>>
>>         // Now seek the second pair
>>         e2.seek(1027);
>>         a2.seek(1027);
>>         assertEquals(1027, e2.getFilePointer());
>>         assertEquals(1027, a2.getFilePointer());
>>         byte be2 = e2.readByte();
>>         byte ba2 = a2.readByte();
>>         assertEquals(be2, ba2);
>>
>>         // Now make sure the first one didn't move
>>         assertEquals(101, e1.getFilePointer());
>>         assertEquals(101, a1.getFilePointer());
>>         be1 = e1.readByte();
>>         ba1 = a1.readByte();
>>         assertEquals(be1, ba1);
>>
>>         // Now more the first one again, past the buffer length
>>         e1.seek(1910);
>>         a1.seek(1910);
>>         assertEquals(1910, e1.getFilePointer());
>>         assertEquals(1910, a1.getFilePointer());
>>         be1 = e1.readByte();
>>         ba1 = a1.readByte();
>>         assertEquals(be1, ba1);
>>
>>         // Now make sure the second set didn't move
>>         assertEquals(1028, e2.getFilePointer());
>>         assertEquals(1028, a2.getFilePointer());
>>         be2 = e2.readByte();
>>         ba2 = a2.readByte();
>>         assertEquals(be2, ba2);
>>
>>         // Move the second set back, again cross the buffer size
>>         e2.seek(17);
>>         a2.seek(17);
>>         assertEquals(17, e2.getFilePointer());
>>         assertEquals(17, a2.getFilePointer());
>>         be2 = e2.readByte();
>>         ba2 = a2.readByte();
>>         assertEquals(be2, ba2);
>>
>>         // Finally, make sure the first set didn't move
>>         // Now make sure the first one didn't move
>>         assertEquals(1911, e1.getFilePointer());
>>         assertEquals(1911, a1.getFilePointer());
>>         be1 = e1.readByte();
>>         ba1 = a1.readByte();
>>         assertEquals(be1, ba1);
>>
>>         e1.close();
>>         e2.close();
>>         a1.close();
>>         a2.close();
>>         cr.close();
>>
>>         closeUp(streams);
>>     }
>>
>>
>>     public void testFileNotFound() throws IOException {
>>         Map streams = setUp_2();
>>         CompoundFileReader cr = new CompoundFileReader(dir, "f.comp");
>>
>>         // Open two files
>>         try {
>>             InputStream e1 = cr.openFile("bogus");
>>             fail("File not found");
>>
>>         } catch (IOException e) {
>>             /* success */
>>             //System.out.println("SUCCESS: File Not Found: " + e);
>>         }
>>
>>         cr.close();
>>
>>         closeUp(streams);
>>     }
>>
>>
>>     public void testReadPastEOF() throws IOException {
>>         Map streams = setUp_2();
>>         CompoundFileReader cr = new CompoundFileReader(dir, "f.comp");
>>         InputStream is = cr.openFile("f2");
>>         is.seek(is.length() - 10);
>>         byte b[] = new byte[100];
>>         is.readBytes(b, 0, 10);
>>
>>         try {
>>             byte test = is.readByte();
>>             fail("Single byte read past end of file");
>>         } catch (IOException e) {
>>             /* success */
>>             //System.out.println("SUCCESS: single byte read past end  
>> of file: " + e);
>>         }
>>
>>         is.seek(is.length() - 10);
>>         try {
>>             is.readBytes(b, 0, 50);
>>             fail("Block read past end of file");
>>         } catch (IOException e) {
>>             /* success */
>>             //System.out.println("SUCCESS: block read past end of  
>> file: " + e);
>>         }
>>
>>         is.close();
>>         cr.close();
>>
>>         closeUp(streams);
>>     }
>> }
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
> 
> 

-- 
*************************************************************
* Dr. Christoph Goller     Tel. : +49 89 203 45734          *
* Geschäftsführer          Email: goller@detego-software.de *
* Detego Software GmbH     Mail : Keuslinstr. 13,           *
*                                 80798 München, Germany    *
*************************************************************


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message