lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 35029] New: - Inconsistent Read and write behavior
Date Mon, 23 May 2005 18:27:51 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=35029>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=35029

           Summary: Inconsistent Read and write behavior
           Product: Lucene
           Version: 1.4
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Index
        AssignedTo: lucene-dev@jakarta.apache.org
        ReportedBy: lucene@ziplip.com


While writing an undefined term , the field is inserted into the index as 
fieldnumber -1 and while reading the same index back an exception is thrown.

First of all, the indexwriter should not allow the operation to succeed if the 
field is not known.

Second, if the data is allowed to write, at least we should be able to read it 
with out any problem.

If one uses the default indexreader, indexwriter and segmentmerger this may 
error may not occur.  However, it is simple fix for the code not to accept bad 
data. Please review and commit the changes.  I am not sure, if there are any 
other classes that requires a similar fix. Our usage uncovered the following 
files:

--

TermInfosWriter

private final void writeTerm(Term term)
throws IOException {
int iField = fieldInfos.fieldNumber(term.field);
if (iField < 0) {
throw new IOException("Unknown field "+term.field+"; term="+term.text);
}
int start = stringDifference(lastTerm.text, term.text);
int length = term.text.length() - start;

output.writeVInt(start); // write shared prefix length
output.writeVInt(length); // write delta length
output.writeChars(term.text, start, length); // write delta chars

output.writeVInt(iField); // write field num

lastTerm = term;
}

 

FieldsReader

 final Document doc(int n) throws IOException {
    indexStream.seek(n * 8L);
    long position = indexStream.readLong();
    fieldsStream.seek(position);

    Document doc = new Document();
    int numFields = fieldsStream.readVInt();
    for (int i = 0; i < numFields; i++) {
      int fieldNumber = fieldsStream.readVInt();
      byte bits = fieldsStream.readByte();
      String stFieldValue = fieldsStream.readString();
      if (fieldNumber >=0) {
          FieldInfo fi = fieldInfos.fieldInfo(fieldNumber);

          doc.add(new Field(fi.name, // name
                            stFieldValue, // read value
                            true, // stored
                            fi.isIndexed, // indexed
                            (bits & 1) != 0)); // tokenized
      }
    }

    return doc;
  }

-- FieldsWriter.java

final void addDocument(Document doc) throws IOException {
    indexStream.writeLong(fieldsStream.getFilePointer());
    
    int storedCount = 0;
    Enumeration fields  = doc.fields();
    while (fields.hasMoreElements()) {
      Field field = (Field)fields.nextElement();
      if (field.isStored())
	storedCount++;
    }
    fieldsStream.writeVInt(storedCount);
    
    fields  = doc.fields();
    while (fields.hasMoreElements()) {
      Field field = (Field)fields.nextElement();
      if (field.isStored()) {
          int iField = fieldInfos.fieldNumber(field.name());
          if (iField == -1) {
              throw new IOException("Unknown field " + field.name());
          }
          fieldsStream.writeVInt(iField);

	byte bits = 0;
	if (field.isTokenized())
	  bits |= 1;
	fieldsStream.writeByte(bits);

	fieldsStream.writeString(field.stringValue());
      }
    }
  }

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message