lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benson Margulies (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (LUCENE-3854) Non-tokenized fields become tokenized when a document is deleted and added back
Date Tue, 06 Mar 2012 14:52:57 GMT
Non-tokenized fields become tokenized when a document is deleted and added back
-------------------------------------------------------------------------------

                 Key: LUCENE-3854
                 URL: https://issues.apache.org/jira/browse/LUCENE-3854
             Project: Lucene - Java
          Issue Type: Bug
          Components: core/index
    Affects Versions: 4.0
            Reporter: Benson Margulies


https://github.com/bimargulies/lucene-4-update-case is a JUnit test case that seems to show
a problem with the current trunk. It creates a document with a Field typed as StringField.TYPE_STORED
and a value with a "-" in it. A TermQuery can find the value, initially, since the field is
not tokenized.

Then, the case reads the Document back out through a reader. In the copy of the Document that
gets read out, the Field now has the tokenized bit turned on. 

Next, the case deletes and adds the Document. The 'tokenized' bit is respected, so now the
field gets tokenized, and the result is that the query on the term with the - in it no longer
works.

So I think that the defect here is in the code that reconstructs the Document when read from
the index, and which turns on the tokenized bit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message