lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Medinets" <medi...@mtolive.com>
Subject Bug in TestPhrasePrefixQuery in 2003.04.21 Build?
Date Tue, 22 Apr 2003 18:14:25 GMT
The TestPhrasePrefixQuery looks like it is searching for "blueberry pi*" and it even seems
to work at first glance. However, the test data is not extensive enough to show what is really
happening.

The searching technique implemented in TestPhrasePrefixQuery will find not only "blueberry
pie" but also "blueberry strudel" if both exist in the documents.

The reason is that the IndexReader.terms(Term termToMatch) method looks for the first term
equal or larger than termToMatch and then returns *all* terms from that point in the index
to the end.

One potential solution might be something like the following:
String pattern = "pi*";
TermEnum te = ir.terms(new Term("body", pattern));
while (te.term().text().matches(pattern)) {
    termsWithPrefix.add(te.term());
    if (te.next() == false)
        break;
    }
}

Of course, the code above only works with JDK1.4 because of the pattern matching.

Comments?

David Medinets
Quality = Resource Multiplication
http://www.codebits.com


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message