lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jamie <>
Subject Lucene 4.7 intermittently not applying query filter
Date Fri, 28 Mar 2014 11:00:20 GMT

We have a problem whereby Lucene 4.7 occasionally does not apply a 
filter query during searching. The problem is intermittent. One in 
thirty or so searches will return what appears to be an unfiltered 
result set. There are no exceptions or errors occurring.. just incorrect 
results. We are using realtime search with multiple index readers. Our 
software had been working fine with earlier versions of Lucene. I've 
double checked the query submitted to lucene and it appears to be 
correct. The query looks as follows:

2014-03-28 21:16:38 t.c.s.a.s.StandardSearch [DEBUG] start search 
TO 201403282115] +cat:email +(to:" 
john.douglas john douglas mycompany com au" 
to:" john.doe john doe 
mycompany com au" from:" 
john.douglas john douglas mycompany com au" 
from:" john.doe john doe 
mycompany com au" cc:" john.douglas john douglas mycompany com au" 
cc:" john.doe john doe 
mycompany com au"))',sort='<long: "mydate">!'}

The string " john.doe john doe 
mycompany com au" is the required expansion for the 
UAX29URLEmailTokenizer. By using quotes, I am aiming for an exact match. 
This works most of the time, but not all of the time (as it should).

  I came across: and 
applied it, but it makes no difference. I tried to downgrade Lucene, but 
it wont read the 4.6 indexes. Can anyone suggest a way forward?

Thanks for your recommendations



public final class EmailAnalyzer extends StopwordAnalyzerBase {

   public static final int DEFAULT_MAX_TOKEN_LENGTH = 
   private int maxTokenLength = DEFAULT_MAX_TOKEN_LENGTH;
   public static final CharArraySet STOP_WORDS_SET = 

   public EmailAnalyzer(Version matchVersion, CharArraySet stopWords) {
     super(matchVersion, stopWords);

   public EmailAnalyzer(Version matchVersion) {
     this(matchVersion, STOP_WORDS_SET);

   public EmailAnalyzer(Version matchVersion, Reader stopwords) throws 
IOException {
     this(matchVersion, loadStopwordSet(stopwords, matchVersion));

   public void setMaxTokenLength(int length) {
     maxTokenLength = length;

  public int getMaxTokenLength() {
     return maxTokenLength;

   protected TokenStreamComponents createComponents(final String 
fieldName, final Reader reader) {
     final UAX29URLEmailTokenizer src = new 
UAX29URLEmailTokenizer(matchVersion, reader);
     TokenStream tok = new EmailFilter(src);
     tok = new LowerCaseFilter(matchVersion, tok);
     return new TokenStreamComponents(src, tok) {
       protected void setReader(final Reader reader) throws IOException {

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message