lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jamie <ja...@mailarchiva.com>
Subject Re: Lucene 4.7 intermittently not applying query filter
Date Fri, 28 Mar 2014 14:54:10 GMT
Steve

Thank for the contact. I believe UAX29URLEmailTokenizer tokenizes email 
addresses as follows: john.doe@mycompany.com.au john.doe 
mycompany.com.au john doe mycompany com au com.au.We have an overridden 
query parser that swaps out anyaddress: with to, from, cc, bcc, etc. 
Inside the overridden query parser, we call getFieldQuery() to build the 
clauses...

Query q = super.getFieldQuery(field, emailAddress, true);
if (slop!=-1) {
applySlop(q,slop);
}
clauses.add(new BooleanClause(q, BooleanClause.Occur.SHOULD));

The query is outputted below. Sometimes when it is executed by Lucene, 
the filter is ignored.

I am busy trying to isolate the issue, since the code is running in a 
wider system among other complexities.

Jamie

On 2014/03/28, 4:08 PM, Steve Rowe wrote:
> Hi Jamie,
>
> What does EmailFilter do?
>
> Why is the expanded form "required for the UAX29URLEmailTokenizer"?  Seems like an exact
match would work on the email address alone, without the expanded components?
>
> Do you have an example of a query that reproducibly matches more documents than it should,
and a document that matched but shouldn’t have?
>
> Steve  	


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message