lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: Can this regex be done?
Date Thu, 03 Sep 2009 21:55:44 GMT

: Because some of the queries that I have to convert (without modifying
: them, unfortunately) have a half literally a page of statements
: expressed like that that, if expanded, would equal a several page long
: lucene query.

FWIW: the RegexQuery (in contrib) applies the regex input to every term in 
the field (in some cases it can skip ahead if there it can find a 
constant prefix) to see if it matches, and then essentially builds a giant 
"OR" query of all the terms that match.

so if your concern is that you don't want to translate the regex 
expressions yourself, the RegexQuerry can handle it for you ... but if 
your concern is that the equivilent query will be really big, well... it's 
going to be really big anyway, but if you convert it you might be able to 
make some optimizations during your conversion process (ie: if you know 
you only need to worry about groupings and alternates, and that you'll 
never have any wildcards (like your one example) then you can build up all 
of the possible clauses without doing a full TermEnum walk ... even for 
100s of clauses, that's probably faster if your field has thousands of 


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message