lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Test code for regex queries
Date Wed, 23 Nov 2005 23:06:24 GMT

On 23 Nov 2005, at 15:42, Paul Elschot wrote:
> I refactored it to have a few more tests, and all seems to work well.
> It also includes the tests from TestSpanRegexQuery.java .
>
> Two questions:
>
> Can I assume the APL2 on Test{,Span}RegexQuery.java?
> If so, I'll post the refactored version with it.

Yes, those files should have had APL2 in them, my apologies for  
missing it.

> To parse a regex query term, the surround parser will have to
> be extended a bit so it recognizes a reasonable subset of the
> java regular expressions.
> Any preferences for the syntax for a regex term in the
> surround parser?

I must admit that I haven't used the surround parser.  For my custom  
parser (a legacy syntax that no one here would want), I take any term  
that has an *, ?, or [...] syntax as a regex term.

There are still some TODO's with the (Span)RegexQuery - such as being  
wise about the prefix length.  Right now it is not wise enough.  I've  
spent some time looking for a regex parser that could parse a regex  
expression into an AST so that it could be used for determining the  
last static character to start term enumeration.  This would also  
come in very handy in being able to rotate a regular expression  
string to maximize the static prefix when indexing with an analyzer  
that rotates terms.  If anyone has suggestions/pointers to how this  
could be accomplished, it'd be most appreciated!

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message