lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven A Rowe <>
Subject RE: RegexQuery Incomplete Results
Date Fri, 08 May 2009 15:32:04 GMT
On 5/8/2009 at 9:13 AM, Ian Lee wrote:
> I'm surprised that it matches either - don't you need ".*in" where .*
> means match any character zero or more times?  See the javadoc for
> java.util.regex.Pattern, or for Jakarta Regexp if you are using that
> package.
> Unless you're an expert in regexps it is probably worth playing with
> them outside your lucene code to start with e.g. with simple
> String.matches(regexp) calls.  They can take some getting used to.
> And try to avoid anything with backslashes if you can!

The java.util.regex.Pattern implementation (the default RegexQuery implementation) actually
uses Matcher.lookingAt(), which is equivalent to prepending a "^" anchor to the beginning
of the pattern, so if Huntsman84 is using the default implementation, then I agree with Ian:
I'm surprised it matches either.  

However, the Jakarta Regexp implementation uses RE.match(), which does *not* require a beginning-of-string

Hunstman84, are you using the Jakarta Regexp implementation?  If so, then like you, I'm surprised
it's not matching both :).

It would be useful to see some real code, including how you index your records.


> On Fri, May 8, 2009 at 1:42 PM, Huntsman84 <>
> wrote:
> >
> > Hi,
> >
> > I am using RegexQuery for searching in a set of records wich are
> > phrases of several words each. My aim is to find any phrase that
> > contains the given group of letters (e.g. "in"). For that case,
> > I am building the query with the regular expression ".in.", so it
> > should return all phrases with contain "in", but the search only
> > matches with the first word of the phrase.
> >
> > For example, if my records are "Knowing yourself" and "Old
> > clinic", the correct search would return 2 matches, but it only
> > matches with "Knowing yourself".
> >
> > How could I fix this?

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message