lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Trejkaz <trej...@trypticon.org>
Subject Re: Range queries in successive positions
Date Fri, 02 Mar 2012 07:26:20 GMT
On Fri, Mar 2, 2012 at 6:22 PM, su ha <s_hangal@yahoo.com> wrote:
> Hi,
> I'm new to Lucene. I'm indexed some documents with Lucene and need to sanitize it to
ensure
> that they do not have any social security numbers (3-digits 2-digits 4-digits).
>
> (How) Can I write a query (with the QueryParser) that searches for this pattern?
>
> e.g. I can do [000 to 999] or [00 to 99] or [0000 to 9999], but this causes hits with
any 2, 3 or 4 digit number.
> Something like "[000 to 999] [00 TO 99] [0000 TO 9999]", I get no hits at all.
>
> Is this possible with the default QueryParser?
> Or is there some other programmatic way to do it?

The programmatic way is to use SpanMultiTermQueryWrapper around each
RangeQuery and then SpanNearQuery around the lot.

The default QueryParser probably can't do it. I believe someone was
enhancing it for wildcards but I'm not sure if range queries were
included in all that.

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message