lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@gmail.com>
Subject Re: Range queries in successive positions
Date Fri, 02 Mar 2012 09:21:28 GMT
Or take a look at search.regex.RegexQuery contrib module.  You won't
be able to use that via QueryParser either.

It might make more sense to do the sanitizing before indexing rather than after.


--
Ian.


On Fri, Mar 2, 2012 at 7:26 AM, Trejkaz <trejkaz@trypticon.org> wrote:
> On Fri, Mar 2, 2012 at 6:22 PM, su ha <s_hangal@yahoo.com> wrote:
>> Hi,
>> I'm new to Lucene. I'm indexed some documents with Lucene and need to sanitize it
to ensure
>> that they do not have any social security numbers (3-digits 2-digits 4-digits).
>>
>> (How) Can I write a query (with the QueryParser) that searches for this pattern?
>>
>> e.g. I can do [000 to 999] or [00 to 99] or [0000 to 9999], but this causes hits
with any 2, 3 or 4 digit number.
>> Something like "[000 to 999] [00 TO 99] [0000 TO 9999]", I get no hits at all.
>>
>> Is this possible with the default QueryParser?
>> Or is there some other programmatic way to do it?
>
> The programmatic way is to use SpanMultiTermQueryWrapper around each
> RangeQuery and then SpanNearQuery around the lot.
>
> The default QueryParser probably can't do it. I believe someone was
> enhancing it for wildcards but I'm not sure if range queries were
> included in all that.
>
> TX
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message