lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Valentin Popov <valentin...@gmail.com>
Subject Re: multiterm numbers regexp search
Date Mon, 15 Dec 2014 08:58:53 GMT
Nope, this is for compliance request for banking system, have a look to PCI DSS. 

@wmartinusa, please do not get the traffic, if you have nothing to say about subject. 


 
> On 15 дек. 2014 г., at 11:54, wmartinusa <wmartinusa@gmail.com> wrote:
> 
> Sounds crooked. R u a criminal?
>  
>  
> Sent from my LG Optimus G™, an AT&T 4G LTE smartphone
> 
> ------ Original message ------
> From: Valentin Popov 
> Date: 12/15/2014 3:46 AM
> To: java-user@lucene.apache.org;
> Subject:multiterm numbers regexp search
> 
> I have a need to find mastercard numbers with regular expression. 
> 
> I’m using Query query = new RegexpQuery(new Term("body", "5{1}<1-5>{1}<0-9>{14}"),
RegExp.ALL) to search numbers in email’s body and StandardAnalizer used for body indexing.
So number like 5106792294698422 will be indexed as it is and all mastercard numbers will be
on search results, but numbers like 5106 7922 9469 8422 will be indexed as 4 tokens 5106,
7922, 9469, 8422, simular for 5106-7922-9469-8422. 
> 
> Any ideas how to find the sequence of numbers with spaces, dashes etc? Maybe multiterm
regexp search query? 
> 
> 
> Regards,
> Valentin Popov
> 
> 
> 
> 
> 


 С Уважением,
Валентин Попов






Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message