lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gordin, Ira" <ira.gor...@sap.com>
Subject RE: Search in lines, so need to index lines?
Date Wed, 01 Aug 2018 11:01:33 GMT
Hi Tomoko,

I need to search in many files and we use Lucene for this purpose.

Thanks,
Ira

-----Original Message-----
From: Tomoko Uchida <tomoko.uchida.1111@gmail.com> 
Sent: Wednesday, August 1, 2018 1:49 PM
To: java-user@lucene.apache.org
Subject: Re: Search in lines, so need to index lines?

Hi Ira,

> I am trying to implement regex search in file

Why are you using Lucene for regular expression search?
You can implement this by simply using java.util.regex package?

Regards,
Tomoko

2018年8月1日(水) 0:18 Gordin, Ira <ira.gordin@sap.com>:

> Hi Uwe,
>
> I am trying to implement regex search in file the same as in editors, in
> Notepad++ for example.
>
> Thanks,
> Ira
>
> -----Original Message-----
> From: Uwe Schindler <uwe@thetaphi.de>
> Sent: Tuesday, July 31, 2018 6:12 PM
> To: java-user@lucene.apache.org
> Subject: RE: Search in lines, so need to index lines?
>
> Hi,
>
> you need to create your own tokenizer that splits tokens on \n or \r.
> Instead of using WhitespaceTokenizer, you can use:
>
> Tokenizer tok = CharTokenizer. fromSeparatorCharPredicate(ch -> ch=='\r'
> || ch=='\n');
>
> But I would first think of how to implement the whole thing correctly.
> Using a regular expression as "default" query is slow and does not look
> correct. What are you trying to do?
>
> Uwe
>
> -----
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
> > -----Original Message-----
> > From: Gordin, Ira <ira.gordin@sap.com>
> > Sent: Tuesday, July 31, 2018 4:08 PM
> > To: java-user@lucene.apache.org
> > Subject: Search in lines, so need to index lines?
> >
> > Hi all,
> >
> > I understand Lucene knows to find query matches in tokens. For example
> if I
> > use WhiteSpaceTokenizer and I am searching with /.*nice day.*/ regular
> > expression, I'll always find nothing. Am I correct?
> > In my project I need to find matches inside lines and not inside words,
> so I
> > am considering to tokenize lines. How I should to implement this idea?
> > I'll really appriciate you have more ideas/implementations.
> >
> > Thanks in advance,
> > Ira
> >
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

-- 
Tomoko Uchida
Mime
View raw message