lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <>
Subject Re: How to Index each file and then each Line for Complete Phrase Match. Sample Data shown.
Date Sat, 03 Aug 2013 13:55:33 GMT
Why not start with something simple? Like, index each log line as a 
tokenized text field and then do PhraseQuery against that text field? Is 
there something else you need beyond that?

-- Jack Krupansky

-----Original Message----- 
From: Ankit Murarka
Sent: Saturday, August 03, 2013 3:22 AM
Subject: How to Index each file and then each Line for Complete Phrase 
Match. Sample Data shown.

Hello All,

I have this mentioned in the log file. Till now I am indexing the
complete directory containing files which contain data like this:

Now I need to index each line of the file to implement complete phrase
search. I intend to store phrases in index and then use SpellChecker API
to suggest me similar phrases.

7/20/2013 7:45 *package execution happening-1
* FATAL *check request has been sent for instance* Ip:Port
7/20/2013 7:45 *This is not working perfectly
* DEBUG *check request for instance being received is status=200
7/20/2013 7:45 *Encountering a constant error.
* DEBUG *response is not proper.Expecting some more information on
this detail.
7/20/2013 7:45 *This needs urgent attention
* FATAL *I am still trying to ensure it is running perfectly.
Encountering some issues.

7/20/2013 8:01 *Job is running fine.*

*Exception Occured in ClassFactory* * Function()
java.nullPointerException: Value is null
* *Should not be null*

To implement complete phrase search I reckon I need to index each line and 
store the phrase .*Phrases in the above mentioned table are highlighted in 

So, if I am able to index these and store these phrases as indexes, so when 
User tries to search for "package executing",

the Lucene would be able to provide me "package execution happening-1" as a 
valid suggestion..

These columns does not have a name to them and hence I cannot index based on 
column name. Also as shown in the table above, first column may contain 
time/date or a phrase in itself (shown in last row).

Please suggest. How is it possible using Lucene and its API. Javadoc does 
not seem to guide me anywhere for this case.


Ankit Murarka

"What lies behind us and what lies before us are tiny matters compared with 
what lies within us"

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message