lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ankit Murarka <ankit.mura...@rancoretech.com>
Subject Re: How to Index each file and then each Line for Complete Phrase Match. Sample Data shown.
Date Mon, 05 Aug 2013 08:49:44 GMT
Hello.

1. What I am trying to implement is "Complete Suggestion Match-Did You 
Mean feature for a phrase. I did it for Single Word. I want to do it now 
for Sentence."

2. What my understanding of indexing each line as a valid phrase in a 
particular file is as follows:

a. Instead of providing a directory name to index, give file name.
b. Following code to read each line..  This might be wrong as I am not 
fully aware of how to index each log line as a valid phrase and not the 
individual words.


      LineNumberReader lnr = new LineNumberReader( new FileReader( new 
File("D:\\Lucene\\FileSearch\\Memo-1094.20130722-005200_10761334-10771333.txt"))) 
;
          String line=null;
           while( null != (line = lnr.readLine()) ){
               doc.add(new TextField("contents",line,Field.Store.YES));
           }

c. Using StandardAnalyzer and storing the index in a separate location.

Now, Obviously after this I ran into problem. I provided this index to 
SpellCheck to create its own index using this and then invoked 
SpellCheck similar method to give me suggestions. I got only 1 word as 
the suggested.

Now I know I have done a terrible mistake over here but don't seem to 
figure out.

I guess I need to index the whole line as a Phrase (present in the file) 
as a spellchecker suggestion. Wondering what can be the possible 
approach. Any help will be highly appreciated.

On 8/3/2013 7:25 PM, Jack Krupansky wrote:
> Why not start with something simple? Like, index each log line as a 
> tokenized text field and then do PhraseQuery against that text field? 
> Is there something else you need beyond that?
>
> -- Jack Krupansky
>
> -----Original Message----- From: Ankit Murarka
> Sent: Saturday, August 03, 2013 3:22 AM
> To: java-user@lucene.apache.org
> Subject: How to Index each file and then each Line for Complete Phrase 
> Match. Sample Data shown.
>
> Hello All,
>
> I have this mentioned in the log file. Till now I am indexing the
> complete directory containing files which contain data like this:
>
> Now I need to index each line of the file to implement complete phrase
> search. I intend to store phrases in index and then use SpellChecker API
> to suggest me similar phrases.
>
> 7/20/2013 7:45 *package execution happening-1
> * FATAL *check request has been sent for instance* Ip:Port
> *EXCEPTION*
> 7/20/2013 7:45 *This is not working perfectly
> * DEBUG *check request for instance being received is status=200
> * Ip:Port *EXCEPTION*
> 7/20/2013 7:45 *Encountering a constant error.
> * DEBUG *response is not proper.Expecting some more information on
> this detail.
> * Ip:Port *EXCEPTION*
> 7/20/2013 7:45 *This needs urgent attention
> * FATAL *I am still trying to ensure it is running perfectly.
> Encountering some issues.
> * Ip:Port *EXCEPTION*
>
> 7/20/2013 8:01 *Job is running fine.*
> INFO
> *************************************************************************\ 
>
>
> *Exception Occured in ClassFactory* * Function()
> java.nullPointerException: Value is null
> * *Should not be null*
>
> To implement complete phrase search I reckon I need to index each line 
> and store the phrase .*Phrases in the above mentioned table are 
> highlighted in Bold.*
>
> So, if I am able to index these and store these phrases as indexes, so 
> when User tries to search for "package executing",
>
> the Lucene would be able to provide me "package execution happening-1" 
> as a valid suggestion..
>
> These columns does not have a name to them and hence I cannot index 
> based on column name. Also as shown in the table above, first column 
> may contain time/date or a phrase in itself (shown in last row).
>
> Please suggest. How is it possible using Lucene and its API. Javadoc 
> does not seem to guide me anywhere for this case.
>


-- 
Regards

Ankit Murarka

"What lies behind us and what lies before us are tiny matters compared with what lies within
us"


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message