lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: How to Index each file and then each Line for Complete Phrase Match. Sample Data shown.
Date Mon, 05 Aug 2013 11:38:26 GMT
Why not use one of the suggesters under lucene/suggest/*?

Mike McCandless

http://blog.mikemccandless.com


On Mon, Aug 5, 2013 at 4:49 AM, Ankit Murarka
<ankit.murarka@rancoretech.com> wrote:
> Hello.
>
> 1. What I am trying to implement is "Complete Suggestion Match-Did You Mean
> feature for a phrase. I did it for Single Word. I want to do it now for
> Sentence."
>
> 2. What my understanding of indexing each line as a valid phrase in a
> particular file is as follows:
>
> a. Instead of providing a directory name to index, give file name.
> b. Following code to read each line..  This might be wrong as I am not fully
> aware of how to index each log line as a valid phrase and not the individual
> words.
>
>
>      LineNumberReader lnr = new LineNumberReader( new FileReader( new
> File("D:\\Lucene\\FileSearch\\Memo-1094.20130722-005200_10761334-10771333.txt")))
> ;
>          String line=null;
>           while( null != (line = lnr.readLine()) ){
>               doc.add(new TextField("contents",line,Field.Store.YES));
>           }
>
> c. Using StandardAnalyzer and storing the index in a separate location.
>
> Now, Obviously after this I ran into problem. I provided this index to
> SpellCheck to create its own index using this and then invoked SpellCheck
> similar method to give me suggestions. I got only 1 word as the suggested.
>
> Now I know I have done a terrible mistake over here but don't seem to figure
> out.
>
> I guess I need to index the whole line as a Phrase (present in the file) as
> a spellchecker suggestion. Wondering what can be the possible approach. Any
> help will be highly appreciated.
>
>
> On 8/3/2013 7:25 PM, Jack Krupansky wrote:
>>
>> Why not start with something simple? Like, index each log line as a
>> tokenized text field and then do PhraseQuery against that text field? Is
>> there something else you need beyond that?
>>
>> -- Jack Krupansky
>>
>> -----Original Message----- From: Ankit Murarka
>> Sent: Saturday, August 03, 2013 3:22 AM
>> To: java-user@lucene.apache.org
>> Subject: How to Index each file and then each Line for Complete Phrase
>> Match. Sample Data shown.
>>
>> Hello All,
>>
>> I have this mentioned in the log file. Till now I am indexing the
>> complete directory containing files which contain data like this:
>>
>> Now I need to index each line of the file to implement complete phrase
>> search. I intend to store phrases in index and then use SpellChecker API
>> to suggest me similar phrases.
>>
>> 7/20/2013 7:45 *package execution happening-1
>> * FATAL *check request has been sent for instance* Ip:Port
>> *EXCEPTION*
>> 7/20/2013 7:45 *This is not working perfectly
>> * DEBUG *check request for instance being received is status=200
>> * Ip:Port *EXCEPTION*
>> 7/20/2013 7:45 *Encountering a constant error.
>> * DEBUG *response is not proper.Expecting some more information on
>> this detail.
>> * Ip:Port *EXCEPTION*
>> 7/20/2013 7:45 *This needs urgent attention
>> * FATAL *I am still trying to ensure it is running perfectly.
>> Encountering some issues.
>> * Ip:Port *EXCEPTION*
>>
>> 7/20/2013 8:01 *Job is running fine.*
>> INFO
>> *************************************************************************\
>>
>> *Exception Occured in ClassFactory* * Function()
>> java.nullPointerException: Value is null
>> * *Should not be null*
>>
>> To implement complete phrase search I reckon I need to index each line and
>> store the phrase .*Phrases in the above mentioned table are highlighted in
>> Bold.*
>>
>> So, if I am able to index these and store these phrases as indexes, so
>> when User tries to search for "package executing",
>>
>> the Lucene would be able to provide me "package execution happening-1" as
>> a valid suggestion..
>>
>> These columns does not have a name to them and hence I cannot index based
>> on column name. Also as shown in the table above, first column may contain
>> time/date or a phrase in itself (shown in last row).
>>
>> Please suggest. How is it possible using Lucene and its API. Javadoc does
>> not seem to guide me anywhere for this case.
>>
>
>
> --
> Regards
>
> Ankit Murarka
>
> "What lies behind us and what lies before us are tiny matters compared with
> what lies within us"
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message