lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ankit Murarka <ankit.mura...@rancoretech.com>
Subject Re: How to Index each file and then each Line for Complete Phrase Match. Sample Data shown.
Date Tue, 06 Aug 2013 07:39:24 GMT
Hello.

I dont seem to figure out what to use. Started with AnalyzingSuggester 
and passed StandardAnalyzer to its constructor.

But essentially in order to get the suggestions, I will have to index 
the already indexed document. Now how do I index it again using this 
AnalyzingSuggester.

I cannot use SpellChecker with this as this seem to accept only Analyzer 
and not AnalyzerSuggester.

Is there a different way of using this AnalyzingSuggester to get the 
search suggestion..

Also, verified from the Luke, that indexing the document with 
LineNumberReader is actually working properly. Each line is being 
separately indexed.

Now how do I go about implementing this phrase did you mean search ???

On 8/5/2013 5:08 PM, Michael McCandless wrote:
> Why not use one of the suggesters under lucene/suggest/*?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Mon, Aug 5, 2013 at 4:49 AM, Ankit Murarka
> <ankit.murarka@rancoretech.com>  wrote:
>    
>> Hello.
>>
>> 1. What I am trying to implement is "Complete Suggestion Match-Did You Mean
>> feature for a phrase. I did it for Single Word. I want to do it now for
>> Sentence."
>>
>> 2. What my understanding of indexing each line as a valid phrase in a
>> particular file is as follows:
>>
>> a. Instead of providing a directory name to index, give file name.
>> b. Following code to read each line..  This might be wrong as I am not fully
>> aware of how to index each log line as a valid phrase and not the individual
>> words.
>>
>>
>>       LineNumberReader lnr = new LineNumberReader( new FileReader( new
>> File("D:\\Lucene\\FileSearch\\Memo-1094.20130722-005200_10761334-10771333.txt")))
>> ;
>>           String line=null;
>>            while( null != (line = lnr.readLine()) ){
>>                doc.add(new TextField("contents",line,Field.Store.YES));
>>            }
>>
>> c. Using StandardAnalyzer and storing the index in a separate location.
>>
>> Now, Obviously after this I ran into problem. I provided this index to
>> SpellCheck to create its own index using this and then invoked SpellCheck
>> similar method to give me suggestions. I got only 1 word as the suggested.
>>
>> Now I know I have done a terrible mistake over here but don't seem to figure
>> out.
>>
>> I guess I need to index the whole line as a Phrase (present in the file) as
>> a spellchecker suggestion. Wondering what can be the possible approach. Any
>> help will be highly appreciated.
>>
>>
>> On 8/3/2013 7:25 PM, Jack Krupansky wrote:
>>      
>>> Why not start with something simple? Like, index each log line as a
>>> tokenized text field and then do PhraseQuery against that text field? Is
>>> there something else you need beyond that?
>>>
>>> -- Jack Krupansky
>>>
>>> -----Original Message----- From: Ankit Murarka
>>> Sent: Saturday, August 03, 2013 3:22 AM
>>> To: java-user@lucene.apache.org
>>> Subject: How to Index each file and then each Line for Complete Phrase
>>> Match. Sample Data shown.
>>>
>>> Hello All,
>>>
>>> I have this mentioned in the log file. Till now I am indexing the
>>> complete directory containing files which contain data like this:
>>>
>>> Now I need to index each line of the file to implement complete phrase
>>> search. I intend to store phrases in index and then use SpellChecker API
>>> to suggest me similar phrases.
>>>
>>> 7/20/2013 7:45 *package execution happening-1
>>> * FATAL *check request has been sent for instance* Ip:Port
>>> *EXCEPTION*
>>> 7/20/2013 7:45 *This is not working perfectly
>>> * DEBUG *check request for instance being received is status=200
>>> * Ip:Port *EXCEPTION*
>>> 7/20/2013 7:45 *Encountering a constant error.
>>> * DEBUG *response is not proper.Expecting some more information on
>>> this detail.
>>> * Ip:Port *EXCEPTION*
>>> 7/20/2013 7:45 *This needs urgent attention
>>> * FATAL *I am still trying to ensure it is running perfectly.
>>> Encountering some issues.
>>> * Ip:Port *EXCEPTION*
>>>
>>> 7/20/2013 8:01 *Job is running fine.*
>>> INFO
>>> *************************************************************************\
>>>
>>> *Exception Occured in ClassFactory* * Function()
>>> java.nullPointerException: Value is null
>>> * *Should not be null*
>>>
>>> To implement complete phrase search I reckon I need to index each line and
>>> store the phrase .*Phrases in the above mentioned table are highlighted in
>>> Bold.*
>>>
>>> So, if I am able to index these and store these phrases as indexes, so
>>> when User tries to search for "package executing",
>>>
>>> the Lucene would be able to provide me "package execution happening-1" as
>>> a valid suggestion..
>>>
>>> These columns does not have a name to them and hence I cannot index based
>>> on column name. Also as shown in the table above, first column may contain
>>> time/date or a phrase in itself (shown in last row).
>>>
>>> Please suggest. How is it possible using Lucene and its API. Javadoc does
>>> not seem to guide me anywhere for this case.
>>>
>>>        
>>
>> --
>> Regards
>>
>> Ankit Murarka
>>
>> "What lies behind us and what lies before us are tiny matters compared with
>> what lies within us"
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>      
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>    


-- 
Regards

Ankit Murarka

"What lies behind us and what lies before us are tiny matters compared with what lies within
us"


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message