lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: AnalyZer HELP Please
Date Tue, 17 Aug 2004 13:32:32 GMT
Further on this, Karthik, is that you need to really understand what 
you indexed.  For example... take a document that has "New Year" in it, 
and follow it through your indexing process.  See what your analyzer at 
indexing time actually indexed.  And if "new year" are side-by-side 
tokens emitted from that process, then querying for "New Year" through 
QueryParser should find a match.

You can easily put together a 10-line JUnit test case using 
RAMDirectory and your favorite Analyzer to see how this works.  I 
highly recommend you do this in order to isolate the situation even 
further.

	Erik


On Aug 17, 2004, at 9:25 AM, Patrick Burleson wrote:

> Karthik,
>
> What you would want to do with the split tokens ( "New" and "Year" )
> is then create a PhraseQuery containing a Term object for each token.
> This should do what you want. As Erik said, QueryParser would have
> done this internally, only if you actually sent in the quotes...not
> just "New Year", but "\"New Year\"".
>
> Patrick
>
> On Tue, 17 Aug 2004 18:53:01 +0530, Karthik N S
> <karthik@controlnet.co.in> wrote:
>> Hi
>>
>> Erik
>>
>>   Apologies.......
>>
>>   What I ment to Say was,  a word such as "New Year"  (Quotes means  
>> "\"" )
>>   on  QueryParser.parse(word, "contents", analyzer) should return me 
>> hits
>> for the full word,
>>   but it did not.
>>
>>  So when I  did a quick run on Analyzer process and
>>  found that it was splitting the Word
>>
>>   "New Year"  =  [New]  [Year]
>>
>>  Am I doing some thing wrong in here....
>>
>> Thx in advance.....
>> Karthik
>>
>>
>>
>> -----Original Message-----
>> From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
>> Sent: Tuesday, August 17, 2004 6:18 PM
>> To: Lucene Users List
>> Subject: Re: AnalyZer HELP Please
>>
>> This is what analyzers do.  I don't know of any analyzer that deals
>> with quotes in the way you're requesting, by keeping the contents
>> together as a complete token.  You'll have to write your own variant
>> that does this.
>>
>> QueryParser, however, uses quotes to denote a phrase query, and will
>> query for the words together.  Perhaps this is sufficient for your
>> needs?
>>
>>         Erik
>>
>> On Aug 17, 2004, at 8:40 AM, Karthik N S wrote:
>>
>>>
>>> Hey Guys.....
>>>
>>> Apologies......
>>>
>>>
>>> Some small Help needed
>>>
>>> When I Run the Analyzer's for the word  "New Year" (with Quotes) on
>>> Lucene1-4 final.jar on win 2k O/s
>>> Why is the SimpleAnalyzer splitting it into 2 words ???
>>>
>>> or
>>>
>>>
>>> am i missing something in here......
>>>
>>>
>>>
>>> Analzying " New  Year "
>>> org.apache.lucene.analysis.WhitespaceAnalyzer:
>>>
>>> ["] [New] [+] [Year] ["]
>>>
>>> org.apache.lucene.analysis.SimpleAnalyzer:
>>>
>>> [new] [year]
>>>
>>> org.apache.lucene.analysis.StopAnalyzer:
>>>
>>> [new] [year]
>>>
>>> org.apache.lucene.analysis.standard.StandardAnalyzer:
>>>
>>> [new] [year]
>>>
>>> com.controlnet.indexing.analyzers.GrammerAnalyzer:
>>>
>>> [year]
>>>
>>>
>>>
>>>
>>>
>>>       WITH WARM REGARDS
>>>       HAVE A NICE DAY
>>>       [ N.S.KARTHIK]
>>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message