lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tare...@controldocs.com
Subject Re: Anomaly in defining search phrase
Date Wed, 22 Jun 2005 15:35:03 GMT
>
> On Jun 21, 2005, at 2:59 PM, tareque@controldocs.com wrote:
>
>> I found a discrepancy in results for an identical search
>> ("processing")
>> done with lucene and mysql. Seems like lucene is not returning results
>> where the search word is associated with "-"(hyphen) or
>> '."(period). For
>> example it didn't returned result for a text that contained
>> "processing-7-bit" and "straighforwerd.processing" but mysql did.
>> Is there
>> any settings issue or it is something unavoidable?
>>
>> Thanks
>> Tareque
>> ControlDOCS
>>
>> PS: In contrast to that, I previously found lucene returning some
>> other
>> results those mysql didn't. For example search phrase associated
>> with "'"
>> (apostrophe)  and "_"(underscore). I am not complaining about this.
>> Rather
>> I found it preferable for my purpose.
>
> These all boil down to your choice of analyzer.  What analyzer are
> you using?
>
> As you can see below, "processing-7-bit" is tokenized quite
> differently depending on the analyzer:
>
> $ ant AnalyzerDemo
> Buildfile: build.xml
>
>      [input] String to analyze: [This string will be analyzed.]
> processing-7-bit
>       [echo] Running lia.analysis.AnalyzerDemo...
>       [java] Analyzing "processing-7-bit"
>       [java]   WhitespaceAnalyzer:
>       [java]     [processing-7-bit]
>
>       [java]   SimpleAnalyzer:
>       [java]     [processing] [bit]
>
>       [java]   StopAnalyzer:
>       [java]     [processing] [bit]
>
>       [java]   StandardAnalyzer:
>       [java]     [processing-7-bit]
>
> If you're using the StandardAnalyzer, you are not indexing the word
> "processing" at all.  Grab the source code from Lucene in Action at
> lucenebook.com and type "ant AnalyzerDemo" to try out the basic
> analyzers.
>
>      Erik
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


Thanks! Using StopAnalyzer helped solving the problem. Is there any detail
documentation of what each of this analyzers do?

Tareque


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message