lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "sandeep chawla" <sand.cha...@gmail.com>
Subject Re: Query Analyzer Issue
Date Fri, 31 Aug 2007 19:07:21 GMT
Which analyzer are you using in your query parser ?

can you share the one line of code in which you construct a QueryParser Object.

As you might be  parsing  a  query string made of different fields, I
suggest you use a

PerFieldAnalyzerWrapper  which lets you do unique analysis for different fields.


~sandeep


On 31/08/2007, Harini Raghavan <harini.raghavan@insideview.com> wrote:
> Hi Everyone,
>
> I am facing some strange behaviour with Analyzers. I am using SimpleAnalyzer
> for some fields in my Compass entity, but I also wrote a custom Analyzer
> that is slightly different from the SimpleAnalyzer as I wanted to allow even
> letters and digits in company name column.
> So custom analyzer CompanyNameAnalyzer uses the following tokenizer.
>
> public class CompanyNameTokenizer extends CharTokenizer {
>   /** Construct a new CompanyNameTokenizer. */
>   public CompanyNameTokenizer(Reader in) {
>     super(in);
>   }
>
>   protected char normalize(char c) {
>     return Character.toLowerCase(c);
>   }
>
>   /** Collects only characters which are numbers or letters */
>   protected boolean isTokenChar(char c) {
>     Character ch = new Character(c);
>     // Exclude @ special character while tokenizing
>     if(Character.isLetterOrDigit(c) || ch.equals('@'))
>         return true;
>     else
>         return false;
>   }
> }
>
> But for some reason when I search for +companyName:good +companyName:will,
> the word 'will' is ignored in the search, I get results that match only
> good. I guess this means that 'will' is being stripped off as it is a stop
> word. I don't understand why this should happen because the custom Analyzer
> I am using does not use the StopFilter. So why should it filter the stop
> words?
>
> I tried looking at the Lucene source too, but no luck. Any suggestions would
> be appreciated.
>
> Thanks,
> Harini
>


-- 
SANDEEP CHAWLA
House No- 23                     			
10th main 					
BTM 1st  Stage     					
Bangalore						Mobile: 91-9986150603

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message