lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Leandro Saad" <leandro.s...@gmail.com>
Subject Re: Search with accents
Date Thu, 03 Aug 2006 14:21:20 GMT
I'm using StandardAnalyser all over, so, yes, portuguese stopwords won't be
eliminated

-- 
Leandro Rodrigo Saad Cruz
CTO - InterBusiness Technologies
db.apache.org/ojb
guara-framework.sf.net
xingu.sf.net

On 8/2/06, Eduardo S. Cordeiro <escordeiro@gmail.com> wrote:
>
> But was your index created with BrazilianAnalyzer? Because otherwise
> you wouldn't have portuguese stopwords eliminated, like "e", "ou",
> etc.
>
> 2006/8/2, Leandro Saad <leandro.saad@gmail.com>:
> > Hi Eduardo. I'm using the StandardAnalyser and I can search for words
> with
> > accents. In my case "saúde"
> >
> > --
> > Leandro Rodrigo Saad Cruz
> > CTO - InterBusiness Technologies
> > db.apache.org/ojb
> > guara-framework.sf.net
> > xingu.sf.net
> >
> > On 8/1/06, Eduardo S. Cordeiro <escordeiro@gmail.com> wrote:
> > >
> > > Yes...here's how I create my QueryParser:
> > >
> > > QueryParser parser = new QueryParser("text", new BrazilianAnalyzer());
> > >
> > > 2006/8/1, Zhang, Lisheng <Lisheng.Zhang@broadvision.com>:
> > > > Hi,
> > > >
> > > > Have you used the same BrazilianAnalyzer when
> > > > searching?
> > > >
> > > > Best regards, Lisheng
> > > >
> > > > -----Original Message-----
> > > > From: Eduardo S. Cordeiro [mailto:escordeiro@gmail.com]
> > > > Sent: Tuesday, August 01, 2006 1:40 PM
> > > > To: java-user@lucene.apache.org
> > > > Subject: Search with accents
> > > >
> > > >
> > > > Hello there,
> > > >
> > > > I have a brazilian portuguese index, which has been analyzed with
> > > > BrazilianAnalyzer. When searching words with accents, however,
> they're
> > > > not found -- for instance, if the index contains some text with the
> > > > word "maçã" and I search for that very word, I get no hits, but if I
> > > > search "maca" (which is another portuguese word) then the document
> > > > containing "maçã" is found.
> > > >
> > > > I've seen posts in the archive indicating that I should use
> > > > ISOLatin1AccentFilter to handle this, but I don't quite see how:
> > > > should I leave indexation as it is and use this filter only for
> search
> > > > queries or should I apply it in both cases?
> > > >
> > > > Thank you,
> > > > Eduardo Cordeiro
> > > >
> > > >
> ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > > >
> > > >
> > >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message