lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "prabin meitei" <prabin.mei...@gmail.com>
Subject Re: Problem with special charecters
Date Wed, 03 Dec 2008 11:48:13 GMT
I don't think there is a way to do that if you are using lucene
standardAnalyzer. because standardAnalyzer is meant to tokenize by some
standard token characters. for custom analyzing it is good to use your own
analyzer.  You can probably use SimpleAnalyzer

Prabin
toostep.com

On Wed, Dec 3, 2008 at 4:36 PM, Ravichandra <
ravichandra.thiruveedhula@fmr.com> wrote:

>
> It worked out well.
> Thanks
>
> Is there any way that we can use standardAnalyzer and tell it not generated
> tokens out of this?
>
> Thanks
> Ravichandra
>
>
> prabin meitei wrote:
> >
> > use your own analyzer. Write a class extending lucene analyzer.  you can
> > override the  tokenStream method to include whatever you want and exclude
> > what you don't want.
> >
> > eg. of a token stream method which may work for you
> >  public TokenStream tokenStream(String fieldName, Reader reader) {
> >         TokenStream result = new StopFilter(
> >                     new LowerCaseFilter(
> >                         new WhitespaceTokenizer(reader)));
> >         return result;
> >     }
> >
> > Prabin
> > toostep.com
> >
> > On Wed, Dec 3, 2008 at 3:17 PM, Ravichandra <
> > ravichandra.thiruveedhula@fmr.com> wrote:
> >
> >>
> >> Hi
> >>
> >> I tried that approach, I did used escaping using the "\", and the query
> >> has
> >> the special charecter, but i got no results that time.
> >>
> >> What I found out was when I use Standard Analyzer on "ABC+S", the terms
> >> generated are "ABC" and "S" and '+' is getting lost.
> >> When I used whitespaceAnalyzer or keywordAnalyzer (or use
> >> standardanalyzer
> >> but don't analyze this field), it is working, but it is case sensitive.
> >>
> >> Can anybody suggest me how to use Whitespaceanalyzer/keywordanalyzer
> >> along
> >> with lowecase filter
> >> or
> >> how to make Standardanalyzer not to generate tokens based on these
> >> special
> >> chrecters.
> >>
> >> Thanks
> >> Ravichandra
> >>
> >>
> >> prabin meitei wrote:
> >> >
> >> > try manually escaping the search string, adding "\" in front of the
> >> > special
> >> > characters. (you can do this easily by using string replace)
> >> > This will make sure that your query contains the special characters
> >> >
> >> >
> >> > Prabin
> >> > toostep.com
> >> >
> >> > On Wed, Dec 3, 2008 at 12:03 PM, Ravichandra <
> >> > ravichandra.thiruveedhula@fmr.com> wrote:
> >> >
> >> >>
> >> >> Hi
> >> >>
> >> >> I am having problems with searching for special charecters like
> >> %,*,+,-
> >> >> etc.
> >> >> I am using standard analyzer. I create queries using the query
> >> >> constructor.
> >> >>
> >> >> I read in the forums and tried with QueryParser.escpae and parse
> >> methods,
> >> >> but the problem is if I have two fields as "ABC+S" and "ABC-S" and
> >> when
> >> i
> >> >> search for "ABC+S", the order of fields is "ABC+S","ABC-S", and when
> >> we
> >> >> search for "ABC-S", then also the order is "ABC+S","ABC-S".
> >> >>
> >> >> What I have seen is when we escape the search string and then parse
> it
> >> >> using
> >> >> query parser, the generated query doesn't contain the special
> >> charecter
> >> >> '+',
> >> >> or '-'.
> >> >>
> >> >> Please suggest me how to deal with special charecters.
> >> >>
> >> >> Lucene : 2.4
> >> >>
> >> >> Thanks
> >> >> Ravichandra
> >> >>
> >> >> -----
> >> >> Thanks
> >> >> Ravichandra
> >> >> --
> >> >> View this message in context:
> >> >>
> >>
> http://www.nabble.com/Problem-with-special-charecters-tp20807546p20807546.html
> >> >> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> >> >>
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >> >>
> >> >>
> >> >
> >> >
> >>
> >>
> >> -----
> >> Thanks
> >> Ravichandra
> >> --
> >> View this message in context:
> >>
> http://www.nabble.com/Problem-with-special-charecters-tp20807546p20809771.html
> >> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
> >
>
>
> -----
> Thanks
> Ravichandra
> --
> View this message in context:
> http://www.nabble.com/Problem-with-special-charecters-tp20807546p20810912.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message