lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: WhitespaceAnalyzer vs StandardAnalyzer
Date Mon, 18 Nov 2013 16:28:34 GMT
What are you getting from looking at your admin/analysis page? That should
help a lot. Otherwise you haven't provided much info about what's failing,
for instance debug=query output to see what gets through your parser.

You might review:
http://wiki.apache.org/solr/UsingMailingLists

Best,
Erick


On Sun, Nov 17, 2013 at 9:42 PM, <raghavendra.k.rao@barclays.com> wrote:

> Hi All,
>
> Could any one please suggest if it is possible to perform Leading and / or
> trailing wildcard searches using WhitespaceAnalyzer?
>
> As noted below, WhitespaceAnalyzer works well for my cause, but I need to
> support wildcard searches. Initial results prove that it isn't possible.
>
> BTW, I escape all the special characters and then at the end I tried by
> suffixing "*". It didn't help.
>
> Please suggest. It is real urgent.. Appreciate any possible help!
>
> Regards,
> Raghu
>
>
> -----Original Message-----
> From: Rao, Raghavendra: IT (NYK)
> Sent: Sunday, November 17, 2013 12:54 PM
> To: java-user@lucene.apache.org
> Subject: RE: WhitespaceAnalyzer vs StandardAnalyzer
>
> The solution clicked to me as soon as I sent the email :)
>
> The problem was that I was enclosing the search text with double quotes
> (for PhraseQuery) before providing it to QueryParser and it was getting
> messed up as double quotes is one of the special characters for Lucene and
> I guess even the double quotes were getting escaped. Now I changed the code
> as follows.
>
> Analyzer analyzer = new WhitespaceAnalyzer(Version.LUCENE_43);
> QueryParser parser = new QueryParser(Version.LUCENE_43, "CONTENTS",
> analyzer); query = parser.parse("\"" +
> QueryParser.escape(strTxtSearchString.toUpperCase()) + "\"");
>
> Regards,
> Raghu
>
>
> -----Original Message-----
> From: Rao, Raghavendra: IT (NYK)
> Sent: Sunday, November 17, 2013 12:36 PM
> To: java-user@lucene.apache.org
> Subject: RE: WhitespaceAnalyzer vs StandardAnalyzer
>
> Thank you very much, Eric.
>
> WhitespaceAnalyzer is going pretty well. I am now trying to search for
> values with special characters that need escaping for Lucene, but facing
> some issues.
>
> I have used the QueryParser.escape() method in the past with
> StandardAnalyzer and it worked fine. But now with WhitespaceAnalyzer, the
> final query is getting tampered once I use the escape() method. Below is an
> example.
>
> ***Code***
> Analyzer analyzer = new WhitespaceAnalyzer(Version.LUCENE_43);
> QueryParser parser = new QueryParser(Version.LUCENE_43, "CONTENTS",
> analyzer); query =
> parser.parse(QueryParser.escape(strTxtSearchString.toUpperCase()));
>
> ***Result***
> Raw Search string passed: modern corporation It is provided to Lucene as:
> "modern corporation" for PhraseQuery
>
> Type of query: BooleanQuery
> query.toString: CONTENTS:"MODERN CONTENTS:CORPORATION"
>
> where as I am expecting:
>
> Type of query: PhraseQuery
> query.toString: CONTENTS:"MODERN CORPORATION"
>
> Please suggest if I am doing anything wrong. As a last option, I am
> planning to manually escape the special characters by preceding them with a
> "\".
>
> Regards,
> Raghu
>
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: Friday, November 15, 2013 4:45 PM
> To: java-user
> Subject: Re: WhitespaceAnalyzer vs StandardAnalyzer
>
> Well, your example will work exactly as you want. And if your input is
> strictly controlled, that's fine. But if you're putting in text, for
> instance, punctuation  will be part of the token. I.e. in the sentence just
> before this one, "token" would not be found, but "token." would.
>
> The admin/analysis page is your friend :).
>
> You might want to consider following with a LowerCaseFilterFactory here
> unless you want your searches to be case sensitive.
>
> And do watch querying in this case. You need to escape things like the
> colon and other special characters, see:
> http://lucene.apache.org/core/2_9_4/queryparsersyntax.html#EscapingSpecial Characters
>
> Best,
> Erick
>
>
> On Fri, Nov 15, 2013 at 3:21 PM, <raghavendra.k.rao@barclays.com> wrote:
>
> > Hi,
> >
> > I implemented my Lucene solution using StandardAnalyzer for both
> > indexing and searching. While testing, I noticed that special
> > characters such as hyphens, forward slash etc. are omitted by this
> Analyzer.
> >
> > In plain English, the requirement is to search for individual words,
> > in Lucene terms SPACE should be the only tokenizer. Also, no part of
> > the text should not be modified / omitted.
> >
> > For eg. ModelNumber: ABC/x:123
> > Here there should be only 2 tokens, "ModelNumber:" and "ABC/x:123".
> >
> > Based on what I read about WhitespaceAnalyzer, it sounds as though it
> > can do exactly what I am looking for. Before I make this big decision,
> > I also wanted to run this by you folks to check if there are any
> > side-effects of switching the Analyzer - keeping in mind my requirements.
> >
> > Any suggestions as always would be greatly appreciated.
> >
> > Regards,
> > Raghu
> >
> >
> > _______________________________________________
> >
> > This message is for information purposes only, it is not a
> > recommendation, advice, offer or solicitation to buy or sell a product
> > or service nor an official confirmation of any transaction. It is
> > directed at persons who are professionals and is not intended for
> > retail customer use. Intended for recipient only. This message is
> subject to the terms at:
> > www.barclays.com/emaildisclaimer.
> >
> > For important disclosures, please see:
> > www.barclays.com/salesandtradingdisclaimer regarding market commentary
> > from Barclays Sales and/or Trading, who are active market
> > participants; and in respect of Barclays Research, including
> > disclosures relating to specific issuers, please see
> http://publicresearch.barclays.com.
> >
> > _______________________________________________
> >
> _______________________________________________
>
> This message is for information purposes only, it is not a recommendation,
> advice, offer or solicitation to buy or sell a product or service nor an
> official confirmation of any transaction. It is directed at persons who are
> professionals and is not intended for retail customer use. Intended for
> recipient only. This message is subject to the terms at:
> www.barclays.com/emaildisclaimer.
>
> For important disclosures, please see:
> www.barclays.com/salesandtradingdisclaimer regarding market commentary
> from Barclays Sales and/or Trading, who are active market participants; and
> in respect of Barclays Research, including disclosures relating to specific
> issuers, please see http://publicresearch.barclays.com.
>
> _______________________________________________
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> _______________________________________________
>
> This message is for information purposes only, it is not a recommendation,
> advice, offer or solicitation to buy or sell a product or service nor an
> official confirmation of any transaction. It is directed at persons who are
> professionals and is not intended for retail customer use. Intended for
> recipient only. This message is subject to the terms at:
> www.barclays.com/emaildisclaimer.
>
> For important disclosures, please see:
> www.barclays.com/salesandtradingdisclaimer regarding market commentary
> from Barclays Sales and/or Trading, who are active market participants; and
> in respect of Barclays Research, including disclosures relating to specific
> issuers, please see http://publicresearch.barclays.com.
>
> _______________________________________________
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> _______________________________________________
>
> This message is for information purposes only, it is not a recommendation,
> advice, offer or solicitation to buy or sell a product or service nor an
> official confirmation of any transaction. It is directed at persons who are
> professionals and is not intended for retail customer use. Intended for
> recipient only. This message is subject to the terms at:
> www.barclays.com/emaildisclaimer.
>
> For important disclosures, please see:
> www.barclays.com/salesandtradingdisclaimer regarding market commentary
> from Barclays Sales and/or Trading, who are active market participants; and
> in respect of Barclays Research, including disclosures relating to specific
> issuers, please see http://publicresearch.barclays.com.
>
> _______________________________________________
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message