lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Memo: RE: RE: Query parser and minus signs
Date Wed, 26 May 2004 14:11:42 GMT
What is the value of your "Parsed query:" output?


On May 26, 2004, at 8:39 AM, alex.bourne@hsbcam.com wrote:

>
>
>
>
> I switched to indexing using a text field instead of keyword, then I 
> tried
> the following based on various pieces of advice:
>
>             PerFieldAnalyzerWrapper pfaw = new 
> PerFieldAnalyzerWrapper(new ChineseAnalyzer());
>             pfaw.addAnalyzer("language", new WhitespaceAnalyzer());
>
>             try
>             {
>                 query = MultiFieldQueryParser.parse(queryString, new 
> String[]{"contents", "keywords", "title", "language"}, (Analyzer) 
> pfaw);
>                 System.out.println("Parsed query: " + 
> query.toString());
>             }
>             catch (ParseException e)
>             {
>                 error = true;
>                 e.printStackTrace();
>             }
>
> I have tried both "language:zh-HK" and  "language:zh\-HK" (which 
> appears in
> the debugger as "language:zh\\-HK") as the query, and neither return 
> any
> hits. I've tried stepping through the code to see what is being indexed
> (which looks OK at least to a relative beginner like myself), and also
> through the search code but I'm still none the wiser.
>
> Am I doing something wrong, or have I completely missed the point ??
>
>
>
> To:    Alex BOURNE/IBEU/HSBC@HSBC
> cc:
> bcc:
>
> Subject:    RE: RE: Query parser and minus signs
>
>
> remember luke does not display the indexed tokens but the stored 
> field.  So
> you would expect to see en-uk in the field.
>
> doc.add(Field.Keyword("locale","test-uk"));
>
> are you adding to the document like this?
>
> Also what analyzer you using to pass the query?
>
> org.apache.lucene.analysis.WhitespaceAnalyzer : parses as locale:en-uk
> org.apache.lucene.analysis.SimpleAnalyzer : parses as locale:en uk
> org.apache.lucene.analysis.standard.StandardAnalyzer : parses as 
> locale:en
> uk
>
> Try using whitespace analyzer in Luke and see how it's interpreting the
> query.  If you are storing as a keyword but searching with tokens, it 
> may
> be your problem.
>
>
>
> -----Original Message-----
> From: alex.bourne@hsbcam.com [mailto:alex.bourne@hsbcam.com]
> Sent: 24 May 2004 09:50
> To: Lucene Users List
> Subject: RE: RE: Query parser and minus signs
>
>
>
>
>
>
> I tried this, but no it does not work. I'm concerned that escaping the
> minus symbol does not appear to work. The field is indexed as a 
> keyword so
> is not tokenized - I've checked the contents using luke which confirms
> this.
>
>
>
>
> "David Townsend" <david.townsend@magus.co.uk> on 21 May 2004 17:02
>
> Please respond to "Lucene Users List" <lucene-user@jakarta.apache.org>
>
> To:    "Lucene Users List" <lucene-user@jakarta.apache.org>
> cc:
> bcc:
>
> Subject:    RE: RE: Query parser and minus signs
>
>
> Doesn't "en UK" as a phrase query work?
>
> You're probably indexing it as a text field so it's being tokenised.
>
> -----Original Message-----
> From: alex.bourne@hsbcam.com [mailto:alex.bourne@hsbcam.com]
> Sent: 21 May 2004 16:47
> To: Lucene Users List
> Subject: Memo: RE: Query parser and minus signs
>
>
>
>
>
>
> Hmm, we may have to if there is no work around. We're not using java
> locales, but were trying to stick to the ISO standard which uses 
> hyphens.
>
>
>
>
> "Ryan Sonnek" <rsonnek@DigitalRiver.com> on 21 May 2004 16:38
>
> Please respond to "Lucene Users List" <lucene-user@jakarta.apache.org>
>
> To:    "Lucene Users List" <lucene-user@jakarta.apache.org>
> cc:
> bcc:
>
> Subject:    RE: Query parser and minus signs
>
>
> if you're dealing with locales, why not use java's built in locale 
> syntax
> (ex: en_UK, zh_HK)?
>
>> -----Original Message-----
>> From: alex.bourne@hsbcam.com [mailto:alex.bourne@hsbcam.com]
>> Sent: Friday, May 21, 2004 10:36 AM
>> To: lucene-user@jakarta.apache.org
>> Subject: Query parser and minus signs
>>
>>
>>
>>
>>
>>
>> Hi All,
>>
>> I'm using Lucene on a site that has split content with a
>> branch containing
>> pages in English and a separate branch in Chinese.  Some of
>> the chinese
>> pages include some (untranslatable) English words, so when a search is
>> carried out in either language you can get pages from the
>> wrong branch. To
>> combat this we introduced a language field into the index
>> which contains
>> the standard language codes: en-UK and zh-HK.
>>
>> When you parse a query  e.g. language:"en\-UK" you could
>> reasonably expect
>> the search to recover all pages with the language field set
>> to "en-UK" (the
>> minus symbol should be escaped by the backslash according to the FAQ).
>> Unfortunately the parser seems to return "en UK" as the
>> parsed query and
>> hence returns no documents.
>>
>> Has anyone else had this problem, or could suggest a
>> workaround ?? as I
>> have
>> yet to find a solution in the mailing archives or elsewhere.
>>
>> Many thanks in advance,
>>
>> Alex Bourne
>>
>>
>>
>> _____________________________________________________
>>
>> This transmission has been issued by a member of the HSBC Group
>> ("HSBC") for the information of the addressee only and should not be
>> reproduced and / or distributed to any other person. Each page
>> attached hereto must be read in conjunction with any disclaimer which
>> forms part of it. This transmission is neither an offer nor
>> the solicitation
>> of an offer to sell or purchase any investment. Its contents
>> are based
>> on information obtained from sources believed to be reliable but HSBC
>> makes no representation and accepts no responsibility or
>> liability as to
>> its completeness or accuracy.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
>
> ******************************************************************
>  This message originated from the Internet. Its originator may or
>  may not be who they claim to be and the information contained in
>  the message and any attachments may or may not be accurate.
> ******************************************************************
>
>
>
>
>
>
>
>
> _____________________________________________________
>
> This transmission has been issued by a member of the HSBC Group
> ("HSBC") for the information of the addressee only and should not be
> reproduced and / or distributed to any other person. Each page
> attached hereto must be read in conjunction with any disclaimer which
> forms part of it. This transmission is neither an offer nor the
> solicitation
> of an offer to sell or purchase any investment. Its contents are based
> on information obtained from sources believed to be reliable but HSBC
> makes no representation and accepts no responsibility or liability as 
> to
> its completeness or accuracy.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
>
> ******************************************************************
>  This message originated from the Internet. Its originator may or
>  may not be who they claim to be and the information contained in
>  the message and any attachments may or may not be accurate.
> ******************************************************************
>
>
>
>
>
>
>
>
> _____________________________________________________
>
> This transmission has been issued by a member of the HSBC Group
> ("HSBC") for the information of the addressee only and should not be
> reproduced and / or distributed to any other person. Each page
> attached hereto must be read in conjunction with any disclaimer which
> forms part of it. This transmission is neither an offer nor the
> solicitation
> of an offer to sell or purchase any investment. Its contents are based
> on information obtained from sources believed to be reliable but HSBC
> makes no representation and accepts no responsibility or liability as 
> to
> its completeness or accuracy.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
>
> ******************************************************************
>  This message originated from the Internet. Its originator may or
>  may not be who they claim to be and the information contained in
>  the message and any attachments may or may not be accurate.
> ******************************************************************
>
>
>
>
>
>
> _____________________________________________________
>
> This transmission has been issued by a member of the HSBC Group
> ("HSBC") for the information of the addressee only and should not be
> reproduced and / or distributed to any other person. Each page
> attached hereto must be read in conjunction with any disclaimer which
> forms part of it. This transmission is neither an offer nor the 
> solicitation
> of an offer to sell or purchase any investment. Its contents are based
> on information obtained from sources believed to be reliable but HSBC
> makes no representation and accepts no responsibility or liability as 
> to
> its completeness or accuracy.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message