lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Murdoch, Paul" <PAUL.B.MURD...@saic.com>
Subject RE: StandardAnalyzer and comma
Date Wed, 24 Feb 2010 17:10:44 GMT
Thanks for the input.  I'll give the WhitespaceAnalyzer a shot.  Also,
AFAIK, Field.Index.NOT_ANALYZED means that the content you index is not
split into separate tokens so it is searchable, but only for exact
matches.  I may be able to get what I want with the WhitespaceAnalyzer
and Field.Index.NOT_ANALYZED.  Thanks again.

Paul 

-----Original Message-----
From: java-user-return-45134-PAUL.B.MURDOCH=saic.com@lucene.apache.org
[mailto:java-user-return-45134-PAUL.B.MURDOCH=saic.com@lucene.apache.org
] On Behalf Of Max Lynch
Sent: Wednesday, February 24, 2010 11:42 AM
To: java-user@lucene.apache.org
Subject: Re: StandardAnalyzer and comma

Personally punctuation matters in my queries so I use
WhitespaceAnalyzer.  I
also only want exact hits, so that analyzer works well for me.

Also, AFAIK you don't set NOT_ANALYZED if you want to search through it.

On Wed, Feb 24, 2010 at 10:33 AM, Murdoch, Paul
<PAUL.B.MURDOCH@saic.com>wrote:

> I'm using Lucene 2.9.  How do I make a comma behave like a regular
> character using the StandardAnalyzer?  Example:
>
>
>
> I have a field called "choice" and some field values:
>
>
>
> groupA, morning
>
> groupB, noon
>
> groupC, night
>
> morning
>
> noon
>
> night
>
>
>
> So a query choice:night returns "groupC, night" and "night".  Well, I
> only wanted "night".  The StandardAnalyzer strips the commas from
> phrases and splits on whitespace.  A phrase query choice:"night"
> produces the same results.  I think indexing the field values as
> NOT_ANALYZED and making the comma behave as a regular character will
> solve this.
>
>
>
> Of course I have thought about choice:(night -groupC).  That is not an
> option because the contents of the index are hidden from the front end
> where queries are made by users.  I looked into changing
> StandardTokenizerImpl punctuation, but I'm hoping for a more simple
> solution.  Also, changing analyzers is not an option.  I could
possibly
> extend the StandardAnalyzer, but how do I set the punctuation
settings?
> Any help here would be great.  This seems like it should be an easy
fix
> so I hope I've missed something simple.
>
>
>
> Thanks,
>
> Paul
>
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message