lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Murdoch, Paul" <PAUL.B.MURD...@saic.com>
Subject StandardAnalyzer and comma
Date Wed, 24 Feb 2010 16:33:09 GMT
I'm using Lucene 2.9.  How do I make a comma behave like a regular
character using the StandardAnalyzer?  Example:

 

I have a field called "choice" and some field values:

 

groupA, morning 

groupB, noon

groupC, night

morning

noon

night

 

So a query choice:night returns "groupC, night" and "night".  Well, I
only wanted "night".  The StandardAnalyzer strips the commas from
phrases and splits on whitespace.  A phrase query choice:"night"
produces the same results.  I think indexing the field values as
NOT_ANALYZED and making the comma behave as a regular character will
solve this.

 

Of course I have thought about choice:(night -groupC).  That is not an
option because the contents of the index are hidden from the front end
where queries are made by users.  I looked into changing
StandardTokenizerImpl punctuation, but I'm hoping for a more simple
solution.  Also, changing analyzers is not an option.  I could possibly
extend the StandardAnalyzer, but how do I set the punctuation settings?
Any help here would be great.  This seems like it should be an easy fix
so I hope I've missed something simple.

 

Thanks,

Paul 

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message