lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Modifying the StandardAnalyzer to not strip ;-signs
Date Fri, 26 Nov 2004 01:04:21 GMT
You could use Field.Keyword.

Otis


--- clas <nicclas@gmail.com> wrote:

> Hi all,
> 
> I'm using the StandardAnalyzer in an application based on Lucene
> 1.4.2. 
> 
> Currently, and by default, the StandardAnalyser "throws
> semicolon-signs away" at index and store time. For example, a
> document
> like "ee3e städer" looks liks "ee3e st&#58590;r" when
> retrieved from the index (That is, the ;-sign is missing). The
> document is stored as a Field.Text in the index.
> 
> What I would like to do is to index, and store, words like
> "städer" and retrieve them in exactly the same form, i.e. as
> "städer".
> 
> I can imagine that the result I would like to achieve can be produced
> by some modifications to the StandardTokenizer.jj (or somewhere
> else).
> Can someone please help me by showing me where/how such change can be
> made.
> 
> (Note: It is not necessary to be able to search for text with
> semicolon-sign included, just to retrieve them in their original
> format.)
> 
> cheers
> Clas / Frisim.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message