lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@lucene.com>
Subject Re: search item with '-' in it
Date Wed, 04 Jun 2003 23:26:48 GMT
Lixin Meng wrote:
> Therefore, it would be preferable to treat all hyphen in the same way.
> Either as a delimiter or as part of the word (maybe with a flag at the API).

If we change StandardTokenizer in this way then we risk breaking all the 
applications that currently use it and depend on its current behaviour. 
  So I'm reluctant to make this change.

 From the StandardTokenizer documentation:

http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/analysis/standard/StandardTokenizer.html

"Many applications have specific tokenizer needs. If this tokenizer does 
not suit your application, please consider copying this source code 
directory to your project and maintaining your own grammar-based tokenizer."

Also, if you construct a tokenizer that you think is more generally 
useful than StandardTokenizer, please contribute it by mailing it to one 
of the Lucene mailing lists.

Thanks,

Doug


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message