lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Naber <daniel.na...@t-online.de>
Subject Re: Why does the StandardTokenizer split hyphenated words?
Date Wed, 15 Dec 2004 20:49:48 GMT
On Wednesday 15 December 2004 21:14, Mike Snare wrote:

> Also, the phrase query
> would place the same value on a doc that simply had the two words as a
> doc that had the hyphenated version, wouldn't it?  This seems odd.

Not if these words are spelling variations of the same concept, which 
doesn't seem unlikely.

> In addition, why do we assume that a-1 is a "typical product name" but
> a-b isn't?

Maybe for "a-b", but what about English words like "half-baked"?

Regards
 Daniel

-- 
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message