lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Naber <>
Subject Re: Why does the StandardTokenizer split hyphenated words?
Date Wed, 15 Dec 2004 20:49:48 GMT
On Wednesday 15 December 2004 21:14, Mike Snare wrote:

> Also, the phrase query
> would place the same value on a doc that simply had the two words as a
> doc that had the hyphenated version, wouldn't it?  This seems odd.

Not if these words are spelling variations of the same concept, which 
doesn't seem unlikely.

> In addition, why do we assume that a-1 is a "typical product name" but
> a-b isn't?

Maybe for "a-b", but what about English words like "half-baked"?



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message