lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tatu Saloranta <>
Subject Re: inter-term correlation [was Re: Vector Space Model in Lucene?]
Date Tue, 18 Nov 2003 02:59:46 GMT
On Monday 17 November 2003 07:40, Chong, Herb wrote:
> i don't know what the Java implementation is like but the C++ one is very
> fast.
>> I personally do not have any experience with the BreakIterator in Java. Has
>> anyone used it in any production environment? I'd be very interested to
>> learn more about it's efficiency.

Even if that implementation wasn't fast (which it should be), it should be 
fairly easy to implement it to be pretty much as efficient as any of basic 
tokenizers; ie. not much slower than full scanning speed over text data and 
token creation overhead.

-+ Tatu +-

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message