lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chong, Herb" <HCho...@bloomberg.com>
Subject RE: inter-term correlation [was Re: Vector Space Model in Lucene?]
Date Tue, 18 Nov 2003 14:55:17 GMT
i haven't tested, but i know that it is not too hard to change the tables that specify the
break decisions.

Herb....

-----Original Message-----
From: Philippe Laflamme [mailto:plaflamme@konova.com]
Sent: Tuesday, November 18, 2003 9:53 AM
To: Lucene Users List
Subject: RE: inter-term correlation [was Re: Vector Space Model in Lucene?]


In terms of speed I would tend to agree with you.

My question regarding efficiency was directed more towards the quality of
the results it provides. Is the BreakIterator breaking on correct sentence
boundaries or is it being confused by dots at the end of acronyms and such.

Karsten was mentioning that it's results are of higher quality when you
prevent it from breaking after a number. Are there any other tips you can
provide?

Has anybody tested the implementation to estimate its precision?

Regards,
Phil

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message