lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Asad Sayeed <>
Subject Stable score scaling; LSI again
Date Tue, 15 Jul 2008 02:15:16 GMT
Hi, I have a couple of questions about how to alter the similarity scores.
I need scores that can be thresholded, and whose thresholds remain stable
even when I add documents to the IndexWriter. ie, identity should be a
fixed value such as 1.0.  I know that for efficiency reasons, Lucene
doesn't do this.  However, that level of efficiency is not as big a concern
for me as getting a stable, thresholdable similarity score from, eg,
"normal" cosine similarity.  Is there a way to change the DefaultSimilarity
trivally to get this feature, or is it a major overhaul?  The searches from
Lucene are being fed to another analyzer is why, so when the "identity"
score changes by adding docs to the index, it messes up the rest of the

The other question I had was about scoring via Latent Semantic Indexing.  I
read in the archives of this list from way back when that LSI was hard to
integrate into Lucene.  Is that still the case?  I mean, from what I
understand, it is just transforming the index in some way.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message