lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject result explanations / how to get the current document id inside a similarity subclass
Date Fri, 10 Nov 2006 11:48:53 GMT
Hash: SHA1

Hello folks,

we want to work with explanations of document scores inside result lists.
In this context we are interested on the scores of the single terms from a
query, for each document inside the result list:

"termA termB"

doc1  => overall score 2.3
doc2  => overall score 1.6

We would love it to have explanations like this:
overall score doc1 (2.3) = score termA (1.2) + score termB (1.1)
overall score doc2 (1.6) = score termA (1.1) + score termB (0.5)

In the past we worked with the Searcher.explain(..) method which is appropriate
in order to explain the results for small numbers of documents, but since this
method takes as much time as a whole search (as written inside the ApiDoc), this
of course is not feasible for whole result lists.

Nevertheless, all values should be available during the calculation of the overall
score, which is done inside the Similarity class. Thus, collecting of these should
result into nearly no runtime overhead, its mainly a question about memory.

We have looked inside Similarity, and all is available except the current document
id - so we have term score values but we don't know the documents they are related
to. And this is our question:
Does anybody know how to get this current document number/id inside a subclass
implementation of Similarity?

Thanks in advance!


- --

Christian Reuschling, Dipl.-Ing.(BA)
Software Engineer

Knowledge Management Department
German Research Center for Artificial Intelligence DFKI GmbH
Erwin-Schrödinger-Straße 57, D-67663 Kaiserslautern, Germany

Phone: +49.631.205-3441
Version: GnuPG v1.4.2 (GNU/Linux)


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message