lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From J-Pro <jpro....@gmail.com>
Subject Getting unique key of a document inside of a Similarity class.
Date Thu, 19 Feb 2015 21:30:44 GMT
Good afternoon.

I need to uniquely identify a document inside of a Similarity class 
during scoring. Is it possible to get value of unique key of a document 
at this point?

For some time I though I can use internal docID for achieving that. 
Method score(int doc, float freq) is called after every query execution 
for each matched doc. For each indexed doc it equals 0, 1, 2, etc. But 
this is only when documents indexed in a bulk, i.e. in single HTTP 
request. But when docs are indexed in separate requests, these docIds 
equal 0 for all documents.

To summarize, here are 2 final questions:

1. Is docIds behavior described above a bug or a feature? Obviously, if 
it's a bug and I can use docID to uniquely identify a document, then my 
question is answered after this bug is fixed.
2. If docIds behavior described above is normal, then what is an 
alternative way of uniquely identify a document inside of a Similarity 
class during scoring? Can I get unique key of a scoring document in 
Similarity?

FYI: I have asked 1st question in #solr IRC channel. The person named 
hoss answered the following: "you're seeing the *internal* docIds ... 
you can't assign any special meaning to them ... i believe that at the 
level of the Similarity class, these may even be per segment, which 
means that in the context of a SegmentReader they can be used to get 
things like docValues, but they odn't have any meaning compared to your 
uniqueKey (for example)". This kinda makes me think that answer for the 
1st question is "it's a feature". But I am still not sure and don't know 
the answer to the 2nd question. Please help.

Thank you very much in advance.

Mime
View raw message