lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <>
Subject Re: recording a universal ID from DocID in a CustomScoreQuery
Date Mon, 06 Feb 2012 11:53:52 GMT
int doc will be for the subreader, not for the entire index. has setNextReader(IndexReader reader, int
docBase) which you might somehow be able to use.  Failing that I'd go
for FieldCache, or store the docids in a Set in a Map keyed by current
Reader, if that would give you what you needed for the subsequent
messing around.


On Sat, Feb 4, 2012 at 12:09 AM, Paul Allan Hill <> wrote:
> My Index does NOT have a simple UID, it uses the file PATH to the file as the unique
> I was implementing a CustomScoreQuery which not only tweaked the score it also wanted
to write down which documents had passed through this part of overall rebuilt query, so that
I could further mess with those particular documents later.
> I was hoping to do it without using loading up all PATHs from my index into a field cache,
but maybe that is a false way to try to save memory.
> I thought I could write down the docId provided in the call to customScore
> public float customScore(int doc, float subQueryScore, float valSrcScore) throws IOException
>     docIds.add(docId);
>   return ...;
>  }
> private Set<Integer> docIds = new HashSet<Integer>();
> While I thought I had this working, apparently I had not taken into consideration the
subreader and segment problem.
> The int called doc is not the docId for the entire index, just the local reader doc number.
 Is that right?
> So is there a standard way to convert back to the index wide DocID?
> If there is no standard way, I _might_ create a small subclass of IndexSearcher and provide
a method to:
> (1)    Find the right reader by looping through all IndexSearcher.subReaders[] to find
what reader called the CustomScoreQuery
> (2)    Add an offset of the proper value from IndexSearcher.docStarts[iReader]
> But I'm am thinking this prone to the problem that subreader can be made of more subreaders
etc., so I really don't have a clue where to find the current reader and then to map back
> docStarts.
> I also think I'm doing this wrong, because ReaderUtil has nothing like this?
> Is there some way to note for later that a particular document came through this function
query or should I just accept the fact of using the field cache?
> -Paul

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message