lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Trejkaz <trej...@trypticon.org>
Subject Re: Recommendation for doing a search plus collecting extra information?
Date Mon, 12 Oct 2015 00:24:37 GMT
On Mon, Oct 12, 2015 at 6:32 AM, Alan Woodward <alan@flax.co.uk> wrote:
> Hi Trejkaz,
>
> You can still use a standard collector if you don’t need to worry about multi-threaded
search.  It sounds
> as though what you want to do is implement your own Collector that will read and record
docvalues hits,
> and use MultiCollector to wrap it and a standard TopDocsCollector together.

I guess the benefit of doing it directly at the Collector is that the
results will come in doc ID order, so any I/O I'm doing would be local
to the previous I/O? Which makes sense, and fetching the values seems
easy enough, but then the order I get the results is not the order
they will come back in the search, so I have to find a fairly
efficient way to map int->int so that I can look them up later.

What would seem ideal here is extending ScoreDoc to put my new int in
that, so that it's stored along with the same object that gets sorted
and ultimately ends up in the array (plus the extra storage
requirement would be as low as possible), but there the ScoreDoc is
created by HitQueue#getSentinelObject() and there is no way to get a
different subclass of HitQueue in TopScoreDocCollector. So I think
this route would require reimplementing pretty much all of
TopScoreDocCollector. I guess it isn't very large, but I worry about
future API changes when messing with internal stuff.

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message