lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eugeny N Dzhurinsky <>
Subject Distinct search
Date Wed, 11 Oct 2006 14:46:51 GMT
Hi there!

I have a index structure like this:


when searching for some set of documents, there could be a case when several
comments for the same document match the search criteria. In such case I need
to get single hit for all such cases, in other word - perform a "group by"-like 
operation based on document_id. For example, if I have records

1 : 10 : some text here
2 : 10 : some another text here

and the search string was "+some_text:some" - I need get only one hit for both 
these records (return only document_id).

I know I could collect all hits and then filter them, but I need also paging
functionality, so if I need to collect 1.000.000 hits into 50.000 of records -
I need to traverse all 1.000.000 of records, put 50.000 of unique items into
helper array, then get last page with 10 results - and it will take a lot of

Eugene N Dzhurinsky

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message