lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eugeny N Dzhurinsky <b...@redwerk.com>
Subject Distinct search
Date Wed, 11 Oct 2006 14:46:51 GMT
Hi there!

I have a index structure like this:

document_id
some_text
.....

when searching for some set of documents, there could be a case when several
comments for the same document match the search criteria. In such case I need
to get single hit for all such cases, in other word - perform a "group by"-like 
operation based on document_id. For example, if I have records

1 : 10 : some text here
2 : 10 : some another text here

and the search string was "+some_text:some" - I need get only one hit for both 
these records (return only document_id).

I know I could collect all hits and then filter them, but I need also paging
functionality, so if I need to collect 1.000.000 hits into 50.000 of records -
I need to traverse all 1.000.000 of records, put 50.000 of unique items into
helper array, then get last page with 10 results - and it will take a lot of
time.

-- 
Eugene N Dzhurinsky

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message