lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin A. Burton" <bur...@newsmonster.org>
Subject Re: Possible to remove duplicate documents in sort API?
Date Sun, 05 Sep 2004 21:13:51 GMT
Paul Elschot wrote:

>Kevin,
>
>On Sunday 05 September 2004 10:16, Kevin A. Burton wrote:
>  
>
>>I want to sort a result set but perform a group by as well... IE remove
>>duplicate items.
>>    
>>
>
>Could you be more precise?
>
>  
>
My problem is that I have two machines... one for searching, one for 
indexing.

The searcher has an existing index.

The indexer found an UPDATED document and then adds it to a new index 
and pushes that new index over to the searcher.

The searcher then reloads and when someone performs a search BOTH 
documents could show up (including the stale document).

I can't do a delete() on the searcher because the indexer doesn't have 
the entire index as the searcher.

Therefore I wanted to group by the same document ID but this doesn't 
seem possible.  This should suppress the stale document and prefer the 
newer doc.

>>Is this possible with the new API?  Seems like a huge drawback to lucene
>>right now.
>>    
>>
>
>In case you can define another field that defines what is a duplicate
>by having the same value for duplicates, you can use it as one of the
>SortField's for sorting.
>
>  
>
I have this duplicate field...

Kevin


-- 

Please reply using PGP.

    http://peerfear.org/pubkey.asc    
    
    NewsMonster - http://www.newsmonster.org/
    
Kevin A. Burton, Location - San Francisco, CA, Cell - 415.595.9965
       AIM/YIM - sfburtonator,  Web - http://peerfear.org/
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412
  IRC - freenode.net #infoanarchy | #p2p-hackers | #newsmonster


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message