lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Taylor <paul_t...@fastmail.fm>
Subject Re: Can I use Lucene to retrieve a list of duplicates
Date Fri, 23 Feb 2007 16:19:47 GMT
Thanks this might do it, but do I need to know the terms beforehand, I 
just want to return any terms with frequency more than one?

Erick Erickson wrote:
> Sure, you can use the TermDocs/TermEnum classes. Basically, for a term 
> (probably column value in your app) these let you quickly answer the 
> question "which (and how many) documents does this term appear in". 
> What you get is the Lucene doc id, which let's you fetch all the 
> information about the documents you want.
>
> Erick
>
> On 2/23/07, *Paul Taylor* <paul_t100@fastmail.fm 
> <mailto:paul_t100@fastmail.fm>> wrote:
>
>     Hi I have Java Swing application with a table, I was considering using
>     Lucene to index the data in the table. One task Id like to do is
>     for the
>     user to select 'Find Duplicate records for Column X', then I would
>     filter the table to show only records where there is more than one
>     with
>     the same value i.e duplicate for that column. Is there a way to return
>     all the duplicates from a Lucene index.
>
>     thanks paul Taylor
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>     <mailto:java-user-unsubscribe@lucene.apache.org>
>     For additional commands, e-mail: java-user-help@lucene.apache.org
>     <mailto:java-user-help@lucene.apache.org>
>
>
> ------------------------------------------------------------------------
>
> Internal Virus Database is out-of-date.
> Checked by AVG Free Edition.
> Version: 7.1.394 / Virus Database: 268.16.5/616 - Release Date: 04/01/2007
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message