lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Can I use Lucene to retrieve a list of duplicates
Date Fri, 23 Feb 2007 15:22:38 GMT
Sure, you can use the TermDocs/TermEnum classes. Basically, for a term
(probably column value in your app) these let you quickly answer the
question "which (and how many) documents does this term appear in". What you
get is the Lucene doc id, which let's you fetch all the information about
the documents you want.

Erick

On 2/23/07, Paul Taylor <paul_t100@fastmail.fm> wrote:
>
> Hi I have Java Swing application with a table, I was considering using
> Lucene to index the data in the table. One task Id like to do is for the
> user to select 'Find Duplicate records for Column X', then I would
> filter the table to show only records where there is more than one with
> the same value i.e duplicate for that column. Is there a way to return
> all the duplicates from a Lucene index.
>
> thanks paul Taylor
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message