lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <erik.hatc...@gmail.com>
Subject Re: Find duplicates
Date Tue, 02 Dec 2014 16:02:50 GMT
Sort of… if you indexed the full value of the field (and you’re looking for truly exact
matches) as a string field type you could facet on that field with facet.mincount=2 and the
facets returned would be the ones with duplicate values.  You’d have to drill down on each
of the facets returned to find the actual docs.

    Erik

> On Dec 2, 2014, at 10:57 AM, Peter Kirk <pk@alpha-solutions.dk> wrote:
> 
> Hi
> 
> Is it possible to formulate a Solr query which finds all documents which have the same
value in a particular field?
> Note, I don't know what the value is, I just want to find all documents with duplicate
values.
> 
> For example, I have 5 documents:
> 
> Doc1: field Name = Peter
> Doc2: field Name = Jack
> Doc3: field Name = Peter
> Doc4: field Name = Paul
> Doc5: field Name = Jack
> 
> 
> If I executed the query, it would find documents Doc1 and Doc3 (Peter is the same), and
Doc2 and Doc5 (Jack is the same).
> 
> 
> 
> Thanks,
> Peter


Mime
View raw message