lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lee carroll <lee.a.carr...@googlemail.com>
Subject Re: solr equivalent of "select distinct"
Date Mon, 12 Sep 2011 11:45:53 GMT
if you have a limited set of searches which need to use this and they
act on a limited known set of fields you can concat fields at index
time and then facet

PK   FLD1      FLD2    FLD3 FLD4 FLD5 copy45
AB0  A            B          0     x       y        x y
AB1  A            B          1     x       y        x y
CD0  C            D          0     a       b        a b
CD1  C            D          1     e       f         e f

faceting on copy45 field would give you the correct "distinct" term
values (plus their counts).
Its pretty contrived and limited to knowing the fields you need to concat.

What is the use case for this ? it maybe another approach would fit better

lee c

On 11 September 2011 22:26, Michael Sokolov <sokolov@ifactory.com> wrote:
> You can get what you want - unique lists of values from docs matching your
> query - for a single field (using facets), but not for the co-occurrence of
> two field values.  So you could combine the two fields together, if you know
> what they are going to be "in advance."  Facets also give you counts, so in
> some special cases, you could get what you want - eg you can tell when there
> is only a single pair of values since their counts will be the same and the
> same as the total.  But that's all I can think of.
>
> -Mike
>
> On 9/11/2011 12:39 PM, Mark juszczec wrote:
>>
>> Here's an example:
>>
>> PK   FLD1      FLD2    FLD3 FLD4 FLD5
>> AB0  A            B          0     x       y
>> AB1  A            B          1     x       y
>> CD0  C            D          0     a       b
>> CD1  C            D          1     e       f
>>
>> I want to write a query using only the terms FLD1 and FLD2 and ONLY get
>> back:
>>
>> A B x y
>> C D a b
>> C D e f
>>
>> Since FLD4 and FLD5 are the same for PK=AB0 and AB1, I only want one
>> occurrence of those records.
>>
>> Since FLD4 and FLD5 are different for PK=CD0 and CD1, I want BOTH
>> occurrences of those records.
>>
>
>

Mime
View raw message