lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alessandro Benedetti <abenede...@apache.org>
Subject Re: Parent/Child (Nested Document) Faceting
Date Thu, 12 Nov 2015 12:06:26 GMT
Last addition, in the case of multi-level hierarchy.
I think I found what we can not reproduce :

*json.facet*={
    top_reviewers: {
        type: terms,
        field: author_s,
        facet: {
            reviewCount: "unique(parent_s)",
            facet: {
                type: terms,
                field: id,
                domain: {
                    blockParent: "type_s:book"
                },
       facet: {
           bookCount: "unique(id)"}
            }
        }
    }
}

Example Response :

facets":{
    "count":7,
    "top_reviewers":{
      "buckets":[{
          "val":"commenterYonik",
          "count":4,
          "reviewCount":2,
          "facet":{
            "buckets":[{
                "val":"book1",
                "count":1,
                *"bookCount":1},*
              {
                "val":"book2",
                "count":1,
                *"bookCount":1}]}},*
        {
          "val":"commenterAlex",
          "count":3,
          "reviewCount":2,
          "facet":{
            "buckets":[{
                "val":"book2",
                "count":1,
               * "bookCount":1*}]}}]}}}


Ideally I want to be able to move bookCount stat at the level of
reviewCount ( using a sort of path in the ancestors) .
Something like :

json.facet={
    top_reviewers: {
        type: terms,
        field: author_s,
        facet: {
            reviewCount: "unique(parent_s)",
            stat: {
                domain: {
                    blockParent: "type_s:review"
                },
                bookCount: "unique(parent_s)"
            }
        }
    }
}

The 3 level edge case could be solved using a parent_s field and the _root_
field.

But in a N level scenario we would need a way to specify a path and be able
to provide this kind of analytics.

Sorry for the spam.


Cheers

On 12 November 2015 at 11:15, Alessandro Benedetti <abenedetti@apache.org>
wrote:

> I was experimenting with multi-level hierarchy of nested objects.
>
> the _root_ field will always point to  the root parent id.
> If I model Books - Reviews - Comments , where do I have the reference to
> the parent ?
> I think we are getting closer to the understanding of the ES functionality.
>
> It should allow to search in Level N ( comments for example) and then
> facet over the unique values of :
> 1) parent - Given my comments to reviews of books, count all the different
> reviews I commented* ( wondering how to access the parent of a child)*
> 2) grand parent - Given my comments to reviews of books, count all the
> different books, I commented a review of ( tried this and it is working)
> 3) Any ancestor though the path
>
> Cheers
>
> On 12 November 2015 at 10:25, Alessandro Benedetti <abenedetti@apache.org>
> wrote:
>
>> Hi Mikhail,
>> how about this :
>>
>> json.facet={
>>     top_reviewers: {
>>         type: terms,
>>         field: author_s,
>>         sort: "booksCount desc",
>>         facet: {
>>             booksCount: "unique(_root_)"
>>         }
>>     }
>> }
>>
>> We query on children ( comments) and we calculate that facets.
>> This should satisfy your test requirement:
>>
>>
>> http://localhost:8983/solr/demo/select?q=*:*&wt=json&indent=true&fl=id,comment_t&json.facet={top_reviewers:
>> {type: terms,field: author_s,sort: "booksCount desc",facet: {booksCount:
>> "unique(_root_)"}}}
>>
>> Example Response :
>>
>> "top_reviewers":{
>>       "buckets":[{
>>           "val":"dan",
>>           "count":2,
>>           "booksCount":2},
>>         {
>>           "val":"yonik",
>>           "count":2,
>>           "booksCount":2},
>>         {
>>           "val":"Brandon Sanderson",
>>           "count":1,
>>           "booksCount":1},
>>         *{
>>           "val":"Mary",
>>           "count":2,
>>           "booksCount":1}*
>>
>> ...
>>
>> Wondering which kind of scenarios can rise if we consider a multi-level
>> hierarchy...
>>
>> Cheers
>>
>> On 11 November 2015 at 22:26, Mikhail Khludnev <
>> mkhludnev@griddynamics.com> wrote:
>>
>>> I found that example has not enough data to reproduce this functionality.
>>> what if Mary left the same comment to the same book (book2_c4), then we
>>> search for th* across comments
>>>
>>>
>>> http://localhost:8983/solr/techproducts/select?q=comment_t%3Ath*&wt=csv&indent=true&fl=author_s,comment_t,id
>>>
>>> and get
>>>
>>> author_s,comment_t,id
>>> dan,This book was too long.,book1_c2
>>> yonik,Ahead of its time... I wonder if it helped inspire The
>>> Matrix?,book2_c1
>>> dan,A pizza boy for the Mafia franchise? Really?,book2_c2
>>> mary,Neal is so creative and detailed! Loved the metaverse!,book2_c3
>>> mary,Neal is so creative and detailed! Loved the metaverse!,book2_c4
>>>
>>> then, I wish to calculate author facet, but count them in books
>>> (rollup to parents)!
>>>
>>> dan(2) - commented both books
>>> yonik(1) - only second one
>>> *mary(1)*  - only second one, despite twice
>>>
>>> So, far I'm ablle only
>>>
>>>
>>> localhost:8983/solr/techproducts/select?q=comment_t%3Ath*&wt=json&indent=true&fl=author_s,comment_t,id&json.facet={top_reviewers
>>> : { type: terms, field: author_s}}
>>>
>>> "top_reviewers":{
>>>       "buckets":[{
>>>           "val":"dan",
>>>           "count":2},
>>>         {
>>>           "val":"mary",
>>>           "count":2},
>>>         {
>>>           "val":"yonik",
>>>           "count":1}]}}}
>>>
>>> but it's comments mary(2), not books!
>>>
>>> Neither  domain: { blockParent : "type_s:book" } nor  domain: {
>>> blockChildren : "type_s:book" } don't help.
>>>
>>> I tried to nest a facet with specifying a domain, it's just ignored
>>>
>>> localhost:8983/solr/techproducts/select?q=comment_t%3Ath*&wt=json&indent=true&fl=author_s,comment_t,id&json.facet={top_reviewers
>>> : { type: terms, field: author_s, in_books : { type: terms, field:
>>> author_s,  domain: { blockParent : \"type_s:book\" }}}}
>>>
>>>
>>>
>>>
>>> On Wed, Nov 11, 2015 at 6:31 PM, Yonik Seeley <yseeley@gmail.com> wrote:
>>>
>>> > On Mon, Nov 9, 2015 at 2:37 PM, Mikhail Khludnev
>>> > <mkhludnev@griddynamics.com> wrote:
>>> > > Yonik,
>>> > >
>>> > > I wonder is there a plan or a vision for something like
>>> > >
>>> >
>>> https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-reverse-nested-aggregation.html
>>> > > under JSON facets?
>>> >
>>> > Hmmm, I couldn't quite grok that complicated command syntax... but the
>>> > description seems straight-forward enough:
>>> >
>>> > "The following aggregations will return the top commenters' username
>>> > that have commented and per top commenter the top tags of the issues
>>> > the user has commented on:"
>>> >
>>> > So if I translate that into "books" and "reviews" that I use here:
>>> > http://yonik.com/solr-nested-objects/
>>> >
>>> > it sounds like we start with a set of book objects, then map to the
>>> > child domain to facet on comments, then map back to the parent domain
>>> > to facet on books again.
>>> >
>>> > From that blog, this is the command that finds top review authors:
>>> >
>>> > json.facet={
>>> >   top_reviewers : {
>>> >     type: terms,
>>> >     field: author_s,
>>> >     domain: { blockChildren : "type_s:book" }
>>> >   }
>>> > }
>>> >
>>> > Now we just need to add a sub-facet that switches back to the parent
>>> > domain to facet on something there (like genre... equiv to "tags" in
>>> > the ES example):
>>> >
>>> > son.facet={
>>> >   top_reviewers : {
>>> >     type: terms,
>>> >     field: author_s,
>>> >     domain: { blockChildren : "type_s:book" },
>>> >
>>> >     facet : {
>>> >       type:terms,
>>> >       field:genre,
>>> >       domain:{blockParent:"type_s:book"}
>>> >     }
>>> >
>>> >   }
>>> > }
>>> >
>>> >
>>> >
>>> > While there is certainly more work do be done with joins /
>>> > block-joins, it seems like we can already do that specific example at
>>> > least.
>>> >
>>> > -Yonik
>>> >
>>>
>>>
>>>
>>> --
>>> Sincerely yours
>>> Mikhail Khludnev
>>> Principal Engineer,
>>> Grid Dynamics
>>>
>>> <http://www.griddynamics.com>
>>> <mkhludnev@griddynamics.com>
>>>
>>
>>
>>
>> --
>> --------------------------
>>
>> Benedetti Alessandro
>> Visiting card : http://about.me/alessandro_benedetti
>>
>> "Tyger, tyger burning bright
>> In the forests of the night,
>> What immortal hand or eye
>> Could frame thy fearful symmetry?"
>>
>> William Blake - Songs of Experience -1794 England
>>
>
>
>
> --
> --------------------------
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message