lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Umesh Prasad <umesh.i...@gmail.com>
Subject Re: Mixing ordinary and nested documents
Date Tue, 22 Jul 2014 14:34:38 GMT
    public static DocSet mapChildDocsToParentOnly(DocSet childDocSet) {

        DocSet mappedParentDocSet = new BitDocSet();
        DocIterator childIterator = childDocSet.iterator();
        while (childIterator.hasNext()) {
            int childDoc = childIterator.nextDoc();
            int parentDoc = childToParentDocMapping[childDoc];
            mappedParentDocSet.addUnique(parentDoc);
        }
        int[] matches = new int[mappedParentDocSet.size()];
        DocIterator parentIter = mappedParentDocSet.iterator();
        for (int i = 0; parentIter.hasNext(); i++) {
            matches[i] = parentIter.nextDoc();
        }
        return new SortedIntDocSet(matches); // you will need
SortedIntDocSet impl else docset interaction in some facet queries fails
later.
    }



On 22 July 2014 19:59, Umesh Prasad <umesh.iitk@gmail.com> wrote:

> Query parentFilterQuery = new TermQuery(new Term("document_type",
>             "parent"));
>
>             int[] childToParentDocMapping = new int[searcher.maxDoc()];
>             DocSet allParentDocSet = searcher.getDocSet(parentFilterQuery);
>             DocIterator iter = allParentDocSet.iterator();
>             int child = 0;
>             while (iter.hasNext()) {
>                 int parent = iter.nextDoc();
>                 while (child <= parent) {
>                     childToParentDocMapping[child] = parent;
>                     child++;
>                 }
>             }
>
>
> On 22 July 2014 16:28, Bjørn Axelsen <bjorn.axelsen@fagkommunikation.dk>
> wrote:
>
>> Thanks, Umesh
>>
>> You can get the parent bitset by running a the parent doc type query on
>> > the solr indexsearcher.
>> > Then child bitset by runnning the child doc type query. Then  use these
>> > together to create a int[] where int[i] = parent of i.
>> >
>>
>> Can you kindly add an example? I am not quite sure how to put this into a
>> query?
>>
>> I can easily make the join from child to parent, but what I want to
>> achieve
>> is to get the parent document added to the result if it exists but
>> maintain
>> the scoring fromt the child as well as the full child document. Is this
>> possible?
>>
>> Cheers,
>> Bjørn
>>
>> 2014-07-18 19:00 GMT+02:00 Umesh Prasad <umesh.iitk@gmail.com>:
>>
>> > Comments inline
>> >
>> >
>> > On 16 July 2014 20:31, Bjørn Axelsen <bjorn.axelsen@fagkommunikation.dk
>> >
>> > wrote:
>> >
>> > > Hi Solr users
>> > >
>> > > I would appreciate your inputs on how to handle a *mix *of *simple
>> *and
>> > > *nested
>> > > *documents in the most easy and flexible way.
>> > >
>> > > I need to handle:
>> > >
>> > >    - simple documens: webpages, short articles etc. (approx. 90% of
>> the
>> > >    content)
>> > >    - nested documents: books containing chapters etc. (approx 10% of
>> the
>> > >    content)
>> > >
>> > >
>> >
>> >
>> > > For simple documents I just want to present straightforward search
>> > results
>> > > without any grouping etc.
>> > >
>> > > For the nested documents I want to group by book and show book title,
>> > book
>> > > price etc. AND the individual results within the book. Lets say there
>> is
>> > a
>> > > hit on "Chapters 1" and "Chapter 7" within "Book 1" and a hit on
>> "Article
>> > > 1", I would like to present this:
>> > >
>> > > *Book 1 title*
>> > > Book 1 published date
>> > > Book 1 description
>> > > - *Chapter 1 title*
>> > >   Chapter 1 snippet
>> > > - *Chapter 7 title*
>> > >   CHapter 7 snippet
>> > >
>> > > *Article 1 title*
>> > > Article 1 published date
>> > > Article 1 description
>> > > Article 1 snippet
>> > >
>> > > It looks like it is pretty straightforward to use the
>> CollapsingQParser
>> > to
>> > > collapse the book results into one result and not to collapse the
>> other
>> > > results. But how about showing the information about the book (the
>> parent
>> > > document of the chapters)?
>> > >
>> >
>> > You can map the child document to parent  doc id space and extract the
>> > information from parent doc id.
>> >
>> > First you need to generate child doc to parent doc id mapping one time.
>> >   You can get the parent bitset by running a the parent doc type query
>> on
>> > the solr indexsearcher.
>> > Then child bitset by runnning the child doc type query. Then  use these
>> > together to create a int[] where int[i] = parent of i. This result is
>> > cachable till next commit. I am doing that for computing facets from
>> fields
>> > in parent docs and sorting on values from parent docs (while getting
>> child
>> > docs as output).
>> >
>> >
>> >
>> >
>> > > 1) Is there a way to do an* optional block join* to a *parent
>> *document
>> > and
>> > > return it together *with *the *child *document - but not to require a
>> > > parent document?
>> > >
>> > > - or -
>> > >
>> > > 2) Do I need to require parent-child documents for everything? This is
>> > > really not my preferred strategy as only a small part of the
>> documents is
>> > > in a real parent-child relationship. This would mean a lot of dummy
>> child
>> > > documents.
>> > >
>> > >
>> >
>> > >
>> > > - or -
>> > >
>> > > 3) Should I just denormalize data and include the book information
>> within
>> > > each chapter document?
>> > >
>> > > - or -
>> > >
>> > > 4) ... or is there a smarter way?
>> > >
>> > > Your help is very much appreciated.
>> > >
>> > > Cheers,
>> > >
>> > > Bjørn Axelsen
>> > >
>> >
>> >
>> >
>> > --
>> > ---
>> > Thanks & Regards
>> > Umesh Prasad
>> >
>>
>
>
>
> --
> ---
> Thanks & Regards
> Umesh Prasad
>



-- 
---
Thanks & Regards
Umesh Prasad

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message