lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Umesh Prasad <umesh.i...@gmail.com>
Subject Re: Mixing ordinary and nested documents
Date Tue, 22 Jul 2014 14:29:05 GMT
Query parentFilterQuery = new TermQuery(new Term("document_type",
            "parent"));

            int[] childToParentDocMapping = new int[searcher.maxDoc()];
            DocSet allParentDocSet = searcher.getDocSet(parentFilterQuery);
            DocIterator iter = allParentDocSet.iterator();
            int child = 0;
            while (iter.hasNext()) {
                int parent = iter.nextDoc();
                while (child <= parent) {
                    childToParentDocMapping[child] = parent;
                    child++;
                }
            }


On 22 July 2014 16:28, Bjørn Axelsen <bjorn.axelsen@fagkommunikation.dk>
wrote:

> Thanks, Umesh
>
> You can get the parent bitset by running a the parent doc type query on
> > the solr indexsearcher.
> > Then child bitset by runnning the child doc type query. Then  use these
> > together to create a int[] where int[i] = parent of i.
> >
>
> Can you kindly add an example? I am not quite sure how to put this into a
> query?
>
> I can easily make the join from child to parent, but what I want to achieve
> is to get the parent document added to the result if it exists but maintain
> the scoring fromt the child as well as the full child document. Is this
> possible?
>
> Cheers,
> Bjørn
>
> 2014-07-18 19:00 GMT+02:00 Umesh Prasad <umesh.iitk@gmail.com>:
>
> > Comments inline
> >
> >
> > On 16 July 2014 20:31, Bjørn Axelsen <bjorn.axelsen@fagkommunikation.dk>
> > wrote:
> >
> > > Hi Solr users
> > >
> > > I would appreciate your inputs on how to handle a *mix *of *simple *and
> > > *nested
> > > *documents in the most easy and flexible way.
> > >
> > > I need to handle:
> > >
> > >    - simple documens: webpages, short articles etc. (approx. 90% of the
> > >    content)
> > >    - nested documents: books containing chapters etc. (approx 10% of
> the
> > >    content)
> > >
> > >
> >
> >
> > > For simple documents I just want to present straightforward search
> > results
> > > without any grouping etc.
> > >
> > > For the nested documents I want to group by book and show book title,
> > book
> > > price etc. AND the individual results within the book. Lets say there
> is
> > a
> > > hit on "Chapters 1" and "Chapter 7" within "Book 1" and a hit on
> "Article
> > > 1", I would like to present this:
> > >
> > > *Book 1 title*
> > > Book 1 published date
> > > Book 1 description
> > > - *Chapter 1 title*
> > >   Chapter 1 snippet
> > > - *Chapter 7 title*
> > >   CHapter 7 snippet
> > >
> > > *Article 1 title*
> > > Article 1 published date
> > > Article 1 description
> > > Article 1 snippet
> > >
> > > It looks like it is pretty straightforward to use the CollapsingQParser
> > to
> > > collapse the book results into one result and not to collapse the other
> > > results. But how about showing the information about the book (the
> parent
> > > document of the chapters)?
> > >
> >
> > You can map the child document to parent  doc id space and extract the
> > information from parent doc id.
> >
> > First you need to generate child doc to parent doc id mapping one time.
> >   You can get the parent bitset by running a the parent doc type query on
> > the solr indexsearcher.
> > Then child bitset by runnning the child doc type query. Then  use these
> > together to create a int[] where int[i] = parent of i. This result is
> > cachable till next commit. I am doing that for computing facets from
> fields
> > in parent docs and sorting on values from parent docs (while getting
> child
> > docs as output).
> >
> >
> >
> >
> > > 1) Is there a way to do an* optional block join* to a *parent *document
> > and
> > > return it together *with *the *child *document - but not to require a
> > > parent document?
> > >
> > > - or -
> > >
> > > 2) Do I need to require parent-child documents for everything? This is
> > > really not my preferred strategy as only a small part of the documents
> is
> > > in a real parent-child relationship. This would mean a lot of dummy
> child
> > > documents.
> > >
> > >
> >
> > >
> > > - or -
> > >
> > > 3) Should I just denormalize data and include the book information
> within
> > > each chapter document?
> > >
> > > - or -
> > >
> > > 4) ... or is there a smarter way?
> > >
> > > Your help is very much appreciated.
> > >
> > > Cheers,
> > >
> > > Bjørn Axelsen
> > >
> >
> >
> >
> > --
> > ---
> > Thanks & Regards
> > Umesh Prasad
> >
>



-- 
---
Thanks & Regards
Umesh Prasad

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message