lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bjørn Axelsen <bjorn.axel...@fagkommunikation.dk>
Subject Re: Mixing ordinary and nested documents
Date Thu, 24 Jul 2014 10:32:35 GMT
thank you very much :-)

2014-07-22 16:34 GMT+02:00 Umesh Prasad <umesh.iitk@gmail.com>:

>     public static DocSet mapChildDocsToParentOnly(DocSet childDocSet) {
>
>         DocSet mappedParentDocSet = new BitDocSet();
>         DocIterator childIterator = childDocSet.iterator();
>         while (childIterator.hasNext()) {
>             int childDoc = childIterator.nextDoc();
>             int parentDoc = childToParentDocMapping[childDoc];
>             mappedParentDocSet.addUnique(parentDoc);
>         }
>         int[] matches = new int[mappedParentDocSet.size()];
>         DocIterator parentIter = mappedParentDocSet.iterator();
>         for (int i = 0; parentIter.hasNext(); i++) {
>             matches[i] = parentIter.nextDoc();
>         }
>         return new SortedIntDocSet(matches); // you will need
> SortedIntDocSet impl else docset interaction in some facet queries fails
> later.
>     }
>
>
>
> On 22 July 2014 19:59, Umesh Prasad <umesh.iitk@gmail.com> wrote:
>
> > Query parentFilterQuery = new TermQuery(new Term("document_type",
> >             "parent"));
> >
> >             int[] childToParentDocMapping = new int[searcher.maxDoc()];
> >             DocSet allParentDocSet =
> searcher.getDocSet(parentFilterQuery);
> >             DocIterator iter = allParentDocSet.iterator();
> >             int child = 0;
> >             while (iter.hasNext()) {
> >                 int parent = iter.nextDoc();
> >                 while (child <= parent) {
> >                     childToParentDocMapping[child] = parent;
> >                     child++;
> >                 }
> >             }
> >
> >
> > On 22 July 2014 16:28, Bjørn Axelsen <bjorn.axelsen@fagkommunikation.dk>
> > wrote:
> >
> >> Thanks, Umesh
> >>
> >> You can get the parent bitset by running a the parent doc type query on
> >> > the solr indexsearcher.
> >> > Then child bitset by runnning the child doc type query. Then  use
> these
> >> > together to create a int[] where int[i] = parent of i.
> >> >
> >>
> >> Can you kindly add an example? I am not quite sure how to put this into
> a
> >> query?
> >>
> >> I can easily make the join from child to parent, but what I want to
> >> achieve
> >> is to get the parent document added to the result if it exists but
> >> maintain
> >> the scoring fromt the child as well as the full child document. Is this
> >> possible?
> >>
> >> Cheers,
> >> Bjørn
> >>
> >> 2014-07-18 19:00 GMT+02:00 Umesh Prasad <umesh.iitk@gmail.com>:
> >>
> >> > Comments inline
> >> >
> >> >
> >> > On 16 July 2014 20:31, Bjørn Axelsen <
> bjorn.axelsen@fagkommunikation.dk
> >> >
> >> > wrote:
> >> >
> >> > > Hi Solr users
> >> > >
> >> > > I would appreciate your inputs on how to handle a *mix *of *simple
> >> *and
> >> > > *nested
> >> > > *documents in the most easy and flexible way.
> >> > >
> >> > > I need to handle:
> >> > >
> >> > >    - simple documens: webpages, short articles etc. (approx. 90% of
> >> the
> >> > >    content)
> >> > >    - nested documents: books containing chapters etc. (approx 10%
of
> >> the
> >> > >    content)
> >> > >
> >> > >
> >> >
> >> >
> >> > > For simple documents I just want to present straightforward search
> >> > results
> >> > > without any grouping etc.
> >> > >
> >> > > For the nested documents I want to group by book and show book
> title,
> >> > book
> >> > > price etc. AND the individual results within the book. Lets say
> there
> >> is
> >> > a
> >> > > hit on "Chapters 1" and "Chapter 7" within "Book 1" and a hit on
> >> "Article
> >> > > 1", I would like to present this:
> >> > >
> >> > > *Book 1 title*
> >> > > Book 1 published date
> >> > > Book 1 description
> >> > > - *Chapter 1 title*
> >> > >   Chapter 1 snippet
> >> > > - *Chapter 7 title*
> >> > >   CHapter 7 snippet
> >> > >
> >> > > *Article 1 title*
> >> > > Article 1 published date
> >> > > Article 1 description
> >> > > Article 1 snippet
> >> > >
> >> > > It looks like it is pretty straightforward to use the
> >> CollapsingQParser
> >> > to
> >> > > collapse the book results into one result and not to collapse the
> >> other
> >> > > results. But how about showing the information about the book (the
> >> parent
> >> > > document of the chapters)?
> >> > >
> >> >
> >> > You can map the child document to parent  doc id space and extract the
> >> > information from parent doc id.
> >> >
> >> > First you need to generate child doc to parent doc id mapping one
> time.
> >> >   You can get the parent bitset by running a the parent doc type query
> >> on
> >> > the solr indexsearcher.
> >> > Then child bitset by runnning the child doc type query. Then  use
> these
> >> > together to create a int[] where int[i] = parent of i. This result is
> >> > cachable till next commit. I am doing that for computing facets from
> >> fields
> >> > in parent docs and sorting on values from parent docs (while getting
> >> child
> >> > docs as output).
> >> >
> >> >
> >> >
> >> >
> >> > > 1) Is there a way to do an* optional block join* to a *parent
> >> *document
> >> > and
> >> > > return it together *with *the *child *document - but not to require
> a
> >> > > parent document?
> >> > >
> >> > > - or -
> >> > >
> >> > > 2) Do I need to require parent-child documents for everything? This
> is
> >> > > really not my preferred strategy as only a small part of the
> >> documents is
> >> > > in a real parent-child relationship. This would mean a lot of dummy
> >> child
> >> > > documents.
> >> > >
> >> > >
> >> >
> >> > >
> >> > > - or -
> >> > >
> >> > > 3) Should I just denormalize data and include the book information
> >> within
> >> > > each chapter document?
> >> > >
> >> > > - or -
> >> > >
> >> > > 4) ... or is there a smarter way?
> >> > >
> >> > > Your help is very much appreciated.
> >> > >
> >> > > Cheers,
> >> > >
> >> > > Bjørn Axelsen
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > ---
> >> > Thanks & Regards
> >> > Umesh Prasad
> >> >
> >>
> >
> >
> >
> > --
> > ---
> > Thanks & Regards
> > Umesh Prasad
> >
>
>
>
> --
> ---
> Thanks & Regards
> Umesh Prasad
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message