lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gerald Blanck <gerald.bla...@barometerit.com>
Subject Re: Nested Join Queries
Date Wed, 07 Nov 2012 16:40:00 GMT
Thank you Erick for your reply.  I understand that search is not an RDBMS.
 Yes, we do have a huge combinatorial explosion if we de-normalize and
duplicate data.  In fact, I believe our use case is exactly what the Solr
developers were trying to solve with the addition of the Join query.  And
while the example I gave illustrates the problem we are solving with the
Join functionality, it is simplistic in nature compared to what we have in
actuality.

Am still looking for an answer here if someone can shed some light.  Thanks.


On Sat, Nov 3, 2012 at 9:38 PM, Erick Erickson <erickerickson@gmail.com>wrote:

> I'm going to go a bit sideways on you, partly because I can't answer the
> question <G>...
>
> But, every time I see someone doing what looks like substituting "core" for
> "table" and
> then trying to use Solr like a DB, I get on my soap-box and preach......
>
> In this case, consider de-normalizing your DB so you can ask the query in
> terms
> of search rather than joins. e.g.
>
> Make each document a combination of the author and the book, with an
> additional
> field "author_has_written_a_bestseller". Now your query becomes a really
> simple
> search, "author:name AND author_has_written_a_bestseller:true". True, this
> kind
> of approach isn't as flexible as an RDBMS, but it's a _search_ rather than
> a query.
> Yes, it replicates data, but unless you have a huge combinatorial
> explosion, that's
> not a problem.
>
> And the join functionality isn't called "pseudo" for nothing. It was
> written for a specific
> use-case. It is often expensive, especially when the field being joined has
> many unique
> values.
>
> FWIW,
> Erick
>
>
> On Fri, Nov 2, 2012 at 11:32 AM, Gerald Blanck <
> gerald.blanck@barometerit.com> wrote:
>
> > At a high level, I have a need to be able to execute a query that joins
> > across cores, and that query during its joining may join back to the
> > originating core.
> >
> > Example:
> > Find all Books written by an Author who has written a best selling Book.
> >
> > In Solr query syntax
> > A) against the book core - bestseller:true
> > B) against the author core - {!join fromIndex=book from=id
> > to=bookid}bestseller:true
> > C) against the book core - {!join fromIndex=author from=id
> > to=authorid}{!join fromIndex=book from=id to=bookid}bestseller:true
> >
> > A - returns results
> > B - returns results
> > C - does not return results
> >
> > Given that A and C use the same core, I started looking for join code
> that
> > compares the originating core to the fromIndex and found this
> > in JoinQParserPlugin (line #159).
> >
> >         if (info.getReq().getCore() == fromCore) {
> >
> >           // if this is the same core, use the searcher passed in...
> > otherwise we could be warming and
> >
> >           // get an older searcher from the core.
> >
> >           fromSearcher = searcher;
> >
> >         } else {
> >
> >           // This could block if there is a static warming query with a
> > join in it, and if useColdSearcher is true.
> >
> >           // Deadlock could result if two cores both had useColdSearcher
> > and had joins that used eachother.
> >
> >           // This would be very predictable though (should happen every
> > time if misconfigured)
> >
> >           fromRef = fromCore.getSearcher(false, true, null);
> >
> >
> >           // be careful not to do anything with this searcher that
> requires
> > the thread local
> >
> >           // SolrRequestInfo in a manner that requires the core in the
> > request to match
> >
> >           fromSearcher = fromRef.get();
> >
> >         }
> >
> > I found that if I were to modify the above code so that it always follows
> > the logic in the else block, I get the results I expect.
> >
> > Can someone explain to me why the code is written as it is?  And if we
> were
> > to run with only the else block being executed, what type of adverse
> > impacts we might have?
> >
> > Does anyone have other ideas on how to solve this issue?
> >
> > Thanks in advance.
> > -Gerald
> >
>



-- 

*Gerald Blanck*

baro*m*eter*IT*

1331 Tyler Street NE, Suite 100
Minneapolis, MN 55413


612.208.2802

gerald.blanck@barometerit.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message