lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zheng Lin Edwin Yeo <edwinye...@gmail.com>
Subject Re: Joining more than 2 collections
Date Wed, 03 May 2017 15:26:08 GMT
Hi Joel,

Thanks for the clarification.

Would like to check, is this the correct way to do the join? Currently, I
could not get any results after putting in the hashJoin for the 3rd,
smallerStream collection (collection3).

http://localhost:8983/solr/collection1/stream?expr=hashJoin(parallel(collection2
,
innerJoin(
 search(collection2,
q=*:*,
fl="a_s,b_s,c_s,d_s,e_s",
             sort="a_s asc",
partitionKeys="a_s",
rows=200),
 search(collection1,
q=*:*,
fl="a_s,f_s,g_s,h_s,i_s,j_s",
             sort="a_s asc",
partitionKeys="a_s",
rows=200),
         on="a_s"),
workers="2",
                 sort="a_s asc"),
         hashed=search(collection3,
q=*:*,
fl="a_s,k_s,l_s",
sort="a_s asc",
rows=200),
on="a_s")
&indent=true


Regards,
Edwin


On 3 May 2017 at 20:59, Joel Bernstein <joelsolr@gmail.com> wrote:

> Sorry, it's just called hashJoin
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Wed, May 3, 2017 at 2:45 AM, Zheng Lin Edwin Yeo <edwinyeozl@gmail.com>
> wrote:
>
> > Hi Joel,
> >
> > I am getting this error when I used the innerHashJoin.
> >
> >  "EXCEPTION":"Invalid stream expression innerHashJoin(parallel(innerJoin
> >
> > I also can't find the documentation on innerHashJoin for the Streaming
> > Expressions.
> >
> > Are you referring to hashJoin?
> >
> > Regards,
> > Edwin
> >
> >
> > On 3 May 2017 at 13:20, Zheng Lin Edwin Yeo <edwinyeozl@gmail.com>
> wrote:
> >
> > > Hi Joel,
> > >
> > > Thanks for the info.
> > >
> > > Regards,
> > > Edwin
> > >
> > >
> > > On 3 May 2017 at 02:04, Joel Bernstein <joelsolr@gmail.com> wrote:
> > >
> > >> Also take a look at the documentation for the "fetch" streaming
> > >> expression.
> > >>
> > >> Joel Bernstein
> > >> http://joelsolr.blogspot.com/
> > >>
> > >> On Tue, May 2, 2017 at 2:03 PM, Joel Bernstein <joelsolr@gmail.com>
> > >> wrote:
> > >>
> > >> > Yes you join more then one collection with Streaming Expressions.
> Here
> > >> are
> > >> > a few things to keep in mind.
> > >> >
> > >> > * You'll likely want to use the parallel function around the largest
> > >> join.
> > >> > You'll need to use the join keys as the partitionKeys.
> > >> > * innerJoin: requires that the streams be sorted on the join keys.
> > >> > * innerHashJoin: has no sorting requirement.
> > >> >
> > >> > So a strategy for a three collection join might look like this:
> > >> >
> > >> > innerHashJoin(parallel(innerJoin(bigStream, bigStream)),
> > smallerStream)
> > >> >
> > >> > The largest join can be done in parallel using an innerJoin. You can
> > >> then
> > >> > wrap the stream coming out of the parallel function in an
> > innerHashJoin
> > >> to
> > >> > join it to another stream.
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > Joel Bernstein
> > >> > http://joelsolr.blogspot.com/
> > >> >
> > >> > On Mon, May 1, 2017 at 9:42 PM, Zheng Lin Edwin Yeo <
> > >> edwinyeozl@gmail.com>
> > >> > wrote:
> > >> >
> > >> >> Hi,
> > >> >>
> > >> >> Is it possible to join more than 2 collections using one of the
> > >> streaming
> > >> >> expressions (Eg: innerJoin)? If not, is there other ways we can
do
> > it?
> > >> >>
> > >> >> Currently, I may need to join 3 or 4 collections together, and
to
> > >> output
> > >> >> selected fields from all these collections together.
> > >> >>
> > >> >> I'm using Solr 6.4.2.
> > >> >>
> > >> >> Regards,
> > >> >> Edwin
> > >> >>
> > >> >
> > >> >
> > >>
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message