lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zheng Lin Edwin Yeo <edwinye...@gmail.com>
Subject Re: Joining more than 2 collections
Date Thu, 04 May 2017 16:28:53 GMT
Hi Joel,

For the join queries, is it true that if we use q=*:* for the query for one
of the join, there will not be any results return?

Currently I found this is the case, if I just put q=*:*.

Regards,
Edwin


On 4 May 2017 at 23:38, Zheng Lin Edwin Yeo <edwinyeozl@gmail.com> wrote:

> Hi Joel,
>
> I think that might be one of the reason.
> This is what I have for the /export handler in my solrconfig.xml
>
> <requestHandler name="/export" class="solr.SearchHandler"> <lst name=
> "invariants"> <str name="rq">{!xport}</str> <str name="wt">xsort</str>
<
> str name="distrib">false</str> </lst> <arr name="components"> <str>query</
> str> </arr> </requestHandler>
>
> This is the error message that I get when I use the /export handler.
>
> java.io.IOException: java.util.concurrent.ExecutionException:
> java.io.IOException: --> http://localhost:8983/solr/
> collection1_shard1_replica1/: An exception has occurred on the server,
> refer to server log for details.
> at org.apache.solr.client.solrj.io.stream.CloudSolrStream.
> openStreams(CloudSolrStream.java:451)
> at org.apache.solr.client.solrj.io.stream.CloudSolrStream.
> open(CloudSolrStream.java:308)
> at org.apache.solr.client.solrj.io.stream.PushBackStream.open(
> PushBackStream.java:70)
> at org.apache.solr.client.solrj.io.stream.JoinStream.open(
> JoinStream.java:147)
> at org.apache.solr.client.solrj.io.stream.ExceptionStream.
> open(ExceptionStream.java:51)
> at org.apache.solr.handler.StreamHandler$TimerStream.
> open(StreamHandler.java:457)
> at org.apache.solr.client.solrj.io.stream.TupleStream.
> writeMap(TupleStream.java:63)
> at org.apache.solr.response.JSONWriter.writeMap(
> JSONResponseWriter.java:547)
> at org.apache.solr.response.TextResponseWriter.writeVal(
> TextResponseWriter.java:193)
> at org.apache.solr.response.JSONWriter.writeNamedListAsMapWithDups(
> JSONResponseWriter.java:209)
> at org.apache.solr.response.JSONWriter.writeNamedList(
> JSONResponseWriter.java:325)
> at org.apache.solr.response.JSONWriter.writeResponse(
> JSONResponseWriter.java:120)
> at org.apache.solr.response.JSONResponseWriter.write(
> JSONResponseWriter.java:71)
> at org.apache.solr.response.QueryResponseWriterUtil.writeQueryResponse(
> QueryResponseWriterUtil.java:65)
> at org.apache.solr.servlet.HttpSolrCall.writeResponse(
> HttpSolrCall.java:732)
> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:473)
> at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDispatchFilter.java:345)
> at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDispatchFilter.java:296)
> at org.eclipse.jetty.servlet.ServletHandler$CachedChain.
> doFilter(ServletHandler.java:1691)
> at org.eclipse.jetty.servlet.ServletHandler.doHandle(
> ServletHandler.java:582)
> at org.eclipse.jetty.server.handler.ScopedHandler.handle(
> ScopedHandler.java:143)
> at org.eclipse.jetty.security.SecurityHandler.handle(
> SecurityHandler.java:548)
> at org.eclipse.jetty.server.session.SessionHandler.
> doHandle(SessionHandler.java:226)
> at org.eclipse.jetty.server.handler.ContextHandler.
> doHandle(ContextHandler.java:1180)
> at org.eclipse.jetty.servlet.ServletHandler.doScope(
> ServletHandler.java:512)
> at org.eclipse.jetty.server.session.SessionHandler.
> doScope(SessionHandler.java:185)
> at org.eclipse.jetty.server.handler.ContextHandler.
> doScope(ContextHandler.java:1112)
> at org.eclipse.jetty.server.handler.ScopedHandler.handle(
> ScopedHandler.java:141)
> at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(
> ContextHandlerCollection.java:213)
> at org.eclipse.jetty.server.handler.HandlerCollection.
> handle(HandlerCollection.java:119)
> at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> HandlerWrapper.java:134)
> at org.eclipse.jetty.server.Server.handle(Server.java:534)
> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
> at org.eclipse.jetty.server.HttpConnection.onFillable(
> HttpConnection.java:251)
> at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(
> AbstractConnection.java:273)
> at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
> at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(
> SelectChannelEndPoint.java:93)
> at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.
> executeProduceConsume(ExecuteProduceConsume.java:303)
> at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.
> produceConsume(ExecuteProduceConsume.java:148)
> at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(
> ExecuteProduceConsume.java:136)
> at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(
> QueuedThreadPool.java:671)
> at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(
> QueuedThreadPool.java:589)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.ExecutionException: java.io.IOException:
> --> http://localhost:8983/solr/collection1_shard1_replica1/: An exception
> has occurred on the server, refer to server log for details.
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at org.apache.solr.client.solrj.io.stream.CloudSolrStream.
> openStreams(CloudSolrStream.java:445)
> ... 42 more
> Caused by: java.io.IOException: --> http://localhost:8983/solr/
> collection1_shard1_replica1/: An exception has occurred on the server,
> refer to server log for details.
> at org.apache.solr.client.solrj.io.stream.SolrStream.read(
> SolrStream.java:238)
> at org.apache.solr.client.solrj.io.stream.CloudSolrStream$
> TupleWrapper.next(CloudSolrStream.java:541)
> at org.apache.solr.client.solrj.io.stream.CloudSolrStream$
> StreamOpener.call(CloudSolrStream.java:564)
> at org.apache.solr.client.solrj.io.stream.CloudSolrStream$
> StreamOpener.call(CloudSolrStream.java:551)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.
> lambda$execute$0(ExecutorUtil.java:229)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> ... 1 more
> Caused by: org.noggit.JSONParser$ParseException: JSON Parse Error:
> char=<,position=0 BEFORE='<' AFTER='?xml version="1.0" encoding="UTF-8"?> <'
> at org.noggit.JSONParser.err(JSONParser.java:356)
> at org.noggit.JSONParser.handleNonDoubleQuoteString(JSONParser.java:712)
> at org.noggit.JSONParser.next(JSONParser.java:886)
> at org.noggit.JSONParser.nextEvent(JSONParser.java:930)
> at org.apache.solr.client.solrj.io.stream.JSONTupleStream.
> expect(JSONTupleStream.java:97)
> at org.apache.solr.client.solrj.io.stream.JSONTupleStream.
> advanceToDocs(JSONTupleStream.java:179)
> at org.apache.solr.client.solrj.io.stream.JSONTupleStream.
> next(JSONTupleStream.java:77)
> at org.apache.solr.client.solrj.io.stream.SolrStream.read(
> SolrStream.java:207)
> ... 8 more
>
>
> Regards,
> Edwin
>
>
> On 4 May 2017 at 22:54, Joel Bernstein <joelsolr@gmail.com> wrote:
>
>> I suspect that there is something not quite right about the how the
>> /export
>> handler is configured. Straight out of the box in solr 6.4.2  /export will
>> be automatically configured. Are you using a Solr instance that has been
>> upgraded in the past and doesn't have standard 6.4.2 configs?
>>
>> To really do joins properly you'll have to use the /export handler because
>> /select will not stream entire result sets (unless they are pretty small).
>> So your results will be missing data possibly.
>>
>> I would take a close look at the logs and see what all the exceptions are
>> when you run the a search using qt=/export. If you can post all the stack
>> traces that get generated when you run the search we'll probably be able
>> to
>> spot the issue.
>>
>> About the field ordering. There is support for field ordering in the
>> Streaming classes but only a few places actually enforce the order. The
>> 6.5
>> SQL interface does keep the fields in order as does the new Tuple
>> expression in Solr 6.6. But the expressions you are working with currently
>> don't enforce field ordering.
>>
>>
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>> On Thu, May 4, 2017 at 2:41 AM, Zheng Lin Edwin Yeo <edwinyeozl@gmail.com
>> >
>> wrote:
>>
>> > Hi Joel,
>> >
>> > I have managed to get the Join to work, but so far it is only working
>> when
>> > I use qt="/select". It is not working when I use qt="/export".
>> >
>> > For the display of the field, is there a way to allow it to list them in
>> > the order which I want?
>> > Currently, the display is quite random, and I can get a field in
>> > collection1, followed by a field in collection3, then collection1 again,
>> > and then collection2.
>> >
>> > It will be good if we can arrange the field to display in the order
>> that we
>> > want.
>> >
>> > Regards,
>> > Edwin
>> >
>> >
>> >
>> > On 4 May 2017 at 09:56, Zheng Lin Edwin Yeo <edwinyeozl@gmail.com>
>> wrote:
>> >
>> > > Hi Joel,
>> > >
>> > > It works when I started off with just one expression.
>> > >
>> > > Could it be that the data size is too big for export after the join,
>> > which
>> > > causes the error?
>> > >
>> > > Regards,
>> > > Edwin
>> > >
>> > > On 4 May 2017 at 02:53, Joel Bernstein <joelsolr@gmail.com> wrote:
>> > >
>> > >> I was just testing with the query below and it worked for me. Some
of
>> > the
>> > >> error messages I was getting with the syntax was not what I was
>> > expecting
>> > >> though, so I'll look into the error handling. But the joins do work
>> when
>> > >> the syntax correct. The query below is joining to the same collection
>> > >> three
>> > >> times, but the mechanics are exactly the same joining three different
>> > >> tables. In this example each join narrows down the result set.
>> > >>
>> > >> hashJoin(parallel(collection2,
>> > >>                             workers=3,
>> > >>                             sort="id asc",
>> > >>                             innerJoin(search(collection2, q="*:*",
>> > >> fl="id",
>> > >> sort="id asc", qt="/export", partitionKeys="id"),
>> > >>                                             search(collection2,
>> > >> q="year_i:42", fl="id, year_i", sort="id asc", qt="/export",
>> > >> partitionKeys="id"),
>> > >>                                             on="id")),
>> > >>                 hashed=search(collection2, q="day_i:7", fl="id,
>> day_i",
>> > >> sort="id asc", qt="/export"),
>> > >>                 on="id")
>> > >>
>> > >> Joel Bernstein
>> > >> http://joelsolr.blogspot.com/
>> > >>
>> > >> On Wed, May 3, 2017 at 1:29 PM, Joel Bernstein <joelsolr@gmail.com>
>> > >> wrote:
>> > >>
>> > >> > Start off with just this expression:
>> > >> >
>> > >> > search(collection2,
>> > >> >             q=*:*,
>> > >> >             fl="a_s,b_s,c_s,d_s,e_s",
>> > >> >             sort="a_s asc",
>> > >> >             qt="/export")
>> > >> >
>> > >> > And then check the logs for exceptions.
>> > >> >
>> > >> > Joel Bernstein
>> > >> > http://joelsolr.blogspot.com/
>> > >> >
>> > >> > On Wed, May 3, 2017 at 12:35 PM, Zheng Lin Edwin Yeo <
>> > >> edwinyeozl@gmail.com
>> > >> > > wrote:
>> > >> >
>> > >> >> Hi Joel,
>> > >> >>
>> > >> >> I am getting this error after I change add qt=/export and
removed
>> the
>> > >> rows
>> > >> >> param. Do you know what could be the reason?
>> > >> >>
>> > >> >> {
>> > >> >>   "error":{
>> > >> >>     "metadata":[
>> > >> >>       "error-class","org.apache.solr.common.SolrException",
>> > >> >>       "root-error-class","org.apache.http.MalformedChunkCodingExc
>> e
>> > >> >> ption"],
>> > >> >>     "msg":"org.apache.http.MalformedChunkCodingException:
CRLF
>> > >> expected
>> > >> >> at
>> > >> >> end of chunk",
>> > >> >>     "trace":"org.apache.solr.common.SolrException:
>> > >> >> org.apache.http.MalformedChunkCodingException: CRLF expected
at
>> end
>> > of
>> > >> >> chunk\r\n\tat
>> > >> >> org.apache.solr.client.solrj.io.stream.TupleStream.lambda$wr
>> > >> >> iteMap$0(TupleStream.java:79)\r\n\tat
>> > >> >> org.apache.solr.response.JSONWriter.writeIterator(JSONRespon
>> > >> >> seWriter.java:523)\r\n\tat
>> > >> >> org.apache.solr.response.TextResponseWriter.writeVal(TextRes
>> > >> >> ponseWriter.java:175)\r\n\tat
>> > >> >> org.apache.solr.response.JSONWriter$2.put(JSONResponseWriter
>> > >> >> .java:559)\r\n\tat
>> > >> >> org.apache.solr.client.solrj.io.stream.TupleStream.writeMap(
>> > >> >> TupleStream.java:64)\r\n\tat
>> > >> >> org.apache.solr.response.JSONWriter.writeMap(JSONResponseWri
>> > >> >> ter.java:547)\r\n\tat
>> > >> >> org.apache.solr.response.TextResponseWriter.writeVal(TextRes
>> > >> >> ponseWriter.java:193)\r\n\tat
>> > >> >> org.apache.solr.response.JSONWriter.writeNamedListAsMapWithD
>> > >> >> ups(JSONResponseWriter.java:209)\r\n\tat
>> > >> >> org.apache.solr.response.JSONWriter.writeNamedList(JSONRespo
>> > >> >> nseWriter.java:325)\r\n\tat
>> > >> >> org.apache.solr.response.JSONWriter.writeResponse(JSONRespon
>> > >> >> seWriter.java:120)\r\n\tat
>> > >> >> org.apache.solr.response.JSONResponseWriter.write(JSONRespon
>> > >> >> seWriter.java:71)\r\n\tat
>> > >> >> org.apache.solr.response.QueryResponseWriterUtil.writeQueryR
>> > >> >> esponse(QueryResponseWriterUtil.java:65)\r\n\tat
>> > >> >> org.apache.solr.servlet.HttpSolrCall.writeResponse(HttpSolrC
>> > >> >> all.java:732)\r\n\tat
>> > >> >> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:
>> > >> 473)\r\n\tat
>> > >> >> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>> > >> >> atchFilter.java:345)\r\n\tat
>> > >> >> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>> > >> >> atchFilter.java:296)\r\n\tat
>> > >> >> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilte
>> > >> >> r(ServletHandler.java:1691)\r\n\tat
>> > >> >> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHan
>> > >> >> dler.java:582)\r\n\tat
>> > >> >> org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>> > >> >> Handler.java:143)\r\n\tat
>> > >> >> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHa
>> > >> >> ndler.java:548)\r\n\tat
>> > >> >> org.eclipse.jetty.server.session.SessionHandler.doHandle(
>> > >> >> SessionHandler.java:226)\r\n\tat
>> > >> >> org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>> > >> >> ContextHandler.java:1180)\r\n\tat
>> > >> >> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHand
>> > >> >> ler.java:512)\r\n\tat
>> > >> >> org.eclipse.jetty.server.session.SessionHandler.doScope(
>> > >> >> SessionHandler.java:185)\r\n\tat
>> > >> >> org.eclipse.jetty.server.handler.ContextHandler.doScope(
>> > >> >> ContextHandler.java:1112)\r\n\tat
>> > >> >> org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>> > >> >> Handler.java:141)\r\n\tat
>> > >> >> org.eclipse.jetty.server.handler.ContextHandlerCollection.ha
>> > >> >> ndle(ContextHandlerCollection.java:213)\r\n\tat
>> > >> >> org.eclipse.jetty.server.handler.HandlerCollection.handle(
>> > >> >> HandlerCollection.java:119)\r\n\tat
>> > >> >> org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>> > >> >> erWrapper.java:134)\r\n\tat
>> > >> >> org.eclipse.jetty.server.Server.handle(Server.java:534)\r\n\tat
>> > >> >> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
>> > >> java:320)\r\n\tat
>> > >> >> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConne
>> > >> >> ction.java:251)\r\n\tat
>> > >> >> org.eclipse.jetty.io.AbstractConnection$ReadCallback.
>> > >> >> succeeded(AbstractConnection.java:273)\r\n\tat
>> > >> >> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
>> > >> java:95)\r\n\tat
>> > >> >> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChann
>> > >> >> elEndPoint.java:93)\r\n\tat
>> > >> >> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>> > >> >> .executeProduceConsume(ExecuteProduceConsume.java:303)\r\n\tat
>> > >> >> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>> > >> >> .produceConsume(ExecuteProduceConsume.java:148)\r\n\tat
>> > >> >> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>> > >> >> .run(ExecuteProduceConsume.java:136)\r\n\tat
>> > >> >> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(Queued
>> > >> >> ThreadPool.java:671)\r\n\tat
>> > >> >> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedT
>> > >> >> hreadPool.java:589)\r\n\tat
>> > >> >> java.lang.Thread.run(Thread.java:745)\r\nCaused by:
>> > >> >> org.apache.http.MalformedChunkCodingException: CRLF expected
at
>> end
>> > of
>> > >> >> chunk\r\n\tat
>> > >> >> org.apache.http.impl.io.ChunkedInputStream.getChunkSize(Chun
>> > >> >> kedInputStream.java:255)\r\n\tat
>> > >> >> org.apache.http.impl.io.ChunkedInputStream.nextChunk(Chunked
>> > >> >> InputStream.java:227)\r\n\tat
>> > >> >> org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInput
>> > >> >> Stream.java:186)\r\n\tat
>> > >> >> org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInput
>> > >> >> Stream.java:215)\r\n\tat
>> > >> >> org.apache.http.impl.io.ChunkedInputStream.close(ChunkedInpu
>> > >> >> tStream.java:316)\r\n\tat
>> > >> >> org.apache.http.conn.BasicManagedEntity.streamClosed(BasicMa
>> > >> >> nagedEntity.java:164)\r\n\tat
>> > >> >> org.apache.http.conn.EofSensorInputStream.checkClose(EofSens
>> > >> >> orInputStream.java:228)\r\n\tat
>> > >> >> org.apache.http.conn.EofSensorInputStream.close(EofSensorInp
>> > >> >> utStream.java:174)\r\n\tat
>> > >> >> sun.nio.cs.StreamDecoder.implClose(StreamDecoder.java:378)\
>> r\n\tat
>> > >> >> sun.nio.cs.StreamDecoder.close(StreamDecoder.java:193)\r\n\tat
>> > >> >> java.io.InputStreamReader.close(InputStreamReader.java:199)\
>> r\n\tat
>> > >> >> org.apache.solr.client.solrj.io.stream.JSONTupleStream.close
>> > >> >> (JSONTupleStream.java:92)\r\n\tat
>> > >> >> org.apache.solr.client.solrj.io.stream.SolrStream.close(Solr
>> > >> >> Stream.java:193)\r\n\tat
>> > >> >> org.apache.solr.client.solrj.io.stream.CloudSolrStream.close
>> > >> >> (CloudSolrStream.java:464)\r\n\tat
>> > >> >> org.apache.solr.client.solrj.io.stream.HashJoinStream.close(
>> > >> >> HashJoinStream.java:231)\r\n\tat
>> > >> >> org.apache.solr.client.solrj.io.stream.ExceptionStream.close
>> > >> >> (ExceptionStream.java:93)\r\n\tat
>> > >> >> org.apache.solr.handler.StreamHandler$TimerStream.close(
>> > >> >> StreamHandler.java:452)\r\n\tat
>> > >> >> org.apache.solr.client.solrj.io.stream.TupleStream.lambda$wr
>> > >> >> iteMap$0(TupleStream.java:71)\r\n\t...
>> > >> >> 40 more\r\n",
>> > >> >>     "code":500}}
>> > >> >>
>> > >> >>
>> > >> >> Regards,
>> > >> >> Edwin
>> > >> >>
>> > >> >>
>> > >> >> On 4 May 2017 at 00:00, Joel Bernstein <joelsolr@gmail.com>
>> wrote:
>> > >> >>
>> > >> >> > I've reformatted the expression below and made a few
changes.
>> You
>> > >> have
>> > >> >> put
>> > >> >> > things together properly. But these are MapReduce joins
that
>> > require
>> > >> >> > exporting the entire result sets. So you will need to
add
>> > qt=/export
>> > >> to
>> > >> >> all
>> > >> >> > the searches and remove the rows param. In Solr 6.6.
there is a
>> new
>> > >> >> > "shuffle" expression that does this automatically.
>> > >> >> >
>> > >> >> > To test things you'll want to break down each expression
and
>> make
>> > >> sure
>> > >> >> it's
>> > >> >> > behaving as expected.
>> > >> >> >
>> > >> >> > For example first run each search. Then run the innerJoin,
not
>> in
>> > >> >> parallel
>> > >> >> > mode. Then run it in parallel mode. Then try the whole
thing.
>> > >> >> >
>> > >> >> > hashJoin(parallel(collection2,
>> > >> >> >                             innerJoin(search(collection2,
>> > >> >> >                                                     
  q=*:*,
>> > >> >> >
>> > >> >> >  fl="a_s,b_s,c_s,d_s,e_s",
>> > >> >> >                                                     
  sort="a_s
>> > >> asc",
>> > >> >> >
>> > >> >> partitionKeys="a_s",
>> > >> >> >
>> > qt="/export"),
>> > >> >> >                                            search(collection1,
>> > >> >> >                                                     
  q=*:*,
>> > >> >> >
>> > >> >> >  fl="a_s,f_s,g_s,h_s,i_s,j_s",
>> > >> >> >                                                     
  sort="a_s
>> > >> asc",
>> > >> >> >
>> > >> >>  partitionKeys="a_s",
>> > >> >> >
>> >  qt="/export"),
>> > >> >> >                                            on="a_s"),
>> > >> >> >                              workers="2",
>> > >> >> >                              sort="a_s asc"),
>> > >> >> >                hashed=search(collection3,
>> > >> >> >                                          q=*:*,
>> > >> >> >                                          fl="a_s,k_s,l_s",
>> > >> >> >                                          sort="a_s asc",
>> > >> >> >                                          qt="/export"),
>> > >> >> >               on="a_s")
>> > >> >> >
>> > >> >> > Joel Bernstein
>> > >> >> > http://joelsolr.blogspot.com/
>> > >> >> >
>> > >> >> > On Wed, May 3, 2017 at 11:26 AM, Zheng Lin Edwin Yeo
<
>> > >> >> edwinyeozl@gmail.com
>> > >> >> > >
>> > >> >> > wrote:
>> > >> >> >
>> > >> >> > > Hi Joel,
>> > >> >> > >
>> > >> >> > > Thanks for the clarification.
>> > >> >> > >
>> > >> >> > > Would like to check, is this the correct way to
do the join?
>> > >> >> Currently, I
>> > >> >> > > could not get any results after putting in the hashJoin
for
>> the
>> > >> 3rd,
>> > >> >> > > smallerStream collection (collection3).
>> > >> >> > >
>> > >> >> > > http://localhost:8983/solr/collection1/stream?expr=
>> > >> >> > > hashJoin(parallel(collection2
>> > >> >> > > ,
>> > >> >> > > innerJoin(
>> > >> >> > >  search(collection2,
>> > >> >> > > q=*:*,
>> > >> >> > > fl="a_s,b_s,c_s,d_s,e_s",
>> > >> >> > >              sort="a_s asc",
>> > >> >> > > partitionKeys="a_s",
>> > >> >> > > rows=200),
>> > >> >> > >  search(collection1,
>> > >> >> > > q=*:*,
>> > >> >> > > fl="a_s,f_s,g_s,h_s,i_s,j_s",
>> > >> >> > >              sort="a_s asc",
>> > >> >> > > partitionKeys="a_s",
>> > >> >> > > rows=200),
>> > >> >> > >          on="a_s"),
>> > >> >> > > workers="2",
>> > >> >> > >                  sort="a_s asc"),
>> > >> >> > >          hashed=search(collection3,
>> > >> >> > > q=*:*,
>> > >> >> > > fl="a_s,k_s,l_s",
>> > >> >> > > sort="a_s asc",
>> > >> >> > > rows=200),
>> > >> >> > > on="a_s")
>> > >> >> > > &indent=true
>> > >> >> > >
>> > >> >> > >
>> > >> >> > > Regards,
>> > >> >> > > Edwin
>> > >> >> > >
>> > >> >> > >
>> > >> >> > > On 3 May 2017 at 20:59, Joel Bernstein <joelsolr@gmail.com>
>> > wrote:
>> > >> >> > >
>> > >> >> > > > Sorry, it's just called hashJoin
>> > >> >> > > >
>> > >> >> > > > Joel Bernstein
>> > >> >> > > > http://joelsolr.blogspot.com/
>> > >> >> > > >
>> > >> >> > > > On Wed, May 3, 2017 at 2:45 AM, Zheng Lin Edwin
Yeo <
>> > >> >> > > edwinyeozl@gmail.com>
>> > >> >> > > > wrote:
>> > >> >> > > >
>> > >> >> > > > > Hi Joel,
>> > >> >> > > > >
>> > >> >> > > > > I am getting this error when I used the
innerHashJoin.
>> > >> >> > > > >
>> > >> >> > > > >  "EXCEPTION":"Invalid stream expression
>> > innerHashJoin(parallel(
>> > >> >> > > innerJoin
>> > >> >> > > > >
>> > >> >> > > > > I also can't find the documentation on
innerHashJoin for
>> the
>> > >> >> > Streaming
>> > >> >> > > > > Expressions.
>> > >> >> > > > >
>> > >> >> > > > > Are you referring to hashJoin?
>> > >> >> > > > >
>> > >> >> > > > > Regards,
>> > >> >> > > > > Edwin
>> > >> >> > > > >
>> > >> >> > > > >
>> > >> >> > > > > On 3 May 2017 at 13:20, Zheng Lin Edwin
Yeo <
>> > >> edwinyeozl@gmail.com
>> > >> >> >
>> > >> >> > > > wrote:
>> > >> >> > > > >
>> > >> >> > > > > > Hi Joel,
>> > >> >> > > > > >
>> > >> >> > > > > > Thanks for the info.
>> > >> >> > > > > >
>> > >> >> > > > > > Regards,
>> > >> >> > > > > > Edwin
>> > >> >> > > > > >
>> > >> >> > > > > >
>> > >> >> > > > > > On 3 May 2017 at 02:04, Joel Bernstein
<
>> joelsolr@gmail.com
>> > >
>> > >> >> wrote:
>> > >> >> > > > > >
>> > >> >> > > > > >> Also take a look at the documentation
for the "fetch"
>> > >> streaming
>> > >> >> > > > > >> expression.
>> > >> >> > > > > >>
>> > >> >> > > > > >> Joel Bernstein
>> > >> >> > > > > >> http://joelsolr.blogspot.com/
>> > >> >> > > > > >>
>> > >> >> > > > > >> On Tue, May 2, 2017 at 2:03 PM,
Joel Bernstein <
>> > >> >> > joelsolr@gmail.com>
>> > >> >> > > > > >> wrote:
>> > >> >> > > > > >>
>> > >> >> > > > > >> > Yes you join more then one
collection with Streaming
>> > >> >> > Expressions.
>> > >> >> > > > Here
>> > >> >> > > > > >> are
>> > >> >> > > > > >> > a few things to keep in
mind.
>> > >> >> > > > > >> >
>> > >> >> > > > > >> > * You'll likely want to
use the parallel function
>> around
>> > >> the
>> > >> >> > > largest
>> > >> >> > > > > >> join.
>> > >> >> > > > > >> > You'll need to use the join
keys as the
>> partitionKeys.
>> > >> >> > > > > >> > * innerJoin: requires that
the streams be sorted on
>> the
>> > >> join
>> > >> >> > keys.
>> > >> >> > > > > >> > * innerHashJoin: has no
sorting requirement.
>> > >> >> > > > > >> >
>> > >> >> > > > > >> > So a strategy for a three
collection join might look
>> > like
>> > >> >> this:
>> > >> >> > > > > >> >
>> > >> >> > > > > >> > innerHashJoin(parallel(innerJoin(bigStream,
>> > bigStream)),
>> > >> >> > > > > smallerStream)
>> > >> >> > > > > >> >
>> > >> >> > > > > >> > The largest join can be
done in parallel using an
>> > >> innerJoin.
>> > >> >> You
>> > >> >> > > can
>> > >> >> > > > > >> then
>> > >> >> > > > > >> > wrap the stream coming out
of the parallel function
>> in
>> > an
>> > >> >> > > > > innerHashJoin
>> > >> >> > > > > >> to
>> > >> >> > > > > >> > join it to another stream.
>> > >> >> > > > > >> >
>> > >> >> > > > > >> >
>> > >> >> > > > > >> >
>> > >> >> > > > > >> >
>> > >> >> > > > > >> >
>> > >> >> > > > > >> >
>> > >> >> > > > > >> >
>> > >> >> > > > > >> >
>> > >> >> > > > > >> >
>> > >> >> > > > > >> >
>> > >> >> > > > > >> >
>> > >> >> > > > > >> >
>> > >> >> > > > > >> >
>> > >> >> > > > > >> >
>> > >> >> > > > > >> >
>> > >> >> > > > > >> > Joel Bernstein
>> > >> >> > > > > >> > http://joelsolr.blogspot.com/
>> > >> >> > > > > >> >
>> > >> >> > > > > >> > On Mon, May 1, 2017 at 9:42
PM, Zheng Lin Edwin Yeo <
>> > >> >> > > > > >> edwinyeozl@gmail.com>
>> > >> >> > > > > >> > wrote:
>> > >> >> > > > > >> >
>> > >> >> > > > > >> >> Hi,
>> > >> >> > > > > >> >>
>> > >> >> > > > > >> >> Is it possible to join
more than 2 collections using
>> > one
>> > >> of
>> > >> >> the
>> > >> >> > > > > >> streaming
>> > >> >> > > > > >> >> expressions (Eg: innerJoin)?
If not, is there other
>> > ways
>> > >> we
>> > >> >> can
>> > >> >> > > do
>> > >> >> > > > > it?
>> > >> >> > > > > >> >>
>> > >> >> > > > > >> >> Currently, I may need
to join 3 or 4 collections
>> > >> together,
>> > >> >> and
>> > >> >> > to
>> > >> >> > > > > >> output
>> > >> >> > > > > >> >> selected fields from
all these collections together.
>> > >> >> > > > > >> >>
>> > >> >> > > > > >> >> I'm using Solr 6.4.2.
>> > >> >> > > > > >> >>
>> > >> >> > > > > >> >> Regards,
>> > >> >> > > > > >> >> Edwin
>> > >> >> > > > > >> >>
>> > >> >> > > > > >> >
>> > >> >> > > > > >> >
>> > >> >> > > > > >>
>> > >> >> > > > > >
>> > >> >> > > > > >
>> > >> >> > > > >
>> > >> >> > > >
>> > >> >> > >
>> > >> >> >
>> > >> >>
>> > >> >
>> > >> >
>> > >>
>> > >
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message