lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joel Bernstein <joels...@gmail.com>
Subject Re: streaming with SolrJ
Date Fri, 29 Sep 2017 00:09:49 GMT
There isn't much documentation for how to use the Streaming API java
classes directly. All of the effort has been going into Streaming
Expressions which you send to the /stream handler to execute. Over time
it's become more and more complicated to use the Java classes because there
are so many of them and because their initialization can be complex. All of
the test cases are now focused on exercising the underlying classes through
the expressions.


Joel Bernstein
http://joelsolr.blogspot.com/

On Thu, Sep 28, 2017 at 4:47 PM, Hendrik Haddorp <hendrik.haddorp@gmx.net>
wrote:

> hm, thanks, but why are all those withFunctionName calls required and how
> did you get to this?
>
>
> On 28.09.2017 22:01, Susheel Kumar wrote:
>
>> I have this snippet with couple of functions e.g. if that helps
>>
>> ---
>>      TupleStream stream;
>>      List<Tuple> tuples;
>>      StreamContext streamContext = new StreamContext();
>>      SolrClientCache solrClientCache = new SolrClientCache();
>>      streamContext.setSolrClientCache(solrClientCache);
>>
>>      StreamFactory factory = new StreamFactory()
>>       .withCollectionZkHost("gettingstarted", "localhost:2181")
>>      .withFunctionName("search", CloudSolrStream.class)
>>        .withFunctionName("select", SelectStream.class)
>>        .withFunctionName("add", AddEvaluator.class)
>>        .withFunctionName("if", IfThenElseEvaluator.class)
>>        .withFunctionName("gt", GreaterThanEvaluator.class)
>>        .withFunctionName("let", LetStream.class)
>>        .withFunctionName("get", GetStream.class)
>>        .withFunctionName("echo", EchoStream.class)
>>        .withFunctionName("merge", MergeStream.class)
>>        .withFunctionName("sort", SortStream.class)
>>        .withFunctionName("tuple", TupStream.class)
>>        .withFunctionName("rollup",RollupStream.class)
>>        .withFunctionName("hashJoin", HashJoinStream.class)
>>        .withFunctionName("complement", ComplementStream.class)
>>        .withFunctionName("fetch", FetchStream.class)
>>        .withFunctionName("having",HavingStream.class)
>> //      .withFunctionName("eq", EqualsEvaluator.class)
>>        .withFunctionName("count", CountMetric.class)
>>        .withFunctionName("facet", FacetStream.class)
>>        .withFunctionName("sum", SumMetric.class)
>>        .withFunctionName("unique", UniqueStream.class)
>>        .withFunctionName("uniq", UniqueMetric.class)
>>        .withFunctionName("innerJoin", InnerJoinStream.class)
>>        .withFunctionName("intersect", IntersectStream.class)
>>        .withFunctionName("replace", ReplaceOperation.class)
>>
>>        ;
>>      try {
>>      clause = getClause();
>>        stream = factory.constructStream(clause);
>>        stream.setStreamContext(streamContext);
>>        tuples = getTuples(stream);
>>
>>        for(Tuple tuple : tuples )
>>        {
>>        System.out.println(tuple.getString("id"));
>>        System.out.println(tuple.getString("business_email_s"));
>>      ....
>>
>>        }
>>
>>        System.out.println("Total tuples retunred "+tuples.size());
>>
>>
>> ---
>> private static String getClause() {
>> String clause = "select(search(gettingstarted,\n" +
>> "                        q=*:* NOT personal_email_s:*,\n" +
>> "                        fl=\"id,business_email_s\",\n" +
>> "                        sort=\"business_email_s asc\"),\n" +
>> "id,\n" +
>> "business_email_s,\n" +
>> "personal_email_s,\n" +
>> "replace(personal_email_s,null,withField=business_email_s)\n" +
>> ")";
>> return clause;
>> }
>>
>>
>> On Thu, Sep 28, 2017 at 3:35 PM, Hendrik Haddorp <hendrik.haddorp@gmx.net
>> >
>> wrote:
>>
>> Hi,
>>>
>>> I'm trying to use the streaming API via SolrJ but have some trouble with
>>> the documentation and samples. In the reference guide I found the below
>>> example in http://lucene.apache.org/solr/guide/6_6/streaming-expression
>>> s.html. Problem is that "withStreamFunction" does not seem to exist.
>>> There is "withFunctionName", which would match the arguments but there is
>>> no documentation in the JavaDoc nor is the sample stating why I would
>>> need
>>> all those "with" calls if pretty much everything is also in the last
>>> "constructStream" method call. I was planning to retrieve a few fields
>>> for
>>> all documents in a collection but have trouble to figure out what is the
>>> correct way to do so. The documentation also uses "/export" and
>>> "/search",
>>> with little explanation on the differences. Would really appreciate a
>>> pointer to some simple samples.
>>>
>>> The org.apache.solr.client.solrj.io package provides Java classes that
>>> compile streaming expressions into streaming API objects. These classes
>>> can
>>> be used to execute streaming expressions from inside a Java application.
>>> For example:
>>>
>>> StreamFactory streamFactory = new StreamFactory().withCollection
>>> ZkHost("collection1",
>>> zkServer.getZkAddress())
>>>      .withStreamFunction("search", CloudSolrStream.class)
>>>      .withStreamFunction("unique", UniqueStream.class)
>>>      .withStreamFunction("top", RankStream.class)
>>>      .withStreamFunction("group", ReducerStream.class)
>>>      .withStreamFunction("parallel", ParallelStream.class);
>>>
>>> ParallelStream pstream = (ParallelStream)streamFactory.
>>> constructStream("parallel(collection1, group(search(collection1,
>>> q=\"*:*\", fl=\"id,a_s,a_i,a_f\", sort=\"a_s asc,a_f asc\",
>>> partitionKeys=\"a_s\"), by=\"a_s asc\"), workers=\"2\",
>>> zkHost=\""+zkHost+"\", sort=\"a_s asc\")");
>>>
>>> regards,
>>> Hendrik
>>>
>>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message