lucene-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gus Heck (Jira)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-13836) Streaming Expression Query Parser
Date Fri, 01 Nov 2019 15:50:00 GMT

    [ https://issues.apache.org/jira/browse/SOLR-13836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964923#comment-16964923
] 

Gus Heck commented on SOLR-13836:
---------------------------------

It might be good to think about whether this circumvents the protections added in SOLR-12891

> Streaming Expression Query Parser
> ---------------------------------
>
>                 Key: SOLR-13836
>                 URL: https://issues.apache.org/jira/browse/SOLR-13836
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: query parsers, streaming expressions
>            Reporter: Trey Grainger
>            Priority: Minor
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> It is currently possible to hit the search handler in a streaming expression ("search(...)"),
but it is not currently possible to invoke a streaming expression from within a regular search
within the search handler. In some cases, it would be useful to leverage the power of streaming
expressions to generate a result set and then join that result set with a normal set of search
results.
> This isn't expected to be particularly efficient for high cardinality streaming expression
results, but it would be pretty powerful feature that could enable a bunch of use cases that
aren't possible today within a normal search.
> h2. Example:
> *Docs:*
> {code:java}
> curl -X POST -H "Content-Type: application/json" http://localhost:8983/solr/food_collection/update?commit=true
 --data-binary '
> [
> {"id": "1", "name_s":"donut","vector_fs":[5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0]},
> {"id": "2", "name_s":"apple juice","vector_fs":[1.0,5.0,0.0,0.0,0.0,4.0,4.0,3.0]},
> {"id": "3", "name_s":"cappuccino","vector_fs":[0.0,5.0,3.0,0.0,4.0,1.0,2.0,3.0]},
> {"id": "4", "name_s":"cheese pizza","vector_fs":[5.0,0.0,4.0,4.0,0.0,1.0,5.0,2.0]},
> {"id": "5", "name_s":"green tea","vector_fs":[0.0,5.0,0.0,0.0,2.0,1.0,1.0,5.0]},
> {"id": "6", "name_s":"latte","vector_fs":[0.0,5.0,4.0,0.0,4.0,1.0,3.0,3.0]},
> {"id": "7", "name_s":"soda","vector_fs":[0.0,5.0,0.0,0.0,3.0,5.0,5.0,0.0]},
> {"id": "8", "name_s":"cheese bread sticks","vector_fs":[5.0,0.0,4.0,5.0,0.0,1.0,4.0,2.0]},
> {"id": "9", "name_s":"water","vector_fs":[0.0,5.0,0.0,0.0,0.0,0.0,0.0,5.0]},
> {"id": "10", "name_s":"cinnamon bread sticks","vector_fs":[5.0,0.0,1.0,5.0,0.0,3.0,4.0,2.0]}
> ]
> {code}
>  
> *Query:*
> {code:java}
> http://localhost:8983/solr/food/select?q=*:*&fq=\{!streaming_expression}top(select(search(food,%20q=%22*:*%22,%20fl=%22id,vector_fs%22,%20sort=%22id%20asc%22),%20cosineSimilarity(vector_fs,%20array(5.1,0.0,1.0,5.0,0.0,4.0,5.0,1.0))%20as%20cos,%20id),%20n=5,%20sort=%22cos%20desc%22)&fl=id,name_s
> {code}
>  
> *Response:*
> {code:java}
> {
>   "responseHeader":{
>     "zkConnected":true,
>     "status":0,
>     "QTime":7,
>     "params":{
>       "q":"*:*",
>       "fl":"id,name_s",
>       "fq":"{!streaming_expression}top(select(search(food, q=\"*:*\", fl=\"id,vector_fs\",
sort=\"id asc\"), cosineSimilarity(vector_fs, array(5.2,0.0,1.0,5.0,0.0,4.0,5.0,1.0)) as cos,
id), n=5, sort=\"cos desc\")"}},
>   "response":{"numFound":5,"start":0,"docs":[
>       {
>         "name_s":"donut",
>         "id":"1"},
>       {
>         "name_s":"apple juice",
>         "id":"2"},
>       {
>         "name_s":"cheese pizza",
>         "id":"4"},
>       {
>         "name_s":"cheese bread sticks",
>         "id":"8"},
>       {
>         "name_s":"cinnamon bread sticks",
>         "id":"10"}]
>   }}
> {code}
> The current implementation also supports the following additional parameters:
>  *f*: (optional) The field name from the streaming expression containing the document
ids upon which to filter. Defaults to the same uniqueKey field name from your documents. 
>  *method*: (optional) Any of termsFilter (default), booleanQuery, automaton, docValuesTermsFilter.
> The method may go away, especially if we find a more efficient way to join the stream
to the main query doc set.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


Mime
View raw message