lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Domenico Fabio Marino (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SOLR-10610) Add CanaryComponent, a search component to analyse requests
Date Fri, 25 Aug 2017 11:03:00 GMT

    [ https://issues.apache.org/jira/browse/SOLR-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141476#comment-16141476
] 

Domenico Fabio Marino edited comment on SOLR-10610 at 8/25/17 11:02 AM:
------------------------------------------------------------------------

h2. Status update:
h2. 
Note: this proposal depends on the patches SOLR-10990 SOLR-10880 SOLR-10881
An implementation is ready, but as the dependencies are not yet merged/ready, it will not
be uploaded.

The canary system is based on the usage of a new component: CanaryComponent.
CanaryComponent aims to perform the query in a safer and checked manner, and report eventual
errors.
The CanaryComponent needs to be properly set-up before being used.
*Set-up:*
# Tagging one or more replicas as "canary" replicas (Depends on SOLR-10880 and SOLR-10881):
  Using replica properties, set a property (this can be independent from any other shard filtering
property, but it is not compulsory for it to be such), to a value (canary type).
  A collection can have many canaries of many types, for example:
 **   {{shard1replica3, shard2replica1}} have the property {{canaryColour=yellow}}
 **  {{shard4replica2}} has the property {{canaryColour=red}}
  And so on.
  There can be multiple canary replicas per shard.
  A test file will be provided to show an example of such tagging.

# 2) The CanaryComponent needs to be added to the /select RequestHandler (Also included in
the test files).
  Optional but encouraged: set the flag {noformat}canary.timeout{noformat} to a sensible Long
(time in milliseconds) value, this will ensure that all the requests have a timeout specified.
  Note: the timeout can be specified on a per-request basis.
This concludes the initial set-up.

*Usage*
For each request that needs to be run through the CanaryComponent the following parameters
have to be added (depends on SOLR-10880):
{noformat}filterByReplicaProp=true{noformat}
and
{noformat}canary=CANARY_TYPE_PROPERTY:CANARY_TYPE{noformat}

A timeout needs to be specified, but for convenience it can be specified as mentioned in point
2.
Running a query on the canary without a timeout is not permitted, and an exception will be
thrown.

For example: {code:java}filterByReplicaProp=true&canary=birdColour:yellow&canary.timeout=5000{code}

This means that the request needs the replica filtering framework enabled (See SOLR-10880),
that the canary requests will have to be routed to the replicas having the property {{birdColour}}
set to {{yellow}} and that this request should timeout after 5 seconds.
If no replicas match the given filter, an exception will be thrown.

The request will only run on one canary replica, but if there are multiple replicas matching
{{CANARY_TYPE_PROPERTY:CANARY_TYPE}}, a random one will be picked among them, should it be
unreachable, another random one will be chosen, and so on.
The request will be executed exactly how the QueryComponent would execute it (depends on SOLR-10990),
this is done to ensure that the analysis is as realistic as possible, however its execution
will be performed in a separate thread.
This is done so that eventual exceptions thrown by the query can be caught, and that its time
of execution can be monitored at a finer level, the execution of the query is halted as soon
as an exception is detected or if it timed out.
CanaryComponent reports the status of the analysis as a Boolean via the {{CanaryComponent.CANARY_SUCCESS}}
response parameter and through a field in {{ResponseBuilder}}.

The same return convention applies to both:
* {{null/non-existing}} when the CanaryComponent did not execute the query
* {{true}} if the CanaryComponent processed the query and did not find any problem
* {{false}} if the query execution didn't terminate normally.
CanaryComponent will clean its query results so that other components will not see partial
results.


was (Author: dmarino):
h2. Status update:
h2. 
Note: this proposal depends on the patches SOLR-10990 SOLR-10880 SOLR-10881
An implementation is ready, but as the dependencies are not yet merged/ready, it will not
be uploaded.

The canary system is based on the usage of a new component: CanaryComponent.
CanaryComponent aims to perform the query in a safer and checked manner, and report eventual
errors.
The CanaryComponent needs to be properly set-up before being used.
*Set-up:*
# Tagging one or more replicas as "canary" replicas (Depends on SOLR-10880 and SOLR-10881):
  Using replica properties, set a property (this can be independent from any other shard filtering
property, but it is not compulsory for it to be such), to a value (canary type).
  A collection can have many canaries of many types, for example:
 **   {{shard1replica3, shard2replica1}} have the property {{canaryColour=yellow}}
 **  {{shard4replica2}} has the property {{canaryColour=red}}
  And so on.
  There can be multiple canary replicas per shard.
  A test file will be provided to show an example of such tagging.

# 2) The CanaryComponent needs to be added to the /select RequestHandler (Also included in
the test files).
  Optional but encouraged: set the flag {noformat}canary.timeout{noformat} to a sensible Long
(time in milliseconds) value, this will ensure that all the requests have a timeout specified.
  Note: the timeout can be specified on a per-request basis.
This concludes the initial set-up.

*Usage*
For each request that needs to be run through the CanaryComponent the following parameters
have to be added (depends on SOLR-10880):
{noformat}filterByReplicaProp=true{noformat}
and
{noformat}canary=CANARY_TYPE_PROPERTY:CANARY_TYPE{noformat}

A timeout needs to be specified, but for convenience it can be specified as mentioned in point
2.
Running a query on the canary without a timeout is not permitted, and an exception will be
thrown.

For example: {code:java}filterByReplicaProp=true&canary=birdColour:yellow&canary.timeout=5000{code}

This means that the request needs the replica filtering framework enabled (See SOLR-10880),
that the canary requests will have to be routed to the replicas having the property {{birdColour}}
set to {{yellow}} and that this request should timeout after 5 seconds.
If no replicas match the given filter, an exception will be thrown.

The request will only run on one canary replica, but if there are multiple replicas matching
{{CANARY_TYPE_PROPERTY:CANARY_TYPE}}, a random one will be picked among them, should it be
unreachable, another random one will be chosen, and so on.
The request will be executed exactly how the QueryComponent would execute it (depends on SOLR-10990),
this is done to ensure that the analysis is as realistic as possible, however its execution
will be performed in a separate thread.
This is done so that eventual exceptions thrown by the query can be caught, and that its time
of execution can be monitored at a finer level, the execution of the query is halted as soon
as an exception is detected or if it timed out.
CanaryComponent reports the status of the analysis as a Boolean via the {{CanaryComponent.CANARY_SUCCESS}}
response parameter and through a field in {code:java}ResponseBuilder.{code}

The same return convention applies to both:
* {{null/non-existing}} when the CanaryComponent did not execute the query
* {{true}} if the CanaryComponent processed the query and did not find any problem
* {{false}} if the query execution didn't terminate normally.
CanaryComponent will clean its query results so that other components will not see partial
results.

> Add CanaryComponent, a search component to analyse requests
> -----------------------------------------------------------
>
>                 Key: SOLR-10610
>                 URL: https://issues.apache.org/jira/browse/SOLR-10610
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Domenico Fabio Marino
>            Priority: Minor
>         Attachments: SOLR-10610.patch
>
>
> This patch outlines a new component that analyses a request and reports whether it is
too complex to continue processing.
> Running this component should be conditional and happen before other components start
processing. The component will set a status flag so that other components can know the result
of the Canary check, and also adds some information to the response sent back to the client.
> The component runs the query on the set of replicas that are tagged with the "canary"
tag.
> Please note this is only an outline so far and it therefore lacks test cases.
> When this will be more feature-complete, a test case will be added



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message