lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anshum Gupta (JIRA)" <>
Subject [jira] [Commented] (SOLR-5477) Async execution of OverseerCollectionProcessor tasks
Date Mon, 02 Dec 2013 16:29:38 GMT


Anshum Gupta commented on SOLR-5477:

Here's what I'd recommend. 

Have 3 queues in the first phase of implementation. One each for submitted, running, completed.
The completed queue only keeps the top-X tasks (by recency of completion). The completion
queue is important for people to figure out details about a completed task e.g. completion
time, running time etc.

I've started working on it and would recommend that we have a ThreadPool for the running tasks.
This can be capped at a config setting.

I am still debating about when to accept tasks (or perhaps accept everything and fail them
when they run). Here's a sample case on that. Firing a Shard split for collection1/shard1
would lead to an inactive shard1. If we continue to accept tasks until this completes, we
may accept actions that involve shard1. We may need to take a call on that.

For now, I am not looking at truly multi-threading my implementation (but certainly doing
that before having this particular JIRA as resolved). Once I get to it, I'd perhaps still
just run only one request per collection at a time, until we have a more complex decision
making capability.

Once a task is submitted, the OverseerCollectionProcessor peeks and processes tasks which
are in the submitted queue and moves them to in-process. We'll have to synchronize this task
on the queue/collection.

Upon completion, again the task is moved from the in-progress queue to the completed queue.

Cleaning up of the completed queue could also be tricky and we may need a failed tasks queue
or have a way to perhaps retain failed tasks in the completed queue longer.

> Async execution of OverseerCollectionProcessor tasks
> ----------------------------------------------------
>                 Key: SOLR-5477
>                 URL:
>             Project: Solr
>          Issue Type: Sub-task
>          Components: SolrCloud
>            Reporter: Noble Paul
> Typical collection admin commands are long running and it is very common to have the
requests get timed out.  It is more of a problem if the cluster is very large.Add an option
to run these commands asynchronously
> add an extra param async=true for all collection commands
> the task is written to ZK and the caller is returned a task id. 
> as separate collection admin command will be added to poll the status of the task
> command=status&id=7657668909
> if id is not passed all running async tasks should be listed
> A separate queue is created to store in-process tasks . After the tasks are completed
the queue entry is removed. OverSeerColectionProcessor will perform these tasks in multiple

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message