lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cao Manh Dat (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-12338) Replay buffering tlog in parallel
Date Fri, 11 May 2018 02:24:00 GMT

    [ https://issues.apache.org/jira/browse/SOLR-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16471392#comment-16471392
] 

Cao Manh Dat commented on SOLR-12338:
-------------------------------------

{quote}I have doubts on the use of a new ArrayBlockingQueue<>(1) per doc ID hash bucket.
What if the client adds a Runnable for doc1, then immediately adds another Runnable for doc1.
You're intending for the second runnable to block until the first completes to achieve the
per-doc ID serialization. But this may not happen; a thread may start on the first runnable
(which frees up the second runnable to be submitted), then the thread doesn't get CPU time,
and then the other Runnable zooms ahead out-of-order. See what I mean?
{quote}
It is per threads (which is small), not per bucket. If I understand correctly, what you mean
here is two threads waiting for a lock to be released, the one who come late win the lock.
This seems can be solve by set the fair flag of {{ArrayBlockingQueue}} to true, right?

{quote}
Also if you submit without an ID, then it should probably proceed right to the delegate Executor.
 Why does it pick an ID at random?
{quote}
This can help us to know how many threads are running (pending). Therefore OrderedExecutor
does not execute more than {{numThreads }}in parallel. It also solves the case when ExecutorService's
queue is full it will throw RejectedExecutionException.

> Replay buffering tlog in parallel
> ---------------------------------
>
>                 Key: SOLR-12338
>                 URL: https://issues.apache.org/jira/browse/SOLR-12338
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Cao Manh Dat
>            Assignee: Cao Manh Dat
>            Priority: Major
>         Attachments: SOLR-12338.patch, SOLR-12338.patch
>
>
> Since updates with different id are independent, therefore it is safe to replay them
in parallel. This will significantly reduce recovering time of replicas in high load indexing
environment. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message