cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Schuller (Commented) (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-3494) Streaming is mono-threaded (the bulk loader too by extension)
Date Thu, 01 Dec 2011 19:06:40 GMT


Peter Schuller commented on CASSANDRA-3494:

We had it set very high. Throttling is not the problem here. The problem is that each node
was sending data to only one bootstrapping node at a time, and that process was bottlenecking
on the destination node's write capacity (see CASSANDRA-3549 too).

Another effect is that nodes weren't evenly bootstrapping across the cluster; so some nodes
were at 200 gigs whiles other were at 500, because of the pseudo-random ordering restriction
imposed by this limitation (multiple destinations dogpiling on a source -> some have to
just wait for a long time before they even begin receiving data). In addition to the aggregate
bandwidth limit, this caused even more delay to get to the state where all nodes were bootstrapped.
> Streaming is mono-threaded (the bulk loader too by extension)
> -------------------------------------------------------------
>                 Key: CASSANDRA-3494
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.8.0
>            Reporter: Sylvain Lebresne
>            Priority: Minor
> The streamExecutor is define as:
> {noformat}
> streamExecutor_ = new DebuggableThreadPoolExecutor("Streaming", Thread.MIN_PRIORITY);
> {noformat}
> In the meantime, in
> {noformat}
> public DebuggableThreadPoolExecutor(String threadPoolName, int priority)
> {
>    this(1, Integer.MAX_VALUE, TimeUnit.SECONDS, new LinkedBlockingQueue<Runnable>(),
new NamedThreadFactory(threadPoolName, priority));
> }
> {noformat}
> In other word, since the core pool size is 1 and the queue unbounded, tasks will always
queued and the executor is essentially mono-threaded.
> This is clearly not necessary since we already have stream throttling nowadays. And it
could be a limiting factor in the case of the bulk loader.
> Besides, I would venture that this maybe was not the intention, because putting the max
core size to MAX_VALUE would suggest that the intention was to spawn threads on demand. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message