cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Cranford (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-12884) Batch logic can lead to unbalanced use of system.batches
Date Thu, 10 Aug 2017 20:42:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-12884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122288#comment-16122288
] 

Daniel Cranford commented on CASSANDRA-12884:
---------------------------------------------

1) BatchlogManager::shuffle is stubbed out so the unit test can provide a deterministic override.
The unit test has been expanded to provide a test which catches this regression. (the existing
code used the same pattern for getRandomInt which is overridden to be non-random in the unit
test)

2) getRandomInt could return the same value twice (sampling with replacement) resulting in
the same replica being chosen. The existing code uses the shuffle+take head pattern, eg in
BatchlogManager.java line 545 {{shuffle((List<String>) racks);}} and line 550 {{for
(String rack : Iterables.limit(racks, 2))}} to perform sampling without replacement.


> Batch logic can lead to unbalanced use of system.batches
> --------------------------------------------------------
>
>                 Key: CASSANDRA-12884
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12884
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Adam Hattrell
>            Assignee: Daniel Cranford
>             Fix For: 3.0.x, 3.11.x
>
>         Attachments: 0001-CASSANDRA-12884.patch
>
>
> It looks as though there are some odd edge cases in how we distribute the copies in system.batches.
> The main issue is in the filter method for org.apache.cassandra.batchlog.BatchlogManager
> {code:java}
>  if (validated.size() - validated.get(localRack).size() >= 2)
>  {
>         // we have enough endpoints in other racks
>         validated.removeAll(localRack);
>   }
>  if (validated.keySet().size() == 1)
>  {
>        // we have only 1 `other` rack
>        Collection otherRack = Iterables.getOnlyElement(validated.asMap().values());
>        
>         return Lists.newArrayList(Iterables.limit(otherRack, 2));
>  }
> {code}
> So with one or two racks we just return the first 2 entries in the list.  There's no
shuffle or randomisation here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message