cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Shuler (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6488) Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters
Date Mon, 16 Dec 2013 16:09:10 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849261#comment-13849261
] 

Michael Shuler commented on CASSANDRA-6488:
-------------------------------------------

This introduced a failure in BootStrapperTest:

{code}
test:
     [echo] running unit tests
    [mkdir] Created dir: /home/mshuler/git/cassandra/build/test/cassandra
    [mkdir] Created dir: /home/mshuler/git/cassandra/build/test/output
    [junit] WARNING: multiple versions of ant detected in path for junit 
    [junit]          jar:file:/usr/share/ant/lib/ant.jar!/org/apache/tools/ant/Project.class
    [junit]      and jar:file:/home/mshuler/git/cassandra/build/lib/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class
    [junit] Testsuite: org.apache.cassandra.dht.BootStrapperTest
    [junit] Tests run: 4, Failures: 1, Errors: 0, Time elapsed: 6.177 sec
    [junit] 
    [junit] ------------- Standard Error -----------------
    [junit]  WARN 09:47:46,135 No host ID found, created 9019bb70-4d6e-4cf6-b730-140ff5ae4be5
(Note: This should happen exactly once per node).
    [junit]  WARN 09:47:46,262 Generated random token [d9180feb2e806704effa4024e8f4c631].
Random tokens will result in an unbalanced ring; see http://wiki.apache.org/cassandra/Operations
    [junit] ------------- ---------------- ---------------
    [junit] Testcase: testSourceTargetComputation(org.apache.cassandra.dht.BootStrapperTest):
  FAILED
    [junit] expected:<1> but was:<0>
    [junit] junit.framework.AssertionFailedError: expected:<1> but was:<0>
    [junit]     at org.apache.cassandra.dht.BootStrapperTest.testSourceTargetComputation(BootStrapperTest.java:212)
    [junit]     at org.apache.cassandra.dht.BootStrapperTest.testSourceTargetComputation(BootStrapperTest.java:173)
    [junit] 
    [junit] 
    [junit] Test org.apache.cassandra.dht.BootStrapperTest FAILED

BUILD FAILED
/home/mshuler/git/cassandra/build.xml:1113: The following error occurred while executing this
line:
/home/mshuler/git/cassandra/build.xml:1078: Some unit test(s) failed.

Total time: 9 seconds
((4be9e67...)|BISECTING)mshuler@hana:~/git/cassandra$ git bisect bad
4be9e6720d9f94a83aa42153c3e71ae1e557d2d9 is the first bad commit
commit 4be9e6720d9f94a83aa42153c3e71ae1e557d2d9
Author: Aleksey Yeschenko <aleksey@apache.org>
Date:   Sun Dec 15 13:29:56 2013 +0300

    Improve batchlog write performance with vnodes
    
    patch by Jonathan Ellis and Rick Branson; reviewed by Aleksey Yeschenko
    for CASSANDRA-6488

:100644 100644 e5865925f160faabc2506c3a5aac9985c17c1658 b55393b2ed138011bab52f95f2e9b52107709938
M      CHANGES.txt
:040000 040000 dea10aa8044e10eb60002e75f2586a9c8e94b647 7030c09f9713bd3e342e4e012c59b09c86b79a42
M      src
{code}

> Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-6488
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6488
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Rick Branson
>            Assignee: Rick Branson
>             Fix For: 1.2.13, 2.0.4
>
>         Attachments: 6488-rbranson-patch.txt, 6488-v2.txt, 6488-v3.txt, graph (21).png
>
>
> The cloneTokenOnlyMap call in StorageProxy.getBatchlogEndpoints causes enormous amounts
of CPU to be consumed on clusters with many vnodes. I created a patch to cache this data as
a workaround and deployed it to a production cluster with 15,000 tokens. CPU consumption drop
to 1/5th. This highlights the overall issues with cloneOnlyTokenMap() calls on vnodes clusters.
I'm including the maybe-not-the-best-quality workaround patch to use as a reference, but cloneOnlyTokenMap
is a systemic issue and every place it's called should probably be investigated.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Mime
View raw message