Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 27F7018818 for ; Thu, 12 Nov 2015 23:24:13 +0000 (UTC) Received: (qmail 3407 invoked by uid 500); 12 Nov 2015 23:24:12 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 3372 invoked by uid 500); 12 Nov 2015 23:24:12 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 3326 invoked by uid 99); 12 Nov 2015 23:24:12 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Nov 2015 23:24:12 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 109B22C1F58 for ; Thu, 12 Nov 2015 23:24:12 +0000 (UTC) Date: Thu, 12 Nov 2015 23:24:12 +0000 (UTC) From: "Ariel Weisberg (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15003017#comment-15003017 ] Ariel Weisberg edited comment on CASSANDRA-7217 at 11/12/15 11:24 PM: ---------------------------------------------------------------------- Performance counters 2000 threads {code} Results: op rate : 20576 [WRITE:20576] partition rate : 20576 [WRITE:20576] row rate : 20576 [WRITE:20576] latency mean : 97.2 [WRITE:97.2] latency median : 91.0 [WRITE:91.0] latency 95th percentile : 179.1 [WRITE:179.1] latency 99th percentile : 268.3 [WRITE:268.3] latency 99.9th percentile : 499.0 [WRITE:499.0] latency max : 1123.2 [WRITE:1123.2] Total partitions : 19000000 [WRITE:19000000] Total errors : 0 [WRITE:0] total gc count : 0 total gc mb : 0 total gc time (s) : 0 avg gc time(ms) : NaN stdev gc time(ms) : 0 Total operation time : 00:15:23 END Performance counter stats for './cassandra-stress write n=19000000 -rate threads=2000 -mode native cql3 -node 192.168.1.9': 3,236,123,141,155 cycles # 2.115 GHz [16.14%] 2,580,132,815,701 instructions # 0.80 insns per cycle # 0.89 stalled cycles per insn [21.45%] 63,994,020,523 cache-references # 41.828 M/sec [26.72%] 12,523,946,172 cache-misses # 19.570 % of all cache refs [32.00%] 2,294,356,584,027 idle-cycles-frontend # 70.90% frontend cycles idle [37.28%] 1,636,932,476,246 idle-cycles-backend # 50.58% backend cycles idle [42.54%] 1529337.521837 cpu-clock (msec) 1529938.883184 task-clock (msec) # 1.635 CPUs utilized 129,217 page-faults # 0.084 K/sec 87,687,956 cs # 0.057 M/sec 36,591,482 migrations # 0.024 M/sec 129,132 minor-faults # 0.084 K/sec 360,467,544,173 branch-instructions # 235.609 M/sec [47.81%] 5,205,849,494 branch-misses # 1.44% of all branches [47.76%] 67,636,847,959 L1-dcache-load-misses # 44.209 M/sec [47.83%] 24,113,350,939 L1-dcache-store-misses # 15.761 M/sec [47.94%] 18,928,905,359 L1-dcache-prefetch-misses # 12.372 M/sec [42.84%] 56,721,903,854 L1-icache-load-misses # 37.075 M/sec [42.94%] 3,977,754,938 dTLB-load-misses # 2.600 M/sec [42.96%] 748,817,996 dTLB-store-misses # 0.489 M/sec [42.93%] 791,352,271 iTLB-load-misses # 0.517 M/sec [42.86%] 5,414,521,445 branch-load-misses # 3.539 M/sec [42.80%] 37,275,666,810 LLC-loads # 24.364 M/sec [42.83%] 10,226,436,059 LLC-stores # 6.684 M/sec [42.80%] 16,548,689,552 LLC-prefetches # 10.817 M/sec [10.57%] 935.835191719 seconds time elapsed {code} 500 threads {code} Results: op rate : 63563 [WRITE:63563] partition rate : 63563 [WRITE:63563] row rate : 63563 [WRITE:63563] latency mean : 7.9 [WRITE:7.9] latency median : 5.8 [WRITE:5.8] latency 95th percentile : 16.2 [WRITE:16.2] latency 99th percentile : 36.3 [WRITE:36.3] latency 99.9th percentile : 74.0 [WRITE:74.0] latency max : 422.0 [WRITE:422.0] Total partitions : 19000000 [WRITE:19000000] Total errors : 0 [WRITE:0] total gc count : 0 total gc mb : 0 total gc time (s) : 0 avg gc time(ms) : NaN stdev gc time(ms) : 0 Total operation time : 00:04:58 END Performance counter stats for './cassandra-stress write n=19000000 -rate threads=500 -mode native cql3 -node 192.168.1.9': 1,967,800,644,333 cycles # 2.424 GHz [16.23%] 1,939,192,725,937 instructions # 0.99 insns per cycle # 0.67 stalled cycles per insn [21.56%] 29,961,702,909 cache-references # 36.915 M/sec [26.87%] 7,138,097,546 cache-misses # 23.824 % of all cache refs [32.16%] 1,290,923,581,701 idle-cycles-frontend # 65.60% frontend cycles idle [37.44%] 827,710,334,443 idle-cycles-backend # 42.06% backend cycles idle [42.67%] 811637.475308 cpu-clock (msec) 811646.201981 task-clock (msec) # 2.618 CPUs utilized 79,867 page-faults # 0.098 K/sec 34,954,827 cs # 0.043 M/sec 1,803,328 migrations # 0.002 M/sec 79,531 minor-faults # 0.098 K/sec 216,302,396,604 branch-instructions # 266.498 M/sec [47.89%] 2,293,191,606 branch-misses # 1.06% of all branches [47.75%] 36,684,160,264 L1-dcache-load-misses # 45.197 M/sec [47.69%] 15,585,249,129 L1-dcache-store-misses # 19.202 M/sec [47.62%] 14,137,121,831 L1-dcache-prefetch-misses # 17.418 M/sec [42.28%] 33,608,185,424 L1-icache-load-misses # 41.407 M/sec [42.28%] 2,489,611,820 dTLB-load-misses # 3.067 M/sec [42.26%] 371,870,411 dTLB-store-misses # 0.458 M/sec [42.27%] 512,108,974 iTLB-load-misses # 0.631 M/sec [42.28%] 2,280,308,348 branch-load-misses # 2.809 M/sec [42.31%] 16,344,737,798 LLC-loads # 20.138 M/sec [42.38%] 3,477,812,875 LLC-stores # 4.285 M/sec [42.43%] 9,526,173,996 LLC-prefetches # 11.737 M/sec [10.69%] 310.036724914 seconds time elapsed {code} was (Author: aweisberg): Performance counters 2000 threads {code} Results: op rate : 19419 [WRITE:19419] partition rate : 19419 [WRITE:19419] row rate : 19419 [WRITE:19419] latency mean : 103.0 [WRITE:103.0] latency median : 91.3 [WRITE:91.3] latency 95th percentile : 179.4 [WRITE:179.4] latency 99th percentile : 252.3 [WRITE:252.3] latency 99.9th percentile : 428.5 [WRITE:428.5] latency max : 57651.8 [WRITE:57651.8] Total partitions : 19000000 [WRITE:19000000] Total errors : 0 [WRITE:0] total gc count : 0 total gc mb : 0 total gc time (s) : 0 avg gc time(ms) : NaN stdev gc time(ms) : 0 Total operation time : 00:16:18 END Performance counter stats for './cassandra-stress write n=19000000 -rate threads=2000 -mode native cql3 -node 192.168.1.9': 3,320,451,421,007 cycles # 2.192 GHz [15.41%] 2,563,758,232,484 instructions # 0.77 insns per cycle # 0.94 stalled cycles per insn [20.47%] 69,188,067,241 cache-references # 45.664 M/sec [25.56%] 13,456,198,724 cache-misses # 19.449 % of all cache refs [30.60%] 131,776,347,830 bus-cycles # 86.973 M/sec [35.65%] 2,415,412,133,089 idle-cycles-frontend # 72.74% frontend cycles idle [40.69%] 1,750,197,198,741 idle-cycles-backend # 52.71% backend cycles idle [45.75%] 1514363.238593 cpu-clock (msec) 1515146.390785 task-clock (msec) # 1.530 CPUs utilized 154,815 page-faults # 0.102 K/sec 87,357,050 cs # 0.058 M/sec 37,030,093 migrations # 0.024 M/sec 154,691 minor-faults # 0.102 K/sec 0 major-faults # 0.000 K/sec 0 alignment-faults # 0.000 K/sec 0 emulation-faults # 0.000 K/sec 358,579,878,595 branch-instructions # 236.664 M/sec [45.74%] 5,088,330,722 branch-misses # 1.42% of all branches [45.80%] 70,350,080,393 L1-dcache-load-misses # 46.431 M/sec [45.92%] 24,626,765,787 L1-dcache-store-misses # 16.254 M/sec [40.88%] 19,812,757,638 L1-dcache-prefetch-misses # 13.076 M/sec [40.97%] 59,285,911,291 L1-icache-load-misses # 39.129 M/sec [40.92%] 4,437,071,985 dTLB-load-misses # 2.928 M/sec [40.90%] 821,151,709 dTLB-store-misses # 0.542 M/sec [40.80%] 1,188,402,914 iTLB-load-misses # 0.784 M/sec [40.66%] 5,274,857,779 branch-load-misses # 3.481 M/sec [40.58%] 39,293,189,238 LLC-loads # 25.934 M/sec [40.47%] 10,625,403,856 LLC-stores # 7.013 M/sec [40.45%] 16,978,686,645 LLC-prefetches # 11.206 M/sec [10.08%] 990.019887601 seconds time elapsed {code} 500 threads {code} Results: op rate : 63678 [WRITE:63678] partition rate : 63678 [WRITE:63678] row rate : 63678 [WRITE:63678] latency mean : 7.8 [WRITE:7.8] latency median : 5.6 [WRITE:5.6] latency 95th percentile : 16.8 [WRITE:16.8] latency 99th percentile : 36.5 [WRITE:36.5] latency 99.9th percentile : 77.5 [WRITE:77.5] latency max : 358.8 [WRITE:358.8] Total partitions : 19000000 [WRITE:19000000] Total errors : 0 [WRITE:0] total gc count : 0 total gc mb : 0 total gc time (s) : 0 avg gc time(ms) : NaN stdev gc time(ms) : 0 Total operation time : 00:04:58 END Performance counter stats for './cassandra-stress write n=19000000 -rate threads=500 -mode native cql3 -node 192.168.1.9': 2,055,138,822,781 cycles # 2.519 GHz [15.25%] 1,923,953,212,761 instructions # 0.94 insns per cycle # 0.71 stalled cycles per insn [20.30%] 31,745,552,527 cache-references # 38.904 M/sec [25.33%] 6,931,345,766 cache-misses # 21.834 % of all cache refs [30.35%] 79,818,924,716 bus-cycles # 97.818 M/sec [35.35%] 1,374,763,901,585 idle-cycles-frontend # 66.89% frontend cycles idle [40.37%] 891,429,827,525 idle-cycles-backend # 43.38% backend cycles idle [45.35%] 815994.442406 cpu-clock (msec) 815998.411396 task-clock (msec) # 2.635 CPUs utilized 84,202 page-faults # 0.103 K/sec 34,375,605 cs # 0.042 M/sec 1,661,307 migrations # 0.002 M/sec 83,803 minor-faults # 0.103 K/sec 0 major-faults # 0.000 K/sec 0 alignment-faults # 0.000 K/sec 0 emulation-faults # 0.000 K/sec 219,082,315,466 branch-instructions # 268.484 M/sec [45.30%] 2,321,109,537 branch-misses # 1.06% of all branches [45.35%] 37,321,647,256 L1-dcache-load-misses # 45.737 M/sec [45.40%] 15,702,399,931 L1-dcache-store-misses # 19.243 M/sec [40.39%] 14,082,194,661 L1-dcache-prefetch-misses # 17.258 M/sec [40.47%] 35,512,444,743 L1-icache-load-misses # 43.520 M/sec [40.47%] 2,048,574,473 dTLB-load-misses # 2.511 M/sec [40.46%] 338,040,710 dTLB-store-misses # 0.414 M/sec [40.47%] 680,218,846 iTLB-load-misses # 0.834 M/sec [40.47%] 2,316,842,085 branch-load-misses # 2.839 M/sec [40.44%] 16,883,500,935 LLC-loads # 20.691 M/sec [40.41%] 3,542,330,824 LLC-stores # 4.341 M/sec [40.37%] 9,938,493,897 LLC-prefetches # 12.180 M/sec [10.04%] 309.643226007 seconds time elapsed {code} > Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads > ------------------------------------------------------------------------------------------------- > > Key: CASSANDRA-7217 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7217 > Project: Cassandra > Issue Type: Bug > Reporter: Benedict > Assignee: Ariel Weisberg > Labels: performance, stress, triaged > Fix For: 3.1 > > > This is obviously bad. Let's figure out why it's happening and put a stop to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)