cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Constance Eustace (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8641) Repair causes a large number of tiny SSTables
Date Tue, 28 Apr 2015 14:55:07 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14517138#comment-14517138
] 

Constance Eustace commented on CASSANDRA-8641:
----------------------------------------------

We are encountering huge numbers of SStables as well, we've seen 250,000+ files for a paltry
4 gigabytes of data. RF3, 3Nodes, using vnodes, incremental parallel repair

repairs do not complete, compactions take a very long time. 

> Repair causes a large number of tiny SSTables
> ---------------------------------------------
>
>                 Key: CASSANDRA-8641
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8641
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Ubuntu 14.04
>            Reporter: Flavien Charlon
>             Fix For: 2.1.3
>
>
> I have a 3 nodes cluster with RF = 3, quad core and 32 GB or RAM. I am running 2.1.2
with all the default settings. I'm seeing some strange behaviors during incremental repair
(under write load).
> Taking the example of one particular column family, before running an incremental repair,
I have about 13 SSTables. After finishing the incremental repair, I have over 114000 SSTables.
> {noformat}
> Table: customers
> SSTable count: 114688
> Space used (live): 97203707290
> Space used (total): 99175455072
> Space used by snapshots (total): 0
> SSTable Compression Ratio: 0.28281112416526505
> Memtable cell count: 0
> Memtable data size: 0
> Memtable switch count: 1069
> Local read count: 0
> Local read latency: NaN ms
> Local write count: 11548705
> Local write latency: 0.030 ms
> Pending flushes: 0
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.00000
> Bloom filter space used: 144145152
> Compacted partition minimum bytes: 311
> Compacted partition maximum bytes: 1996099046
> Compacted partition mean bytes: 3419
> Average live cells per slice (last five minutes): 0.0
> Maximum live cells per slice (last five minutes): 0.0
> Average tombstones per slice (last five minutes): 0.0
> Maximum tombstones per slice (last five minutes): 0.0
> {noformat}
> Looking at the logs during the repair, it seems Cassandra is struggling to compact minuscule
memtables (often just a few kilobytes):
> {noformat}
> INFO  [CompactionExecutor:337] 2015-01-17 01:44:27,011 CompactionTask.java:251 - Compacted
32 sstables to [/mnt/data/cassandra/data/business/customers-d9d42d209ccc11e48ca54553c90a9d45/business-customers-ka-228341,].
 8,332 bytes to 6,547 (~78% of original) in 80,476ms = 0.000078MB/s.  32 total partitions
merged to 32.  Partition merge counts were {1:32, }
> INFO  [CompactionExecutor:337] 2015-01-17 01:45:35,519 CompactionTask.java:251 - Compacted
32 sstables to [/mnt/data/cassandra/data/business/customers-d9d42d209ccc11e48ca54553c90a9d45/business-customers-ka-229348,].
 8,384 bytes to 6,563 (~78% of original) in 6,880ms = 0.000910MB/s.  32 total partitions merged
to 32.  Partition merge counts were {1:32, }
> INFO  [CompactionExecutor:339] 2015-01-17 01:47:46,475 CompactionTask.java:251 - Compacted
32 sstables to [/mnt/data/cassandra/data/business/customers-d9d42d209ccc11e48ca54553c90a9d45/business-customers-ka-229351,].
 8,423 bytes to 6,401 (~75% of original) in 10,416ms = 0.000586MB/s.  32 total partitions
merged to 32.  Partition merge counts were {1:32, }
> {noformat}
>  
> Here is an excerpt of the system logs showing the abnormal flushing:
> {noformat}
> INFO  [AntiEntropyStage:1] 2015-01-17 15:28:43,807 ColumnFamilyStore.java:840 - Enqueuing
flush of customers: 634484 (0%) on-heap, 2599489 (0%) off-heap
> INFO  [AntiEntropyStage:1] 2015-01-17 15:29:06,823 ColumnFamilyStore.java:840 - Enqueuing
flush of levels: 129504 (0%) on-heap, 222168 (0%) off-heap
> INFO  [AntiEntropyStage:1] 2015-01-17 15:29:07,940 ColumnFamilyStore.java:840 - Enqueuing
flush of chain: 4508 (0%) on-heap, 6880 (0%) off-heap
> INFO  [AntiEntropyStage:1] 2015-01-17 15:29:08,124 ColumnFamilyStore.java:840 - Enqueuing
flush of invoices: 1469772 (0%) on-heap, 2542675 (0%) off-heap
> INFO  [AntiEntropyStage:1] 2015-01-17 15:29:09,471 ColumnFamilyStore.java:840 - Enqueuing
flush of customers: 809844 (0%) on-heap, 3364728 (0%) off-heap
> INFO  [AntiEntropyStage:1] 2015-01-17 15:29:24,368 ColumnFamilyStore.java:840 - Enqueuing
flush of levels: 28212 (0%) on-heap, 44220 (0%) off-heap
> INFO  [AntiEntropyStage:1] 2015-01-17 15:29:24,822 ColumnFamilyStore.java:840 - Enqueuing
flush of chain: 860 (0%) on-heap, 1130 (0%) off-heap
> INFO  [AntiEntropyStage:1] 2015-01-17 15:29:24,985 ColumnFamilyStore.java:840 - Enqueuing
flush of invoices: 334480 (0%) on-heap, 568959 (0%) off-heap
> INFO  [AntiEntropyStage:1] 2015-01-17 15:29:27,375 ColumnFamilyStore.java:840 - Enqueuing
flush of customers: 221568 (0%) on-heap, 929962 (0%) off-heap
> INFO  [AntiEntropyStage:1] 2015-01-17 15:29:35,755 ColumnFamilyStore.java:840 - Enqueuing
flush of invoices: 7916 (0%) on-heap, 11080 (0%) off-heap
> INFO  [AntiEntropyStage:1] 2015-01-17 15:29:36,239 ColumnFamilyStore.java:840 - Enqueuing
flush of customers: 9968 (0%) on-heap, 33041 (0%) off-heap
> INFO  [AntiEntropyStage:1] 2015-01-17 15:29:37,935 ColumnFamilyStore.java:840 - Enqueuing
flush of invoices: 42108 (0%) on-heap, 69494 (0%) off-heap
> INFO  [AntiEntropyStage:1] 2015-01-17 15:29:41,182 ColumnFamilyStore.java:840 - Enqueuing
flush of customers: 40936 (0%) on-heap, 159099 (0%) off-heap
> INFO  [AntiEntropyStage:1] 2015-01-17 15:29:49,573 ColumnFamilyStore.java:840 - Enqueuing
flush of levels: 17236 (0%) on-heap, 27048 (0%) off-heap
> INFO  [AntiEntropyStage:1] 2015-01-17 15:29:50,440 ColumnFamilyStore.java:840 - Enqueuing
flush of chain: 548 (0%) on-heap, 630 (0%) off-heap
> {noformat}
> At the end of the repair, the cluster has become unusable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message