incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Li <ale...@gmail.com>
Subject Re: Periodically hiccups
Date Fri, 23 Apr 2010 02:36:14 GMT
Thank you for the tip about microseconds, the stats matches then. Thank you!

Two more things happened:

    - One compaction took 19s, is that normal?
    - The whole ring was having trouble for a while, any ideas?

Thanks.

 INFO [FLUSH-WRITER-POOL:1] 2010-04-22 19:13:13,247 Memtable.java (line 
162) Completed flushing /var/lib/cassandra/data/Pandora/Folder-1893-Data.db
 INFO [COMPACTION-POOL:1] 2010-04-22 19:13:13,248 CompactionManager.java 
(line 246) Compacting 
[org.apache.cassandra.io.SSTableReader(path='/var/lib/cassandra/data/Pandora/Folder-1890-Data.db'),org.apache.cassandra.io.SSTableReader(path='/var/lib/cassandra/data/Pandora/Folder-1891-Data.db'),org.apache.cassandra.io.SSTableReader(path='/var/lib/cassandra/data/Pandora/Folder-1892-Data.db'),org.apache.cassandra.io.SSTableReader(path='/var/lib/cassandra/data/Pandora/Folder-1893-Data.db')]
 INFO [COMPACTION-POOL:1] 2010-04-22 19:13:32,425 CompactionManager.java 
(line 326) Compacted to 
/var/lib/cassandra/data/Pandora/Folder-1894-Data.db.  71368197/53308707 
bytes for 45168 keys.  Time: *18994ms.*


INFO [Timer-1] 2010-04-22 19:30:50,239 Gossiper.java (line 179) 
InetAddress /10.10.104.16 is now dead.
 INFO [Timer-1] 2010-04-22 19:30:50,239 Gossiper.java (line 179) 
InetAddress /10.10.104.15 is now dead.
 INFO [Timer-1] 2010-04-22 19:30:50,240 Gossiper.java (line 179) 
InetAddress /10.10.104.13 is now dead.
 INFO [Timer-1] 2010-04-22 19:30:50,240 Gossiper.java (line 179) 
InetAddress /10.10.104.11 is now dead.
 INFO [GMFD:1] 2010-04-22 19:30:51,622 Gossiper.java (line 568) 
InetAddress /10.10.104.15 is now UP
 INFO [GMFD:1] 2010-04-22 19:30:51,622 Gossiper.java (line 568) 
InetAddress /10.10.104.11 is now UP
 INFO [GMFD:1] 2010-04-22 19:30:51,665 Gossiper.java (line 568) 
InetAddress /10.10.104.13 is now UP
 INFO [GMFD:1] 2010-04-22 19:30:55,377 Gossiper.java (line 568) 
InetAddress /10.10.104.16 is now UP
 INFO [Timer-1] 2010-04-22 19:31:15,310 Gossiper.java (line 179) 
InetAddress /10.10.104.16 is now dead.
 INFO [GMFD:1] 2010-04-22 19:31:15,866 Gossiper.java (line 568) 
InetAddress /10.10.104.16 is now UP


Alex Li wrote:
> Hello,
>
> We recently deployed a cluster of 5 Cassandra nodes into production, 
> and ran into big problems with periodically hiccups (individual node 
> goes down, high CPU, client connection timeout). It was terrible with 
> 0.5 (one hiccups every 5-10 minutes), today we upgraded to 0.6.1, it 
> happens less frequently now (likely once every 30 minutes or so). But 
> it is still quite frustrating.
>
> We used ReplicationFactor=3 for all column families. 5 nodes are 
> behind haproxy. Java client goes through haproxy. The most obvious 
> behavior is: as soon as one node goes down, the connections between 
> haproxy and Cassandra nodes just shoot up to 1000 (in normal case it 
> is stable at 40, which should be really trivial for Cassandra), and 
> the connections don't go down for quite a while. Meanwhile Java 
> clients just get all kind of TimeoutException, then kept on retrying. 
> Eventually we have to restart haproxy, then things go back to normal.
>
> Each node has 5GB max heap, powerful enough CPU (quad-core), software 
> RAID mirror. We are definitely NOT putting lots of load yet, mostly 
> 20-50 concurrent requests to Cassandra, but it is not holding up! 
> Please help, we are on the verge of giving up Cassandra after 5 days 
> of periodic "outage".
>
> Couple observations:
>
> - cfstats shows significant read latency on "system" keyspace, almost 
> 5s (see below)
>
> - RecentReadLatencyMicros and RecentWriteLatencyMicros are super high 
> for StorageProxy, as well as every column family in JMX: up to 
> 152676.92 and 6950 (they are in ms, right?). However, in cfstats, they 
> are quite small.
>
> - Every second we see 5-10 DigestMismatchException in the log:
>
> INFO [pool-1-thread-15857] 2010-04-22 00:37:37,887 StorageProxy.java 
> (line 499) DigestMismatchException: Mismatch for key 1068022523 
> (d41d8cd98f00b204e9800998ecf8427e vs 0dd4cdaeeb1a334ae133c6955e109629)
>
> Please advice. Thank you!
>
> storage-conf, cfstats, tpstats are listed below:
>
> root@cdb-006:/glass/sfw/cassandra# cat conf/storage-conf.xml
>
> <Storage>
>
>    <AutoBootstrap>false</AutoBootstrap>
>    <Keyspaces>
>        <Keyspace Name="Pandora">
>   
>            
> <ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>

>
>            <ReplicationFactor>3</ReplicationFactor>
>            
> <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>

>
>        </Keyspace>
>        </Keyspaces>
>
>
>    <RpcTimeoutInMillis>10000</RpcTimeoutInMillis>
>    <CommitLogRotationThresholdInMB>128</CommitLogRotationThresholdInMB>
>
>    <DiskAccessMode>auto</DiskAccessMode>
>    <RowWarningThresholdInMB>512</RowWarningThresholdInMB>
>    <SlicedBufferSizeInKB>64</SlicedBufferSizeInKB>
>    <FlushDataBufferSizeInMB>32</FlushDataBufferSizeInMB>
>    <FlushIndexBufferSizeInMB>8</FlushIndexBufferSizeInMB>
>    <ColumnIndexSizeInKB>64</ColumnIndexSizeInKB>
>    <MemtableThroughputInMB>64</MemtableThroughputInMB>
>    <BinaryMemtableThroughputInMB>256</BinaryMemtableThroughputInMB>
>    <MemtableOperationsInMillions>0.3</MemtableOperationsInMillions>
>       <MemtableFlushAfterMinutes>60</MemtableFlushAfterMinutes>
>
>    <!--
>               ~ Unlike most systems, in Cassandra writes are faster 
> than reads, so
>               ~ you can afford more of those in parallel.  A good rule 
> of thumb is 2
>               ~ concurrent reads per processor core.  Increase 
> ConcurrentWrites to
>               ~ the number of clients writing at once if you enable 
> CommitLogSync +
>               ~ CommitLogSyncDelay. -->
>    <ConcurrentReads>8</ConcurrentReads>
>    <ConcurrentWrites>32</ConcurrentWrites>
>
>    <!--
>               ~ CommitLogSync may be either "periodic" or "batch."  
> When in batch
>               ~ mode, Cassandra won't ack writes until the commit log 
> has been
>               ~ fsynced to disk.  It will wait up to 
> CommitLogSyncBatchWindowInMS
>               ~ milliseconds for other writes, before performing the 
> sync.
>
>               ~ This is less necessary in Cassandra than in 
> traditional databases
>               ~ since replication reduces the odds of losing data from 
> a failure
>               ~ after writing the log entry but before it actually 
> reaches the disk.
>               ~ So the other option is "periodic," where writes may be 
> acked immediately
>               ~ and the CommitLog is simply synced every 
> CommitLogSyncPeriodInMS
>               ~ milliseconds.
>              -->
>    <CommitLogSync>periodic</CommitLogSync>
>    <!--
>               ~ Interval at which to perform syncs of the CommitLog in 
> periodic mode.
>               ~ Usually the default of 10000ms is fine; increase it if 
> your i/o
>               ~ load is such that syncs are taking excessively long 
> times.
>              -->
>    <CommitLogSyncPeriodInMS>10000</CommitLogSyncPeriodInMS>
>    <!--
>               ~ Delay (in milliseconds) during which additional commit 
> log entries
>               ~ may be written before fsync in batch mode.  This will 
> increase
>               ~ latency slightly, but can vastly improve throughput 
> where there are
>               ~ many writers.  Set to zero to disable (each entry will 
> be synced
>               ~ individually).  Reasonable values range from a minimal 
> 0.1 to 10 or
>               ~ even more if throughput matters more than latency.
>              -->
>    <!-- <CommitLogSyncBatchWindowInMS>1</CommitLogSyncBatchWindowInMS>

> -->
>
>    <!--
>               ~ Time to wait before garbage-collection deletion 
> markers.  Set this to
>               ~ a large enough value that you are confident that the 
> deletion marker
>               ~ will be propagated to all replicas by the time this 
> many seconds has
>               ~ elapsed, even in the face of hardware failures.  The 
> default value is
>               ~ ten days.
>              -->
>    <GCGraceSeconds>864000</GCGraceSeconds>
> </Storage>
> root@cdb-006:/glass/sfw/cassandra# bin/nodetool -h localhost cfstats
> Keyspace: system
>    Read Count: 878
>    Read Latency: 5752.042634396355 ms.
>    Write Count: 2260398
>    Write Latency: 0.014567047926957996 ms.
>    Pending Tasks: 0
>        Column Family: LocationInfo
>        SSTable count: 2
>        Space used (live): 3569
>        Space used (total): 3569
>        Memtable Columns Count: 0
>        Memtable Data Size: 0
>        Memtable Switch Count: 1
>        Read Count: 1
>        Read Latency: NaN ms.
>        Write Count: 6
>        Write Latency: NaN ms.
>        Pending Tasks: 0
>        Key cache capacity: 2
>        Key cache size: 1
>        Key cache hit rate: NaN
>        Row cache: disabled
>        Compacted row minimum size: 0
>        Compacted row maximum size: 0
>        Compacted row mean size: 0
>
>        Column Family: HintsColumnFamily
>        SSTable count: 2
>        Space used (live): 70272035
>        Space used (total): 70272035
>        Memtable Columns Count: 56264
>        Memtable Data Size: 486854
>        Memtable Switch Count: 21
>        Read Count: 877
>        Read Latency: 13614.412 ms.
>        Write Count: 2260392
>        Write Latency: 0.142 ms.
>        Pending Tasks: 0
>        Key cache capacity: 2
>        Key cache size: 2
>        Key cache hit rate: 0.25
>        Row cache: disabled
>        Compacted row minimum size: 78567
>        Compacted row maximum size: 39561901
>        Compacted row mean size: 27878603
>
> ----------------
> Keyspace: Titan
>    Read Count: 8948702
>    Read Latency: 7.949136100185256 ms.
>    Write Count: 3393490
>    Write Latency: 0.19255415398306758 ms.
>    Pending Tasks: 0
>        Column Family: FbUser
>        SSTable count: 6
>        Space used (live): 3675014807
>        Space used (total): 3675014807
>        Memtable Columns Count: 250055
>        Memtable Data Size: 9146339
>        Memtable Switch Count: 9
>        Read Count: 6591406
>        Read Latency: 8.361 ms.
>        Write Count: 343030
>        Write Latency: 0.078 ms.
>        Pending Tasks: 0
>        Key cache capacity: 200000
>        Key cache size: 200000
>        Key cache hit rate: 0.6341912478864628
>        Row cache: disabled
>        Compacted row minimum size: 320
>        Compacted row maximum size: 586
>        Compacted row mean size: 479
>
>        Column Family: Payment
>        SSTable count: 3
>        Space used (live): 2473
>        Space used (total): 2473
>        Memtable Columns Count: 0
>        Memtable Data Size: 0
>        Memtable Switch Count: 0
>        Read Count: 27728
>        Read Latency: 0.059 ms.
>        Write Count: 0
>        Write Latency: NaN ms.
>        Pending Tasks: 0
>        Key cache capacity: 200000
>        Key cache size: 0
>        Key cache hit rate: NaN
>        Row cache: disabled
>        Compacted row minimum size: 0
>        Compacted row maximum size: 0
>        Compacted row mean size: 0
>
>        Column Family: Club
>        SSTable count: 6
>        Space used (live): 363609256
>        Space used (total): 477968226
>        Memtable Columns Count: 124134
>        Memtable Data Size: 38144903
>        Memtable Switch Count: 140
>        Read Count: 63668
>        Read Latency: 29.187 ms.
>        Write Count: 1966098
>        Write Latency: 0.184 ms.
>        Pending Tasks: 0
>        Key cache capacity: 200000
>        Key cache size: 63751
>        Key cache hit rate: 0.6543073524771221
>        Row cache: disabled
>        Compacted row minimum size: 274
>        Compacted row maximum size: 49784
>        Compacted row mean size: 7711
>
>        Column Family: RandomUsers
>        SSTable count: 1
>        Space used (live): 1329464
>        Space used (total): 1329464
>        Memtable Columns Count: 170433
>        Memtable Data Size: 5525529
>        Memtable Switch Count: 8
>        Read Count: 58236
>        Read Latency: 6.215 ms.
>        Write Count: 143339
>        Write Latency: 0.575 ms.
>        Pending Tasks: 0
>        Key cache capacity: 200000
>        Key cache size: 21
>        Key cache hit rate: 1.0
>        Row cache: disabled
>        Compacted row minimum size: 1244196
>        Compacted row maximum size: 1327409
>        Compacted row mean size: 1287929
>
>        Column Family: User
>        SSTable count: 4
>        Space used (live): 583812728
>        Space used (total): 583812728
>        Memtable Columns Count: 248833
>        Memtable Data Size: 7909687
>        Memtable Switch Count: 12
>        Read Count: 2207667
>        Read Latency: 8.413 ms.
>        Write Count: 941029
>        Write Latency: 0.107 ms.
>        Pending Tasks: 0
>        Key cache capacity: 200000
>        Key cache size: 94940
>        Key cache hit rate: 0.9769078238427732
>        Row cache: disabled
>        Compacted row minimum size: 259
>        Compacted row maximum size: 174340
>        Compacted row mean size: 3916
>
> ----------------
> Keyspace: Pandora
>    Read Count: 1475530
>    Read Latency: 5.249669284257182 ms.
>    Write Count: 856550
>    Write Latency: 0.16070848053236822 ms.
>    Pending Tasks: 0
>        Column Family: Folder
>        SSTable count: 7
>        Space used (live): 942926793
>        Space used (total): 942926793
>        Memtable Columns Count: 150119
>        Memtable Data Size: 814194
>        Memtable Switch Count: 21
>        Read Count: 604605
>        Read Latency: 7.539 ms.
>        Write Count: 553015
>        Write Latency: 0.173 ms.
>        Pending Tasks: 0
>        Key cache capacity: 200000
>        Key cache size: 111884
>        Key cache hit rate: 0.9770064886992118
>        Row cache: disabled
>        Compacted row minimum size: 234
>        Compacted row maximum size: 1644541
>        Compacted row mean size: 1238
>
>        Column Family: Attachment
>        SSTable count: 5
>        Space used (live): 754692822
>        Space used (total): 754692822
>        Memtable Columns Count: 548
>        Memtable Data Size: 7110
>        Memtable Switch Count: 8
>        Read Count: 22708
>        Read Latency: 6.950 ms.
>        Write Count: 29835
>        Write Latency: 0.025 ms.
>        Pending Tasks: 0
>        Key cache capacity: 200000
>        Key cache size: 8299
>        Key cache hit rate: 0.017857142857142856
>        Row cache: disabled
>        Compacted row minimum size: 269
>        Compacted row maximum size: 274
>        Compacted row mean size: 272
>
>        Column Family: Message
>        SSTable count: 7
>        Space used (live): 3689251440
>        Space used (total): 3689251440
>        Memtable Columns Count: 50399
>        Memtable Data Size: 1361773
>        Memtable Switch Count: 10
>        Read Count: 163630
>        Read Latency: 16.981 ms.
>        Write Count: 257843
>        Write Latency: 0.113 ms.
>        Pending Tasks: 0
>        Key cache capacity: 200000
>        Key cache size: 52427
>        Key cache hit rate: 0.5041365046535677
>        Row cache: disabled
>        Compacted row minimum size: 272
>        Compacted row maximum size: 2754
>        Compacted row mean size: 752
>
>        Column Family: FolderInfo
>        SSTable count: 0
>        Space used (live): 0
>        Space used (total): 0
>        Memtable Columns Count: 0
>        Memtable Data Size: 0
>        Memtable Switch Count: 0
>        Read Count: 8704
>        Read Latency: 0.008 ms.
>        Write Count: 0
>        Write Latency: NaN ms.
>        Pending Tasks: 0
>        Key cache capacity: 200000
>        Key cache size: 0
>        Key cache hit rate: NaN
>        Row cache: disabled
>        Compacted row minimum size: 0
>        Compacted row maximum size: 0
>        Compacted row mean size: 0
>
>        Column Family: StandardByUUID1
>        SSTable count: 0
>        Space used (live): 0
>        Space used (total): 0
>        Memtable Columns Count: 0
>        Memtable Data Size: 0
>        Memtable Switch Count: 0
>        Read Count: 8704
>        Read Latency: 0.005 ms.
>        Write Count: 0
>        Write Latency: NaN ms.
>        Pending Tasks: 0
>        Key cache capacity: 200000
>        Key cache size: 0
>        Key cache hit rate: NaN
>        Row cache: disabled
>        Compacted row minimum size: 0
>        Compacted row maximum size: 0
>        Compacted row mean size: 0
>
>        Column Family: User
>        SSTable count: 4
>        Space used (live): 120121309
>        Space used (total): 120121309
>        Memtable Columns Count: 1786
>        Memtable Data Size: 50657
>        Memtable Switch Count: 8
>        Read Count: 641067
>        Read Latency: 2.728 ms.
>        Write Count: 15857
>        Write Latency: 0.148 ms.
>        Pending Tasks: 0
>        Key cache capacity: 200000
>        Key cache size: 45965
>        Key cache hit rate: 0.957460499034818
>        Row cache: disabled
>        Compacted row minimum size: 271
>        Compacted row maximum size: 506
>        Compacted row mean size: 499
>
>        Column Family: Standard1
>        SSTable count: 0
>        Space used (live): 0
>        Space used (total): 0
>        Memtable Columns Count: 0
>        Memtable Data Size: 0
>        Memtable Switch Count: 0
>        Read Count: 8704
>        Read Latency: 0.005 ms.
>        Write Count: 0
>        Write Latency: NaN ms.
>        Pending Tasks: 0
>        Key cache capacity: 200000
>        Key cache size: 0
>        Key cache hit rate: NaN
>        Row cache: disabled
>        Compacted row minimum size: 0
>        Compacted row maximum size: 0
>        Compacted row mean size: 0
>
>        Column Family: Standard2
>        SSTable count: 0
>        Space used (live): 0
>        Space used (total): 0
>        Memtable Columns Count: 0
>        Memtable Data Size: 0
>        Memtable Switch Count: 0
>        Read Count: 8704
>        Read Latency: 0.006 ms.
>        Write Count: 0
>        Write Latency: NaN ms.
>        Pending Tasks: 0
>        Key cache capacity: 200000
>        Key cache size: 0
>        Key cache hit rate: NaN
>        Row cache: disabled
>        Compacted row minimum size: 0
>        Compacted row maximum size: 0
>        Compacted row mean size: 0
>
>        Column Family: Folder1
>        SSTable count: 0
>        Space used (live): 0
>        Space used (total): 0
>        Memtable Columns Count: 0
>        Memtable Data Size: 0
>        Memtable Switch Count: 0
>        Read Count: 8704
>        Read Latency: 0.005 ms.
>        Write Count: 0
>        Write Latency: NaN ms.
>        Pending Tasks: 0
>        Key cache capacity: 200000
>        Key cache size: 0
>        Key cache hit rate: NaN
>        Row cache: disabled
>        Compacted row minimum size: 0
>        Compacted row maximum size: 0
>        Compacted row mean size: 0
>
> ----------------
> root@cdb-006:/glass/sfw/cassandra# bin/nodetool -h localhost tpstats
> Pool Name                    Active   Pending      Completed
> FILEUTILS-DELETE-POOL             0         0             70
> STREAM-STAGE                      0         0              0
> RESPONSE-STAGE                    0         0       18298968
> ROW-READ-STAGE                    1         0       10168985
> LB-OPERATIONS                     0         0              0
> MESSAGE-DESERIALIZER-POOL         0         0       24960681
> GMFD                              0         0          93647
> LB-TARGET                         0         0              0
> CONSISTENCY-MANAGER               0         0              0
> ROW-MUTATION-STAGE                0         0        4280703
> MESSAGE-STREAMING-POOL            0         0             12
> LOAD-BALANCER-STAGE               0         0              0
> FLUSH-SORTER-POOL                 0         0              0
> MEMTABLE-POST-FLUSHER             0         0            224
> FLUSH-WRITER-POOL                 0         0            224
> AE-SERVICE-STAGE                  0         0             22
> HINTED-HANDOFF-POOL               1         5             73
>>
> Here are some system load data I collected.
>
> top - 00:43:43 up 1 day, 20:15,  2 users,  load average: 2.96, 2.74, 2.37
> Tasks: 160 total,   1 running, 159 sleeping,   0 stopped,   0 zombie
> Cpu0  : 10.9%us,  2.6%sy,  0.0%ni, 83.2%id,  3.2%wa,  0.0%hi,  
> 0.0%si,  0.0%st
> Cpu1  :  9.6%us,  2.2%sy,  0.0%ni, 85.4%id,  2.8%wa,  0.0%hi,  
> 0.0%si,  0.0%st
> Cpu2  : 37.1%us,  4.2%sy,  0.0%ni, 23.3%id, 33.8%wa,  1.2%hi,  
> 0.3%si,  0.0%st
> Cpu3  : 18.5%us,  3.9%sy,  0.0%ni, 62.6%id,  9.5%wa,  4.1%hi,  
> 1.4%si,  0.0%st
> Cpu4  : 11.1%us,  2.6%sy,  0.0%ni, 80.1%id,  3.2%wa,  2.9%hi,  
> 0.2%si,  0.0%st
> Cpu5  : 10.1%us,  2.7%sy,  0.0%ni, 84.4%id,  2.7%wa,  0.0%hi,  
> 0.0%si,  0.0%st
> Cpu6  : 20.1%us,  2.6%sy,  0.0%ni, 62.6%id, 14.7%wa,  0.0%hi,  
> 0.0%si,  0.0%st
> Cpu7  : 18.9%us,  4.4%sy,  0.0%ni, 69.0%id,  7.7%wa,  0.0%hi,  
> 0.1%si,  0.0%st
> Mem:   8188108k total,  8135168k used,    52940k free,    12968k buffers
> Swap: 62500792k total,   277484k used, 62223308k free,  2381704k cached
>
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  
> COMMAND                                                                  
> 18000 root      20   0 15.6g 6.7g 1.5g S   64 85.4 552:54.13 
> java                                                                     
>     1 root      20   0 19316  672  412 S    0  0.0   0:03.36 init       
>
> root@cdb-006:/glass/sfw/cassandra# vmstat 5
> procs -----------memory---------- ---swap-- -----io---- -system-- 
> ----cpu----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us 
> sy id wa
>  5  1 277984  50180  12888 2382292   84   67   880   127   10    4 17  
> 4 69 10
>  3  1 278000  51536  12856 2381316    4   14  2556    37 8660 10714  
> 7  4 82  8
>  2  0 277968  52084  12768 2380156    7    2   891   618 6287 8865  6  
> 3 87  3
>  3  0 277872  52020  12764 2379412   24    7  1070    22 5492 10111  
> 8  3 84  4
>  0  1 277528  50764  12760 2379176    2   10  1530   696 8773 13661  
> 8  4 83  5


Mime
View raw message