cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Kinder <dkin...@turnitin.com>
Subject Re:
Date Fri, 29 Sep 2017 16:47:46 GMT
Attached a jstack thread dump from a node that's having trouble (see
tpstats paste as well)

Pool Name                         Active   Pending      Completed   Blocked
All time blocked

ReadStage                              0         0         409111         0
              0

MiscStage                              0         0              0         0
              0

CompactionExecutor                    10        10           2481         0
              0

MutationStage                         32   1572402       31515477         0
              0

GossipStage                            1      1620         180091         0
              0

RequestResponseStage                   0         0       17509006         0
              0

ReadRepairStage                        0         0           4833         0
              0

CounterMutationStage                   0         0              0         0
              0

MemtableFlushWriter                    1        44             74         0
              0

MemtablePostFlush                      1       281            158         0
              0

ValidationExecutor                     0         0              0         0
              0

ViewMutationStage                      0         0              0         0
              0

CacheCleanupExecutor                   0         0              0         0
              0

PerDiskMemtableFlushWriter_10          0         0             67         0
              0

PerDiskMemtableFlushWriter_11          0         0             67         0
              0

MemtableReclaimMemory                  0         0             79         0
              0

PendingRangeCalculator                 0         0             19         0
              0

SecondaryIndexManagement               0         0              0         0
              0

HintsDispatcher                        1         1           1456         0
              0

PerDiskMemtableFlushWriter_1           0         0             67         0
              0

Native-Transport-Requests             69         0       17951984         0
          26956

PerDiskMemtableFlushWriter_2           0         0             67         0
              0

MigrationStage                         1       201           1048         0
              0

PerDiskMemtableFlushWriter_0           0         0             78         0
              0

Sampler                                0         0              0         0
              0

PerDiskMemtableFlushWriter_5           0         0             67         0
              0

InternalResponseStage                  0         0        2754163         0
              0

PerDiskMemtableFlushWriter_6           0         0             67         0
              0

PerDiskMemtableFlushWriter_3           0         0             67         0
              0

PerDiskMemtableFlushWriter_4           0         0             67         0
              0

PerDiskMemtableFlushWriter_9           0         0             67         0
              0

AntiEntropyStage                       0         0              0         0
              0

PerDiskMemtableFlushWriter_7           0         0             67         0
              0

PerDiskMemtableFlushWriter_8           0         0             67         0
              0


Message type           Dropped

READ                         0

RANGE_SLICE                  0

_TRACE                       0

HINT                         0

MUTATION                     0

COUNTER_MUTATION             0

BATCH_STORE                  0

BATCH_REMOVE                 0

REQUEST_RESPONSE             0

PAGED_RANGE                  0

READ_REPAIR                  0

On Thu, Sep 28, 2017 at 2:27 PM, Dan Kinder <dkinder@turnitin.com> wrote:

> Unless I'm reading the logs wrong, it doesn't seem like the pauses are
> very long... Attached the full log as well since node restart.
>
> $ grep -i "pause\|times" /var/log/cassandra/gc.log.0.current | tail -n 40
> 2017-09-28T14:18:48.324-0700: 12866.695: [GC pause (G1 Evacuation Pause)
> (young)
>  [Times: user=0.33 sys=0.00, real=0.02 secs]
> 2017-09-28T14:18:58.198-0700: 12876.569: [GC pause (G1 Evacuation Pause)
> (young)
>  [Times: user=0.34 sys=0.00, real=0.03 secs]
> 2017-09-28T14:19:08.039-0700: 12886.411: [GC pause (G1 Evacuation Pause)
> (young)
>  [Times: user=0.30 sys=0.00, real=0.03 secs]
> 2017-09-28T14:19:17.900-0700: 12896.271: [GC pause (G1 Evacuation Pause)
> (young)
>  [Times: user=0.33 sys=0.00, real=0.03 secs]
> 2017-09-28T14:19:28.995-0700: 12907.366: [GC pause (G1 Evacuation Pause)
> (young)
>  [Times: user=0.31 sys=0.00, real=0.02 secs]
> 2017-09-28T14:19:40.576-0700: 12918.947: [GC pause (G1 Evacuation Pause)
> (young)
>  [Times: user=0.31 sys=0.01, real=0.02 secs]
> 2017-09-28T14:19:52.164-0700: 12930.535: [GC pause (G1 Evacuation Pause)
> (young)
>  [Times: user=0.32 sys=0.01, real=0.02 secs]
> 2017-09-28T14:20:03.813-0700: 12942.184: [GC pause (G1 Evacuation Pause)
> (young)
>  [Times: user=0.33 sys=0.00, real=0.02 secs]
> 2017-09-28T14:20:15.584-0700: 12953.955: [GC pause (G1 Evacuation Pause)
> (young)
>  [Times: user=0.35 sys=0.00, real=0.02 secs]
> 2017-09-28T14:20:27.261-0700: 12965.632: [GC pause (G1 Evacuation Pause)
> (young)
>  [Times: user=0.34 sys=0.00, real=0.03 secs]
> 2017-09-28T14:20:38.894-0700: 12977.265: [GC pause (G1 Evacuation Pause)
> (young)
>  [Times: user=0.34 sys=0.01, real=0.03 secs]
> 2017-09-28T14:20:50.685-0700: 12989.056: [GC pause (G1 Evacuation Pause)
> (young)
>  [Times: user=0.33 sys=0.00, real=0.03 secs]
> 2017-09-28T14:21:02.281-0700: 13000.652: [GC pause (G1 Evacuation Pause)
> (young)
>  [Times: user=0.31 sys=0.00, real=0.02 secs]
> 2017-09-28T14:21:13.975-0700: 13012.346: [GC pause (G1 Evacuation Pause)
> (young)
>  [Times: user=0.32 sys=0.00, real=0.02 secs]
> 2017-09-28T14:21:25.797-0700: 13024.168: [GC pause (G1 Evacuation Pause)
> (young)
>  [Times: user=0.32 sys=0.01, real=0.02 secs]
> 2017-09-28T14:21:37.395-0700: 13035.766: [GC pause (G1 Evacuation Pause)
> (young)
>  [Times: user=0.33 sys=0.00, real=0.02 secs]
> 2017-09-28T14:21:49.166-0700: 13047.537: [GC pause (G1 Evacuation Pause)
> (young)
>  [Times: user=0.32 sys=0.00, real=0.02 secs]
> 2017-09-28T14:22:00.887-0700: 13059.258: [GC pause (G1 Evacuation Pause)
> (young)
>  [Times: user=0.33 sys=0.00, real=0.03 secs]
> 2017-09-28T14:22:12.453-0700: 13070.824: [GC pause (G1 Evacuation Pause)
> (young)
>  [Times: user=0.34 sys=0.00, real=0.02 secs]
> 2017-09-28T14:22:24.023-0700: 13082.394: [GC pause (G1 Evacuation Pause)
> (young)
>  [Times: user=0.35 sys=0.00, real=0.02 secs]
>
>
> On Thu, Sep 28, 2017 at 2:13 PM, Jeff Jirsa <jjirsa@gmail.com> wrote:
>
>> The digest mismatch exception is not a problem, that's why it's only
>> logged at debug.
>>
>> As Thomas noted, there's a pretty good chance this is
>> https://issues.apache.org/jira/browse/CASSANDRA-13754 - if you see a lot
>> of GCInspector logs indicating GC pauses, that would add confidence to that
>> diagnosis.  <https://issues.apache.org/jira/browse/CASSANDRA-13754>
>>
>>
>> On Thu, Sep 28, 2017 at 2:08 PM, Dan Kinder <dkinder@turnitin.com> wrote:
>>
>>> Thanks for the responses.
>>>
>>> @Prem yes this is after the entire cluster is on 3.11, but no I did not
>>> run upgradesstables yet.
>>>
>>> @Thomas no I don't see any major GC going on.
>>>
>>> @Jeff yeah it's fully upgraded. I decided to shut the whole thing down
>>> and bring it back (thankfully this cluster is not serving live traffic).
>>> The nodes seemed okay for an hour or two, but I see the issue again,
>>> without me bouncing any nodes. This time it's ReadStage that's building up,
>>> and the exception I'm seeing in the logs is:
>>>
>>> DEBUG [ReadRepairStage:106] 2017-09-28 13:01:37,206
>>> ReadCallback.java:242 - Digest mismatch:
>>>
>>> org.apache.cassandra.service.DigestMismatchException: Mismatch for key
>>> DecoratedKey(6150926370328526396, 696a6374652e6f7267)
>>> (2f0fffe2d743cdc4c69c3eb351a3c9ca vs 00ee661ae190c2cbf0eb2fb8a51f6025)
>>>
>>>         at org.apache.cassandra.service.DigestResolver.compareResponses(DigestResolver.java:92)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>>         at org.apache.cassandra.service.ReadCallback$AsyncRepairRunner.run(ReadCallback.java:233)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>> [na:1.8.0_71]
>>>
>>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>> [na:1.8.0_71]
>>>
>>>         at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$th
>>> readLocalDeallocator$0(NamedThreadFactory.java:81)
>>> [apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>>         at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_71]
>>>
>>>
>>> Do you think running upgradesstables would help? Or relocatesstables? I
>>> presumed it shouldn't be necessary for Cassandra to function, just an
>>> optimization.
>>>
>>> On Thu, Sep 28, 2017 at 12:49 PM, Steinmaurer, Thomas <
>>> thomas.steinmaurer@dynatrace.com> wrote:
>>>
>>>> Dan,
>>>>
>>>>
>>>>
>>>> do you see any major GC? We have been hit by the following memory leak
>>>> in our loadtest environment with 3.11.0.
>>>>
>>>> https://issues.apache.org/jira/browse/CASSANDRA-13754
>>>>
>>>>
>>>>
>>>> So, depending on the heap size and uptime, you might get into heap
>>>> troubles.
>>>>
>>>>
>>>>
>>>> Thomas
>>>>
>>>>
>>>>
>>>> *From:* Dan Kinder [mailto:dkinder@turnitin.com]
>>>> *Sent:* Donnerstag, 28. September 2017 18:20
>>>> *To:* user@cassandra.apache.org
>>>> *Subject:*
>>>>
>>>>
>>>>
>>>> Hi,
>>>>
>>>> I recently upgraded our 16-node cluster from 2.2.6 to 3.11 and see the
>>>> following. The cluster does function, for a while, but then some stages
>>>> begin to back up and the node does not recover and does not drain the
>>>> tasks, even under no load. This happens both to MutationStage and
>>>> GossipStage.
>>>>
>>>> I do see the following exception happen in the logs:
>>>>
>>>>
>>>>
>>>> ERROR [ReadRepairStage:2328] 2017-09-26 23:07:55,440
>>>> CassandraDaemon.java:228 - Exception in thread
>>>> Thread[ReadRepairStage:2328,5,main]
>>>>
>>>> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed
>>>> out - received only 1 responses.
>>>>
>>>>         at org.apache.cassandra.service.DataResolver$RepairMergeListener.close(DataResolver.java:171)
>>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>>
>>>>         at org.apache.cassandra.db.partitions.UnfilteredPartitionIterat
>>>> ors$2.close(UnfilteredPartitionIterators.java:182)
>>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>>
>>>>         at org.apache.cassandra.db.transform.BaseIterator.close(BaseIterator.java:82)
>>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>>
>>>>         at org.apache.cassandra.service.DataResolver.compareResponses(DataResolver.java:89)
>>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>>
>>>>         at org.apache.cassandra.service.AsyncRepairCallback$1.runMayThr
>>>> ow(AsyncRepairCallback.java:50) ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>>
>>>>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>>
>>>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>> ~[na:1.8.0_91]
>>>>
>>>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>> ~[na:1.8.0_91]
>>>>
>>>>         at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$th
>>>> readLocalDeallocator$0(NamedThreadFactory.java:81)
>>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>>
>>>>         at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_91]
>>>>
>>>>
>>>>
>>>> But it's hard to correlate precisely with things going bad. It is also
>>>> very strange to me since I have both read_repair_chance and
>>>> dclocal_read_repair_chance set to 0.0 for ALL of my tables. So it is
>>>> confusing why ReadRepairStage would err.
>>>>
>>>> Anyone have thoughts on this? It's pretty muddling, and causes nodes to
>>>> lock up. Once it happens Cassandra can't even shut down, I have to kill -9.
>>>> If I can't find a resolution I'm going to need to downgrade and restore to
>>>> backup...
>>>>
>>>> The only issue I found that looked similar is
>>>> https://issues.apache.org/jira/browse/CASSANDRA-12689 but that appears
>>>> to be fixed by 3.10.
>>>>
>>>>
>>>>
>>>> $ nodetool tpstats
>>>>
>>>> Pool Name                         Active   Pending      Completed
>>>> Blocked  All time blocked
>>>>
>>>> ReadStage                              0         0         582103
>>>>     0                 0
>>>>
>>>> MiscStage                              0         0              0
>>>>     0                 0
>>>>
>>>> CompactionExecutor                    11        11           2868
>>>>     0                 0
>>>>
>>>> MutationStage                         32   4593678       55057393
>>>>     0                 0
>>>>
>>>> GossipStage                            1      2818         371487
>>>>     0                 0
>>>>
>>>> RequestResponseStage                   0         0        4345522
>>>>     0                 0
>>>>
>>>> ReadRepairStage                        0         0         151473
>>>>     0                 0
>>>>
>>>> CounterMutationStage                   0         0              0
>>>>     0                 0
>>>>
>>>> MemtableFlushWriter                    1        81             76
>>>>     0                 0
>>>>
>>>> MemtablePostFlush                      1       382            139
>>>>     0                 0
>>>>
>>>> ValidationExecutor                     0         0              0
>>>>     0                 0
>>>>
>>>> ViewMutationStage                      0         0              0
>>>>     0                 0
>>>>
>>>> CacheCleanupExecutor                   0         0              0
>>>>     0                 0
>>>>
>>>> PerDiskMemtableFlushWriter_10          0         0             69
>>>>     0                 0
>>>>
>>>> PerDiskMemtableFlushWriter_11          0         0             69
>>>>     0                 0
>>>>
>>>> MemtableReclaimMemory                  0         0             81
>>>>     0                 0
>>>>
>>>> PendingRangeCalculator                 0         0             32
>>>>     0                 0
>>>>
>>>> SecondaryIndexManagement               0         0              0
>>>>     0                 0
>>>>
>>>> HintsDispatcher                        0         0            596
>>>>     0                 0
>>>>
>>>> PerDiskMemtableFlushWriter_1           0         0             69
>>>>     0                 0
>>>>
>>>> Native-Transport-Requests             11         0        4547746
>>>>     0                67
>>>>
>>>> PerDiskMemtableFlushWriter_2           0         0             69
>>>>     0                 0
>>>>
>>>> MigrationStage                         1      1545            586
>>>>     0                 0
>>>>
>>>> PerDiskMemtableFlushWriter_0           0         0             80
>>>>     0                 0
>>>>
>>>> Sampler                                0         0              0
>>>>     0                 0
>>>>
>>>> PerDiskMemtableFlushWriter_5           0         0             69
>>>>     0                 0
>>>>
>>>> InternalResponseStage                  0         0          45432
>>>>     0                 0
>>>>
>>>> PerDiskMemtableFlushWriter_6           0         0             69
>>>>     0                 0
>>>>
>>>> PerDiskMemtableFlushWriter_3           0         0             69
>>>>     0                 0
>>>>
>>>> PerDiskMemtableFlushWriter_4           0         0             69
>>>>     0                 0
>>>>
>>>> PerDiskMemtableFlushWriter_9           0         0             69
>>>>     0                 0
>>>>
>>>> AntiEntropyStage                       0         0              0
>>>>     0                 0
>>>>
>>>> PerDiskMemtableFlushWriter_7           0         0             69
>>>>     0                 0
>>>>
>>>> PerDiskMemtableFlushWriter_8           0         0             69
>>>>     0                 0
>>>>
>>>>
>>>>
>>>> Message type           Dropped
>>>>
>>>> READ                         0
>>>>
>>>> RANGE_SLICE                  0
>>>>
>>>> _TRACE                       0
>>>>
>>>> HINT                         0
>>>>
>>>> MUTATION                     0
>>>>
>>>> COUNTER_MUTATION             0
>>>>
>>>> BATCH_STORE                  0
>>>>
>>>> BATCH_REMOVE                 0
>>>>
>>>> REQUEST_RESPONSE             0
>>>>
>>>> PAGED_RANGE                  0
>>>>
>>>> READ_REPAIR                  0
>>>>
>>>>
>>>>
>>>> -dan
>>>> The contents of this e-mail are intended for the named addressee only.
>>>> It contains information that may be confidential. Unless you are the named
>>>> addressee or an authorized designee, you may not copy or use it, or
>>>> disclose it to anyone else. If you received it in error please notify us
>>>> immediately and then destroy it. Dynatrace Austria GmbH (registration
>>>> number FN 91482h) is a company registered in Linz whose registered office
>>>> is at 4040 Linz, Austria, Freistädterstraße 313
>>>> <https://maps.google.com/?q=4040+Linz,+Austria,+Freist%C3%A4dterstra%C3%9Fe+313&entry=gmail&source=g>
>>>>
>>>
>>>
>>>
>>> --
>>> Dan Kinder
>>> Principal Software Engineer
>>> Turnitin – www.turnitin.com
>>> dkinder@turnitin.com
>>>
>>
>>
>
>
> --
> Dan Kinder
> Principal Software Engineer
> Turnitin – www.turnitin.com
> dkinder@turnitin.com
>



-- 
Dan Kinder
Principal Software Engineer
Turnitin – www.turnitin.com
dkinder@turnitin.com

Mime
View raw message