cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thunder Stumpges <thunder.stump...@gmail.com>
Subject Re: delay/stall processing reads
Date Tue, 07 Jan 2014 19:20:48 GMT
Apologies I sent that prior email too soon. I see exceptions in the log
during nodetool repair like these:
ERROR [AntiEntropySessions:1] 2014-01-06 21:46:45,339 RepairSession.java
(line 278) [repair #179e84f0-775f-11e3-9c6a-538c9374b226] session completed
with the following error
org.apache.cassandra.exceptions.RepairException: [repair
#179e84f0-775f-11e3-9c6a-538c9374b226 on test_video/videosearchresult,
(9124197757189535229,9176696932386753491]] Validation failed in /10.20.2.11
    at
org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:152)
    at
org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:188)
    at
org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:59)
    at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
    at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)

and on the server mentioned above:

ERROR [ValidationExecutor:1] 2014-01-06 21:08:35,261 CassandraDaemon.java
(line 187) Exception in thread Thread[ValidationExecutor:1,1,main]
java.lang.AssertionError
    at
org.apache.cassandra.db.compaction.PrecompactedRow.update(PrecompactedRow.java:171)
    at org.apache.cassandra.repair.Validator.rowHash(Validator.java:198)
    at org.apache.cassandra.repair.Validator.add(Validator.java:151)
    at
org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:799)
    at
org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:62)
    at
org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:397)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)

I searched on these exceptions and it appears someone else was having a
similar problem where upgrading to 2.0.3 fixed it (we are currently 2.0.2)

http://www.mail-archive.com/user@cassandra.apache.org/msg33647.html

We will be upgrading today. After a restart, memory usage was normal, GC
shows primarily only the new space collections, and I see none of the CMS
collections either. My read query delays are gone now as well. Not sure
exactly what happened but I will continue to monitor memory usage and GC
stats.

Thanks again everyone.
Thunder




On Tue, Jan 7, 2014 at 11:11 AM, Thunder Stumpges <
thunder.stumpges@gmail.com> wrote:

> Thanks Lee,
> I put these in place on one of the three servers, and while doing so I
> noticed in the logs on the other servers that there were many errors from
> our weekly nodetool repair of this form:
>
>
>
> On Thu, Jan 2, 2014 at 9:34 PM, Lee Mighdoll <lee@underneath.ca> wrote:
>
>> Well from what I see in system.log it does not appear that GC aligns with
>>> this delay.
>>>
>> Though it does seem like quite a few GCs take place. Here is my
>>> system.log around the time of the delay:
>>>
>>
>> It does sound like a lot of CMS runs - you'd like most of your garbage to
>> be collected in new space (ParallelNew) not in old space (CMS).  It also
>> looks like your heap is staying fairly full - either CMS isn't recovering
>> much (most likely) or you're generating a lot of garbage.  I'm not familiar
>> with the GCInspector logs, though, so I may be guessing too far.
>>
>> You can get gold standard gc logs by passing these parameters to the java
>> command when you launch cassandra:
>>
>> -Xloggc:/path/to/where/to/put/the/gc.log
>> -XX:+PrintGCDetails
>> -XX:+PrintGCDateStamps
>> -XX:+PrintHeapAtGC
>> -XX:+PrintTenuringDistribution
>> -XX:+PrintGCApplicationStoppedTime
>> -XX:+PrintPromotionFailure
>>
>> Once you have good logs, more people (and tools) can help interpret them.
>>
>>
>> Cheers,
>> Lee
>>
>>
>>>
>>>  INFO [ScheduledTasks:1] 2014-01-02 12:30:22,164 GCInspector.java (line
>>> 116) GC for ConcurrentMarkSweep: 479 ms for 1 collections, 7417499584 used;
>>> max is 8375238656
>>>  INFO [ScheduledTasks:1] 2014-01-02 12:30:37,579 GCInspector.java (line
>>> 116) GC for ConcurrentMarkSweep: 557 ms for 1 collections, 7663015624 used;
>>> max is 8375238656
>>>  INFO [ScheduledTasks:1] 2014-01-02 12:31:50,416 GCInspector.java (line
>>> 116) GC for ConcurrentMarkSweep: 633 ms for 1 collections, 6053238384used; max
is 8375238656
>>>  INFO [ScheduledTasks:1] 2014-01-02 12:32:06,525 GCInspector.java (line
>>> 116) GC for ConcurrentMarkSweep: 219 ms for 1 collections, 6430575008 used;
>>> max is 8375238656
>>>  INFO [ScheduledTasks:1] 2014-01-02 12:32:22,548 GCInspector.java (line
>>> 116) GC for ConcurrentMarkSweep: 296 ms for 1 collections, 6732761584 used;
>>> max is 8375238656
>>>  INFO [ScheduledTasks:1] 2014-01-02 12:32:37,870 GCInspector.java (line
>>> 116) GC for ConcurrentMarkSweep: 343 ms for 1 collections, 7020263776 used;
>>> max is 8375238656
>>>  INFO [ScheduledTasks:1] 2014-01-02 12:32:53,116 GCInspector.java (line
>>> 116) GC for ConcurrentMarkSweep: 443 ms for 1 collections, 7276089816used; max
is 8375238656
>>>  INFO [ScheduledTasks:1] 2014-01-02 12:33:08,557 GCInspector.java (line
>>> 116) GC for ConcurrentMarkSweep: 554 ms for 1 collections, 7662839184 used;
>>> max is 8375238656
>>>
>>> The earliest start time on my three delayed requests is 12:32:00.083 and
>>> they all clear out at 12:32:00.517. GC log events occur at 12:31:50 and at
>>> 21:32:06. The above entries are the only log entries in system.log at this
>>> time. Any other ideas?
>>>
>>> Is this type of GC frequency and delay "normal" ?
>>>
>>> thanks,
>>> Thunder
>>>
>>>
>>>
>>>
>>> On Thu, Jan 2, 2014 at 2:27 PM, Robert Coli <rcoli@eventbrite.com>wrote:
>>>
>>>> On Thu, Jan 2, 2014 at 2:24 PM, Thunder Stumpges <
>>>> thunder.stumpges@gmail.com> wrote:
>>>>
>>>>> Excuse my ignorance, but where would I look for the GC info? What logs
>>>>> contain this? I will start looking for log files and more clues in them.
>>>>>
>>>>
>>>> system.log contains some basic info, you can enable extended gc info
>>>> via options to the JVM.
>>>>
>>>> =Rob
>>>>
>>>>
>>>
>>>
>>
>

Mime
View raw message