cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Branimir Lambov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7926) Stress can OOM on merging of timing samples
Date Wed, 24 Sep 2014 14:49:33 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14146387#comment-14146387
] 

Branimir Lambov commented on CASSANDRA-7926:
--------------------------------------------

Looks good, if we're OK with having somewhat fewer than maxSamples samples a lot of the time.
Otherwise you could use a variation of the [reservoir sampling method|http://en.wikipedia.org/wiki/Reservoir_sampling].

In Timer constructor, for calculating an upper bound for the power of 2 for a given sample
size you can use
{panel}ceil(log2\(x\)) = 32 - numberOfLeadingZeros(x - 1){panel}
(from the [java documentation|http://docs.oracle.com/javase/7/docs/api/java/lang/Integer.html#numberOfLeadingZeros(int)]).

The changes to FasterRandom probably weren't intended to be uploaded? I don't see them used
anywhere in the patch. (Note: nextDouble returns a value in [-1, 1), it should use >>>
11 instead of >> 12.)

> Stress can OOM on merging of timing samples
> -------------------------------------------
>
>                 Key: CASSANDRA-7926
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7926
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Minor
>              Labels: tools
>             Fix For: 2.1.1
>
>
> {noformat}
> Exception in thread "StressMetrics:2" java.lang.OutOfMemoryError: Java heap space
>         at java.util.Arrays.copyOf(Arrays.java:2343)
>         at org.apache.cassandra.stress.util.SampleOfLongs.merge(SampleOfLongs.java:76)
>         at org.apache.cassandra.stress.util.TimingInterval.merge(TimingInterval.java:95)
>         at org.apache.cassandra.stress.util.Timing.snapInterval(Timing.java:95)
>         at org.apache.cassandra.stress.StressMetrics.update(StressMetrics.java:124)
>         at org.apache.cassandra.stress.StressMetrics.access$200(StressMetrics.java:36)
>         at org.apache.cassandra.stress.StressMetrics$1.run(StressMetrics.java:72)
>         at java.lang.Thread.run(Thread.java:744)
> {noformat}
> This is partially down to recently increasing the per-thread sample size, but also because
we allocate temporary space linear in size to total sample size in all threads during merge.
This can easily be avoided. We should also scale per-thread sample size based on total number
of threads, so we limit total memory use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message