cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7926) Stress can OOM on merging of timing samples
Date Fri, 12 Dec 2014 13:47:13 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14244153#comment-14244153
] 

Benedict commented on CASSANDRA-7926:
-------------------------------------

For the moment we're comfortable with having fewer than maxSamples - because the space required
is pretty low, we can easily oversize it for our desired accuracy. The main reason for this
decision in the first place was to ensure simplicity (and zero thought to deliver) of merging
samples, so that we have a truly uniform resulting sample. When we have the time to revisit
this we probably want to construct an explicitly biased sample that can track outliers with
greater probability (but tracking their actual incidence), at which point we could also consider
introducing reservoir sampling. In fairness, though, we could very easily switch to reservoir
sampling for the individual/source sample accumulation and use the current (lossier) method
for merging samples.

I've committed with your nits.

> Stress can OOM on merging of timing samples
> -------------------------------------------
>
>                 Key: CASSANDRA-7926
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7926
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Minor
>              Labels: tools
>             Fix For: 2.1.3
>
>
> {noformat}
> Exception in thread "StressMetrics:2" java.lang.OutOfMemoryError: Java heap space
>         at java.util.Arrays.copyOf(Arrays.java:2343)
>         at org.apache.cassandra.stress.util.SampleOfLongs.merge(SampleOfLongs.java:76)
>         at org.apache.cassandra.stress.util.TimingInterval.merge(TimingInterval.java:95)
>         at org.apache.cassandra.stress.util.Timing.snapInterval(Timing.java:95)
>         at org.apache.cassandra.stress.StressMetrics.update(StressMetrics.java:124)
>         at org.apache.cassandra.stress.StressMetrics.access$200(StressMetrics.java:36)
>         at org.apache.cassandra.stress.StressMetrics$1.run(StressMetrics.java:72)
>         at java.lang.Thread.run(Thread.java:744)
> {noformat}
> This is partially down to recently increasing the per-thread sample size, but also because
we allocate temporary space linear in size to total sample size in all threads during merge.
This can easily be avoided. We should also scale per-thread sample size based on total number
of threads, so we limit total memory use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message