cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vladimir Bukhtoyarov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-13223) Unable to compute when histogram overflowed
Date Thu, 02 Mar 2017 16:36:45 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892556#comment-15892556
] 

Vladimir Bukhtoyarov commented on CASSANDRA-13223:
--------------------------------------------------

{quote}
But if we just ignore the overflow-counter, that means we would possibly loose a lot information.
{quote}
I do not ignore big latency record, I write the maximum value of resolution, so at least this
anomaly can be observed on monitoring screens. But, you raised a good point, I think I must
to add an error message to log(including stacktrace), because histogram for which overflown
can not be monitored by end user. Log message should be with stacktrace because just logging
fact that histogram overflown is useless because histogram has no an ID, so without stacktrace
it would be imposiible to determine which concrete histogram is overflown. Are you agree?

{quote}
I don't think we want to drop the testDecayingMean test case and the corresponding lines in
the EstimatedHistogramReservoirSnapshot ctor. See CASSANDRA-12876 for details
{quote}
Ok, I will restore this test tomorrow.




> Unable to compute when histogram overflowed
> -------------------------------------------
>
>                 Key: CASSANDRA-13223
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13223
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Vladimir Bukhtoyarov
>            Priority: Minor
>
> DecayingEstimatedHistogramReservoir throws exception when value upper max recorded to
reservoir. It is very undesired behavior, because functionality like logging or monitoring
should never fail with exception. Current behavior of DecayingEstimatedHistogramReservoir
violates contract for [Reservoir|https://github.com/dropwizard/metrics/blob/3.2-development/metrics-core/src/main/java/com/codahale/metrics/Reservoir.java],
as you can see javadocs for Reservoir says nothing that implementation can throw exception
in getSnapshot method. As result all Dropwizzard/Metrics reporters are broken, because nobody
expect that metric will throw exception on get, for example our monitoring pipeline is broken
with exception:
> {noformat}
> com.fasterxml.jackson.databind.JsonMappingException: Unable to compute when histogram
overflowed (through reference chain: java.util.UnmodifiableSortedMap["org.apache.cassandra.metrics.Table
> .ColUpdateTimeDeltaHistogram.all"])
>         at com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:339)
>         at com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:299)
>         at com.fasterxml.jackson.databind.ser.std.StdSerializer.wrapAndThrow(StdSerializer.java:342)
>         at com.fasterxml.jackson.databind.ser.std.MapSerializer.serializeFields(MapSerializer.java:620)
>         at com.fasterxml.jackson.databind.ser.std.MapSerializer.serialize(MapSerializer.java:519)
>         at com.fasterxml.jackson.databind.ser.std.MapSerializer.serialize(MapSerializer.java:31)
>         at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:130)
>         at com.fasterxml.jackson.databind.ObjectMapper.writeValue(ObjectMapper.java:2436)
>         at com.fasterxml.jackson.core.base.GeneratorBase.writeObject(GeneratorBase.java:355)
>         at com.fasterxml.jackson.core.JsonGenerator.writeObjectField(JsonGenerator.java:1442)
>         at com.codahale.metrics.json.MetricsModule$MetricRegistrySerializer.serialize(MetricsModule.java:188)
>         at com.codahale.metrics.json.MetricsModule$MetricRegistrySerializer.serialize(MetricsModule.java:171)
>         at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:130)
>         at com.fasterxml.jackson.databind.ObjectWriter$Prefetch.serialize(ObjectWriter.java:1428)
>         at com.fasterxml.jackson.databind.ObjectWriter._configAndWriteValue(ObjectWriter.java:1129)
>         at com.fasterxml.jackson.databind.ObjectWriter.writeValue(ObjectWriter.java:967)
>         at com.codahale.metrics.servlets.MetricsServlet.doGet(MetricsServlet.java:176)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>         at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:845)
>         at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1689)
>         at com.ringcentral.slf4j.CleanMDCFilter.doFilter(CleanMDCFilter.java:18)
>         at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1676)
>         at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
>         at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>         at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
>         at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>         at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>         at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
>         at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>         at org.eclipse.jetty.server.Server.handle(Server.java:524)
>         at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:319)
>         at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:253)
>         at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
>         at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
>         at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>         at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>         at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>         at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>         at com.ringcentral.concurrent.executors.MonitoredExecutor$MonitoredRunnable.run(MonitoredExecutor.java:220)
>         at com.ringcentral.concurrent.executors.ContextAwareExecutor$ContextAwareRunnable.run(ContextAwareExecutor.java:34)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalStateException: Unable to compute when histogram overflowed
>         at org.apache.cassandra.metrics.DecayingEstimatedHistogramReservoir$EstimatedHistogramReservoirSnapshot.getMean(DecayingEstimatedHistogramReservoir.java:472)
>         at com.codahale.metrics.json.MetricsModule$HistogramSerializer.serialize(MetricsModule.java:72)
>         at com.codahale.metrics.json.MetricsModule$HistogramSerializer.serialize(MetricsModule.java:56)
>         at com.fasterxml.jackson.databind.ser.std.MapSerializer.serializeFields(MapSerializer.java:616)
>         ... 40 common frames omitted
> {noformat}
> The most oblivious solution to resolve overflow, will be replacing the exceeded value
by highest trackable value, [similar to this|https://github.com/vladimir-bukhtoyarov/rolling-metrics/blob/master/src/main/java/com/github/rollingmetrics/histogram/OverflowResolver.java#L34]
> I have implemented the fix in the [branch on github|https://github.com/vladimir-bukhtoyarov/cassandra/tree/fix-reservoir-overflow],
see [particular commit | https://github.com/vladimir-bukhtoyarov/cassandra/commit/4dca54c1000576a892a77bc716f87adc4bc05ecc]
for details.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message