cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Per Otterström (JIRA) <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-13223) Unable to compute when histogram overflowed
Date Fri, 03 Mar 2017 12:25:45 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15894242#comment-15894242
] 

Per Otterström commented on CASSANDRA-13223:
--------------------------------------------

bq. I do not ignore big latency record, I write the maximum value of resolution

Right, I missed that. Basically you are merging the count of the highest counting bucket with
the overflow bucket?

It would be nice with some logging on this. The downside I see with that is that this code
is part of many critical code paths, including write-path and read-path. This could potentially
end up with a flooded log files.  

An option to consider would be to count the overflow bucket as Long.MAX_VALUE (or some other
approximation between maximum register value and Long.MAX_VALUE). Just a thought. Then values
would really stand out in graphs, and you would know where to look.

The existing solution with IllegalStateException is actually derived from [EstimatedHistogram|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/utils/EstimatedHistogram.java#L231].
I don't know the full background story on the original behavior. [~brandon.williams], do you
know if there is a particular reason for throwing the IllegalStateException, or would you
be OK with the changes proposed here?


> Unable to compute when histogram overflowed
> -------------------------------------------
>
>                 Key: CASSANDRA-13223
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13223
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Vladimir Bukhtoyarov
>            Priority: Minor
>
> DecayingEstimatedHistogramReservoir throws exception when value upper max recorded to
reservoir. It is very undesired behavior, because functionality like logging or monitoring
should never fail with exception. Current behavior of DecayingEstimatedHistogramReservoir
violates contract for [Reservoir|https://github.com/dropwizard/metrics/blob/3.2-development/metrics-core/src/main/java/com/codahale/metrics/Reservoir.java],
as you can see javadocs for Reservoir says nothing that implementation can throw exception
in getSnapshot method. As result all Dropwizzard/Metrics reporters are broken, because nobody
expect that metric will throw exception on get, for example our monitoring pipeline is broken
with exception:
> {noformat}
> com.fasterxml.jackson.databind.JsonMappingException: Unable to compute when histogram
overflowed (through reference chain: java.util.UnmodifiableSortedMap["org.apache.cassandra.metrics.Table
> .ColUpdateTimeDeltaHistogram.all"])
>         at com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:339)
>         at com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:299)
>         at com.fasterxml.jackson.databind.ser.std.StdSerializer.wrapAndThrow(StdSerializer.java:342)
>         at com.fasterxml.jackson.databind.ser.std.MapSerializer.serializeFields(MapSerializer.java:620)
>         at com.fasterxml.jackson.databind.ser.std.MapSerializer.serialize(MapSerializer.java:519)
>         at com.fasterxml.jackson.databind.ser.std.MapSerializer.serialize(MapSerializer.java:31)
>         at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:130)
>         at com.fasterxml.jackson.databind.ObjectMapper.writeValue(ObjectMapper.java:2436)
>         at com.fasterxml.jackson.core.base.GeneratorBase.writeObject(GeneratorBase.java:355)
>         at com.fasterxml.jackson.core.JsonGenerator.writeObjectField(JsonGenerator.java:1442)
>         at com.codahale.metrics.json.MetricsModule$MetricRegistrySerializer.serialize(MetricsModule.java:188)
>         at com.codahale.metrics.json.MetricsModule$MetricRegistrySerializer.serialize(MetricsModule.java:171)
>         at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:130)
>         at com.fasterxml.jackson.databind.ObjectWriter$Prefetch.serialize(ObjectWriter.java:1428)
>         at com.fasterxml.jackson.databind.ObjectWriter._configAndWriteValue(ObjectWriter.java:1129)
>         at com.fasterxml.jackson.databind.ObjectWriter.writeValue(ObjectWriter.java:967)
>         at com.codahale.metrics.servlets.MetricsServlet.doGet(MetricsServlet.java:176)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>         at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:845)
>         at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1689)
>         at com.ringcentral.slf4j.CleanMDCFilter.doFilter(CleanMDCFilter.java:18)
>         at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1676)
>         at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
>         at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>         at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
>         at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>         at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>         at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
>         at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>         at org.eclipse.jetty.server.Server.handle(Server.java:524)
>         at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:319)
>         at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:253)
>         at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
>         at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
>         at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>         at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>         at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>         at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>         at com.ringcentral.concurrent.executors.MonitoredExecutor$MonitoredRunnable.run(MonitoredExecutor.java:220)
>         at com.ringcentral.concurrent.executors.ContextAwareExecutor$ContextAwareRunnable.run(ContextAwareExecutor.java:34)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalStateException: Unable to compute when histogram overflowed
>         at org.apache.cassandra.metrics.DecayingEstimatedHistogramReservoir$EstimatedHistogramReservoirSnapshot.getMean(DecayingEstimatedHistogramReservoir.java:472)
>         at com.codahale.metrics.json.MetricsModule$HistogramSerializer.serialize(MetricsModule.java:72)
>         at com.codahale.metrics.json.MetricsModule$HistogramSerializer.serialize(MetricsModule.java:56)
>         at com.fasterxml.jackson.databind.ser.std.MapSerializer.serializeFields(MapSerializer.java:616)
>         ... 40 common frames omitted
> {noformat}
> The most oblivious solution to resolve overflow, will be replacing the exceeded value
by highest trackable value, [similar to this|https://github.com/vladimir-bukhtoyarov/rolling-metrics/blob/master/src/main/java/com/github/rollingmetrics/histogram/OverflowResolver.java#L34]
> I have implemented the fix in the [branch on github|https://github.com/vladimir-bukhtoyarov/cassandra/tree/fix-reservoir-overflow],
see [particular commit | https://github.com/vladimir-bukhtoyarov/cassandra/commit/4dca54c1000576a892a77bc716f87adc4bc05ecc]
for details.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message