cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nate McCall <zznat...@gmail.com>
Subject Re: Re[3]: Histogram error "Unable to compute ceiling for max when histogram overflowed"
Date Thu, 20 Oct 2016 20:56:09 GMT
Open a new issue and link to CASSANDRA-11063. Including a test case
addressing your issue that fails after the 11063 change would be ideal
as well.

Either way, thanks for the continued attention on this.

On Fri, Oct 21, 2016 at 4:06 AM, Владимир Бухтояров
<jsecoder@mail.ru.invalid> wrote:
> I have investigated the problem and found that monitoring was serriously changed since
3.7(version when I got exception in com.codahale.metrics.servlets.MetricsServlet). Since version
3.9 it is enough to change behavior of DecayingEstimatedHistogramReservoir, the EstimatedHistogram
should stay unchanged. The modification of DecayingEstimatedHistogramReservoir will be safe,
because in opposite to EstimatedHistogram, the DecayingEstimatedHistogramReservoir is not
used for Cassandra internal needs.
>
> Also I found very strange resolution of  issue  CASSANDRA-12185 - the nothing done to
prevent of IllegalStateException, but issue is closed. Should I reopen #12185 or deliver pull
request in new issue?
>
>
> Best regards,
> Bukhtoyarov Vladimir
> email jsecoder@mail.ru
> skype live:fanat-tdd
> Github: https://github.com/vladimir-bukhtoyarov
> mobile +79618096798
>
>>Среда, 19 октября 2016, 21:12 +03:00 от Владимир Бухтояров
<jsecoder@mail.ru.INVALID>:
>>
>>The null(zero) values of snapshot are useless for problem analysing, because it is
impossible to distinguishing case when there are no events from case when events were dispatched
too slow. I do not see any criminal to return  999-th percentile as 3h when histogram configured
with 3h max and any latency is 4h.
>>
>>
>>Best regards,
>>Bukhtoyarov Vladimir
>>email  jsecoder@mail.ru
>>skype live:fanat-tdd
>>Github:  https://github.com/vladimir-bukhtoyarov
>>mobile  +79618096798
>>
>>>Среда, 19 октября 2016, 20:17 +03:00 от Ken Hancock < ken.hancock@schange.com
>:
>>>
>>>I would suggest metrics should return null values instead of false values.
>>>
>>>On Wed, Oct 19, 2016 at 12:21 PM, Владимир Бухтояров <
>>> jsecoder@mail.ru.invalid > wrote:
>>>
>>>>
>>>> Hi to all,
>>>>
>>>> I want to fix  https://issues.apache.org/jira/browse/CASSANDRA-11063
>>>> This issue is very ugly for me, because when something works slow then it
>>>> is impossible to capture metrics and save it to monitoring database for
>>>> future investigation. Moreover when one histogram throw exception then many
>>>> metrics-exporters are unable to export metrics for whole MetricRegistry(for
>>>> example MetricsServlet), so when overflow happen in one histogram then I
>>>> have no history data at all.
>>>>
>>>> I propose to implement the following changes:
>>>> 1. The DecayingEstimatedHistogramReservoir and EstimatedHistogram will
>>>> return maximum trackable value instead of Long.MAX_VALUE
>>>> 2. The DecayingEstimatedHistogramReservoir and EstimatedHistogram will
>>>> never throw IllegalStateException, instead, it will use maximum trackable
>>>> value as regular value in percentile and average calculation.
>>>> 3.  If anybody want to save old behavior(prefer to crash instead of
>>>> inaccurate reporting) then I can add configuration parameter to save
>>>> previous behavior, moreover I can leave old behavior as default, for my
>>>> needs it will be enough to have some option to avoid crashes.
>>>>
>>>>
>>>> Best regards,
>>>> Bukhtoyarov Vladimir
>>>> email  jsecoder@mail.ru
>>>> skype live:fanat-tdd
>>>> Github:  https://github.com/vladimir-bukhtoyarov
>>>>
>>
>

Mime
View raw message