phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <>
Subject [jira] [Commented] (PHOENIX-3062) JMXCacheBuster restarting the metrics system causes PhoenixTracingEndToEndIT to hang
Date Sat, 04 Mar 2017 00:07:45 GMT


Enis Soztutar commented on PHOENIX-3062:

{{TraceMetricSource}} javadoc explains some, but from what I remember, the htrace works by
sending all the traces to the configured {{SpanReceiver}}. So all of the hdfs + hbase and
phoenix traces go to the same SpanReceiver. {{TraceMetricSource}} implements the SpanReceiver,
and forwards the spans to the metrics system. The {{PhoenixMetricsSink}} periodically runs
via the metrics subsystem, and gets the buffered traces via the getMetrics() call. Then it
issues the Phoenix writes. 
As long as we still implement the SpanReceiver, the metrics will be collected from all sources
(hdfs,hbase,phoenix). We just need to remove the metrics dependency by forking a scheduled
thread for the {{PhoenixMetricsSink}}, and also put a limited buffered queue or something
where the traces will be dropped if we cannot keep up. Should be an easy patch. 

> JMXCacheBuster restarting the metrics system causes PhoenixTracingEndToEndIT to hang
> ------------------------------------------------------------------------------------
>                 Key: PHOENIX-3062
>                 URL:
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 4.10.0
>         Attachments: phoenix-3062_v1.patch
> With some recent fixes in the hbase metrics system, we are now affectively restarting
the metrics system (in HBase-1.3.0, probably not affecting 1.2.0). Since we use a custom sink
in the PhoenixTracingEndToEndIT, restarting the metrics system loses the registered sink thus
causing a hang. 
> We need a fix in HBase, and Phoenix so that we will not restart the metrics during tests.

> Thanks to [~sergey.soldatov] for analyzing the initial root cause of the hang. 
> See HBASE-14166 and others. 

This message was sent by Atlassian JIRA

View raw message