accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-3421) DistributedTrace.enable will eat exceptions about failing to connect to ZK
Date Mon, 15 Dec 2014 17:07:13 GMT


Josh Elser commented on ACCUMULO-3421:

[~billie.rinaldi] brought up a good point to me in chat that we might not want the exception
to propagate back up to the client. If there is a spurious ZK exception, or ZK is just unavailable
at the moment, it's probably undesirable to tank the application. However, this currently
doesn't happen because we rely on a watcher to update ZooTraceClient which isn't set if we
fail to talk to ZK in the initialization.

Perhaps we need to watch for the exception and start some timer thread to re-attempt the connection
to ZK. After we connect successfully (get the hosts, if any, the first time) the Watcher should
be sufficient.

> DistributedTrace.enable will eat exceptions about failing to connect to ZK
> --------------------------------------------------------------------------
>                 Key: ACCUMULO-3421
>                 URL:
>             Project: Accumulo
>          Issue Type: Bug
>          Components: test
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>             Fix For: 1.7.0
> From a failed TracerRecoversAfterOfflineTableIT
> {noformat}
> java.lang.RuntimeException: Failed to connect to zookeeper (localhost:2181) within 2x
zookeeper timeout period 30000
> 	at org.apache.accumulo.fate.zookeeper.ZooSession.connect(
> 	at org.apache.accumulo.fate.zookeeper.ZooSession.getSession(
> 	at org.apache.accumulo.fate.zookeeper.ZooReader.getSession(
> 	at org.apache.accumulo.fate.zookeeper.ZooReader.getZooKeeper(
> 	at org.apache.accumulo.fate.zookeeper.ZooReader.exists(
> 	at org.apache.accumulo.tracer.ZooTraceClient.process(
> 	at org.apache.accumulo.tracer.ZooTraceClient.configure(
> 	at org.apache.accumulo.core.trace.DistributedTrace.loadInstance(
> 	at org.apache.accumulo.core.trace.DistributedTrace.loadSpanReceivers(
> 	at org.apache.accumulo.core.trace.DistributedTrace.enableTracing(
> 	at org.apache.accumulo.core.trace.DistributedTrace.enable(
> 	at org.apache.accumulo.core.trace.DistributedTrace.enable(
> 	at org.apache.accumulo.test.TracerRecoversAfterOfflineTableIT.test(
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> 	at java.lang.reflect.Method.invoke(
> 	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(
> 	at
> 	at org.junit.runners.model.FrameworkMethod.invokeExplosively(
> 	at org.junit.internal.runners.statements.InvokeMethod.evaluate(
> 	at org.junit.internal.runners.statements.RunBefores.evaluate(
> 	at org.junit.internal.runners.statements.RunAfters.evaluate(
> 	at org.junit.internal.runners.statements.FailOnTimeout$
> {noformat}
> The problem is that {{org.apache.accumulo.tracer.ZooTraceClient.process(}}
eats the Exception and it doesn't propagate back up the stack through {{org.apache.accumulo.tracer.ZooTraceClient.configure(}}.
Thus, the test just saw a "successful" call to DistributedTrace.enable, tried to run the test,
which ultimately failed because tracing wasn't actually enabled.
> I think we need to make sure that such an exception propagates back to the caller.

This message was sent by Atlassian JIRA

View raw message