accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-3421) DistributedTrace.enable will eat exceptions about failing to connect to ZK
Date Tue, 16 Dec 2014 20:07:14 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248818#comment-14248818
] 

Josh Elser commented on ACCUMULO-3421:
--------------------------------------

Final resolution was to change the action of DistributedTrace.enable in a failure condition
to asynchronously retry to connect to ZooKeeper (after the first synchronous connection attempt
which is the same as previously) which will ensure that tracing does eventually get enabled
in situations where ZooKeeper might not be available or reachable for some time.

> DistributedTrace.enable will eat exceptions about failing to connect to ZK
> --------------------------------------------------------------------------
>
>                 Key: ACCUMULO-3421
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3421
>             Project: Accumulo
>          Issue Type: Bug
>          Components: test
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>             Fix For: 1.7.0
>
>         Attachments: 0001-ACCUMULO-3421-Retry-initialization-of-trace-client.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> From a failed TracerRecoversAfterOfflineTableIT
> {noformat}
> java.lang.RuntimeException: Failed to connect to zookeeper (localhost:2181) within 2x
zookeeper timeout period 30000
> 	at org.apache.accumulo.fate.zookeeper.ZooSession.connect(ZooSession.java:118)
> 	at org.apache.accumulo.fate.zookeeper.ZooSession.getSession(ZooSession.java:163)
> 	at org.apache.accumulo.fate.zookeeper.ZooReader.getSession(ZooReader.java:39)
> 	at org.apache.accumulo.fate.zookeeper.ZooReader.getZooKeeper(ZooReader.java:43)
> 	at org.apache.accumulo.fate.zookeeper.ZooReader.exists(ZooReader.java:166)
> 	at org.apache.accumulo.tracer.ZooTraceClient.process(ZooTraceClient.java:82)
> 	at org.apache.accumulo.tracer.ZooTraceClient.configure(ZooTraceClient.java:75)
> 	at org.apache.accumulo.core.trace.DistributedTrace.loadInstance(DistributedTrace.java:184)
> 	at org.apache.accumulo.core.trace.DistributedTrace.loadSpanReceivers(DistributedTrace.java:166)
> 	at org.apache.accumulo.core.trace.DistributedTrace.enableTracing(DistributedTrace.java:143)
> 	at org.apache.accumulo.core.trace.DistributedTrace.enable(DistributedTrace.java:101)
> 	at org.apache.accumulo.core.trace.DistributedTrace.enable(DistributedTrace.java:86)
> 	at org.apache.accumulo.test.TracerRecoversAfterOfflineTableIT.test(TracerRecoversAfterOfflineTableIT.java:77)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> 	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> 	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 	at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}
> The problem is that {{org.apache.accumulo.tracer.ZooTraceClient.process(ZooTraceClient.java:82)}}
eats the Exception and it doesn't propagate back up the stack through {{org.apache.accumulo.tracer.ZooTraceClient.configure(ZooTraceClient.java:75)}}.
Thus, the test just saw a "successful" call to DistributedTrace.enable, tried to run the test,
which ultimately failed because tracing wasn't actually enabled.
> I think we need to make sure that such an exception propagates back to the caller.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message