accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vikram Srivastava (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-2154) NoNodeException error in master
Date Sat, 18 Jan 2014 06:43:19 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13875568#comment-13875568
] 

Vikram Srivastava commented on ACCUMULO-2154:
---------------------------------------------

getList() getting called by multiple processes isn't the issue here. This exception comes
happens MasterClientServiceHandler is calling getList and at the same time, one of the other
2 call sites call delete. So we do need to ensure getList() exits before delete() is called.
I've made post() also synchronized so that getList() doesn't miss any dead servers that get
added after zoo.getChildren() is called.


> NoNodeException error in master
> -------------------------------
>
>                 Key: ACCUMULO-2154
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2154
>             Project: Accumulo
>          Issue Type: Bug
>          Components: master
>         Environment: 1.6.0 sha 417902e218c566333b6ea5ac492186ae305e5e16
>            Reporter: John Vines
>            Assignee: Vikram Srivastava
>              Labels: PatchAvailable
>             Fix For: 1.6.0
>
>         Attachments: ACCUMULO-2154.v1.patch.txt
>
>
> I have a test that brings accumulo down hard after a minute and then brings it back up
again. I was running it overnight and I saw this stack trace once. Not sure if it's a problem
or not though.
> {code}org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
for /accumulo/617ee3a7-98b9-4f5f-af13-8894afe7c33c/dead/tservers/10.10.1.148:9997
> 	org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /accumulo/617ee3a7-98b9-4f5f-af13-8894afe7c33c/dead/tservers/10.10.1.148:9997
> 		at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
> 		at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> 		at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
> 		at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1180)
> 		at org.apache.accumulo.fate.zookeeper.ZooReader.getData(ZooReader.java:45)
> 		at org.apache.accumulo.server.master.state.DeadServerList.getList(DeadServerList.java:52)
> 		at org.apache.accumulo.master.MasterClientServiceHandler.getMasterStats(MasterClientServiceHandler.java:268)
> 		at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 		at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> 		at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 		at java.lang.reflect.Method.invoke(Method.java:597)
> 		at org.apache.accumulo.trace.instrument.thrift.TraceWrap$1.invoke(TraceWrap.java:63)
> 		at com.sun.proxy.$Proxy11.getMasterStats(Unknown Source)
> 		at org.apache.accumulo.core.master.thrift.MasterClientService$Processor$getMasterStats.getResult(MasterClientService.java:1414)
> 		at org.apache.accumulo.core.master.thrift.MasterClientService$Processor$getMasterStats.getResult(MasterClientService.java:1398)
> 		at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> 		at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> 		at org.apache.accumulo.server.util.TServerUtils$TimedProcessor.process(TServerUtils.java:171)
> 		at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> 		at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> 		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> 		at java.lang.Thread.run(Thread.java:662){code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message