accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Created] (ACCUMULO-3148) TabletServer didn't get Session expired in HalfDeadTServerIT
Date Thu, 18 Sep 2014 23:43:34 GMT
Josh Elser created ACCUMULO-3148:
------------------------------------

             Summary: TabletServer didn't get Session expired in HalfDeadTServerIT
                 Key: ACCUMULO-3148
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3148
             Project: Accumulo
          Issue Type: Bug
          Components: test
            Reporter: Josh Elser
            Assignee: Josh Elser
             Fix For: 1.6.1, 1.7.0


Beening seeing spurious failures with HalfDeadTServerIT where it doesn't get the ZK session
expiration

{noformat}
2014-09-15 09:39:59,201 [tserver.TabletServer] DEBUG: ScanSess tid 172.31.33.94:35957 !0 0
entries in 0.07 secs, nbTimes = [63 63 63.00 1] 
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
2014-09-15 09:40:20,088 [tserver.TabletServer] FATAL: Lost tablet server lock (reason = LOCK_DELETED),
exiting.
2014-09-15 09:40:20,088 [zookeeper.ZooCache] WARN : Zookeeper error, will retry
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
for /accumulo/d0b9b8e7-9869-4b00-9ae7-317f5231f2c1/tables/1/conf/table.iterator.minc.vers.opt.maxVersions
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
	at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
	at org.apache.accumulo.fate.zookeeper.ZooCache$2.run(ZooCache.java:261)
	at org.apache.accumulo.fate.zookeeper.ZooCache.retry(ZooCache.java:153)
	at org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:277)
	at org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:224)
	at org.apache.accumulo.server.conf.ZooCachePropertyAccessor.get(ZooCachePropertyAccessor.java:114)
	at org.apache.accumulo.server.conf.ZooCachePropertyAccessor.getProperties(ZooCachePropertyAccessor.java:144)
	at org.apache.accumulo.server.conf.TableConfiguration.getProperties(TableConfiguration.java:108)
	at org.apache.accumulo.core.conf.AccumuloConfiguration.iterator(AccumuloConfiguration.java:69)
	at org.apache.accumulo.core.conf.ConfigSanityCheck.validate(ConfigSanityCheck.java:40)
	at org.apache.accumulo.server.conf.ServerConfigurationFactory.getTableConfiguration(ServerConfigurationFactory.java:155)
	at org.apache.accumulo.server.conf.ServerConfiguration.getTableConfiguration(ServerConfiguration.java:69)
	at org.apache.accumulo.tserver.TabletServer.getTableConfiguration(TabletServer.java:3983)
	at org.apache.accumulo.tserver.Tablet.<init>(Tablet.java:1277)
	at org.apache.accumulo.tserver.Tablet.<init>(Tablet.java:1256)
	at org.apache.accumulo.tserver.Tablet.<init>(Tablet.java:1112)
	at org.apache.accumulo.tserver.Tablet.<init>(Tablet.java:1089)
	at org.apache.accumulo.tserver.TabletServer$AssignmentHandler.run(TabletServer.java:2935)
	at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
	at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
	at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
	at java.lang.Thread.run(Thread.java:745)
2014-09-15 09:40:20,090 [tserver.TabletServer] WARN : Check for long GC pauses not called
in a timely fashion. Expected every 5.0 seconds but was 16.3 seconds since last check
2014-09-15 09:40:20,477 [datanode.DataNode] ERROR: 127.0.0.1:57185:DataXceiver error processing
WRITE_BLOCK operation  src: /127.0.0.1:42146 dst: /127.0.0.1:57185
java.io.IOException: Premature EOF from inputStream
	at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:194)
	at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
	at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
	at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:467)
	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:771)
	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:718)
	at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:126)
	at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:72)
	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:225)
	at java.lang.Thread.run(Thread.java:745)
{noformat}

It looks like the tserver killed itself after the connection loss but before the tserver retried
to connect and got the session expiration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message