accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (ACCUMULO-1916) Hung TServer during CI with agitation
Date Sun, 24 Nov 2013 05:14:35 GMT

     [ https://issues.apache.org/jira/browse/ACCUMULO-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Josh Elser resolved ACCUMULO-1916.
----------------------------------

    Resolution: Not A Problem

> Hung TServer during CI with agitation
> -------------------------------------
>
>                 Key: ACCUMULO-1916
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1916
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>         Environment: hdp-2.0 (apache hadoop 2.2.0), Accumulo 1.5.1-SNAPSHOT (11/14/2013
timeframe)
>            Reporter: Josh Elser
>         Attachments: jstack.1, jstack.2, jstack.3
>
>
> Ran continuous ingest on a 6 node system for ~18hrs with full agitation (datanode, tserver
and master/gc).
> Checked on the system and noticed that queries were still running but ingest was hung.
A single tabletserver believed that any time it tried to create a new WAL file that it couldn't
be replicated.
> {noformat}
> 2013-11-21 10:45:45,998 [tabletserver.LargestFirstMemoryManager] DEBUG: COMPACTING 9;7c905f;7c387d
 total = 1,787,421,387 ingestMemory = 1,787,421,387
> 2013-11-21 10:45:45,998 [tabletserver.LargestFirstMemoryManager] DEBUG: chosenMem = 57,180,830
chosenIT = 0.15 load 57,187,348
> 2013-11-21 10:45:46,000 [tabletserver.NativeMap] DEBUG: Allocated native map 0x000000000151c0e0
> 2013-11-21 10:45:46,000 [tabletserver.Tablet] DEBUG: MinC initiate lock 0.00 secs
> 2013-11-21 10:45:46,001 [tabletserver.MinorCompactor] DEBUG: Begin minor compaction /accumulo/tables/9/t-0000skc/F0001bmp.rf_tmp
9;7c905f;7c387d
> 2013-11-21 10:45:46,038 [tabletserver.TabletServer] DEBUG: UpSess 192.168.56.172:50599
23,482 in 0.765s, at=[0 3 0.05 63] ft=0.619s(pt=0.013s lt=0.413s ct=0.193s)
> 2013-11-21 10:45:46,252 [tabletserver.LargestFirstMemoryManager] DEBUG: BEFORE compactionThreshold
= 0.834 maxObserved = 1,815,829,128
> 2013-11-21 10:45:46,253 [tabletserver.LargestFirstMemoryManager] DEBUG: AFTER compactionThreshold
= 0.834
> 2013-11-21 10:45:46,900 [tabletserver.TabletServer] DEBUG: gc ParNew=287.26(+0.07) secs
ConcurrentMarkSweep=19.96(+0.00) secs freemem=622,040,112(+393,782,976) totalmem=1,021,313,024
> 2013-11-21 10:45:47,965 [hdfs.DFSClient] WARN : DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /accumulo/wal/192.168.56.172+9997/087cf8a0-c1c1-41e2-a2cb-343afb9dd9e8
could only be replicated to 0 nodes instead of minReplication (=1).  There are 5 datanode(s)
running and no node(s) are excluded in this operation.
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2503)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1226)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1078)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)
> 2013-11-21 10:45:47,968 [hdfs.DFSClient] WARN : Error while syncing
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /accumulo/wal/192.168.56.172+9997/087cf8a0-c1c1-41e2-a2cb-343afb9dd9e8
could only be replicated to 0 nodes instead of minReplication (=1).  There are 5 datanode(s)
running and no node(s) are excluded in this operation.
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2503)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1226)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1078)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)
> 2013-11-21 10:45:47,999 [log.DfsLogger] WARN : Exception syncing java.lang.reflect.InvocationTargetException
> 2013-11-21 10:45:48,002 [log.TabletServerLogger] ERROR: Unexpected error writing to log,
retrying attempt 1
> java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
>         at org.apache.accumulo.server.tabletserver.log.DfsLogger$LoggerOperation.await(DfsLogger.java:178)
>         at org.apache.accumulo.server.tabletserver.log.TabletServerLogger.write(TabletServerLogger.java:279)
>         at org.apache.accumulo.server.tabletserver.log.TabletServerLogger.logManyTablets(TabletServerLogger.java:362)
>         at org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler.flush(TabletServer.java:1552)
>         at org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler.applyUpdates(TabletServer.java:1461)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.accumulo.trace.instrument.thrift.TraceWrap$1.invoke(TraceWrap.java:63)
>         at com.sun.proxy.$Proxy10.applyUpdates(Unknown Source)
>         at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$applyUpdates.getResult(TabletClientService.java:2080)
>         at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$applyUpdates.getResult(TabletClientService.java:2066)
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>         at org.apache.accumulo.server.util.TServerUtils$TimedProcessor.process(TServerUtils.java:156)
>         at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:478)
>         at org.apache.accumulo.server.util.TServerUtils$THsHaServer$Invocation.run(TServerUtils.java:208)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
>         at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.reflect.InvocationTargetException
>         at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.accumulo.server.tabletserver.log.DfsLogger$LogSyncingTask.run(DfsLogger.java:116)
>         ... 1 more
> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /accumulo/wal/192.168.56.172+9997/087cf8a0-c1c1-41e2-a2cb-343afb9dd9e8
could only be replicated to 0 nodes instead of minReplication (=1).  There are 5 datanode(s)
running and no node(s) are excluded in this operation.
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2503)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1226)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1078)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)
> 2013-11-21 10:45:48,004 [log.DfsLogger] WARN : Exception syncing java.lang.reflect.InvocationTargetException
> 2013-11-21 10:45:49,511 [log.TabletServerLogger] ERROR: Unexpected error writing to log,
retrying attempt 1
> java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
>         at org.apache.accumulo.server.tabletserver.log.DfsLogger$LoggerOperation.await(DfsLogger.java:178)
>         at org.apache.accumulo.server.tabletserver.log.TabletServerLogger.write(TabletServerLogger.java:279)
>         at org.apache.accumulo.server.tabletserver.log.TabletServerLogger.logManyTablets(TabletServerLogger.java:362)
>         at org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler.flush(TabletServer.java:1552)
>         at org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler.applyUpdates(TabletServer.java:1461)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.accumulo.trace.instrument.thrift.TraceWrap$1.invoke(TraceWrap.java:63)
>         at com.sun.proxy.$Proxy10.applyUpdates(Unknown Source)
>         at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$applyUpdates.getResult(TabletClientService.java:2080)
>         at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$applyUpdates.getResult(TabletClientService.java:2066)
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>         at org.apache.accumulo.server.util.TServerUtils$TimedProcessor.process(TServerUtils.java:156)
>         at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:478)
>         at org.apache.accumulo.server.util.TServerUtils$THsHaServer$Invocation.run(TServerUtils.java:208)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
>         at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.reflect.InvocationTargetException
>         at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.accumulo.server.tabletserver.log.DfsLogger$LogSyncingTask.run(DfsLogger.java:116)
>         ... 1 more
> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /accumulo/wal/192.168.56.172+9997/087cf8a0-c1c1-41e2-a2cb-343afb9dd9e8
could only be replicated to 0 nodes instead of minReplication (=1).  There are 5 datanode(s)
running and no node(s) are excluded in this operation.
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2503)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1226)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1078)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)
> 2013-11-21 10:45:49,609 [log.DfsLogger] ERROR: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
File /accumulo/wal/192.168.56.172+9997/087cf8a0-c1c1-41e2-a2cb-343afb9dd9e8 could only be
replicated to 0 nodes instead of minReplication (=1).  There are 5 datanode(s) running and
no node(s) are excluded in this operation.
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2503)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)
> {noformat}
> After this, it appears that the tabletserver would give up on that file name, and then
begin to try again with a new file
> {noformat}
> 2013-11-21 10:45:49,610 [log.DfsLogger] DEBUG: DfsLogger.open() begin
> 2013-11-21 10:45:49,610 [log.DfsLogger] DEBUG: Found CREATE enum CREATE
> 2013-11-21 10:45:49,611 [log.DfsLogger] DEBUG: Found synch enum SYNC_BLOCK
> 2013-11-21 10:45:49,611 [log.DfsLogger] DEBUG: CreateFlag set: [CREATE, SYNC_BLOCK]
> 2013-11-21 10:45:49,611 [log.DfsLogger] DEBUG: creating /accumulo/wal/192.168.56.172+9997/c44258a0-2f42-42e7-a43b-8b1553722ad3
with SYNCH_BLOCK flag
> 2013-11-21 10:45:49,617 [crypto.CryptoModuleFactory] DEBUG: About to instantiate crypto
module NullCryptoModule
> 2013-11-21 10:45:49,625 [hdfs.DFSClient] WARN : DataStreamer Exception
> ..... same exceptions as before but with the new file name.....
> {noformat}
> Tried to get a heap dump from the tablet server, but it ended up OOME'ing. I did get
some stack traces that I'll attach.
> Restarting this tabletserver appears to have resolved the issue.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message