accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-3775) Root tablet had 6,974 walogs
Date Thu, 07 May 2015 16:59:00 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14533002#comment-14533002
] 

Josh Elser commented on ACCUMULO-3775:
--------------------------------------

Thanks for the reviews, Keith and Eric!

> Root tablet had 6,974 walogs
> ----------------------------
>
>                 Key: ACCUMULO-3775
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3775
>             Project: Accumulo
>          Issue Type: Bug
>         Environment: Same as ACCUMULO-3774
>            Reporter: Keith Turner
>            Assignee: Eric Newton
>            Priority: Blocker
>              Labels: 1.7.0_QA
>             Fix For: 1.7.0
>
>         Attachments: ACCUMULO-3775-3.patch, ACCUMULO_3775-01.patch, ACCUMULO_3775_02.patch
>
>
> Before the deadlock described in ACCUMULO-3774, the root tablet recovered 6,974  walogs.
  Almost all of theses were empty.  Before the tserver was killed there were thousands of
messages like the following (I think this was caused by datanode agitation).  
> {noformat}
> 2015-05-05 18:02:43,236 [log.TabletServerLogger] INFO : Using next log hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/a13aee79-c313-4298-b55a-8ec58ffb977c
> 2015-05-05 18:02:43,236 [log.TabletServerLogger] DEBUG: Creating next WAL
> 2015-05-05 18:02:43,236 [tserver.TabletServer] INFO : Writing log marker for level ROOT
hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/a13aee79-c313-4298-b55a-8ec58ffb977c
> 2015-05-05 18:02:43,236 [log.DfsLogger] DEBUG: Address is worker10:9997
> 2015-05-05 18:02:43,236 [log.DfsLogger] DEBUG: DfsLogger.open() begin
> 2015-05-05 18:02:43,236 [util.MetadataTableUtil] DEBUG: Adding log entry hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/a13aee79-c313-4298-b55a-8ec58ffb977c
> 2015-05-05 18:02:43,237 [fs.VolumeManagerImpl] DEBUG: creating hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/295244ee-c9e3-404f-a3d8-9569e41ba8e1
with CreateFlag set: [CREATE, SYNC_BLOCK]
> 2015-05-05 18:02:43,246 [tserver.TabletServer] INFO : Writing log marker for level NORMAL
hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/a13aee79-c313-4298-b55a-8ec58ffb977c
> 2015-05-05 18:02:43,247 [util.MetadataTableUtil] DEBUG: Adding log entry hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/a13aee79-c313-4298-b55a-8ec58ffb977c
> 2015-05-05 18:02:43,247 [log.DfsLogger] DEBUG: No enciphering, using raw output stream
> 2015-05-05 18:02:43,247 [log.DfsLogger] DEBUG: Got new write-ahead log: worker10:9997/hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/295244ee-c9e3-404f-a3d8-9569e41ba8e1
> 2015-05-05 18:02:43,250 [hdfs.DFSClient] WARN : DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /accumulo/wal/worker10+9997/a13aee79-c313-4298-b55a-8ec58ffb977c
could only be replicated to 2 nodes instead of minReplication (=3).  There are 16 datanode(s)
running and n
> o node(s) are excluded in this operation.
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1550)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3067)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:722)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1476)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1407)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
>         at com.sun.proxy.$Proxy15.addBlock(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>         at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy16.addBlock(Unknown Source)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1430)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1226)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
> {noformat}
> {noformat}
> 2015-05-05 18:02:43,352 [log.TabletServerLogger] INFO : Using next log hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/295244ee-c9e3-404f-a3d8-9569e41ba8e1
> 2015-05-05 18:02:43,352 [log.TabletServerLogger] DEBUG: Creating next WAL
> 2015-05-05 18:02:43,352 [tserver.TabletServer] INFO : Writing log marker for level ROOT
hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/295244ee-c9e3-404f-a3d8-9569e41ba8e1
> 2015-05-05 18:02:43,352 [log.DfsLogger] DEBUG: Address is worker10:9997
> 2015-05-05 18:02:43,352 [log.DfsLogger] DEBUG: DfsLogger.open() begin
> 2015-05-05 18:02:43,353 [util.MetadataTableUtil] DEBUG: Adding log entry hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/295244ee-c9e3-404f-a3d8-9569e41ba8e1
> 2015-05-05 18:02:43,353 [fs.VolumeManagerImpl] DEBUG: creating hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/1810b018-26e3-4728-bbab-e3d901e3edd3
with CreateFlag set: [CREATE, SYNC_BLOCK]
> 2015-05-05 18:02:43,362 [log.DfsLogger] DEBUG: No enciphering, using raw output stream
> 2015-05-05 18:02:43,362 [log.DfsLogger] DEBUG: Got new write-ahead log: worker10:9997/hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/1810b018-26e3-4728-bbab-e3d901e3edd3
> 2015-05-05 18:02:43,366 [log.TabletServerLogger] DEBUG: Created next WAL hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/1810b018-26e3-4728-bbab-e3d901e3edd3
> 2015-05-05 18:02:43,366 [hdfs.DFSClient] WARN : DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /accumulo/wal/worker10+9997/295244ee-c9e3-404f-a3d8-9569e41ba8e1
could only be replicated to 2 nodes instead of minReplication (=3).  There are 16 datanode(s)
running and no node(s) are excluded in this operation.
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1550)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3067)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:722)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1476)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1407)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
>         at com.sun.proxy.$Proxy15.addBlock(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>         at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy16.addBlock(Unknown Source)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1430)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1226)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message