accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adam Fuchs (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-3272) tserver breaks in a bad way when it can't write to hdfs trash
Date Wed, 29 Oct 2014 14:16:35 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188372#comment-14188372
] 

Adam Fuchs commented on ACCUMULO-3272:
--------------------------------------

In this case I believe it's more than unreferenced cruft hanging around. The second stack
trace seems to indicate that the tserver's view of the current file set includes a file that
doesn't exist, indicating there is a consistency problem after this failure. To recover from
this you have to manually kill the tserver process. I would like to have either a roll-back
after the failure or a roll-forward to a consistent state. Roll-back would put the tserver
and tablet metadata in a state as though the compaction never happened. Roll-forward would
put the tserver and metadata in a state as though the compaction completed successfully. Either
of those would be an acceptable behavior in this case, although roll-forward would necessarily
violate the trash collection policy.

> tserver breaks in a bad way when it can't write to hdfs trash
> -------------------------------------------------------------
>
>                 Key: ACCUMULO-3272
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3272
>             Project: Accumulo
>          Issue Type: Bug
>    Affects Versions: 1.5.0, 1.6.0
>            Reporter: Adam Fuchs
>
> When installing and testing a vanilla install of HDP 2.1 the HDFS setting for fs.trash.interval
is set to 360 by default. Accumulo takes this to mean that it should move deleted files to
the .Trash directory in accumulo's hdfs home directory. In this instance, the home directory
did not exist, which caused a major compaction to fail. The failure happened in such a way
that the internal state of the tserver became inconsistent, preventing automatic recovery
after an admin solved the trash problem.
> The first stack trace below shows the initial problem. The second shows a secondary problem
caused by the poor failure mode.
> {code}
> 2014-10-28 15:05:19,353 [tserver.Tablet] DEBUG: Major compaction plan: [hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/00000_00000.rf,
hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf] propogate
deletes : false
> 2014-10-28 15:05:19,353 [tserver.Tablet] DEBUG: MajC initiate lock 0.00 secs, wait 0.00
secs
> 2014-10-28 15:05:19,356 [tserver.Tablet] DEBUG: Starting MajC +r<< (NORMAL) [hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf,
hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/00000_00000.rf]
--> hdfs://n1.sqrrl-lab.
> net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/A000003r.rf_tmp  []
> 2014-10-28 15:05:19,376 [tserver.TabletServer] DEBUG: Got getScans message from user:
!SYSTEM
> 2014-10-28 15:05:19,432 [tserver.Compactor] DEBUG: Compaction +r<< 28 read | 18
written |    608 entries/sec |  0.046 secs
> 2014-10-28 15:05:19,482 [fs.TrashPolicyDefault] INFO : Namenode trash configuration:
Deletion interval = 360 minutes, Emptier interval = 0 minutes.
> 2014-10-28 15:05:19,491 [fs.TrashPolicyDefault] WARN : Can't create trash directory:
hdfs://n1.sqrrl-lab.net:8020/user/accumulo/.Trash/Current/accumulo_1.6_perf_test/tables/+r/root_tablet
> 2014-10-28 15:05:19,491 [tserver.Tablet] ERROR: MajC Failed, extent = +r<<
> 2014-10-28 15:05:19,491 [tserver.Tablet] ERROR: MajC Failed, message = Failed to move
to trash: hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/delete+A000003r.rf+F000003q.rf
> java.io.IOException: Failed to move to trash: hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/delete+A000003r.rf+F000003q.rf
>         at org.apache.hadoop.fs.TrashPolicyDefault.moveToTrash(TrashPolicyDefault.java:160)
>         at org.apache.hadoop.fs.Trash.moveToTrash(Trash.java:109)
>         at org.apache.accumulo.server.fs.VolumeManagerImpl.moveToTrash(VolumeManagerImpl.java:364)
>         at org.apache.accumulo.tserver.RootFiles.finishReplacement(RootFiles.java:64)
>         at org.apache.accumulo.tserver.RootFiles.replaceFiles(RootFiles.java:75)
>         at org.apache.accumulo.tserver.Tablet$DatafileManager.bringMajorCompactionOnline(Tablet.java:1001)
>         at org.apache.accumulo.tserver.Tablet._majorCompact(Tablet.java:3239)
>         at org.apache.accumulo.tserver.Tablet.majorCompact(Tablet.java:3340)
>         at org.apache.accumulo.tserver.Tablet.access$4800(Tablet.java:172)
>         at org.apache.accumulo.tserver.Tablet$CompactionRunner.run(Tablet.java:2804)
>         at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:42)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:42)
>         at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: org.apache.hadoop.security.AccessControlException: Permission denied: user=accumulo,
access=WRITE, inode="/user":hdfs:hdfs:drwxr-xr-x
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:176)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5509)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5491)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5465)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3608)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3578)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3552)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:760)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:558)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>         at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>         at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>         at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2555)
>         at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2524)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:827)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:823)
>         at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:823)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:816)
>         at org.apache.hadoop.fs.TrashPolicyDefault.moveToTrash(TrashPolicyDefault.java:136)
>         ... 15 more
> Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
Permission denied: user=accumulo, access=WRITE, inode="/user":hdfs:hdfs:drwxr-xr-x
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:176)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5509)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5491)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5465)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3608)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3578)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3552)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:760)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:558)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1410)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1363)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at com.sun.proxy.$Proxy20.mkdirs(Unknown Source)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
>         at com.sun.proxy.$Proxy20.mkdirs(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:500)
>         at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2553)
>         ... 22 more
> {code}
> {code}
> 2014-10-28 15:05:20,558 [tserver.FileManager] ERROR: Failed to open file hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf
File does not exist: /accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf
>         at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
>         at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1728)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1671)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1651)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1625)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:503)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:322)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> 2014-10-28 15:05:20,558 [problems.ProblemReports] DEBUG: Filing problem report +r FILE_READ
hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf
> 2014-10-28 15:05:20,559 [tserver.TabletServer] WARN : exception while scanning tablet
+r<<
> java.io.IOException: Failed to open hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf
>         at org.apache.accumulo.tserver.FileManager.reserveReaders(FileManager.java:334)
>         at org.apache.accumulo.tserver.FileManager.access$500(FileManager.java:59)
>         at org.apache.accumulo.tserver.FileManager$ScanFileManager.openFiles(FileManager.java:491)
>         at org.apache.accumulo.tserver.FileManager$ScanFileManager.openFileRefs(FileManager.java:479)
>         at org.apache.accumulo.tserver.FileManager$ScanFileManager.openFiles(FileManager.java:499)
>         at org.apache.accumulo.tserver.Tablet$ScanDataSource.createIterator(Tablet.java:1980)
>         at org.apache.accumulo.tserver.Tablet$ScanDataSource.iterator(Tablet.java:1942)
>         at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.seek(SourceSwitchingIterator.java:165)
>         at org.apache.accumulo.tserver.Tablet.nextBatch(Tablet.java:1659)
>         at org.apache.accumulo.tserver.Tablet.access$3200(Tablet.java:172)
>         at org.apache.accumulo.tserver.Tablet$Scanner.read(Tablet.java:1799)
>         at org.apache.accumulo.tserver.TabletServer$ThriftClientHandler$NextBatchTask.run(TabletServer.java:1041)
>         at org.apache.accumulo.tserver.TabletServerResourceManager.executeReadAhead(TabletServerResourceManager.java:642)
>         at org.apache.accumulo.tserver.TabletServer$ThriftClientHandler.continueScan(TabletServer.java:1278)
>         at org.apache.accumulo.tserver.TabletServer$ThriftClientHandler.startScan(TabletServer.java:1247)
>         at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.accumulo.trace.instrument.thrift.RpcServerInvocationHandler.invoke(RpcServerInvocationHandler.java:46)
>         at org.apache.accumulo.server.util.RpcWrapper$1.invoke(RpcWrapper.java:44)
>         at com.sun.proxy.$Proxy21.startScan(Unknown Source)
>         at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$startScan.getResult(TabletClientService.java:2179)
>         at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$startScan.getResult(TabletClientService.java:2163)
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>         at org.apache.accumulo.server.util.TServerUtils$TimedProcessor.process(TServerUtils.java:168)
>         at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:516)
>         at org.apache.accumulo.server.util.CustomNonBlockingServer$1.run(CustomNonBlockingServer.java:77)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
>         at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: java.io.FileNotFoundException: File does not exist: /accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf
>         at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
>         at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1728)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1671)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1651)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1625)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:503)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:322)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>         at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>         at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>         at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1144)
>         at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1132)
>         at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1122)
>         at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:264)
>         at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:231)
>         at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:224)
>         at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1295)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:300)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:296)
>         at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:296)
>         at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:764)
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBCFile(CachableBlockFile.java:261)
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.access$100(CachableBlockFile.java:144)
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader$MetaBlockLoader.get(CachableBlockFile.java:216)
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBlock(CachableBlockFile.java:318)
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:372)
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:144)
>         at org.apache.accumulo.core.file.rfile.RFile$Reader.<init>(RFile.java:825)
>         at org.apache.accumulo.core.file.rfile.RFileOperations.openReader(RFileOperations.java:79)
>         at org.apache.accumulo.core.file.DispatchingFileFactory.openReader(FileOperations.java:119)
>         at org.apache.accumulo.tserver.FileManager.reserveReaders(FileManager.java:315)
>         ... 32 more
> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File
does not exist: /accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf
>         at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
>         at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1728)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1671)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1651)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1625)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:503)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:322)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1410)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1363)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at com.sun.proxy.$Proxy20.getBlockLocations(Unknown Source)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
>         at com.sun.proxy.$Proxy20.getBlockLocations(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:219)
>         at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1142)
>         ... 53 more
> {code}
> I'm not sure if this could possibly cause inconsistencies that are visible to the end
user, but it seems possible. The goal of this ticket is to improve the failure mode rather
than to fully handle the case where the trash policy isn't supported by the underlying infrastructure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message