accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-3272) tserver breaks in a bad way when it can't write to hdfs trash
Date Wed, 29 Oct 2014 15:27:35 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188450#comment-14188450
] 

Josh Elser commented on ACCUMULO-3272:
--------------------------------------

bq. the tserver's view of the current file set includes a file that doesn't exist, indicating
there is a consistency problem after this failure

Interesting. I missed that subtlety -- thanks for pointing it out. I found the old ticket
in which I saw something similar to what you describe here (will link it here). My comments
there say that I couldn't find anything wrong we were doing then, but your stack traces are
different. Would be worth digging into. I would think it should be simple enough to fail the
compaction if we can't move files to the trash and then the compaction would implicitly retried
(and I thought we already did this). That would keep us from having to try to roll back the
compaction I believe.

Have you tried to use MiniDFS to try to reproduce this? If you can run MiniDFS with permissions
enabled, it might be possible to "change" the user via the {{HADOOP_USER_NAME}} env variable
and reproduce this?



> tserver breaks in a bad way when it can't write to hdfs trash
> -------------------------------------------------------------
>
>                 Key: ACCUMULO-3272
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3272
>             Project: Accumulo
>          Issue Type: Bug
>    Affects Versions: 1.5.0, 1.6.0
>            Reporter: Adam Fuchs
>
> When installing and testing a vanilla install of HDP 2.1 the HDFS setting for fs.trash.interval
is set to 360 by default. Accumulo takes this to mean that it should move deleted files to
the .Trash directory in accumulo's hdfs home directory. In this instance, the home directory
did not exist, which caused a major compaction to fail. The failure happened in such a way
that the internal state of the tserver became inconsistent, preventing automatic recovery
after an admin solved the trash problem.
> The first stack trace below shows the initial problem. The second shows a secondary problem
caused by the poor failure mode.
> {code}
> 2014-10-28 15:05:19,353 [tserver.Tablet] DEBUG: Major compaction plan: [hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/00000_00000.rf,
hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf] propogate
deletes : false
> 2014-10-28 15:05:19,353 [tserver.Tablet] DEBUG: MajC initiate lock 0.00 secs, wait 0.00
secs
> 2014-10-28 15:05:19,356 [tserver.Tablet] DEBUG: Starting MajC +r<< (NORMAL) [hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf,
hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/00000_00000.rf]
--> hdfs://n1.sqrrl-lab.
> net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/A000003r.rf_tmp  []
> 2014-10-28 15:05:19,376 [tserver.TabletServer] DEBUG: Got getScans message from user:
!SYSTEM
> 2014-10-28 15:05:19,432 [tserver.Compactor] DEBUG: Compaction +r<< 28 read | 18
written |    608 entries/sec |  0.046 secs
> 2014-10-28 15:05:19,482 [fs.TrashPolicyDefault] INFO : Namenode trash configuration:
Deletion interval = 360 minutes, Emptier interval = 0 minutes.
> 2014-10-28 15:05:19,491 [fs.TrashPolicyDefault] WARN : Can't create trash directory:
hdfs://n1.sqrrl-lab.net:8020/user/accumulo/.Trash/Current/accumulo_1.6_perf_test/tables/+r/root_tablet
> 2014-10-28 15:05:19,491 [tserver.Tablet] ERROR: MajC Failed, extent = +r<<
> 2014-10-28 15:05:19,491 [tserver.Tablet] ERROR: MajC Failed, message = Failed to move
to trash: hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/delete+A000003r.rf+F000003q.rf
> java.io.IOException: Failed to move to trash: hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/delete+A000003r.rf+F000003q.rf
>         at org.apache.hadoop.fs.TrashPolicyDefault.moveToTrash(TrashPolicyDefault.java:160)
>         at org.apache.hadoop.fs.Trash.moveToTrash(Trash.java:109)
>         at org.apache.accumulo.server.fs.VolumeManagerImpl.moveToTrash(VolumeManagerImpl.java:364)
>         at org.apache.accumulo.tserver.RootFiles.finishReplacement(RootFiles.java:64)
>         at org.apache.accumulo.tserver.RootFiles.replaceFiles(RootFiles.java:75)
>         at org.apache.accumulo.tserver.Tablet$DatafileManager.bringMajorCompactionOnline(Tablet.java:1001)
>         at org.apache.accumulo.tserver.Tablet._majorCompact(Tablet.java:3239)
>         at org.apache.accumulo.tserver.Tablet.majorCompact(Tablet.java:3340)
>         at org.apache.accumulo.tserver.Tablet.access$4800(Tablet.java:172)
>         at org.apache.accumulo.tserver.Tablet$CompactionRunner.run(Tablet.java:2804)
>         at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:42)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:42)
>         at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: org.apache.hadoop.security.AccessControlException: Permission denied: user=accumulo,
access=WRITE, inode="/user":hdfs:hdfs:drwxr-xr-x
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:176)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5509)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5491)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5465)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3608)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3578)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3552)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:760)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:558)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>         at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>         at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>         at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2555)
>         at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2524)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:827)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:823)
>         at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:823)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:816)
>         at org.apache.hadoop.fs.TrashPolicyDefault.moveToTrash(TrashPolicyDefault.java:136)
>         ... 15 more
> Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
Permission denied: user=accumulo, access=WRITE, inode="/user":hdfs:hdfs:drwxr-xr-x
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:176)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5509)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5491)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5465)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3608)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3578)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3552)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:760)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:558)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1410)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1363)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at com.sun.proxy.$Proxy20.mkdirs(Unknown Source)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
>         at com.sun.proxy.$Proxy20.mkdirs(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:500)
>         at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2553)
>         ... 22 more
> {code}
> {code}
> 2014-10-28 15:05:20,558 [tserver.FileManager] ERROR: Failed to open file hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf
File does not exist: /accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf
>         at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
>         at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1728)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1671)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1651)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1625)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:503)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:322)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> 2014-10-28 15:05:20,558 [problems.ProblemReports] DEBUG: Filing problem report +r FILE_READ
hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf
> 2014-10-28 15:05:20,559 [tserver.TabletServer] WARN : exception while scanning tablet
+r<<
> java.io.IOException: Failed to open hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf
>         at org.apache.accumulo.tserver.FileManager.reserveReaders(FileManager.java:334)
>         at org.apache.accumulo.tserver.FileManager.access$500(FileManager.java:59)
>         at org.apache.accumulo.tserver.FileManager$ScanFileManager.openFiles(FileManager.java:491)
>         at org.apache.accumulo.tserver.FileManager$ScanFileManager.openFileRefs(FileManager.java:479)
>         at org.apache.accumulo.tserver.FileManager$ScanFileManager.openFiles(FileManager.java:499)
>         at org.apache.accumulo.tserver.Tablet$ScanDataSource.createIterator(Tablet.java:1980)
>         at org.apache.accumulo.tserver.Tablet$ScanDataSource.iterator(Tablet.java:1942)
>         at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.seek(SourceSwitchingIterator.java:165)
>         at org.apache.accumulo.tserver.Tablet.nextBatch(Tablet.java:1659)
>         at org.apache.accumulo.tserver.Tablet.access$3200(Tablet.java:172)
>         at org.apache.accumulo.tserver.Tablet$Scanner.read(Tablet.java:1799)
>         at org.apache.accumulo.tserver.TabletServer$ThriftClientHandler$NextBatchTask.run(TabletServer.java:1041)
>         at org.apache.accumulo.tserver.TabletServerResourceManager.executeReadAhead(TabletServerResourceManager.java:642)
>         at org.apache.accumulo.tserver.TabletServer$ThriftClientHandler.continueScan(TabletServer.java:1278)
>         at org.apache.accumulo.tserver.TabletServer$ThriftClientHandler.startScan(TabletServer.java:1247)
>         at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.accumulo.trace.instrument.thrift.RpcServerInvocationHandler.invoke(RpcServerInvocationHandler.java:46)
>         at org.apache.accumulo.server.util.RpcWrapper$1.invoke(RpcWrapper.java:44)
>         at com.sun.proxy.$Proxy21.startScan(Unknown Source)
>         at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$startScan.getResult(TabletClientService.java:2179)
>         at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$startScan.getResult(TabletClientService.java:2163)
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>         at org.apache.accumulo.server.util.TServerUtils$TimedProcessor.process(TServerUtils.java:168)
>         at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:516)
>         at org.apache.accumulo.server.util.CustomNonBlockingServer$1.run(CustomNonBlockingServer.java:77)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
>         at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: java.io.FileNotFoundException: File does not exist: /accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf
>         at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
>         at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1728)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1671)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1651)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1625)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:503)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:322)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>         at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>         at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>         at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1144)
>         at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1132)
>         at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1122)
>         at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:264)
>         at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:231)
>         at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:224)
>         at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1295)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:300)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:296)
>         at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:296)
>         at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:764)
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBCFile(CachableBlockFile.java:261)
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.access$100(CachableBlockFile.java:144)
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader$MetaBlockLoader.get(CachableBlockFile.java:216)
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBlock(CachableBlockFile.java:318)
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:372)
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:144)
>         at org.apache.accumulo.core.file.rfile.RFile$Reader.<init>(RFile.java:825)
>         at org.apache.accumulo.core.file.rfile.RFileOperations.openReader(RFileOperations.java:79)
>         at org.apache.accumulo.core.file.DispatchingFileFactory.openReader(FileOperations.java:119)
>         at org.apache.accumulo.tserver.FileManager.reserveReaders(FileManager.java:315)
>         ... 32 more
> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File
does not exist: /accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf
>         at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
>         at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1728)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1671)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1651)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1625)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:503)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:322)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1410)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1363)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at com.sun.proxy.$Proxy20.getBlockLocations(Unknown Source)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
>         at com.sun.proxy.$Proxy20.getBlockLocations(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:219)
>         at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1142)
>         ... 53 more
> {code}
> I'm not sure if this could possibly cause inconsistencies that are visible to the end
user, but it seems possible. The goal of this ticket is to improve the failure mode rather
than to fully handle the case where the trash policy isn't supported by the underlying infrastructure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message