accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Busbey (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-3272) tserver breaks in a bad way when it can't write to hdfs trash
Date Wed, 29 Oct 2014 14:45:35 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188406#comment-14188406
] 

Sean Busbey commented on ACCUMULO-3272:
---------------------------------------

For things other than the root tablet, where the deletion happens in the GC, we do roll forward.
We should be consistent.

It sounds like the VolumeManager implementation just isn't conservative enough in handling
the Trash object from Hadoop. It'll already behave properly when the HDFS Trash is disabled.
In your case the trash is enabled but it's misconfigured.

We could handle all of this consistently if we

* On init create the trash directory ( TrashPolicy.currentTrashDir for Hadoop 2.x and "~/.Trash"
otherwise) on each volume.
  * WARN if it fails but continue
* Since there's nothing protecting the trash directory, on server start up we should get the
current trash directory on each volume and attempt to write to it. If that fails, we should
WARN and then keep internal state to treat the trash as disabled on that volume (thus avoiding
having to do exception handling in e.g. code paths handling the root tablet)
* Given the way we use VolumeManager.moveToTrash, we should javadoc that it returns false
on underlying IOException and remove the throws clause.

> tserver breaks in a bad way when it can't write to hdfs trash
> -------------------------------------------------------------
>
>                 Key: ACCUMULO-3272
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3272
>             Project: Accumulo
>          Issue Type: Bug
>    Affects Versions: 1.5.0, 1.6.0
>            Reporter: Adam Fuchs
>
> When installing and testing a vanilla install of HDP 2.1 the HDFS setting for fs.trash.interval
is set to 360 by default. Accumulo takes this to mean that it should move deleted files to
the .Trash directory in accumulo's hdfs home directory. In this instance, the home directory
did not exist, which caused a major compaction to fail. The failure happened in such a way
that the internal state of the tserver became inconsistent, preventing automatic recovery
after an admin solved the trash problem.
> The first stack trace below shows the initial problem. The second shows a secondary problem
caused by the poor failure mode.
> {code}
> 2014-10-28 15:05:19,353 [tserver.Tablet] DEBUG: Major compaction plan: [hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/00000_00000.rf,
hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf] propogate
deletes : false
> 2014-10-28 15:05:19,353 [tserver.Tablet] DEBUG: MajC initiate lock 0.00 secs, wait 0.00
secs
> 2014-10-28 15:05:19,356 [tserver.Tablet] DEBUG: Starting MajC +r<< (NORMAL) [hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf,
hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/00000_00000.rf]
--> hdfs://n1.sqrrl-lab.
> net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/A000003r.rf_tmp  []
> 2014-10-28 15:05:19,376 [tserver.TabletServer] DEBUG: Got getScans message from user:
!SYSTEM
> 2014-10-28 15:05:19,432 [tserver.Compactor] DEBUG: Compaction +r<< 28 read | 18
written |    608 entries/sec |  0.046 secs
> 2014-10-28 15:05:19,482 [fs.TrashPolicyDefault] INFO : Namenode trash configuration:
Deletion interval = 360 minutes, Emptier interval = 0 minutes.
> 2014-10-28 15:05:19,491 [fs.TrashPolicyDefault] WARN : Can't create trash directory:
hdfs://n1.sqrrl-lab.net:8020/user/accumulo/.Trash/Current/accumulo_1.6_perf_test/tables/+r/root_tablet
> 2014-10-28 15:05:19,491 [tserver.Tablet] ERROR: MajC Failed, extent = +r<<
> 2014-10-28 15:05:19,491 [tserver.Tablet] ERROR: MajC Failed, message = Failed to move
to trash: hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/delete+A000003r.rf+F000003q.rf
> java.io.IOException: Failed to move to trash: hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/delete+A000003r.rf+F000003q.rf
>         at org.apache.hadoop.fs.TrashPolicyDefault.moveToTrash(TrashPolicyDefault.java:160)
>         at org.apache.hadoop.fs.Trash.moveToTrash(Trash.java:109)
>         at org.apache.accumulo.server.fs.VolumeManagerImpl.moveToTrash(VolumeManagerImpl.java:364)
>         at org.apache.accumulo.tserver.RootFiles.finishReplacement(RootFiles.java:64)
>         at org.apache.accumulo.tserver.RootFiles.replaceFiles(RootFiles.java:75)
>         at org.apache.accumulo.tserver.Tablet$DatafileManager.bringMajorCompactionOnline(Tablet.java:1001)
>         at org.apache.accumulo.tserver.Tablet._majorCompact(Tablet.java:3239)
>         at org.apache.accumulo.tserver.Tablet.majorCompact(Tablet.java:3340)
>         at org.apache.accumulo.tserver.Tablet.access$4800(Tablet.java:172)
>         at org.apache.accumulo.tserver.Tablet$CompactionRunner.run(Tablet.java:2804)
>         at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:42)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:42)
>         at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: org.apache.hadoop.security.AccessControlException: Permission denied: user=accumulo,
access=WRITE, inode="/user":hdfs:hdfs:drwxr-xr-x
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:176)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5509)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5491)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5465)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3608)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3578)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3552)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:760)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:558)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>         at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>         at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>         at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2555)
>         at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2524)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:827)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:823)
>         at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:823)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:816)
>         at org.apache.hadoop.fs.TrashPolicyDefault.moveToTrash(TrashPolicyDefault.java:136)
>         ... 15 more
> Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
Permission denied: user=accumulo, access=WRITE, inode="/user":hdfs:hdfs:drwxr-xr-x
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:176)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5509)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5491)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5465)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3608)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3578)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3552)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:760)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:558)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1410)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1363)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at com.sun.proxy.$Proxy20.mkdirs(Unknown Source)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
>         at com.sun.proxy.$Proxy20.mkdirs(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:500)
>         at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2553)
>         ... 22 more
> {code}
> {code}
> 2014-10-28 15:05:20,558 [tserver.FileManager] ERROR: Failed to open file hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf
File does not exist: /accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf
>         at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
>         at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1728)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1671)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1651)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1625)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:503)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:322)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> 2014-10-28 15:05:20,558 [problems.ProblemReports] DEBUG: Filing problem report +r FILE_READ
hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf
> 2014-10-28 15:05:20,559 [tserver.TabletServer] WARN : exception while scanning tablet
+r<<
> java.io.IOException: Failed to open hdfs://n1.sqrrl-lab.net:8020/accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf
>         at org.apache.accumulo.tserver.FileManager.reserveReaders(FileManager.java:334)
>         at org.apache.accumulo.tserver.FileManager.access$500(FileManager.java:59)
>         at org.apache.accumulo.tserver.FileManager$ScanFileManager.openFiles(FileManager.java:491)
>         at org.apache.accumulo.tserver.FileManager$ScanFileManager.openFileRefs(FileManager.java:479)
>         at org.apache.accumulo.tserver.FileManager$ScanFileManager.openFiles(FileManager.java:499)
>         at org.apache.accumulo.tserver.Tablet$ScanDataSource.createIterator(Tablet.java:1980)
>         at org.apache.accumulo.tserver.Tablet$ScanDataSource.iterator(Tablet.java:1942)
>         at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.seek(SourceSwitchingIterator.java:165)
>         at org.apache.accumulo.tserver.Tablet.nextBatch(Tablet.java:1659)
>         at org.apache.accumulo.tserver.Tablet.access$3200(Tablet.java:172)
>         at org.apache.accumulo.tserver.Tablet$Scanner.read(Tablet.java:1799)
>         at org.apache.accumulo.tserver.TabletServer$ThriftClientHandler$NextBatchTask.run(TabletServer.java:1041)
>         at org.apache.accumulo.tserver.TabletServerResourceManager.executeReadAhead(TabletServerResourceManager.java:642)
>         at org.apache.accumulo.tserver.TabletServer$ThriftClientHandler.continueScan(TabletServer.java:1278)
>         at org.apache.accumulo.tserver.TabletServer$ThriftClientHandler.startScan(TabletServer.java:1247)
>         at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.accumulo.trace.instrument.thrift.RpcServerInvocationHandler.invoke(RpcServerInvocationHandler.java:46)
>         at org.apache.accumulo.server.util.RpcWrapper$1.invoke(RpcWrapper.java:44)
>         at com.sun.proxy.$Proxy21.startScan(Unknown Source)
>         at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$startScan.getResult(TabletClientService.java:2179)
>         at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$startScan.getResult(TabletClientService.java:2163)
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>         at org.apache.accumulo.server.util.TServerUtils$TimedProcessor.process(TServerUtils.java:168)
>         at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:516)
>         at org.apache.accumulo.server.util.CustomNonBlockingServer$1.run(CustomNonBlockingServer.java:77)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
>         at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: java.io.FileNotFoundException: File does not exist: /accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf
>         at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
>         at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1728)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1671)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1651)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1625)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:503)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:322)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>         at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>         at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>         at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1144)
>         at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1132)
>         at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1122)
>         at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:264)
>         at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:231)
>         at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:224)
>         at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1295)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:300)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:296)
>         at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:296)
>         at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:764)
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBCFile(CachableBlockFile.java:261)
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.access$100(CachableBlockFile.java:144)
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader$MetaBlockLoader.get(CachableBlockFile.java:216)
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBlock(CachableBlockFile.java:318)
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:372)
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:144)
>         at org.apache.accumulo.core.file.rfile.RFile$Reader.<init>(RFile.java:825)
>         at org.apache.accumulo.core.file.rfile.RFileOperations.openReader(RFileOperations.java:79)
>         at org.apache.accumulo.core.file.DispatchingFileFactory.openReader(FileOperations.java:119)
>         at org.apache.accumulo.tserver.FileManager.reserveReaders(FileManager.java:315)
>         ... 32 more
> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File
does not exist: /accumulo_1.6_perf_test/tables/+r/root_tablet/F000003q.rf
>         at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
>         at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1728)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1671)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1651)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1625)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:503)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:322)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1410)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1363)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at com.sun.proxy.$Proxy20.getBlockLocations(Unknown Source)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
>         at com.sun.proxy.$Proxy20.getBlockLocations(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:219)
>         at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1142)
>         ... 53 more
> {code}
> I'm not sure if this could possibly cause inconsistencies that are visible to the end
user, but it seems possible. The goal of this ticket is to improve the failure mode rather
than to fully handle the case where the trash policy isn't supported by the underlying infrastructure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message