hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jingcheng Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12201) Close the writers in the MOB sweep tool
Date Thu, 09 Oct 2014 01:54:34 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14164573#comment-14164573
] 

Jingcheng Du commented on HBASE-12201:
--------------------------------------

Thanks Jon, [~jmhsieh].
bq. changes the compaction working dir so that multiple jobs or subsequent jobs don't use
the same path.
Previously we have the working dir like mobCompactionDirOfJobName/working/... (the job name
is mapperclass-reducerclass-table-cf), when we remove the working dir, only working is deleted,
the mobCompactionDirOfJobName will be left there. It won't impact the logic, but it's better
to remove it.

bq. Can you give a quick explanation of why or how we would potentially open a file for append?
(I believe there is a obscure race possible there, – two get to !exists and then both trry
to create) but let's ignore that for now).
We have lock that surrounds the logic, right? There's only one sweeper running at the same
time for the same table and cf, so there's not race condition here. Actually we don't need
to check the existence here (1, the whole working dir is deleted/created in the beginning
of the sweeper. 2, the file name is a UUID), we could create it directly here. Will provide
a new patch(V3) to fix this.

> Close the writers in the MOB sweep tool
> ---------------------------------------
>
>                 Key: HBASE-12201
>                 URL: https://issues.apache.org/jira/browse/HBASE-12201
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: hbase-11339
>            Reporter: Jingcheng Du
>            Assignee: Jingcheng Du
>            Priority: Minor
>             Fix For: hbase-11339
>
>         Attachments: HBASE-12201-V2.diff, HBASE-12201.diff
>
>
>  When running the sweep tool, we encountered such an exception.
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
No lease on /hbase/mobdir/.tmp/mobcompaction/SweepJob-SweepMapper-SweepReducer-testSweepToolExpiredNoMinVersion-data/working/names/all
(inode 46500): File does not exist. Holder DFSClient_NONMAPREDUCE_-1863270027_1 does not have
any open files.
> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3319)
>   at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:3407)
>   at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:3377)
>   at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:673)
>   at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.complete(AuthorizationProviderProxyClientProtocol.java:219)
>   at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:520)
>   at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1411)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1364)
>   at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy15.complete(Unknown Source)
>   at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:435)
>   at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>    at java.lang.reflect.Method.invoke(Method.java:606)
>    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>    at com.sun.proxy.$Proxy16.complete(Unknown Source)
>    at org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2180)
>    at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2164)
>    at org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:908)
>   at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:925)
>   at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:861)
>   at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2687)
>   at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2704)
>   at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
> This is because we have several writers opened by fs.create(path, true) are not closed
properly.
> Meanwhile, in the current implementation, we save the temp files under sweepJobDir/working/...,
and when we remove the directory of the sweep job only the working is deleted. We should remove
the whole sweepJobDir instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message