Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 29C409E63 for ; Tue, 17 Apr 2012 18:55:46 +0000 (UTC) Received: (qmail 85948 invoked by uid 500); 17 Apr 2012 18:55:46 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 85912 invoked by uid 500); 17 Apr 2012 18:55:46 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 85904 invoked by uid 99); 17 Apr 2012 18:55:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Apr 2012 18:55:45 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Apr 2012 18:55:39 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 4488B39E5BA for ; Tue, 17 Apr 2012 18:55:18 +0000 (UTC) Date: Tue, 17 Apr 2012 18:55:18 +0000 (UTC) From: "Hadoop QA (Commented) (JIRA)" To: issues@hbase.apache.org Message-ID: <2096159086.34146.1334688918301.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1054145780.42025.1331259477068.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255824#comment-13255824 ] Hadoop QA commented on HBASE-5545: ---------------------------------- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12522989/HBASE-5545.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1553//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1553//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1553//console This message is automatically generated. > region can't be opened for a long time. Because the creating File failed. > ------------------------------------------------------------------------- > > Key: HBASE-5545 > URL: https://issues.apache.org/jira/browse/HBASE-5545 > Project: HBase > Issue Type: Bug > Components: regionserver > Affects Versions: 0.90.6 > Reporter: gaojinchao > Assignee: gaojinchao > Fix For: 0.90.7, 0.92.2, 0.94.0 > > Attachments: HBASE-5545.patch > > > Scenario: > ------------ > 1. File is created > 2. But while writing data, all datanodes might have crashed. So writing data will fail. > 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. > 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. > Suggestion to handle this scenario. > --------------------------- > 1. Check for the existence of the file, if exists delete the file and create new file. > Here delete call for the file will not check whether the file is open or closed. > Overwrite Option: > ---------------- > 1. Overwrite option will be applicable if you are trying to overwrite a closed file. > 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. > This is the expected behaviour to avoid the Multiple clients writing to same file. > Region server logs: > org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) > at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) > at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) > at org.apache.hadoop.ipc.Client.call(Client.java:961) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) > at $Proxy6.create(Unknown Source) > at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at $Proxy6.create(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.(DFSClient.java:3643) > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) > at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) > at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) > at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) > at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) > at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) > at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) > at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > [2012-03-07 20:51:45,858] [WARN ] [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the method call: public abstract void org.apache.hadoop.hdfs.protocol.ClientProtocol.create(java.lang.String,org.apache.hadoop.fs.permission.FsPermission,java.lang.String,boolean,boolean,short,long) throws java.io.IOException with arguments of length: 7. The exisiting ActiveServerConnection is: > ActiveServerConnectionInfo: > Metadata:158-1-131-48/158.1.132.19:9000 > Version:145720623220907 > [2012-03-07 20:51:45,872] [DEBUG] [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-20] [org.apache.hadoop.hbase.zookeeper.ZKAssign 849] regionserver:20020-0x135ec32b39e0002-0x135ec32b39e0002 Successfully transitioned node 91bf3e6f8adb2e4b335f061036353126 from M_ZK_REGION_OFFLINE to RS_ZK_REGION_OPENING > [2012-03-07 20:51:45,873] [DEBUG] [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-20] [org.apache.hadoop.hbase.regionserver.HRegion 2649] Opening region: REGION => {NAME => 'test1,00088613810,1331112369360.91bf3e6f8adb2e4b335f061036353126.', STARTKEY => '00088613810', ENDKEY => '00088613815', ENCODED => 91bf3e6f8adb2e4b335f061036353126, TABLE => {{NAME => 'test1', FAMILIES => [{NAME => 'value', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'GZ', TTL => '86400', BLOCKSIZE => '655360', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}} > [2012-03-07 20:51:45,873] [DEBUG] [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-20] [org.apache.hadoop.hbase.regionserver.HRegion 316] Instantiated test1,00088613810,1331112369360.91bf3e6f8adb2e4b335f061036353126. > [2012-03-07 20:51:45,874] [ERROR] [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-20] [ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira