Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9895095A5 for ; Tue, 1 Nov 2011 02:20:58 +0000 (UTC) Received: (qmail 99608 invoked by uid 500); 1 Nov 2011 02:20:58 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 99557 invoked by uid 500); 1 Nov 2011 02:20:58 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 99504 invoked by uid 99); 1 Nov 2011 02:20:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Nov 2011 02:20:58 +0000 X-ASF-Spam-Status: No, hits=-2001.2 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Nov 2011 02:20:55 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id F367932B466 for ; Tue, 1 Nov 2011 02:20:34 +0000 (UTC) Date: Tue, 1 Nov 2011 02:20:34 +0000 (UTC) From: "Hadoop QA (Commented) (JIRA)" To: issues@hbase.apache.org Message-ID: <1917875311.44165.1320114034998.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <176360948.33304.1319833772560.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-4695) WAL logs get deleted before region server can fully flush MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-4695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13140852#comment-13140852 ] Hadoop QA commented on HBASE-4695: ---------------------------------- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12501720/HBASE-4695_Branch90_V2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/116//console This message is automatically generated. > WAL logs get deleted before region server can fully flush > --------------------------------------------------------- > > Key: HBASE-4695 > URL: https://issues.apache.org/jira/browse/HBASE-4695 > Project: HBase > Issue Type: Bug > Components: wal > Affects Versions: 0.90.4 > Reporter: jack levin > Assignee: gaojinchao > Priority: Blocker > Fix For: 0.92.0, 0.90.5 > > Attachments: HBASE-4695_Branch90_V2.patch, HBASE-4695_Trunk_V2.patch, HBASE-4695_branch90_trial.patch, hbase-4695-0.92.txt > > > To replicate the problem do the following: > 1. check /hbase/.logs/XXXX directory to see if you have WAL logs for the region server you are shutting down. > 2. executing kill (where pid is a regionserver pid) > 3. Watch the regionserver log to start flushing, you will see how many regions are left to flush: > 09:36:54,665 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Waiting on 489 regions to close > 09:56:35,779 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Waiting on 116 regions to close > 4. Check /hbase/.logs/XXXX -- you will notice that it has dissapeared. > 5. Check namenode logs: > 09:26:41,607 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=root ip=/10.101.1.5 cmd=delete src=/hbase/.logs/rdaa5.prod.imageshack.com,60020,1319749 > Note that, if you kill -9 the RS now, and it crashes on flush, you won't have any WAL logs to replay. We need to make sure that logs are deleted or moved out only when RS has fully flushed. Otherwise its possible to lose data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira