hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Levin <magn...@gmail.com>
Subject Re: [jira] [Updated] (HBASE-4695) WAL logs get deleted before region server can fully flush
Date Tue, 01 Nov 2011 00:08:38 GMT
It would be nice to have a patch for 0.90.4 also.


Thanks,

-Jack

On Mon, Oct 31, 2011 at 9:49 AM, stack (Updated) (JIRA) <jira@apache.org> wrote:
>
>     [ https://issues.apache.org/jira/browse/HBASE-4695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
>
> stack updated HBASE-4695:
> -------------------------
>
>    Fix Version/s: 0.92.0
>
>> WAL logs get deleted before region server can fully flush
>> ---------------------------------------------------------
>>
>>                 Key: HBASE-4695
>>                 URL: https://issues.apache.org/jira/browse/HBASE-4695
>>             Project: HBase
>>          Issue Type: Bug
>>          Components: wal
>>    Affects Versions: 0.90.4
>>            Reporter: jack levin
>>            Assignee: gaojinchao
>>            Priority: Blocker
>>             Fix For: 0.92.0, 0.90.5
>>
>>         Attachments: HBASE-4695_Trunk_V2.patch, HBASE-4695_branch90_trial.patch,
hbase-4695-0.92.txt
>>
>>
>> To replicate the problem do the following:
>> 1. check /hbase/.logs/XXXX directory to see if you have WAL logs for the region server
you are shutting down.
>> 2. executing kill <pid> (where pid is a regionserver pid)
>> 3. Watch the regionserver log to start flushing, you will see how many regions are
left to flush:
>> 09:36:54,665 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Waiting on
489 regions to close
>> 09:56:35,779 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Waiting on
116 regions to close
>> 4. Check /hbase/.logs/XXXX -- you will notice that it has dissapeared.
>> 5. Check namenode logs:
>> 09:26:41,607 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=root
ip=/10.101.1.5 cmd=delete src=/hbase/.logs/rdaa5.prod.imageshack.com,60020,1319749
>> Note that, if you kill -9 the RS now, and it crashes on flush, you won't have any
WAL logs to replay.  We need to make sure that logs are deleted or moved out only when RS
has fully flushed. Otherwise its possible to lose data.
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
>

Mime
View raw message