hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Victor Xu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-9645) Regionserver halt because of HLog's "Logic Error Snapshot seq id from earlier flush still present!"
Date Tue, 24 Sep 2013 14:09:03 GMT

     [ https://issues.apache.org/jira/browse/HBASE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Victor Xu updated HBASE-9645:
-----------------------------

    Description: 
I upgrade my hbase cluster to 0.94.10 three weeks ago, and this case happened several days
after that. I change the bug's priority to 'Critical' because every  time it happens, a regionserver
halt down. All of them have the same log:
{noformat} 
ERROR org.apache.hadoop.hbase.regionserver.wal.HLog: Logic Error Snapshot seq id from earlier
flush still present! for region c0d88db4ce3606842fbec9d34c38f707 overwritten oldseq=80114270537with
new seq=80115066829
{noformat} 
I check the code finding that it locates at HLog.startCacheFlush method. The 'lastSeqWritten'
has been locked. Maybe something wrong happened outside the HLog that change it by mistake.

  was:
I upgrade my hbase cluster to 0.94.10 three weeks ago, and this case happened several days
after that. I change the bug's priority to 'Critical' because every  time it happens, a regionserver
halt down. All of them have the same log:

ERROR org.apache.hadoop.hbase.regionserver.wal.HLog: Logic Error Snapshot seq id from earlier
flush still present! for region c0d88db4ce3606842fbec9d34c38f707 overwritten oldseq=80114270537with
new seq=80115066829

I check the code finding that it locates at HLog.startCacheFlush method. The 'lastSeqWritten'
has been locked. Maybe something wrong happened outside the HLog that change it by mistake.

    
> Regionserver halt because of HLog's "Logic Error Snapshot seq id from earlier flush still
present!"
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-9645
>                 URL: https://issues.apache.org/jira/browse/HBASE-9645
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver, wal
>    Affects Versions: 0.94.10
>         Environment: Linux 2.6.32-el5.x86_64
>            Reporter: Victor Xu
>            Priority: Critical
>
> I upgrade my hbase cluster to 0.94.10 three weeks ago, and this case happened several
days after that. I change the bug's priority to 'Critical' because every  time it happens,
a regionserver halt down. All of them have the same log:
> {noformat} 
> ERROR org.apache.hadoop.hbase.regionserver.wal.HLog: Logic Error Snapshot seq id from
earlier flush still present! for region c0d88db4ce3606842fbec9d34c38f707 overwritten oldseq=80114270537with
new seq=80115066829
> {noformat} 
> I check the code finding that it locates at HLog.startCacheFlush method. The 'lastSeqWritten'
has been locked. Maybe something wrong happened outside the HLog that change it by mistake.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message