hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars George (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1994) Master will lose hlog entries while splitting if region has empty oldlogfile.log
Date Thu, 19 Nov 2009 16:11:41 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780128#action_12780128

Lars George commented on HBASE-1994:

>From IRC

<larsgeorge> the write path is in the thread
<clehene> it's wap
<larsgeorge> yeah
<larsgeorge> urgh
<larsgeorge> it also only catches IOException
<larsgeorge> I only know from experience that uncaught exceptions in threads rarely
get logged
<larsgeorge> and if then not proper as the stack is different
<clehene> uhm... however an exception in that big try could live it empty
<larsgeorge> yes
<larsgeorge> fragile
<clehene> and next time it would split 
<clehene> it would just throw away all edits because it fails with EOF
<larsgeorge> could be
<larsgeorge> the read deletes the input
<St^Ack> anything in the .out files?
<larsgeorge> so when the write fails
<larsgeorge> tough luck?
<larsgeorge> the read part has the delete in the finally
<larsgeorge> so the input log is deleted for sure
<St^Ack> cosmin you think the master went down while it was splitting a log?
<larsgeorge> shouldn't that be done at the very end? Or in some sort of Atomic commit
<larsgeorge> as in have a big try/catch/finally and either rollback the split or apply
<St^Ack> I think that general split state needs to be hoisted up into zk
<St^Ack> master takes out a 'lock'
<St^Ack> one that will evaporate if it dies mid-split
<larsgeorge> hmm
<larsgeorge> for a master crash?
<larsgeorge> as this is all done in master start anyways
<St^Ack> yeah
<St^Ack> an empty oldlogfile.log -- i can't stand typing the name even -- would seem
to an exit w/o a call to close

> Master will lose hlog entries while splitting if region has empty oldlogfile.log
> --------------------------------------------------------------------------------
>                 Key: HBASE-1994
>                 URL: https://issues.apache.org/jira/browse/HBASE-1994
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.21.0
>            Reporter: Cosmin Lehene
>            Priority: Blocker
>             Fix For: 0.21.0
>   Original Estimate: 48h
>  Remaining Estimate: 48h
> I don't know yet how an empty oldlogfile.log can exist, however it happened.
> Master will fail to put the splits in the region oldlogfile.log if an empty oldlogfile.log
already exists there.
> This is the master log after I artificially reproduced it by placing an empty oldlogfile.log
in /hbase/.META./1028785192/oldlogfile.log and then killed the regionserver that was holding
the .META. table
> 2009-11-19 09:08:36,012 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting
1 hlog(s) in hdfs://b0:9000/hbase/.logs/b4,60020,1258637492773
> 2009-11-19 09:08:36,012 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting
hlog 1 of 1: hdfs://b0:9000/hbase/.logs/b4,60020,1258637492773/hlog.dat.1258637493128, length=0
> 2009-11-19 09:08:36,019 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Adding queue
for .META.,,1
> 2009-11-19 09:08:36,037 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Pushed=795
entries from hdfs://b0:9000/hbase/.logs/b4,60020,1258637492773/hlog.dat.1258637493128
> 2009-11-19 09:08:36,038 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Thread got
795 to process
> 2009-11-19 09:08:36,043 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: Old hlog
file hdfs://b0:9000/hbase/.META./1028785192/oldlogfile.log already exists. Copying existing
file to new file
> 2009-11-19 09:08:36,079 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: Got while
writing region .META.,,1 log java.io.EOFException
> 2009-11-19 09:08:36,081 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: hlog file
splitting completed in 70 millis for hdfs://b0:9000/hbase/.logs/b4,60020,1258637492773

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message