hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karthik Ranganathan <kranganat...@facebook.com>
Subject HBASE-2312 discussion
Date Tue, 16 Mar 2010 18:13:46 GMT
Hey guys,

Just wanted to close on which solution we wanted to pick for this issue - I was thinking about
working on this one. There are 3 possibilities here. I have briefly written up the issue and
the three solutions below.

There is a very corner case when bad things could happen(ie data loss):
1) RS #1 is going to roll its HLog - not yet created the new one, old one will get no more
2) RS #1 enters GC Pause of Death
3) Master lists HLog files of RS#1 that is has to split as RS#1 is dead, starts splitting
4) RS #1 wakes up, created the new HLog (previous one was rolled) and appends an edit - which
is lost

Solution 1:
1) Master detects RS#1 is dead
2) The master renames the /hbase/.logs/<regionserver name> directory to something else
(say /hbase/.logs/<regionserver name>-dead)
3) Add mkdir support (as opposed to mkdirs) to HDFS - so that a file create fails if the directory
doesn't exist. Dhruba tells me this is very doable.
4) RS#1 comes back up and is not able create the new hlog. It restarts itself.
NOTE: Need another HDFS API to be supported, Todd wants to avoid this. This API exists in
Hadoop 0.21, but is not back-ported to 0.20.

Solution 2:
1) RS #1 has written log.1, log.2, log.3
2) RS #1 is just about to write log.4 and enters gc pause before doing so
3) Master detects RS #1 dead
4) Master sees log.1, log.2, log.3. It then opens log.3 for append and also creates log.4
as a lock
5) RS #1 wakes up and isn't allowed to write to either log.3 or log.4 since HMaster holds
NOTE:  This changes the log file names, changes the create mode of the log files from overwrite
= true to false. Master needs to create the last log file and open it in append mode to prevent
RS from proceeding. RS will fail if it cannot create the next log file. The number of log
files the RS can create will be bound.

Solution 3:
1) Write "intend to roll HLog to new file hlog.N+1" to hlog.N
2) Open hlog.N+1 for append
3) Write "finished rolling" to hlog.N
4) continue writing to hlog.N+1
NOTE: This requires new types edits to go into the log file - "intent to roll" and "finished
roll". Master has to open the last log file for append first. Also, master has to "chase"
log files created by the region server (please see the issue for details) as there is an outside
chance of log files rolling when the GC pause happens.

In my opinion, from the perspective of code simplicity, I would rank the solutions as 1 being
simplest, then 2, then 3. Since 1 needs another HDFS API, I was thinking that 2 seemed simpler
to do and easier to verify correctness.

What are your thoughts?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message