hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: HBASE-2312 discussion
Date Wed, 17 Mar 2010 05:08:54 GMT
On Tue, Mar 16, 2010 at 8:59 PM, Stack <stack@duboce.net> wrote:

> On Tue, Mar 16, 2010 at 5:08 PM, Todd Lipcon <todd@cloudera.com> wrote:
> >
> > What do you think about the trick of making the RS do a ZK sync before
> any
> > meta op? This forces it to take at most one action after it's been
> > terminated.
> >
>
> ... where meta op is open of new WAL log?
>
> How would this work?  RS would note in ZK the name of the WAL its
> about to open before it did it?  If the RS then does a "Juliet" --
>
[haha, love this terminology!]

> i.e. goes into a GC pause death-like coma -- on revivial, it'll go to
> open the WAL but master will have already done so, and so it'll fail?
>
>
I was actually referring to the explicit sync call in ZK:
http://hadoop.apache.org/zookeeper/docs/r3.2.1/api/org/apache/zookeeper/ZooKeeper.html#sync%28java.lang.String,%20org.apache.zookeeper.AsyncCallback.VoidCallback,%20java.lang.Object%29

The javadoc isn't that clear, but the way I understand this call is that it
makes sure the client's view of the world is up-to-date with respect to the
ZK leader at the beginning of the sync call.

The "note" box at the bottom of this section also explains it pretty well:
http://hadoop.apache.org/zookeeper/docs/r3.2.2/zookeeperProgrammers.html#ch_zkGuarantees

If we insert this between any transitions, I think we can ensure that the
region server will only do at most one operation after losing its lease.
This means that whole "chasing the log" thing is unnecessary.



> @Karthik "I am a little nervous about the master backing off on
> detecting the RS's progress - because the RS has already lost its zk
> lease."
>
> Yes.  The RS will have had its 'shut-yourself-down' flag set on
> loss-of-lease so is on its way out.  Its not going to revive so its
> logs need recovering.
>
> @Kannan "Option #1 seems easy to reason about and simple to implement.
> Can we go ahead with that if there is no major objection?"
>
> Fine by me.
>

Fine by me as well. I think we'll need solutions like 2 or 3 other places,
but for this one #1 seems to work (I'll continue to think if there are any
holes in our logic)

-Todd


-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message