hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 杨苏立 Yang Su Li <yangs...@gmail.com>
Subject Re: How threads interact with each other in HBase
Date Sun, 02 Apr 2017 18:35:20 GMT
I understand why HBase by default does not use hsync -- it does come with
big performance cost (though for FSYNC_WAL which is not the default option,
you should probably do it because the documentation explicitly promised
it).


I just want to make sure my description about HBase is accurate, including
the durability aspect.

On Sun, Apr 2, 2017 at 12:19 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> Suli:
> Have you looked at HBASE-5954 ?
>
> It gives some background on why hbase code is formulated the way it
> currently is.
>
> Cheers
>
> On Sun, Apr 2, 2017 at 9:36 AM, 杨苏立 Yang Su Li <yangsuli@gmail.com> wrote:
>
> > Don't your second paragraph just prove my point? -- If data is not
> > persisted to disk, then it is not durable. That is the definition of
> > durability.
> >
> > If you want the data to be durable, then you need to call hsync() instead
> > of hflush(), and that would be the correct behavior if you use FSYNC_WAL
> > flag (per HBase documentation).
> >
> > However, HBase does not do that.
> >
> > Suli
> >
> > On Sun, Apr 2, 2017 at 11:26 AM, Josh Elser <josh.elser@gmail.com>
> wrote:
> >
> > > No, that's not correct. HBase would, by definition, not be a
> > > consistent database if a write was not durable when a client sees a
> > > successful write.
> > >
> > > The point that I will concede to you is that the hflush call may, in
> > > extenuating circumstances, may not be completely durable. For example,
> > > HFlush does not actually force the data to disk. If an abrupt power
> > > failure happens before this data is pushed to disk, HBase may think
> > > that data was durable when it actually wasn't (at the HDFS level).
> > >
> > > On Thu, Mar 30, 2017 at 4:26 PM, 杨苏立 Yang Su Li <yangsuli@gmail.com>
> > > wrote:
> > > > Also, please correct me if I am wrong, but I don't think a put is
> > durable
> > > > when an RPC returns to the client. Just its corresponding WAL entry
> is
> > > > pushed to the memory of all three data nodes, so it has a low
> > probability
> > > > of being lost. But nothing is persisted at this point.
> > > >
> > > > And this is true no mater you use SYNC_WAL or FSYNC_WAL flag.
> > > >
> > > > On Tue, Mar 28, 2017 at 12:11 PM, Josh Elser <elserj@apache.org>
> > wrote:
> > > >
> > > >> 1.1 -> 2: don't forget about the block cache which can invalidate
> the
> > > need
> > > >> for any HDFS read.
> > > >>
> > > >> I think you're over-simplifying the write-path quite a bit. I'm not
> > sure
> > > >> what you mean by an 'asynchronous write', but that doesn't exist at
> > the
> > > >> HBase RPC layer as that would invalidate the consistency guarantees
> > (if
> > > an
> > > >> RPC returns to the client that data was "put", then it is durable).
> > > >>
> > > >> Going off of memory (sorry in advance if I misstate something): the
> > > >> general way that data is written to the WAL is a "group commit". You
> > > have
> > > >> many threads all trying to append data to the WAL -- performance
> would
> > > be
> > > >> terrible if you serially applied all of these writes. Instead, many
> > > writes
> > > >> can be accepted and a the caller receives a Future. The caller must
> > wait
> > > >> for the Future to complete. What's happening behind the scene is
> that
> > > the
> > > >> writes are being bundled together to reduce the number of syncs to
> the
> > > WAL
> > > >> ("grouping" the writes together). When one caller's future would
> > > complete,
> > > >> what really happened is that the write/sync which included the
> > caller's
> > > >> update was committed (along with others). All of this is happening
> > > inside
> > > >> the RS's implementation of accepting an update.
> > > >>
> > > >> https://github.com/apache/hbase/blob/55d6dcaf877cc5223e67973
> > > >> 6eb613173229c18be/hbase-server/src/main/java/org/
> apache/hadoop/hbase/
> > > >> regionserver/wal/FSHLog.java#L74-L106
> > > >>
> > > >>
> > > >> 杨苏立 Yang Su Li wrote:
> > > >>
> > > >>> The attachment can be found in the following URL:
> > > >>> http://pages.cs.wisc.edu/~suli/hbase.pdf
> > > >>>
> > > >>> Sorry for the inconvenience...
> > > >>>
> > > >>>
> > > >>> On Mon, Mar 27, 2017 at 8:25 PM, Ted Yu<yuzhihong@gmail.com>
> wrote:
> > > >>>
> > > >>> Again, attachment didn't come thru.
> > > >>>>
> > > >>>> Is it possible to formulate as google doc ?
> > > >>>>
> > > >>>> Thanks
> > > >>>>
> > > >>>> On Mon, Mar 27, 2017 at 6:19 PM, 杨苏立 Yang Su Li<
> yangsuli@gmail.com>
> > > >>>> wrote:
> > > >>>>
> > > >>>> Hi,
> > > >>>>>
> > > >>>>> I am a graduate student working on scheduling on storage
systems,
> > > and we
> > > >>>>> are interested in how different threads in HBase interact
with
> each
> > > >>>>> other
> > > >>>>> and how it might affect scheduling.
> > > >>>>>
> > > >>>>> I have written down my understanding on how HBase/HDFS
works
> based
> > on
> > > >>>>> its
> > > >>>>> current thread architecture (attached). I am wondering
if the
> > > developers
> > > >>>>>
> > > >>>> of
> > > >>>>
> > > >>>>> HBase could take a look at it and let me know if anything
is
> > > incorrect
> > > >>>>> or
> > > >>>>> inaccurate, or if I have missed anything.
> > > >>>>>
> > > >>>>> Thanks a lot for your help!
> > > >>>>>
> > > >>>>> On Wed, Mar 22, 2017 at 3:39 PM, 杨苏立 Yang Su Li<
> yangsuli@gmail.com
> > >
> > > >>>>> wrote:
> > > >>>>>
> > > >>>>> Hi,
> > > >>>>>>
> > > >>>>>> I am a graduate student working on scheduling on storage
> systems,
> > > and
> > > >>>>>> we
> > > >>>>>> are interested in how different threads in HBase interact
with
> > each
> > > >>>>>>
> > > >>>>> other
> > > >>>>
> > > >>>>> and how it might affect scheduling.
> > > >>>>>>
> > > >>>>>> I have written down my understanding on how HBase/HDFS
works
> based
> > > on
> > > >>>>>>
> > > >>>>> its
> > > >>>>
> > > >>>>> current thread architecture (attached). I am wondering
if the
> > > >>>>>>
> > > >>>>> developers of
> > > >>>>
> > > >>>>> HBase could take a look at it and let me know if anything
is
> > > incorrect
> > > >>>>>>
> > > >>>>> or
> > > >>>>
> > > >>>>> inaccurate, or if I have missed anything.
> > > >>>>>>
> > > >>>>>> Thanks a lot for your help!
> > > >>>>>>
> > > >>>>>> --
> > > >>>>>> Suli Yang
> > > >>>>>>
> > > >>>>>> Department of Physics
> > > >>>>>> University of Wisconsin Madison
> > > >>>>>>
> > > >>>>>> 4257 Chamberlin Hall
> > > >>>>>> Madison WI 53703
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>> --
> > > >>>>> Suli Yang
> > > >>>>>
> > > >>>>> Department of Physics
> > > >>>>> University of Wisconsin Madison
> > > >>>>>
> > > >>>>> 4257 Chamberlin Hall
> > > >>>>> Madison WI 53703
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>
> > > >>>
> > > >>>
> > > >
> > > >
> > > > --
> > > > Suli Yang
> > > >
> > > > Department of Physics
> > > > University of Wisconsin Madison
> > > >
> > > > 4257 Chamberlin Hall
> > > > Madison WI 53703
> > >
> >
> >
> >
> > --
> > Suli Yang
> >
> > Department of Physics
> > University of Wisconsin Madison
> >
> > 4257 Chamberlin Hall
> > Madison WI 53703
> >
>



-- 
Suli Yang

Department of Physics
University of Wisconsin Madison

4257 Chamberlin Hall
Madison WI 53703

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message