hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: How threads interact with each other in HBase
Date Sun, 02 Apr 2017 17:19:17 GMT
Suli:
Have you looked at HBASE-5954 ?

It gives some background on why hbase code is formulated the way it
currently is.

Cheers

On Sun, Apr 2, 2017 at 9:36 AM, 杨苏立 Yang Su Li <yangsuli@gmail.com> wrote:

> Don't your second paragraph just prove my point? -- If data is not
> persisted to disk, then it is not durable. That is the definition of
> durability.
>
> If you want the data to be durable, then you need to call hsync() instead
> of hflush(), and that would be the correct behavior if you use FSYNC_WAL
> flag (per HBase documentation).
>
> However, HBase does not do that.
>
> Suli
>
> On Sun, Apr 2, 2017 at 11:26 AM, Josh Elser <josh.elser@gmail.com> wrote:
>
> > No, that's not correct. HBase would, by definition, not be a
> > consistent database if a write was not durable when a client sees a
> > successful write.
> >
> > The point that I will concede to you is that the hflush call may, in
> > extenuating circumstances, may not be completely durable. For example,
> > HFlush does not actually force the data to disk. If an abrupt power
> > failure happens before this data is pushed to disk, HBase may think
> > that data was durable when it actually wasn't (at the HDFS level).
> >
> > On Thu, Mar 30, 2017 at 4:26 PM, 杨苏立 Yang Su Li <yangsuli@gmail.com>
> > wrote:
> > > Also, please correct me if I am wrong, but I don't think a put is
> durable
> > > when an RPC returns to the client. Just its corresponding WAL entry is
> > > pushed to the memory of all three data nodes, so it has a low
> probability
> > > of being lost. But nothing is persisted at this point.
> > >
> > > And this is true no mater you use SYNC_WAL or FSYNC_WAL flag.
> > >
> > > On Tue, Mar 28, 2017 at 12:11 PM, Josh Elser <elserj@apache.org>
> wrote:
> > >
> > >> 1.1 -> 2: don't forget about the block cache which can invalidate the
> > need
> > >> for any HDFS read.
> > >>
> > >> I think you're over-simplifying the write-path quite a bit. I'm not
> sure
> > >> what you mean by an 'asynchronous write', but that doesn't exist at
> the
> > >> HBase RPC layer as that would invalidate the consistency guarantees
> (if
> > an
> > >> RPC returns to the client that data was "put", then it is durable).
> > >>
> > >> Going off of memory (sorry in advance if I misstate something): the
> > >> general way that data is written to the WAL is a "group commit". You
> > have
> > >> many threads all trying to append data to the WAL -- performance would
> > be
> > >> terrible if you serially applied all of these writes. Instead, many
> > writes
> > >> can be accepted and a the caller receives a Future. The caller must
> wait
> > >> for the Future to complete. What's happening behind the scene is that
> > the
> > >> writes are being bundled together to reduce the number of syncs to the
> > WAL
> > >> ("grouping" the writes together). When one caller's future would
> > complete,
> > >> what really happened is that the write/sync which included the
> caller's
> > >> update was committed (along with others). All of this is happening
> > inside
> > >> the RS's implementation of accepting an update.
> > >>
> > >> https://github.com/apache/hbase/blob/55d6dcaf877cc5223e67973
> > >> 6eb613173229c18be/hbase-server/src/main/java/org/apache/hadoop/hbase/
> > >> regionserver/wal/FSHLog.java#L74-L106
> > >>
> > >>
> > >> 杨苏立 Yang Su Li wrote:
> > >>
> > >>> The attachment can be found in the following URL:
> > >>> http://pages.cs.wisc.edu/~suli/hbase.pdf
> > >>>
> > >>> Sorry for the inconvenience...
> > >>>
> > >>>
> > >>> On Mon, Mar 27, 2017 at 8:25 PM, Ted Yu<yuzhihong@gmail.com>
 wrote:
> > >>>
> > >>> Again, attachment didn't come thru.
> > >>>>
> > >>>> Is it possible to formulate as google doc ?
> > >>>>
> > >>>> Thanks
> > >>>>
> > >>>> On Mon, Mar 27, 2017 at 6:19 PM, 杨苏立 Yang Su Li<yangsuli@gmail.com>
> > >>>> wrote:
> > >>>>
> > >>>> Hi,
> > >>>>>
> > >>>>> I am a graduate student working on scheduling on storage systems,
> > and we
> > >>>>> are interested in how different threads in HBase interact with
each
> > >>>>> other
> > >>>>> and how it might affect scheduling.
> > >>>>>
> > >>>>> I have written down my understanding on how HBase/HDFS works
based
> on
> > >>>>> its
> > >>>>> current thread architecture (attached). I am wondering if the
> > developers
> > >>>>>
> > >>>> of
> > >>>>
> > >>>>> HBase could take a look at it and let me know if anything is
> > incorrect
> > >>>>> or
> > >>>>> inaccurate, or if I have missed anything.
> > >>>>>
> > >>>>> Thanks a lot for your help!
> > >>>>>
> > >>>>> On Wed, Mar 22, 2017 at 3:39 PM, 杨苏立 Yang Su Li<yangsuli@gmail.com
> >
> > >>>>> wrote:
> > >>>>>
> > >>>>> Hi,
> > >>>>>>
> > >>>>>> I am a graduate student working on scheduling on storage
systems,
> > and
> > >>>>>> we
> > >>>>>> are interested in how different threads in HBase interact
with
> each
> > >>>>>>
> > >>>>> other
> > >>>>
> > >>>>> and how it might affect scheduling.
> > >>>>>>
> > >>>>>> I have written down my understanding on how HBase/HDFS
works based
> > on
> > >>>>>>
> > >>>>> its
> > >>>>
> > >>>>> current thread architecture (attached). I am wondering if the
> > >>>>>>
> > >>>>> developers of
> > >>>>
> > >>>>> HBase could take a look at it and let me know if anything is
> > incorrect
> > >>>>>>
> > >>>>> or
> > >>>>
> > >>>>> inaccurate, or if I have missed anything.
> > >>>>>>
> > >>>>>> Thanks a lot for your help!
> > >>>>>>
> > >>>>>> --
> > >>>>>> Suli Yang
> > >>>>>>
> > >>>>>> Department of Physics
> > >>>>>> University of Wisconsin Madison
> > >>>>>>
> > >>>>>> 4257 Chamberlin Hall
> > >>>>>> Madison WI 53703
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>> --
> > >>>>> Suli Yang
> > >>>>>
> > >>>>> Department of Physics
> > >>>>> University of Wisconsin Madison
> > >>>>>
> > >>>>> 4257 Chamberlin Hall
> > >>>>> Madison WI 53703
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>
> > >>>
> > >>>
> > >
> > >
> > > --
> > > Suli Yang
> > >
> > > Department of Physics
> > > University of Wisconsin Madison
> > >
> > > 4257 Chamberlin Hall
> > > Madison WI 53703
> >
>
>
>
> --
> Suli Yang
>
> Department of Physics
> University of Wisconsin Madison
>
> 4257 Chamberlin Hall
> Madison WI 53703
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message