hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 张铎 <palomino...@gmail.com>
Subject Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0
Date Fri, 29 Apr 2016 05:12:12 GMT
2016-04-29 11:47 GMT+08:00 Ted Yu <yuzhihong@gmail.com>:

> Last comment on HDFS-916 was from 2010.
>
> Suggest making a new issue or reviving discussion on HDFS-916 (currently
> assigned to Todd).
>
> bq. The fallback implementation is not aim to get a good performance
>
> For more than two weeks, I have been working with Azure Data Lake
> developers so that all hbase system tests pass on ADLS - there were subtle
> differences between ADLS and hdfs.
>
> If switching to AsyncWAL gives either WASB or ADLS subpar performance, it
> would make upgrading to hbase 2.x unacceptable for their users.
>
You can still use FSHLog, it is not removed...
But yes, this is a good point on how we choose default configs in HBase.
A config that performs normally for every case, or a config that performs
much better under the main scenario but worse for other scenarios...

>
> On Thu, Apr 28, 2016 at 8:39 PM, 张铎 <palomino219@gmail.com> wrote:
>
> > 2016-04-29 11:35 GMT+08:00 Ted Yu <yuzhihong@gmail.com>:
> >
> > > bq. AsyncFSOutput will be in HDFS-3.0
> > >
> > > Is there HDFS JIRA for the above ? Can you share the number ?
> > >
> > I have not filed a new one but there are bunch of related issues already,
> > such as this one https://issues.apache.org/jira/browse/HDFS-916
> >
> > >
> > > bq. Just wrap FSDataOutputStream to make it act like an asynchronous
> > output
> > >
> > > Can you be a bit more specific ?
> > > HBase currently works with WASB and Azure Data Lake. Does the above
> mean
> > > their performance would suffer ?
> > >
> > Yes, the performance will suffer...
> > The fallback implementation is not aim to get a good performance, just
> for
> > compatibility with any FileSystem implementation.
> >
> > >
> > > On Thu, Apr 28, 2016 at 8:30 PM, 张铎 <palomino219@gmail.com> wrote:
> > >
> > > > Inline comments.
> > > > Thanks,
> > > >
> > > > 2016-04-29 10:57 GMT+08:00 Sean Busbey <busbey@cloudera.com>:
> > > >
> > > > > I am nervous about having default out-of-the-box new HBase users
> > > reliant
> > > > on
> > > > > a bespoke HDFS client, especially given Hadoop's compatibility
> > > > > promises and history. Answers for these questions would make me
> more
> > > > > confident:
> > > > >
> > > > > 1) Where are we on getting the client-side changes to HDFS pushed
> > back
> > > > > upstream?
> > > > >
> > > > No progress yet... Here I want to tell a good story that HBase is
> > already
> > > > use it as default :)
> > > >
> > > > >
> > > > > 2) How well do we detect when our FS is not HDFS and what does
> > > > > fallback look like?
> > > > >
> > > > Just wrap FSDataOutputStream to make it act like an asynchronous
> > > > output(call hflush in a separated thread). The performance is not
> good
> > I
> > > > think.
> > > >
> > > > >
> > > > > 3) Will this mean altering the versions of Hadoop we label as
> > > > > supported for HBase 2.y+?
> > > > >
> > > > I have tested with hadoop versions from 2.4.x to 2.7.x, so I don't
> > think
> > > we
> > > > need to change the supported versions?
> > > >
> > > > >
> > > > > 4) How are we going to ensure our client remains compatible with
> > newer
> > > > > Hadoop releases?
> > > > >
> > > > We can not ensure, HDFS always breaks HBase at a new release...
> > > > I need to test AsyncFSWAL on every new 2.x release and make it
> > compatible
> > > > with that version. And back to #1, I think we should make sure that
> the
> > > > AsyncFSOutput will be in HDFS-3.0. And in HBase-3.0, we can
> introduce a
> > > new
> > > > 'AsyncFSWAL' that use the AsyncFSOutput in HDFS.
> > > >
> > > > >
> > > > > On Thu, Apr 28, 2016 at 9:42 PM, Duo Zhang <zhangduo@apache.org>
> > > wrote:
> > > > > > Six month after I filed HBASE-14790...
> > > > > >
> > > > > > Now the AsyncFSWAL is ready. The WALPE result shows that it
is
> > > > > *1.4x~3.7x*
> > > > > > faster than FSHLog. The ITBLL result turns out that it is *not
> bad*
> > > > than
> > > > > > FSHLog(the master branch is not that stable itself...).
> > > > > >
> > > > > > More details can be found on HBASE-15536.
> > > > > >
> > > > > > So here we propose to change the default WAL from FSHLog to
> > > AsyncFSWAL.
> > > > > > Suggestions are welcomed.
> > > > > >
> > > > > > Thanks.
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > busbey
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message