hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jesse Yates <jesse.k.ya...@gmail.com>
Subject Re: Multiple WALs
Date Sat, 01 Oct 2011 22:05:25 GMT
I think adding the abstraction layer and making it not only pluggable, but
configurable would be great.

 It would be nice to be able to tie into a service that logs directly to
disk, rather than go through HDFS giving some potentially awesome speedup at
the cost of having to write a logging service that handles replication, etc.
Side note, Accumulo is using their own service to storing the WAL, rather
than HDFS and I suspect that plays a big role in people's claim of its
ability to do 'outperform' HBase.

-Jesse Yates

On Sat, Oct 1, 2011 at 2:04 PM, Stack <stack@duboce.net> wrote:

> Yes.  For sure.  Would need to check that the split can deal w/
> multiple logs written by the one server concurrently (sort by sequence
> edit id after sorting on all the rest that makes up a wal log key).
>
> St.Ack
>
> On Sat, Oct 1, 2011 at 1:36 PM, karthik tunga <karthik.tunga@gmail.com>
> wrote:
> > Hey,
> >
> > Doesn't multiple WALs need some kind of merging when recovering from a
> crash
> > ?
> >
> > Cheers,
> > Karthik
> >
> >
> > On 1 October 2011 15:17, Stack <stack@duboce.net> wrote:
> >
> >> +1 on making WAL pluggable so we can experiment.  Being able to write
> >> multiple WALs at once should be easy enough to do (the WAL split code
> >> should be able to handle it). Also a suggestion made a while back was
> >> making it so hbase could be configured to write two filesystems --
> >> there'd be hbase.rootdir as now -- and then we'd allow specifying
> >> another fs to use for writing WALs (If not specified, we'd just use
> >> hbase.rootdir for all filesystem interactions as now).
> >>
> >> St.Ack
> >>
> >> On Sat, Oct 1, 2011 at 10:56 AM, Dhruba Borthakur <dhruba@gmail.com>
> >> wrote:
> >> > I have been experimenting with the WAL settings too. It is obvious
> that
> >> > turning off the wal makes ur transactions go faster, HDFS write/sync
> are
> >> not
> >> > yet very optimized for high throughput small writes.
> >> >
> >> > However, irrespective of whether I have one wal or two, I have seeing
> the
> >> > same throughput. I have experimented with an HDFS setting that allows
> >> > writing/sync to multiple replicas in parallel, and that has increased
> >> > performance for my test workload, see
> >> > https://issues.apache.org/jira/browse/HDFS-1783.
> >> >
> >> > About using one wal or two, it will be nice if we can separate out the
> >> wal
> >> > API elegantly and make it pluggable. In that case, we can experiment
> >> HBase
> >> > with multiple systems. Once we have it pluggable, we can make the
> habse
> >> wal
> >> > go to a separate HDFS (pure SSD based maybe?).
> >> >
> >> > -dhruba
> >> >
> >> >
> >> > On Sat, Oct 1, 2011 at 8:09 AM, Akash Ashok <thehellmaker@gmail.com>
> >> wrote:
> >> >
> >> >> Hey,
> >> >> I've see that setting writeToWAL(false) boosts up the writes like
> crazy.
> >> I
> >> >> was just thinking having MuiltipleWAL on HBase. I understand that
> this
> >> is a
> >> >> consideration in BigTable paper that a WAL per region is not used
> >> because
> >> >> it
> >> >> might result in a lot of disk seeks when there are large number of
> >> reasons.
> >> >> But how about having as many WALs as the number of HardDrives in the
> >> >> system.
> >> >> I see that the recommended configs for HBase are 4 - 12 hard drives
> per
> >> >> node. This might kick the writes up a notch.
> >> >>
> >> >> Would like to know the general opinion on this one?
> >> >>
> >> >> Cheers,
> >> >> Akash A
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > Connect to me at http://www.facebook.com/dhruba
> >> >
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message