hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akash Ashok <thehellma...@gmail.com>
Subject Re: Multiple WALs
Date Sun, 02 Oct 2011 05:08:38 GMT
I've opened up a JIRA for this
https://issues.apache.org/jira/browse/HBASE-4529

Cheers,
Akash A

On Sun, Oct 2, 2011 at 6:04 AM, karthik tunga <karthik.tunga@gmail.com>wrote:

> Hey Stack,
>
> Along with the log replaying part, logic is also needed for log roll over.
> This, I think, easier compared to the merging of the logs. Any edits less
> than the last sequence number  on the file system can be removed from all
> the WALs.
>
> Cheers,
> Karthik
>
> On 1 October 2011 18:05, Jesse Yates <jesse.k.yates@gmail.com> wrote:
>
> > I think adding the abstraction layer and making it not only pluggable,
> but
> > configurable would be great.
> >
> >  It would be nice to be able to tie into a service that logs directly to
> > disk, rather than go through HDFS giving some potentially awesome speedup
> > at
> > the cost of having to write a logging service that handles replication,
> > etc.
> > Side note, Accumulo is using their own service to storing the WAL, rather
> > than HDFS and I suspect that plays a big role in people's claim of its
> > ability to do 'outperform' HBase.
> >
> > -Jesse Yates
> >
> > On Sat, Oct 1, 2011 at 2:04 PM, Stack <stack@duboce.net> wrote:
> >
> > > Yes.  For sure.  Would need to check that the split can deal w/
> > > multiple logs written by the one server concurrently (sort by sequence
> > > edit id after sorting on all the rest that makes up a wal log key).
> > >
> > > St.Ack
> > >
> > > On Sat, Oct 1, 2011 at 1:36 PM, karthik tunga <karthik.tunga@gmail.com
> >
> > > wrote:
> > > > Hey,
> > > >
> > > > Doesn't multiple WALs need some kind of merging when recovering from
> a
> > > crash
> > > > ?
> > > >
> > > > Cheers,
> > > > Karthik
> > > >
> > > >
> > > > On 1 October 2011 15:17, Stack <stack@duboce.net> wrote:
> > > >
> > > >> +1 on making WAL pluggable so we can experiment.  Being able to
> write
> > > >> multiple WALs at once should be easy enough to do (the WAL split
> code
> > > >> should be able to handle it). Also a suggestion made a while back
> was
> > > >> making it so hbase could be configured to write two filesystems --
> > > >> there'd be hbase.rootdir as now -- and then we'd allow specifying
> > > >> another fs to use for writing WALs (If not specified, we'd just use
> > > >> hbase.rootdir for all filesystem interactions as now).
> > > >>
> > > >> St.Ack
> > > >>
> > > >> On Sat, Oct 1, 2011 at 10:56 AM, Dhruba Borthakur <dhruba@gmail.com
> >
> > > >> wrote:
> > > >> > I have been experimenting with the WAL settings too. It is obvious
> > > that
> > > >> > turning off the wal makes ur transactions go faster, HDFS
> write/sync
> > > are
> > > >> not
> > > >> > yet very optimized for high throughput small writes.
> > > >> >
> > > >> > However, irrespective of whether I have one wal or two, I have
> > seeing
> > > the
> > > >> > same throughput. I have experimented with an HDFS setting that
> > allows
> > > >> > writing/sync to multiple replicas in parallel, and that has
> > increased
> > > >> > performance for my test workload, see
> > > >> > https://issues.apache.org/jira/browse/HDFS-1783.
> > > >> >
> > > >> > About using one wal or two, it will be nice if we can separate
out
> > the
> > > >> wal
> > > >> > API elegantly and make it pluggable. In that case, we can
> experiment
> > > >> HBase
> > > >> > with multiple systems. Once we have it pluggable, we can make
the
> > > habse
> > > >> wal
> > > >> > go to a separate HDFS (pure SSD based maybe?).
> > > >> >
> > > >> > -dhruba
> > > >> >
> > > >> >
> > > >> > On Sat, Oct 1, 2011 at 8:09 AM, Akash Ashok <
> thehellmaker@gmail.com
> > >
> > > >> wrote:
> > > >> >
> > > >> >> Hey,
> > > >> >> I've see that setting writeToWAL(false) boosts up the writes
like
> > > crazy.
> > > >> I
> > > >> >> was just thinking having MuiltipleWAL on HBase. I understand
that
> > > this
> > > >> is a
> > > >> >> consideration in BigTable paper that a WAL per region is
not used
> > > >> because
> > > >> >> it
> > > >> >> might result in a lot of disk seeks when there are large
number
> of
> > > >> reasons.
> > > >> >> But how about having as many WALs as the number of HardDrives
in
> > the
> > > >> >> system.
> > > >> >> I see that the recommended configs for HBase are 4 - 12 hard
> drives
> > > per
> > > >> >> node. This might kick the writes up a notch.
> > > >> >>
> > > >> >> Would like to know the general opinion on this one?
> > > >> >>
> > > >> >> Cheers,
> > > >> >> Akash A
> > > >> >>
> > > >> >
> > > >> >
> > > >> >
> > > >> > --
> > > >> > Connect to me at http://www.facebook.com/dhruba
> > > >> >
> > > >>
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message