hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: HBase region server failure issues
Date Tue, 15 Apr 2014 16:29:37 GMT
You'd probably know better than I Kevin but I'd worry about the
1000*1000*32 case, where HDFS is as (over)committed as the HBase tier.


On Tue, Apr 15, 2014 at 9:26 AM, Kevin O'dell <kevin.odell@cloudera.com>wrote:

> In general I have never seen nor heard of Federated Namespaces in the wild,
> so I would be hesitant to go down that path.  But you know for "Science" I
> would be interested in seeing how that worked out.  Would we be looking at
> 32 WALs per region?  At a large cluster with 1000nodes, 100 regions per
> node, and a WAL per region(I like easy math):
>
> 1000*100*32= 3.2 million files for WALs  This is not ideal, but it is not
> horrible if we are using 128MB block sizes etc.
>
> I feel like I am missing something above though.  Thoughts?
>
>
> On Tue, Apr 15, 2014 at 12:20 PM, Andrew Purtell <apurtell@apache.org
> >wrote:
>
> > # of WALs as roughly spindles / replication factor seems intuitive. Would
> > be interesting to benchmark.
> >
> > As for one WAL per region, the BigTable paper IIRC says they didn't
> because
> > of concerns about the number of seeks in the filesystems underlying GFS
> and
> > because it would reduce the effectiveness of group commit throughput
> > optimization. If WALs are backed by SSD certainly the first consideration
> > no longer holds. We also had a global HDFS file limit to contend with. I
> > know HDFS is incrementally improving the scalabilty of a namespace, but
> > this is still an active consideration. (Or we could try partitioning a
> > deploy over a federated namespace? Could be "interesting". Has anyone
> tried
> > that? I haven't heard.)
> >
> >
> >
> > On Tue, Apr 15, 2014 at 7:11 AM, Jonathan Hsieh <jon@cloudera.com>
> wrote:
> >
> > > It makes sense to have as many wals as # of spindles / replication
> factor
> > > per machine.  This should be decoupled from the number of regions on a
> > > region server.  So for a cluster with 12 spindles we should likely have
> > at
> > > least 4 wals (12 spindles / 3 replication factor), and need to do
> > > experiments to see if going to 8 or some higher number makes sense (new
> > wal
> > > uses a disruptor pattern which avoids much contention on individual
> > > writes).   So with your example, your 1000 regions would get sharded
> into
> > > the 4 wals which would maximize io throughput, disk utilization, and
> > reduce
> > > time for recovery in the face of failure.
> > >
> > > In the case of an SSD world, it makes more sense to have one wal per
> node
> > > once we have decent HSM support in HDFS.  The key win here will be in
> > > recovery time -- if any RS goes down we only have to replay a regions
> > edits
> > > and not have to split or demux different region's edits.
> > >
> > > Jon.
> > >
> > >
> > > On Mon, Apr 14, 2014 at 10:37 PM, Vladimir Rodionov
> > > <vladrodionov@gmail.com>wrote:
> > >
> > > > Todd, how about 300 regions with 3x replication?  Or 1000 regions?
> This
> > > is
> > > > going to be 3000 files. on HDFS. per one RS. When I said that it does
> > not
> > > > scale, I meant that exactly that.
> > > >
> > >
> > >
> > >
> > > --
> > > // Jonathan Hsieh (shay)
> > > // HBase Tech Lead, Software Engineer, Cloudera
> > > // jon@cloudera.com // @jmhsieh
> > >
> >
> >
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
>
>
>
> --
> Kevin O'Dell
> Systems Engineer, Cloudera
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message