hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vladrodio...@gmail.com>
Subject Re: [Shadow Regions / Read Replicas ]
Date Tue, 03 Dec 2013 20:31:39 GMT
The downside:

- Double/Triple memstore usage
- Increased block cache usage (effectively, block cache will have 50%
capacity may be less)

These downsides are pretty serious ones. This will result:

1. in decreased overall performance due to decreased efficient block cache
size
 2. In more frequent memstore flushes - this will affect compaction and
write tput.

I do not believe that  HBase 'large' MTTR does not allow to meet 99% SLA.
of 10-20ms unless your RSs go down 2-3 times a day for several minutes each
time. You have to analyze first why are you having so frequent failures,
than fix the root source of the problem. Its possible to reduce 'detection'
phase in MTTR process to couple seconds either by using external beacon
process (as I suggested already) or by rewriting some code inside HBase and
NameNode to move all data out from Java heap to off-heap and reducing
GC-induced timeouts from 30 sec to 1-2 sec max. Its tough, but doable. The
result: you will decrease MTTR by 50% at least w/o sacrificing the overall
cluster performance.

I think, its RS and NN large heaps   and frequent s-t-w GC  activities
prevents meeting strict SLA - not occasional server failures.



On Tue, Dec 3, 2013 at 11:51 AM, Jonathan Hsieh <jon@cloudera.com> wrote:

> To keep the discussion focused on the design goals, I'm going start
> referring to enis and deveraj's eventually consistent read replicas as the
> *read replica* design, and consistent fast read recovery mechanism based on
> shadowing/tailing the wals as *shadow regions* or *shadow memstores*.  Can
> we agree on nomenclature?
>
>
> On Tue, Dec 3, 2013 at 11:07 AM, Enis Söztutar <enis@apache.org> wrote:
>
> > Thanks Jon for bringing this to dev@.
> >
> >
> > On Mon, Dec 2, 2013 at 10:01 PM, Jonathan Hsieh <jon@cloudera.com>
> wrote:
> >
> > > Fundamentally, I'd prefer focusing on making HBase "HBasier" instead of
> > > tackling a feature that other systems architecturally can do better
> > > (inconsistent reads).   I consider consistent reads/writes being one of
> > > HBase's defining features. That said, I think read replicas makes sense
> > and
> > > is a nice feature to have.
> > >
> >
> > Our design proposal has a specific use case goal, and hopefully we can
> > demonstrate the
> > benefits of having this in HBase so that even more pieces can be built on
> > top of this. Plus I imagine this will
> > be a widely used feature for read-only tables or bulk loaded tables. We
> are
> > not
> > proposing of reworking strong consistency semantics or major
> architectural
> > changes. I think by
> > having the tables to be defined with replication count, and the proposed
> > client API changes (Consistency definition)
> > plugs well into the HBase model rather well.
> >
> >
> I do agree think that without any recent updating mechanism, we are
> limiting this usefulness of this feature to essentially *only* the
> read-only or bulk load only tables.  Recency if there were any
> edits/updates would be severely lagging (by default potentially an hour)
> especially in cases where there are only a few edits to a primarily bulk
> loaded table.  This limitation is not mentioned in the tradeoffs or
> requirements (or a non-requirements section) definitely should be listed
> there.
>
> With the current design it might be best to have a flag on the table which
> marks it read-only or bulk-load only so that it only gets used by users
> when the table is in that mode?  (and maybe an "escape hatch" for power
> users).
>
> [snip]
> >
> > - I think the two goals are both worthy on their own each with their own
> > > optimal points.  We should in the design makes sure that we can support
> > > both goals.
> > >
> >
> > I think our proposal is consistent with your doc, and we have considered
> > secondary region promotion
> > in the future section. It would be good if you can review and comment on
> > whether you see any points
> > missing.
> >
> >
> > I definitely will. At the moment, I think the hybrid for the wals/hlogs I
> suggested in the other thread seems to be an optimal solution considering
> locality.  Though feasible is obviously more complex than just one approach
> alone.
>
>
> > > - I want to making sure the proposed design have a path for optimal
> > > fast-consistent read-recovery.
> > >
> >
> > We think that it is, but it is a secondary goal for the initial work. I
> > don't see any reason why secondary
> > promotion cannot be build on top of this, once the branch is in a better
> > state.
> >
>
> Based on the detail in the design doc and this statement it sounds like you
> have a prototype branch already?  Is this the case?
>
> --
> // Jonathan Hsieh (shay)
> // Software Engineer, Cloudera
> // jon@cloudera.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message