aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joshua Cohen <jco...@apache.org>
Subject Re: Future of storage in Aurora
Date Tue, 03 Oct 2017 14:45:36 GMT
What does this mean in terms of the original goals behind the storage
system refactor? Are we confident that Jordan's work for hot-followers will
alleviate the problems w/ long failovers? I'm definitely in favor of
killing the H2 code if its goals can never be realized and it's just a
maintenance burden, but I'd also like to know what our plans are for
storage in the future.

Also, what does this mean for stores that have never existed as non-H2
(i.e. the job update store). Will converting it have an impact on, e.g.,
storage write-lock contention?

On Sun, Oct 1, 2017 at 5:59 PM, Bill Farner <wfarner@apache.org> wrote:

> I would like to revive this discussion in light of some work i have been
> doing around the storage system.  The fruits of the DB storage system will
> require a lot of additional effort to reach the beneficial outcomes i laid
> out above, and i agree that we should cut our losses.
>
> I plan to introduce patches soon to introduce non-H2 in-memory store
> implementations.  *If anyone disagrees with removing the H2 implementations
> as well, please chime in here.*
>
> Disclaimer - i may propose an alternative for the persistent storage in the
> near future.
>
> On Mon, Apr 3, 2017 at 9:40 AM, Stephan Erb <serb@apache.org> wrote:
>
> > H2 could give us fine granular data access. However, most of our code
> > performs massive joins to reconstruct fully hydrated thrift objects.
> > Most of the time we are then only interested in very few properties of
> > those thrift structs. This applies to internal usage, but also how we
> > use the API.
> >
> > I therefore believe we have to improve and refine our domain model in
> > order to significantly improve the storage situation.
> >
> > I really liked Maxim's proposal from last year, and I think it is worth
> > reconsidering: https://docs.google.com/document/d/1myYX3yuofGr8JIzud98x
> > Xd5mqgpZ8q_RqKBpSff4-WE/edit
> >
> > Best regards,
> > Stephan
> >
> > On Thu, 2017-03-30 at 15:53 -0700, David McLaughlin wrote:
> > > So it sounds like before we make any decisions around removing the
> > > work
> > > done in H2 so far, we should figure out what is remaining to move to
> > > external storage (or if it's even still a goal).
> > >
> > > I may still play around with reviving the in-memory stores, but will
> > > separate that work from any goal to remove the H2 layer. Since it's
> > > motivated by performance, I'd verify there is a benefit before
> > > submitting
> > > any review.
> > >
> > > Thanks all for the feedback.
> > >
> > >
> > > On Thu, Mar 30, 2017 at 12:08 PM, Bill Farner <wfarnerapache@gmail.co
> > > m>
> > > wrote:
> > >
> > > > Adding some background - there were several motivators to using SQL
> > > > that
> > > > come to mind:
> > > > a) well-understood transaction isolation guarantees leading to a
> > > > simpler
> > > > programming model w.r.t. concurrency
> > > > b) ability to offload storage to a separate system (e.g. Postgres)
> > > > and
> > > > scale it separately
> > > > c) relief of computational burden of performing snapshots and
> > > > backups due
> > > > to (b)
> > > > d) simpler code and operations model due to (b)
> > > > e) schema backwards compatibility guarantees due to persistence-
> > > > friendly
> > > > migration-scripts
> > > > f) straightforward normalization to facilitate sharing of
> > > > otherwise-redundant state (I.e. TaskConfig)
> > > >
> > > > The storage overhaul comes with a huge caveat requiring the
> > > > approach to
> > > > scheduling rounds to change. I concur that the current model is
> > > > hostile to
> > > > offloaded storage, as ~all state must be read every scheduling
> > > > round. If
> > > > that cannot be worked around with lazy state or best-effort
> > > > concurrency
> > > > (I.e. in-memory caching), the approach is indeed flawed.
> > > >
> > > > On Mar 30, 2017, 10:29 AM -0700, Joshua Cohen <jcohen@apache.org>,
> > > > wrote:
> > > > > My understanding of the H2-backed stores is that at least part of
> > > > > the
> > > > > original rationale behind them was that they were meant to be an
> > > > > interim
> > > > > point on the way to external SQL-backed stores which should
> > > > > theoretically
> > > > > provide significant benefits w.r.t. to GC (obviously unproven,
> > > > > especially
> > > > > at scale).
> > > > >
> > > > > I don't disagree that the H2 stores themselves are problematic
> > > > > (to say
> > > >
> > > > the
> > > > > least); do we have evidence that returning to memory based stores
> > > > > will be
> > > > > an improvement on that?
> > > > >
> > > > > On Thu, Mar 30, 2017 at 12:16 PM, David McLaughlin <
> > > >
> > > > dmclaughlin@apache.org
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I'd like to start a discussion around storage in Aurora.
> > > > > >
> > > > > > I think one of the biggest mistakes we made in migrating our
> > > > > > storage
> > > >
> > > > to H2
> > > > > > was deleting the memory stores as we moved. We made a pretty
> > > > > > big bet
> > > >
> > > > that
> > > > > > we could eventually make H2/relational databases work. I don't
> > > > > > think
> > > >
> > > > that
> > > > > > bet has paid off and that we need to revisit the direction
> > > > > > we're
> > > >
> > > > taking.
> > > > > >
> > > > > > My belief is that the current H2/MyBatis approach is untenable
> > > > > > for
> > > >
> > > > large
> > > > > > production clusters, at least without changing our current
> > > >
> > > > single-master
> > > > > > architecture. At Twitter we are already having to fight to keep
> > > > > > GC
> > > > > > manageable even without DbTaskStore enabled, so I don't see
a
> > > > > > path
> > > >
> > > > forward
> > > > > > where we could eventually enable that. So far experiments with
> > > > > > H2
> > > >
> > > > off-heap
> > > > > > storage have provided marginal (if any) gains.
> > > > > >
> > > > > > Would anyone object to restoring the in-memory stores and
> > > > > > creating new
> > > > > > implementations for the missing ones (UpdateStore)? I'd even
go
> > > >
> > > > further and
> > > > > > propose that we consider in-memory H2 and MyBatis a failed
> > > > > > experiment
> > > >
> > > > and
> > > > > > we drop that storage layer completely.
> > > > > >
> > > > > > Cheers,
> > > > > > David
> > > > > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message