cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DuyHai Doan <doanduy...@gmail.com>
Subject Re: State of triggers
Date Sun, 05 Mar 2017 08:25:03 GMT
No problem, distributed systems are hard to reason about, I got caught many
times in the past

On Sun, Mar 5, 2017 at 9:23 AM, benjamin roth <brstgt@gmail.com> wrote:

> Sorry. Answer was to fast. Maybe you are right.
>
> Am 05.03.2017 09:21 schrieb "benjamin roth" <brstgt@gmail.com>:
>
> > No. You just change the partitioner. That's all
> >
> > Am 05.03.2017 09:15 schrieb "DuyHai Doan" <doanduyhai@gmail.com>:
> >
> >> "How can that be achieved? I haven't done "scientific researches" yet
> but
> >> I
> >> guess a "MV partitioner" could do the trick. Instead of applying the
> >> regular partitioner, an MV partitioner would calculate the PK of the
> base
> >> table (which is always possible) and then apply the regular
> partitioner."
> >>
> >> The main purpose of MV is to avoid the drawbacks of 2nd index
> >> architecture,
> >> e.g. to scan a lot of nodes to fetch the results.
> >>
> >> With MV, since you give the partition key, the guarantee is that you'll
> >> hit
> >> a single node.
> >>
> >> Now if you put MV data on the same node as base table data, you're doing
> >> more-or-less the same thing as 2nd index.
> >>
> >> Let's take a dead simple example
> >>
> >> CREATE TABLE user (user_id uuid PRIMARY KEY, email text);
> >> CREATE MV user_by_email AS SELECT * FROM user WHERE user_id IS NOT NULL
> >> AND
> >> email IS NOT NULL PRIMARY KEY((email),user_id);
> >>
> >> SELECT * FROM user_by_email WHERE email = xxx;
> >>
> >> With this query, how can you find the user_id that corresponds to email
> >> 'xxx' so that your MV partitioner idea can work ?
> >>
> >>
> >>
> >> On Sun, Mar 5, 2017 at 9:05 AM, benjamin roth <brstgt@gmail.com> wrote:
> >>
> >> > While I was reading the MV paragraph in your post, an idea popped up:
> >> >
> >> > The problem with MV inconsistencies and inconsistent range movement is
> >> that
> >> > the "MV contract" is broken. This only happens because base data and
> >> > replica data reside on different hosts. If base data + replicas would
> >> stay
> >> > on the same host then a rebuild/remove would always stream both
> matching
> >> > parts of a base table + mv.
> >> >
> >> > So my idea:
> >> > Why not make a replica ALWAYS stay local regardless where the token of
> >> a MV
> >> > would point at. That would solve these problems:
> >> > 1. Rebuild / remove node would not break MV contract
> >> > 2. A write always stays local:
> >> >
> >> > a) That means replication happens sync. That means a quorum write to
> the
> >> > base table guarantees instant data availability with quorum read on a
> >> view
> >> >
> >> > b) It saves network roundtrips + request/response handling and helps
> to
> >> > keep a cluster healthier in case of bulk operations (like repair
> >> streams or
> >> > rebuild stream). Write load stays local and is not spread across the
> >> whole
> >> > cluster. I think it makes the load in these situations more
> predictable.
> >> >
> >> > How can that be achieved? I haven't done "scientific researches" yet
> >> but I
> >> > guess a "MV partitioner" could do the trick. Instead of applying the
> >> > regular partitioner, an MV partitioner would calculate the PK of the
> >> base
> >> > table (which is always possible) and then apply the regular
> partitioner.
> >> >
> >> > I'll create a proper Jira for it on monday. Currently it's sunday here
> >> and
> >> > my family wants me back so just a few thoughts on this right now.
> >> >
> >> > Any feedback is appreciated!
> >> >
> >> > 2017-03-05 6:34 GMT+01:00 Edward Capriolo <edlinuxguru@gmail.com>:
> >> >
> >> > > On Sat, Mar 4, 2017 at 10:26 AM, Jeff Jirsa <jjirsa@gmail.com>
> wrote:
> >> > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > > On Mar 4, 2017, at 7:06 AM, Edward Capriolo <
> >> edlinuxguru@gmail.com>
> >> > > > wrote:
> >> > > > >
> >> > > > >> On Fri, Mar 3, 2017 at 12:04 PM, Jeff Jirsa <jjirsa@gmail.com>
> >> > wrote:
> >> > > > >>
> >> > > > >> On Fri, Mar 3, 2017 at 5:40 AM, Edward Capriolo <
> >> > > edlinuxguru@gmail.com>
> >> > > > >> wrote:
> >> > > > >>
> >> > > > >>>
> >> > > > >>> I used them. I built do it yourself secondary indexes
with
> them.
> >> > They
> >> > > > >> have
> >> > > > >>> there gotchas, but so do all the secondary index
> >> implementations.
> >> > > Just
> >> > > > >>> because datastax does not write about something.
Lets see
> like 5
> >> > > years
> >> > > > >> ago
> >> > > > >>> there was this: https://github.com/hmsonline/
> cassandra-triggers
> >> > > > >>>
> >> > > > >>>
> >> > > > >> Still in use? How'd it work? Production ready? Would
you still
> >> do it
> >> > > > that
> >> > > > >> way in 2017?
> >> > > > >>
> >> > > > >>
> >> > > > >>> There is a fairly large divergence to what actual
users do and
> >> what
> >> > > > other
> >> > > > >>> groups 'say' actual users do in some cases.
> >> > > > >>>
> >> > > > >>
> >> > > > >> A lot of people don't share what they're doing (for
business
> >> > reasons,
> >> > > or
> >> > > > >> because they don't think it's important, or because
they don't
> >> know
> >> > > > >> how/where), and that's fine but it makes it hard for
anyone to
> >> know
> >> > > what
> >> > > > >> features are used, or how well they're really working
in
> >> production.
> >> > > > >>
> >> > > > >> I've seen a handful of "how do we use triggers" questions
in
> IRC,
> >> > and
> >> > > > they
> >> > > > >> weren't unreasonable questions, but seemed like a lot
of pain,
> >> and
> >> > > more
> >> > > > >> than one of those people ultimately came back and said
they
> used
> >> > some
> >> > > > other
> >> > > > >> mechanism (and of course, some of them silently disappear,
so
> we
> >> > have
> >> > > no
> >> > > > >> idea if it worked or not).
> >> > > > >>
> >> > > > >> If anyone's actively using triggers, please don't keep
it a
> >> secret.
> >> > > > Knowing
> >> > > > >> that they're being used would be a great way to justify
> >> continuing
> >> > to
> >> > > > >> maintain them.
> >> > > > >>
> >> > > > >> - Jeff
> >> > > > >>
> >> > > > >
> >> > > > > "Still in use? How'd it work? Production ready? Would you
still
> >> do it
> >> > > > that way in 2017?"
> >> > > > >
> >> > > > > I mean that is a loaded question. How long has cassandra
had
> >> > Secondary
> >> > > > > Indexes? Did they work well? Would you use them? How many
times
> >> were
> >> > > > they re-written?
> >> > > >
> >> > > > It wasn't really meant to be a loaded question; I was being
> sincere
> >> > > >
> >> > > > But I'll answer: secondary indexes suck for many use cases, but
> >> they're
> >> > > > invaluable for their actual intended purpose, and I have no idea
> how
> >> > many
> >> > > > times they've been rewritten but they're production ready for
> their
> >> > > narrow
> >> > > > use case (defined by cardinality).
> >> > > >
> >> > > > Is there a real triggers use case still? Alternative to MVs?
> >> > Alternative
> >> > > > to CDC? I've never implemented triggers - since you have, what's
> the
> >> > > level
> >> > > > of surprise for the developer?
> >> > >
> >> > >
> >> > > :) You mention alternatives/: Lets break them down.
> >> > >
> >> > > MV:
> >> > > They seem to have a lot pf promise. IE you can use them for things
> >> other
> >> > > then equality searches, and I do think the CQL example with the top
> N
> >> > high
> >> > > scores is pretty useful. Then again our buddy Mr Roth has a thread
> >> named
> >> > > "Rebuild / remove node with MV is inconsistent". I actually think
a
> >> lot
> >> > of
> >> > > the use case for mv falls into the category of "something you should
> >> > > actually be doing with storm". I can vibe with the concept of not
> >> > needing a
> >> > > streaming platform, but i KNOW storm would do this correctly. I
> don't
> >> > want
> >> > > to land on something like 2x index v1 v2 where there was fundamental
> >> > flaws
> >> > > at scale.(not saying this is case but the rebuild thing seems a bit
> >> > scary)
> >> > >
> >> > > CDC:
> >> > > I slightly afraid of this. Rational: A extensible piece design
> >> > specifically
> >> > > for a close source implementation of hub and spoke replication. I
> have
> >> > some
> >> > > experience trying to "play along" with extensible things
> >> > > https://issues.apache.org/jira/browse/CASSANDRA-12627
> >> > > "Thus, I'm -1 on {[PropertyOrEnvironmentSeedProvider}}."
> >> > >
> >> > > Not a rub, but I can't even get something committed using an
> existing
> >> > > extensible interface. Heaven forbid a use case I have would want to
> >> > > *change*
> >> > > the interface, I would probably get a -12. So I have no desire to
> try
> >> and
> >> > > maintain a CDC implementation. I see myself falling into the same
> old
> >> > "why
> >> > > you want to do this? -1" trap.
> >> > >
> >> > > Coordinator Triggers:
> >> > > To bring things back really old-school coordinator triggers everyone
> >> > always
> >> > > wanted. In a nutshell, I DO believe they are easier to reason about
> >> then
> >> > > MV. It is pretty basic, it happens on the coordinator there is no
> >> > batchlogs
> >> > > or whatever, best effort possibly requiring more nodes then as the
> >> keys
> >> > > might be on different services. Actually I tend do like features
> like.
> >> > Once
> >> > > something comes on the downswing of  "software hype cycle" you know
> >> it is
> >> > > pretty stable as everyone's all excited about other things.
> >> > >
> >> > > As I said, I know I can use storm for top-n, so what is this
> feature?
> >> > Well
> >> > > I want to optimize my network transfer generally by building my
> batch
> >> > > mutations on the server. Seems reasonable. Maybe I want to have my
> own
> >> > > little "read before write" thing like CQL lists.
> >> > >
> >> > > The warts, having tried it. First time i tried it found it did not
> >> work
> >> > > with non batches, patched in 3 hours. Took weeks before some CQL
> user
> >> had
> >> > > the same problem and it got fixed :) There was no dynamic stuff at
> the
> >> > time
> >> > > so it was BYO class loader. Going against the grain and saying.
> >> > >
> >> > > The thing you have to realize with the best effort coordinator
> >> triggers
> >> > are
> >> > > that "transaction" could be incomplete and well that sucks maybe for
> >> some
> >> > > cases. But I actually felt the 2x index implementations force all
> >> > problems
> >> > > into a type of "foreign key transnational integrity " that does not
> >> make
> >> > > sense for cassandra.
> >> > >
> >> > > Have you every used elastic search, there version of consistency is
> >> write
> >> > > something, keep reading and eventually you see it, wildly popular
:)
> >> It
> >> > is
> >> > > a crazy world.
> >> > >
> >> >
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message