incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhu Han <schumi....@gmail.com>
Subject Re: [DISCUSSION] High-volume counters in Cassandra
Date Mon, 27 Sep 2010 03:32:51 GMT
 I propose a new way to solve the counter problem in cassandra-1502[1].
Since I do not follow the jira update very carefully, I paste it here and
want to let more people comment it and then to see whether its feasible.

"Seems like we have not found a solution acceptable to everybody. I tries to
propose a new approach. Let's see whether anybody can shed some light on it
and make it as reality.

1) We add a basic data structure, called as counter, which is a special type
of super column.

2) The name of each column in the counter super column, is the host name of
a cassandra node. And the value is the calculated result from that node.

3) WRITE PATH: Once a node receives the add/dec request of a counter, it
de-serializes its local counter super column, and update the column named by
itself atomically. After that, it propagates the updated column value to
other replicas, just like how the mutation of a normal column is propagated
to other replicas. Different consistency levels can be supported as before.

4) READ PATH: Depends on the consistency level, contact several replicas,
read back the counter super column as whole, and get the latest counter
value by summing up all columns in the counter. Read-repair logic can work
as before.

IMHO, the biggest advantages of this approach, is re-using as many
mechanisms already in the code as possible. So it might not so disruptive.
But adding new thrift API is inevitable. "
NB: If it's feasible, I might not be the right man working on it as I have
not touched the internal of cassandra for more than 1 year. I wants to
contribute something to help us get consensus.

[1]
https://issues.apache.org/jira/browse/CASSANDRA-1502?focusedCommentId=12915103&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12915103

best regards,
hanzhu


On Sun, Sep 26, 2010 at 9:49 PM, Jonathan Ellis <jbellis@gmail.com> wrote:

> you have misunderstood.  if we continue the 1072 approach of writing
> counter data to the clock field, this is necessarily incompatible with
> the right way of writing counter data to the value field.  it's no
> longer simply a matter of reversing 1070.
>
> On Sat, Sep 25, 2010 at 11:50 PM, Zhu Han <schumi.han@gmail.com> wrote:
> > Jonathan,
> >
> > This is a personnel email.
> >
> > On Sun, Sep 26, 2010 at 1:27 PM, Jonathan Ellis <jbellis@gmail.com>
> wrote:
> >>
> >> On Sat, Sep 25, 2010 at 8:57 PM, Zhu Han <schumi.han@gmail.com> wrote:
> >> > Can we just let the patch committed but mark it as "alpah" or
> >> > "experimental"?
> >>
> >> I explained exactly why that is not a good approach here:
> >> http://www.mail-archive.com/dev@cassandra.apache.org/msg00917.html
> >>
> > Yes, I see. But the clock structure is in truck since Cassandra-1070.  We
> > still need to clean them
> > out,  whatever. We need somebody to be volunteer to take this work.
> > Considering the complexity
> > of Cassandra-1070, the programmer who has the in depth knowledge of this
> > patch is preferable. And it
> > will take some time to do it.
> >
> > Fortunately,  Johan Oskarsson has promised to take it in the comment of
> > Cassandra-1072[1]:
> >
> > "The clock changes would get into trunk quicker if we didn't, avoiding
> the
> > extra overhead of a big patch during reviews, merge with trunk, code
> updates
> > and publication of a new patch.
> > If the concern is that we won't attend to the clocks once this patch is
> in I
> > can promise that we'll look at it straight away. "
> >
> > And if twitter/digg/simplegeo forks their tree of cassandra, this will
> give
> > a big marketing opportunities of other NOSQL system supporters. As you
> know,
> > the competition is quite fierce currently.
> >
> > So, instead of sticking to the embarrassed situation,  why not change to
> > another strategy:
> >
> >> "Fork another experimental tree from 0.7 beta 1 and accept
> >> Cassandra-1072.  At the same time, start the clean up work on this tree.
> >> Once it's finalized , merge them back to 0.7, no matter it's 0.7.1 or
> 0.7.2.
> >>
> >> Hence, these guys from twitter does not need to maintain a huge
> >> out-of-tree patch, while the quality impact of cassandra-1072 is still
> >> limited.
> >
> > I do know the pain of maintaining a large patch out of the official tree.
> > Once it gets in, everybody will feels much better.
> >
> > If you give some opportunities to this patch, Johan or others  can be
> highly
> > motivated because all of the community works together.  It's a
> compromise,
> > but it's worth.
> >
> > [1]
> >
> https://issues.apache.org/jira/browse/CASSANDRA-1072?focusedCommentId=12909234&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12909234
> >
> >
> >>
> >> --
> >> Jonathan Ellis
> >> Project Chair, Apache Cassandra
> >> co-founder of Riptano, the source for professional Cassandra support
> >> http://riptano.com
> >
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message