cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pedro Gordo <pedro.gordo1...@gmail.com>
Subject Re: New contribution - Burst Hour Compaction Strategy
Date Fri, 09 Jun 2017 09:44:53 GMT
Hi Stefan

Thanks for pointing these out. So far, I've only worked collaboratively
with SVN, so I wasn't sure how best to address this part of the development
with Git. I'll create a document explaining what I've done, hopefully until
the end of this week, so that people at least can discuss the strategy
while I work out how to address Git history. Would it be best to post
updates to this thread, or should I continue with comments in the JIRA
ticket?

I initially thought about developing in v3.0, but then I read here
<http://cassandra.apache.org/doc/latest/development/patches.html?highlight=tick%20tock>
that
only bug fixes would be accepted for 3.0, and the 3.11 was just a stability
release. I guess I'll need to redo a lot of commits, so what would be the
best branch to do them on?

Best regards

Pedro Gordo

On 9 June 2017 at 09:10, Stefan Podkowinski <spod@apache.org> wrote:

> Hello Pedro
>
> Thanks for being interested in contributing to Apache Cassandra.
> Creating a new compaction strategy is not an easy task and there are
> several things you can do to make it more obvious for other developers
> to understand what you're up to.
>
> First of all, if using github, changes to the code base should be done
> by having a separate branch in your own fork of the Apache repository.
> This will make it possible for others to quickly compare your changes to
> the current code base using the web interface. Technically using a new
> repo works as well, but isn't as convenient for others, e.g. it starts
> by not communicating which Cassandra branch was used as basis for you
> changes.
>
> Talking about git, I'd also suggest to learn more about creating a git
> history for your code that is easy to review. E.g. you may want to
> squash some of the "code clean up" style commits.
>
> As mentioned, implementing a new compaction strategy is quite an effort
> and the theories and motivations behind this is at least as interesting
> as the actual implementation. Therefor it could be a good idea to have a
> design document describing your work on a different abstraction level.
> It will also make it more likely to get other people involved in the
> discussion, as not everyone will have to check the source code for the
> details.
>
> -Stefan
>
>
> On 08.06.2017 09:31, Pedro Gordo wrote:
> > Hi all
> >
> > As part of my MSc project, I've done a new compaction strategy for
> > Cassandra, called Burst Hour Compaction Strategy. You can find the JIRA
> > ticket here: https://issues.apache.org/jira/browse/CASSANDRA-12201
> >
> > In a nutshell, the background compaction for this strategy is only
> > triggered during a predefined interval, freeing the resources during
> other
> > times of the day. It also tries to make keys unique across all the
> > SSTables, when these keys that are present in more than a configurable
> > number of tables. Please check the JIRA ticket for a full description.
> >
> > The code can be found here: https://github.com/sedulam/CASSANDRA-12201
> >
> > Please let me know what you think, or improvements that can be done (some
> > ideas are in the ticket description). Since I'm new to Cassandra, I
> imagine
> > that a lot of assumptions might not be the best, e.g. 100MB for the
> maximum
> > table size.
> >
> > I'm looking forward to working with this community!
> >
> > All the best
> > Pedro Gordo
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message