cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vasileios Vlachos <vasileiosvlac...@gmail.com>
Subject Re: Streaming Process: How can we speed it up?
Date Thu, 15 Sep 2016 09:47:25 GMT
Hello and thanks for your responses,

OK, so increasing stream_throughput_outbound_megabits_per_sec makes no
difference. Any ideas why streaming is limited to only two of the three
nodes available?

As an alternative to slow streaming I tried this:

  - install C* on a new node, stop the service and delete
/var/lib/cassandra/*
 - rsync /etc/cassandra from old node to new node
 - rsync /var/lib/cassandra from old node to new node
 - stop C* on the old node
 - rsync /var/lib/cassandra from old node to new node
 - move the old node to a different IP
 - move the new node to the old node's original IP
 - start C* on the new node (no need for the replace_node option in
cassandra-env.sh)

This technique has been successful so far for a demo cluster with fewer
data. The only disadvantage for us is that we were hoping that by streaming
the SSTables to the new node, tombstones would be discarded (freeing a lot
of disk space on our live cluster). This is exactly what happened for the
one node we streamed so far; unfortunately, the slow streaming generates a
lot of hints which makes recovery a very long process.

Do you guys see any other problems with the rsync method that I've skipped?

Regarding the tombstones issue (if we finally do what I described above),
I'm thinking sstablsplit. Then compaction should deal with it (I think). I
have not used sstablesplit in the past, so another thing I'd like to ask is
if you guys find this a good/bad idea for what I'm trying to do.

Many thanks,
Vasilis

On Mon, Sep 12, 2016 at 6:42 PM, Jeff Jirsa <jjirsa@apache.org> wrote:

>
>
> On 2016-09-12 09:38 (-0700), daemeon reiydelle <daemeonr@gmail.com> wrote:
> > Re. throughput. That looks slow for jumbo with 10g. Check your networks.
> >
> >
>
> It's extremely unlikely you'll be able to saturate a 10g link with a
> single instance cassandra.
>
> Faster Cassandra streaming is a work in progress - being able to send more
> than one file at a time is probably the most obvious area for improvement,
> and being able to better deal with the CPU / garbage generated on the
> receiving side is just behind that. You'll likely be able to stream 10-15
> MB/s per sending server or cpu core, whichever is less (in a vnode setup,
> you'll be cpu bound - in a single-token setup, you'll be stream bound).
>
>
>

Mime
View raw message