incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rustam Aliyev <>
Subject Problems with shuffle
Date Sun, 07 Apr 2013 12:43:34 GMT

After upgrading to the vnodes I created and enabled shuffle operation as 
suggested. After running for a couple of hours I had to disable it 
because nodes were not catching up with compactions. I repeated this 
process 3 times (enable/disable).

I have 5 nodes and each of them had ~35GB. After shuffle operations 
described above some nodes are now reaching ~170GB. In the log files I 
can see same files transferred 2-4 times to the same host within the 
same shuffle session. Worst of all, after all of these I had only 20 
vnodes transferred out of 1280. So if it will continue at the same speed 
it will take about a month or two to complete shuffle.

I had few question to better understand shuffle:

 1. Does disabling and re-enabling shuffle starts shuffle process from
    scratch or it resumes from the last point?

 2. Will vnode reallocations speedup as shuffle proceeds or it will
    remain the same?

 3. Why I see multiple transfers of the same file to the same host? e.g.:

    INFO [Streaming to /] 2013-04-07 14:27:10,038 (line 44) Successfully sent
    to /
    INFO [Streaming to /] 2013-04-07 16:27:07,427 (line 44) Successfully sent
    to /

 4. When I enable/disable shuffle I receive warning message such as
    below. Do I need to worry about it?

    cassandra-shuffle -h localhost disable
    Failed to enable shuffling on!
    Failed to enable shuffling on!

I couldn't find many docs on shuffle, only read through JIRA and 
original proposal by Eric.


View raw message