On 5 August 2013 12:30, Christopher Wirt <chris.wirt@struq.com> wrote:

I’m thinking about reducing the number of vnodes per server.


We have 3 DC setup – one with 9 nodes, two with 3 nodes each.


Each node has 256 vnodes. We’ve found that repair operations are beginning to take too long.


Is reducing the number of vnodes to 64/32 likely to help our situation?

Unlikely.  The amount of time repair takes only depends on the number of vnodes if you have a tiny amount of data.  If you have 256 vnodes and not much more than 256 * num_nodes rows, the overhead of splitting up the repairs into separate ranges for each vnode is significant.  However, once your dataset becomes bigger than this trivial amount, the vnode overhead of repair becomes totally insignificant.

The main reasons for slow repair are lots of data or lots of data out of sync.  You can tell how much is out of sync by looking in the logs - it will say how many ranges within each vnode range need repairing.