Thanks for the quick reply, Mohit. Can we measure/monitor the size of Hinted Handoffs? Would it be a good enough indicator of my back log?
As far as I know Cassandra doesn't use internal queueing mechanism specific to replication. Cassandra sends the write the remote DC and after that it's upto the tcp/ip stack to deal with buffering. If requests starts to timeout Cassandra would use HH upto certain time. For longer outage you would have to run repair.Also look at tcp/ip tuning parameters that are helpful with your scenario:Run iperf and test the latency.On Wed, Sep 5, 2012 at 8:22 PM, Venkat Rama <email@example.com> wrote:
Hi,We have multi DC Cassandra ring with 2 DCs setup. We use LOCAL_QUORUM for writes and reads. The network we have seen between the DC is sometimes flaky lasting few minutes to few 10 of minutes.I wanted to know what is the best way to measure/monitor either the lag or replication latency between the data centers. Are there any metrics I can monitor to find the backlog of data that needs to be transferred?Thanks in advance.VR