cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Cassandra Wiki] Update of "RepairAsyncAPI" by yukim
Date Tue, 29 Jul 2014 16:08:22 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "RepairAsyncAPI" page has been changed by yukim:

New page:
= Repair Async API =

Repair used to be invoked through sync JMX interface, but since repair takes time to finish,
JMX connection timeout happens sometimes.
So [[|CASSANDRA-4767]] added asynchronous
repair API which once invoked users can track repair progress through JMX notification.

== Repair JMX Notification ==

Repair JMX Notification is sent from StorageService MBean(org.apache.cassandra.db:type=StorageService).

Before you run repair, you should subscribe to receive JMX notification otherwise you may
miss some of messages.

Repair JMX Notification contains the following.

|| type      || "repair" ||
|| message   || repair status message ||
|| user data || int array containing command number and repair status ||

'''message''' is repair status message like "Starting repair ..." or error message.

'''user data''' is int array of 2 elements. The first element is ''command number'' which
is assigned uniquely when repair is invoked through async API. You can obtain command number
as return value of async APIs. The second element is repair status number as shown below.

|| 0 || STARTED         || repair command started ||
|| 1 || SESSION_SUCCESS || repair session (repair for one range in a keyspace) succeeded ||
|| 2 || SESSION_FAILED  || repair session failed ||
|| 3 || FINISHED        || repair command finished ||

(In the code, these are defined as ActiveRepairService.Status enum.)

''nodetool repair'' command also uses these status to track repair progress.

== Further improvement ==

Still, the granuality of tracking repair status is large. Repair involves several nodes who
do validation compaction and file streaming. Each of those are monitored through ''nodetool
compactionstat'' and ''nodetool netstat'' on each node.

Possible solution to track the whole repair process is to [[|Repair

View raw message