cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei Deng (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-11965) Duplicated effort in repair streaming
Date Tue, 07 Jun 2016 15:32:21 GMT
Wei Deng created CASSANDRA-11965:
------------------------------------

             Summary: Duplicated effort in repair streaming
                 Key: CASSANDRA-11965
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11965
             Project: Cassandra
          Issue Type: Improvement
          Components: Streaming and Messaging
            Reporter: Wei Deng


[~jbellis] mentioned this as a potential improvement in his 2013 committer meeting notes (http://grokbase.com/t/cassandra/dev/132s6sh415/notes-from-committers-meeting-streaming-and-repair):
"making the repair coordinator smarter to know when to avoid duplicate streaming. E.g., if
replicas A and B have row X, but C does not, currently both A and B will stream to C."

I tested in C* 3.0.6 and looks like this is still happening. Basically on a 3-node cluster
I inserted into a trivial table under a keyspace with RF=3 and forced two flushes on all nodes
so that I have two SSTables on each node, then I shutdown the 1st node and removed one SSTable
from its data directory and restarted the node. I connected cqlsh to this node and verified
that with CL.ONE the data is indeed missing; I now moved onto the 2nd node running a "nodetool
repair <keyspace> <table>", here are what I observed from system.log on the 2nd
node (as repair coordinator):

{noformat}
INFO  [Thread-47] 2016-06-06 23:19:54,173  RepairRunnable.java:125 - Starting repair command
#1, repairing keyspace weitest with repair options (parallelism: parallel, primary range:
false, incremental: true, job threads: 1, ColumnFamilies: [songs], dataCenters: [], hosts:
[], # of ranges: 3)
INFO  [Thread-47] 2016-06-06 23:19:54,253  RepairSession.java:237 - [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57]
new session: will sync /172.31.44.75, /172.31.40.215, /172.31.36.148 on range [(3074457345618258602,-9223372036854775808],
(-9223372036854775808,-3074457345618258603], (-3074457345618258603,3074457345618258602]] for
weitest.[songs]
INFO  [Repair#1:1] 2016-06-06 23:19:54,268  RepairJob.java:172 - [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57]
Requesting merkle trees for songs (to [/172.31.40.215, /172.31.36.148, /172.31.44.75])
INFO  [AntiEntropyStage:1] 2016-06-06 23:19:54,335  RepairSession.java:181 - [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57]
Received merkle tree for songs from /172.31.40.215
INFO  [AntiEntropyStage:1] 2016-06-06 23:19:54,427  RepairSession.java:181 - [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57]
Received merkle tree for songs from /172.31.44.75
INFO  [AntiEntropyStage:1] 2016-06-06 23:19:54,460  RepairSession.java:181 - [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57]
Received merkle tree for songs from /172.31.36.148
INFO  [RepairJobTask:1] 2016-06-06 23:19:54,466  SyncTask.java:73 - [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57]
Endpoints /172.31.40.215 and /172.31.36.148 have 3 range(s) out of sync for songs
INFO  [RepairJobTask:1] 2016-06-06 23:19:54,467  RemoteSyncTask.java:54 - [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57]
Forwarding streaming repair of 3 ranges to /172.31.40.215 (to be streamed with /172.31.36.148)
INFO  [RepairJobTask:1] 2016-06-06 23:19:54,472  SyncTask.java:66 - [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57]
Endpoints /172.31.36.148 and /172.31.44.75 are consistent for songs
INFO  [RepairJobTask:3] 2016-06-06 23:19:54,474  SyncTask.java:73 - [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57]
Endpoints /172.31.40.215 and /172.31.44.75 have 3 range(s) out of sync for songs
INFO  [RepairJobTask:3] 2016-06-06 23:19:54,529  LocalSyncTask.java:68 - [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57]
Performing streaming repair of 3 ranges with /172.31.40.215
INFO  [RepairJobTask:3] 2016-06-06 23:19:54,574  StreamResultFuture.java:86 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57]
Executing streaming plan for Repair
INFO  [StreamConnectionEstablisher:1] 2016-06-06 23:19:54,576  StreamSession.java:238 - [Stream
#2d423640-2c3d-11e6-94d2-b35b6c93de57] Starting streaming to /172.31.40.215
INFO  [StreamConnectionEstablisher:1] 2016-06-06 23:19:54,580  StreamCoordinator.java:213
- [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57, ID#0] Beginning stream session with /172.31.40.215
INFO  [STREAM-IN-/172.31.40.215] 2016-06-06 23:19:54,588  StreamResultFuture.java:168 - [Stream
#2d423640-2c3d-11e6-94d2-b35b6c93de57 ID#0] Prepare completed. Receiving 0 files(0 bytes),
sending 1 files(174 bytes)
INFO  [STREAM-IN-/172.31.40.215] 2016-06-06 23:19:55,117  StreamResultFuture.java:182 - [Stream
#2d423640-2c3d-11e6-94d2-b35b6c93de57] Session with /172.31.40.215 is complete
INFO  [STREAM-IN-/172.31.40.215] 2016-06-06 23:19:55,120  StreamResultFuture.java:214 - [Stream
#2d423640-2c3d-11e6-94d2-b35b6c93de57] All sessions completed
INFO  [STREAM-IN-/172.31.40.215] 2016-06-06 23:19:55,123  LocalSyncTask.java:114 - [repair
#2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Sync complete using session 2d177cc0-2c3d-11e6-94d2-b35b6c93de57
between /172.31.40.215 and /172.31.44.75 on songs
INFO  [RepairJobTask:3] 2016-06-06 23:19:55,123  RepairJob.java:143 - [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57]
songs is fully synced
INFO  [RepairJobTask:3] 2016-06-06 23:19:55,125  RepairSession.java:279 - [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57]
Session completed successfully
INFO  [RepairJobTask:3] 2016-06-06 23:19:55,126  RepairRunnable.java:240 - Repair session
2d177cc0-2c3d-11e6-94d2-b35b6c93de57 for range [(3074457345618258602,-9223372036854775808],
(-9223372036854775808,-3074457345618258603], (-3074457345618258603,3074457345618258602]] finished
INFO  [CompactionExecutor:991] 2016-06-06 23:19:55,131  CompactionManager.java:511 - Starting
anticompaction for weitest.songs on 2/[BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-2-big-Data.db'),
BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-1-big-Data.db')]
sstables
INFO  [CompactionExecutor:991] 2016-06-06 23:19:55,131  CompactionManager.java:540 - SSTable
BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-2-big-Data.db')
fully contained in range (-9223372036854775808,-9223372036854775808], mutating repairedAt
instead of anticompacting
INFO  [CompactionExecutor:991] 2016-06-06 23:19:55,135  CompactionManager.java:540 - SSTable
BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-1-big-Data.db')
fully contained in range (-9223372036854775808,-9223372036854775808], mutating repairedAt
instead of anticompacting
INFO  [CompactionExecutor:991] 2016-06-06 23:19:55,137  CompactionManager.java:578 - Completed
anticompaction successfully
INFO  [InternalResponseStage:8] 2016-06-06 23:19:55,145  RepairRunnable.java:322 - Repair
command #1 finished in 0 seconds
{noformat}

This is the log entry from the 1st node where one SSTable was missing and needed to be repaired,
indeed confirming that two equivalent streaming happened from two replica nodes:

{noformat}
INFO  [AntiEntropyStage:1] 2016-06-06 23:19:54,307  Validator.java:274 - [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57]
Sending completed merkle tree to /172.31.44.75 for weitest.songs
INFO  [AntiEntropyStage:1] 2016-06-06 23:19:54,470  StreamingRepairTask.java:58 - [streaming
task #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Performing streaming repair of 3 ranges with /172.31.36.148
INFO  [AntiEntropyStage:1] 2016-06-06 23:19:54,497  StreamResultFuture.java:86 - [Stream #2d38e770-2c3d-11e6-80ed-e382fc580483]
Executing streaming plan for Repair
INFO  [StreamConnectionEstablisher:1] 2016-06-06 23:19:54,498  StreamSession.java:238 - [Stream
#2d38e770-2c3d-11e6-80ed-e382fc580483] Starting streaming to /172.31.36.148
INFO  [StreamConnectionEstablisher:1] 2016-06-06 23:19:54,512  StreamCoordinator.java:213
- [Stream #2d38e770-2c3d-11e6-80ed-e382fc580483, ID#0] Beginning stream session with /172.31.36.148
INFO  [STREAM-IN-/172.31.36.148] 2016-06-06 23:19:54,562  StreamResultFuture.java:168 - [Stream
#2d38e770-2c3d-11e6-80ed-e382fc580483 ID#0] Prepare completed. Receiving 1 files(174 bytes),
sending 0 files(0 bytes)
INFO  [STREAM-INIT-/172.31.44.75:57066] 2016-06-06 23:19:54,579  StreamResultFuture.java:111
- [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57 ID#0] Creating new streaming plan for Repair
INFO  [STREAM-INIT-/172.31.44.75:57066] 2016-06-06 23:19:54,580  StreamResultFuture.java:118
- [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57, ID#0] Received streaming plan for Repair
INFO  [STREAM-INIT-/172.31.44.75:47984] 2016-06-06 23:19:54,581  StreamResultFuture.java:118
- [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57, ID#0] Received streaming plan for Repair
INFO  [STREAM-IN-/172.31.44.75] 2016-06-06 23:19:54,584  StreamResultFuture.java:168 - [Stream
#2d423640-2c3d-11e6-94d2-b35b6c93de57 ID#0] Prepare completed. Receiving 1 files(174 bytes),
sending 0 files(0 bytes)
INFO  [StreamReceiveTask:1] 2016-06-06 23:19:55,034  StreamResultFuture.java:182 - [Stream
#2d38e770-2c3d-11e6-80ed-e382fc580483] Session with /172.31.36.148 is complete
INFO  [StreamReceiveTask:1] 2016-06-06 23:19:55,037  StreamResultFuture.java:214 - [Stream
#2d38e770-2c3d-11e6-80ed-e382fc580483] All sessions completed
INFO  [StreamReceiveTask:1] 2016-06-06 23:19:55,040  StreamingRepairTask.java:85 - [repair
#2d177cc0-2c3d-11e6-94d2-b35b6c93de57] streaming task succeed, returning response to /172.31.44.75
INFO  [StreamReceiveTask:2] 2016-06-06 23:19:55,114  StreamResultFuture.java:182 - [Stream
#2d423640-2c3d-11e6-94d2-b35b6c93de57] Session with /172.31.44.75 is complete
INFO  [StreamReceiveTask:2] 2016-06-06 23:19:55,115  StreamResultFuture.java:214 - [Stream
#2d423640-2c3d-11e6-94d2-b35b6c93de57] All sessions completed
INFO  [CompactionExecutor:3] 2016-06-06 23:19:55,130  CompactionManager.java:511 - Starting
anticompaction for weitest.songs on 1/[BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-4-big-Data.db'),
BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-3-big-Data.db'),
BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-2-big-Data.db')]
sstables
INFO  [CompactionExecutor:3] 2016-06-06 23:19:55,131  CompactionManager.java:540 - SSTable
BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-2-big-Data.db')
fully contained in range (-9223372036854775808,-9223372036854775808], mutating repairedAt
instead of anticompacting
INFO  [CompactionExecutor:3] 2016-06-06 23:19:55,135  CompactionManager.java:578 - Completed
anticompaction successfully
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message