cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ben Chan (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-5483) Repair tracing
Date Sun, 30 Mar 2014 17:10:18 GMT


Ben Chan commented on CASSANDRA-5483:

Ouch. After running a before and after test, I'm 99% sure this was the problem. There was
some obviously wrong code in {{waitActivity}} (an older version used 0 instead of -1 to signify
"done"; I apparently forgot to update everything when I changed this).

Sorry about removing the previous patch. It didn't have the correct {{git diff -p}} parameters.

For convenience:

for url in \
do [ -e $(basename $url) ] || curl -sO $url; done &&
git apply 5483-v09-*.patch &&
ant clean && ant

Here's what I used to test with; I get slower and slower repairs, with a hang on the 5th repair
with the "before" code, and consistent 10-second repairs with the "after" code.

cat > ccm-nodetool <<"E"

# ccm doesn't let us call nodetool with options, but we still need to get the
# host and port config from it.
read -r JMXGET <<E
/jmx_port/{p=\$2;} \
/binary/{split(\$2,a,/\047/);h=a[2];} \
END{printf("bin/nodetool -h %s -p %s\n",h,p,cmd);}

NODETOOL=$(ccm $1 show | awk -F= "$JMXGET")
chmod +x ccm-nodetool
for x in $(seq 3); do 
  for y in $(seq 2); do
    echo repair node$x \#$y
    ./ccm-nodetool node$x repair -tr

> Repair tracing
> --------------
>                 Key: CASSANDRA-5483
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Yuki Morishita
>            Assignee: Ben Chan
>            Priority: Minor
>              Labels: repair
>         Attachments: 5483-full-trunk.txt, 5483-v06-04-Allow-tracing-ttl-to-be-configured.patch,, 5483-v06-06-Fix-interruption-in-tracestate-propagation.patch,
5483-v07-07-Better-constructor-parameters-for-DebuggableThreadPoolExecutor.patch, 5483-v07-08-Fix-brace-style.patch,
5483-v07-09-Add-trace-option-to-a-more-complete-set-of-repair-functions.patch, 5483-v07-10-Correct-name-of-boolean-repairedAt-to-fullRepair.patch,
5483-v08-11-Shorten-trace-messages.-Use-Tracing-begin.patch, 5483-v08-12-Trace-streaming-in-Differencer-StreamingRepairTask.patch,
5483-v08-15-Limit-trace-notifications.-Add-exponential-backoff.patch, 5483-v09-16-Fix-hang-caused-by-incorrect-exit-code.patch,
ccm-repair-test, cqlsh-left-justify-text-columns.patch, prerepair-vs-postbuggedrepair.diff,
test-5483-system_traces-events.txt, trunk@4620823-5483-v02-0001-Trace-filtering-and-tracestate-propagation.patch,
trunk@4620823-5483-v02-0002-Put-a-few-traces-parallel-to-the-repair-logging.patch, trunk@8ebeee1-5483-v01-001-trace-filtering-and-tracestate-propagation.txt,
trunk@8ebeee1-5483-v01-002-simple-repair-tracing.txt, v02p02-5483-v03-0003-Make-repair-tracing-controllable-via-nodetool.patch,
v02p02-5483-v04-0003-This-time-use-an-EnumSet-to-pass-boolean-repair-options.patch, v02p02-5483-v05-0003-Use-long-instead-of-EnumSet-to-work-with-JMX.patch
> I think it would be nice to log repair stats and results like query tracing stores traces
to system keyspace. With it, you don't have to lookup each log file to see what was the status
and how it performed the repair you invoked. Instead, you can query the repair log with session
ID to see the state and stats of all nodes involved in that repair session.

This message was sent by Atlassian JIRA

View raw message