cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <sylv...@datastax.com>
Subject Re: no additional log output after running repair
Date Tue, 31 May 2011 14:10:25 GMT
There is roughly two steps in repair:
  1) Each node involved (that's the node the repair was started on plus the one
      listed in the log) constructs a merkle tree. This amount to a
specific compaction,
      so you can see the progress in JMX->CompactionManager. If, for the nodes
      involved no compaction is running (and none is pending), then
either you're done
      with that phase (or the process failed before this).
  2) The data to repair is streamed between nodes. This will show up
in netstats.

Now repair is not very good at handling failure. Basically, if
anything happens it will hang.
By anything, I mean in particular if a node involved in the repair
dies at any point
of the process. If that happens, the repair for this node will never
finish and as it turns out
nothing specific will show up in the log (well, you should see that
the node have been
restarted, but nothing specific).

Anyway, if you see nothing in the log, it means that either it is
still in 1) (that step
can take quite some time if you have lots of data, and if you are a
bit behind on
compactions, this will report the start of that process), or it has
failed somehow.

If it has failed (nothing on Compaction and no stream), then you can
stop the repair
(ctrl^C the nodetool repair) and start a new one.

For info, CASSANDRA-2433 is open to make repair hopefully report
failure correctly.

--
Sylvain

On Tue, May 31, 2011 at 3:20 PM, Jonathan Colby
<jonathan.colby@gmail.com> wrote:
> I'm trying to run a repair on a 7.6-2 Node.  After running the repair command, this
line shows up in the cassandra.log, but nothing else.  It's been hours.     Nothing is
seen in the logs from other servers or with nodetool commands like netstats or tpstats.
>
> How do  I know if the repair is actually going on or not?   This is incredibly frustrating.
>
>  INFO [manual-repair-9629edfc-7ae9-4626-b90a-2aa6eb1e8224] 2011-05-31 14:05:25,625 AntiEntropyService.java
(line 786) Waiting for repair requests: [#<TreeRequest manual-repair-9629edfc-7ae9-4626-b90a-2aa6eb1e8224
> , /10.47.108.100, (DFS,main)>, #<TreeRequest manual-repair-9629edfc-7ae9-4626-b90a-2aa6eb1e8224,
/10.47.108.103, (DFS,main)>, #<TreeRequest manual-repair-9629edfc-7ae9-4626-b90a-2aa6eb1e8224,
/10.46.108.103, (DFS
> ,main)>, #<TreeRequest manual-repair-9629edfc-7ae9-4626-b90a-2aa6eb1e8224, /10.46.108.101,
(DFS,main)>]
>
>
> Jon

Mime
View raw message