cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anand Somani <meatfor...@gmail.com>
Subject Re: 0.7.4: Replication assertion error after removetoken, removetoken force and a restart
Date Sat, 20 Aug 2011 17:38:16 GMT
0.7.4/ 3 node cluster/ RF -3 /Quorum read/write

After I re-introduced a corrupted node, followed the process as (thanks to
folks on the mailing list for helping me) listed on the operations wiki to
handle failures.
Still doing a cleanup on one node at this point. But I noticed that I am
seeing this same exception appear 10/12 times in a minute, on an existing
node (not the new one). I think it started around the removetoken.

How do I solve this, should I just restart this node? Any other
cleanups/resets I need to do?

Thanks


On Thu, Apr 28, 2011 at 2:26 AM, aaron morton <aaron@thelastpickle.com>wrote:

> I *think* that code is used when one node tells others via gossip it is
> removing a token that is not it's own. The ode that receives information in
> gossip does some work and then replies to the first node with a
> REPLICATION_FINISHED message, which is the node I assume the error is
> happening on.
>
> Have you been doing any moves / removes or additions or tokens/nodes?
>
> Thanks
> Aaron
>
> On 28 Apr 2011, at 08:39, Alexis Lê-Quôc wrote:
>
> > Hi,
> >
> > I've been getting the following lately, every few seconds.
> >
> > 2011-04-27T20:21:18.299885+00:00 10.202.61.193 [MiscStage: 97] Error
> > in ThreadPoolExecutor
> > 2011-04-27T20:21:18.299885+00:00 10.202.61.193 java.lang.AssertionError
> > 2011-04-27T20:21:18.300038+00:00 10.202.61.193 10.202.61.193   at
> >
> org.apache.cassandra.service.StorageService.confirmReplication(StorageService.java:1872)
> > 2011-04-27T20:21:18.300038+00:00 10.202.61.193 10.202.61.193   at
> >
> org.apache.cassandra.streaming.ReplicationFinishedVerbHandler.doVerb(ReplicationFinishedVerbHandler.java:38)
> > 2011-04-27T20:21:18.300047+00:00 10.202.61.193 10.202.61.193   at
> >
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
> > 2011-04-27T20:21:18.300047+00:00 10.202.61.193 10.202.61.193   at
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> > 2011-04-27T20:21:18.300055+00:00 10.202.61.193 10.202.61.193   at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> > 2011-04-27T20:21:18.300055+00:00 10.202.61.193 10.202.61.193   at
> > java.lang.Thread.run(Thread.java:636)
> > 2011-04-27T20:21:18.300555+00:00 10.202.61.193 [MiscStage: 97] Fatal
> > exception in thread Thread[MiscStage:97,5,main]
> >
> > I see it coming from
> > 32 public class ReplicationFinishedVerbHandler implements IVerbHandler
> > 33 {
> > 34     private static Logger logger =
> > LoggerFactory.getLogger(ReplicationFinishedVerbHandler.class);
> > 35
> > 36     public void doVerb(Message msg, String id)
> > 37     {
> > 38         StorageService.instance.confirmReplication(msg.getFrom());
> > 39         Message response =
> > msg.getInternalReply(ArrayUtils.EMPTY_BYTE_ARRAY);
> > 40         if (logger.isDebugEnabled())
> > 41             logger.debug("Replying to " + id + "@" + msg.getFrom());
> > 42         MessagingService.instance().sendReply(response, id,
> msg.getFrom());
> > 43     }
> > 44 }
> >
> > Before I dig deeper in the code, has anybody dealt with this before?
> >
> > Thanks,
> >
> > --
> > Alexis Lê-Quôc
>
>

Mime
View raw message