cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Russell Bradberry <rbradbe...@gmail.com>
Subject C* 1.2.15 Decommission issues
Date Thu, 10 Apr 2014 18:08:42 GMT
We have about a 30 node cluster running the latest C* 1.2 series DSE.  One datacenter uses
VNodes and the other datacenter has VNodes Disabled (because it is running DSE-Seearch)

We have been replacing nodes in the VNode datacenter with faster ones and we have yet to have
a successful decommission.  Every time we attempt to decommission a node we get an “Operation
Timed Out” error and the decommission fails.  We keep retrying it and sometimes it will
work and other times we will just give up and force the node removal.  It seems though, that
all the data has streamed out of the node before the decommission fails.

What exactly does it need to read before leaving that would cause this?  We also have noticed
that in several nodes after the removal that there are ghost entries for the removed node
in the system.peers table and this doesn’t get removed until we restart Cassandra on that
node.

Also, we have noticed that running repairs with VNodes is considerably slower. Is this a misconfiguration?
Or is it expected that VNodes repairs will be slow?


Here is the stack trace from the decommission failure:

Exception in thread "main" java.lang.RuntimeException: org.apache.cassandra.exceptions.ReadTimeoutException:
Operation timed out - received only 0 responses.
        at org.apache.cassandra.db.HintedHandOffManager.getHintsSlice(HintedHandOffManager.java:578)
        at org.apache.cassandra.db.HintedHandOffManager.listEndpointsPendingHints(HintedHandOffManager.java:528)
        at org.apache.cassandra.service.StorageService.streamHints(StorageService.java:2925)
        at org.apache.cassandra.service.StorageService.unbootstrap(StorageService.java:2905)
        at org.apache.cassandra.service.StorageService.decommission(StorageService.java:2866)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
        at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
        at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
        at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
        at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
        at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
        at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
        at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
        at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
        at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
        at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
        at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
        at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
        at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
        at sun.reflect.GeneratedMethodAccessor26.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
        at sun.rmi.transport.Transport$1.run(Transport.java:177)
        at sun.rmi.transport.Transport$1.run(Transport.java:174)
        at java.security.AccessController.doPrivileged(Native Method)
        at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
        at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556)
        at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811)
        at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - received
only 0 responses.
        at org.apache.cassandra.service.ReadCallback.get(ReadCallback.java:105)
        at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:1213)
        at org.apache.cassandra.db.HintedHandOffManager.getHintsSlice(HintedHandOffManager.java:573)
        ... 39 more





Mime
View raw message