cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eyal Sorek (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10687) When adding new node to cluster getting Cassandra timeout during write query
Date Wed, 11 Nov 2015 12:43:10 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15000320#comment-15000320
] 

Eyal Sorek commented on CASSANDRA-10687:
----------------------------------------

BTW, in the joining process, when the node is still joining the - nodetool info, fails with
AssertionError
nodetool info
xss =  -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42
-XX:+CMSClassUnloadingEnabled -Xms8192M -Xmx8192M -Xmn2048M -Xss256k
Exception in thread "main" java.lang.AssertionError
	at org.apache.cassandra.locator.TokenMetadata.getTokens(TokenMetadata.java:502)
	at org.apache.cassandra.service.StorageService.getTokens(StorageService.java:2165)
	at org.apache.cassandra.service.StorageService.getTokens(StorageService.java:2154)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
	at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
	at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
	at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
	at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
	at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83)
	at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206)
	at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647)
	at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
	at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1464)
	at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
	at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
	at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
	at javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:657)
	at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
	at sun.rmi.transport.Transport$2.run(Transport.java:202)
	at sun.rmi.transport.Transport$2.run(Transport.java:199)
	at java.security.AccessController.doPrivileged(Native Method)
	at sun.rmi.transport.Transport.serviceCall(Transport.java:198)
	at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:567)
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:828)
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.access$400(TCPTransport.java:619)
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(TCPTransport.java:684)
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(TCPTransport.java:681)
	at java.security.AccessController.doPrivileged(Native Method)
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:681)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

> When adding new node to cluster getting Cassandra timeout during write query
> ----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-10687
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10687
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Configuration, Coordination, Streaming and Messaging
>         Environment: Cassandra 2.0.9 using vnodes, on Debian 7.9,  on two data centers
(AUS & TAM)
>            Reporter: Eyal Sorek
>
> When adding one new node on 8 nodes cluster (also again after completing adding the 9th
in AUS data center and again when adding the 10th node on TAM data center with same behaviour).
> We get many of the following errors below.
> First - why this, when the node is joining :
> LOCAL_ONE (2 replica were required but only 1 acknowledged the write
> Since when LOCAL_ONE requires 2 replicas ?
> Second, why we fill so much overhead on the all cluster, when a node is joining ?
> com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write
query at consistency LOCAL_ONE (2 replica were required but only 1 acknowledged the write)
> Sample stack trace
> …stax.driver.core.exceptions.WriteTimeoutException.copy (WriteTimeoutException.java:73)
> …m.datastax.driver.core.DriverThrowables.propagateCause (DriverThrowables.java:37)
> ….driver.core.DefaultResultSetFuture.getUninterruptibly (DefaultResultSetFuture.java:214)
>        com.datastax.driver.core.AbstractSession.execute (AbstractSession.java:52)
> com.wixpress.publichtml.renderer.data.access.dao.page.CassandraPagesReadWriteDao$$anonfun$insertCompressed$1.apply(CassandraPagesReadWriteDao.scala:29)
> com.wixpress.publichtml.renderer.data.access.dao.page.CassandraPagesReadWriteDao$$anonfun$insertCompressed$1.apply(CassandraPagesReadWriteDao.scala:25)
> com.wixpress.framework.monitoring.metering.SyncMetering$class.tracking(Metering.scala:58)
> com.wixpress.publichtml.renderer.data.access.dao.page.CassandraPagesReadOnlyDao.tracking(CassandraPagesReadOnlyDao.scala:19)
> com.wixpress.publichtml.renderer.data.access.dao.page.CassandraPagesReadWriteDao.insertCompressed(CassandraPagesReadWriteDao.scala:25)
> com.wixpress.html.data.distributor.core.DaoPageDistributor.com$wixpress$html$data$distributor$core$DaoPageDistributor$$distributePage(DaoPageDistributor.scala:36)
> com.wixpress.html.data.distributor.core.DaoPageDistributor$$anonfun$process$1.apply$mcV$sp(DaoPageDistributor.scala:26)
> com.wixpress.html.data.distributor.core.DaoPageDistributor$$anonfun$process$1.apply(DaoPageDistributor.scala:26)
> com.wixpress.html.data.distributor.core.DaoPageDistributor$$anonfun$process$1.apply(DaoPageDistributor.scala:26)
> com.wixpress.framework.monitoring.metering.SyncMetering$class.tracking(Metering.scala:58)
> com.wixpress.html.data.distributor.core.DaoPageDistributor.tracking(DaoPageDistributor.scala:17)
> com.wixpress.html.data.distributor.core.DaoPageDistributor.process(DaoPageDistributor.scala:25)
> com.wixpress.html.data.distributor.core.greyhound.DistributionRequestHandler.handleMessage(DistributionRequestHandler.scala:19)
> com.wixpress.greyhound.KafkaUserHandlers.handleMessage(UserHandlers.scala:11)
> com.wixpress.greyhound.EventsConsumer.com$wixpress$greyhound$EventsConsumer$$handleMessage(EventsConsumer.scala:51)
> com.wixpress.greyhound.EventsConsumer$$anonfun$com$wixpress$greyhound$EventsConsumer$$dispatch$1.apply$mcV$sp(EventsConsumer.scala:43)
> com.wixpress.greyhound.EventsConsumer$$anonfun$com$wixpress$greyhound$EventsConsumer$$dispatch$1.apply(EventsConsumer.scala:40)
> com.wixpress.greyhound.EventsConsumer$$anonfun$com$wixpress$greyhound$EventsConsumer$$dispatch$1.apply(EventsConsumer.scala:40)
> scala.util.Try$.apply(Try.scala:192)
> com.wixpress.greyhound.EventsConsumer.com$wixpress$greyhound$EventsConsumer$$dispatch(EventsConsumer.scala:40)
> com.wixpress.greyhound.EventsConsumer$$anonfun$consumeEvents$1.apply(EventsConsumer.scala:26)
> com.wixpress.greyhound.EventsConsumer$$anonfun$consumeEvents$1.apply(EventsConsumer.scala:25)
> scala.collection.Iterator$class.foreach(Iterator.scala:742)
> scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
> com.wixpress.greyhound.EventsConsumer.consumeEvents(EventsConsumer.scala:25)
> com.wixpress.greyhound.EventsConsumer.run(EventsConsumer.scala:20)
>       java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1142)
>      java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:617)
>                                    java.lang.Thread.run (Thread.java:745)
> caused by com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout
during write query at consistency LOCAL_ONE (2 replica were required but only 1 acknowledged
the write)
> …stax.driver.core.exceptions.WriteTimeoutException.copy (WriteTimeoutException.java:100)
>    com.datastax.driver.core.Responses$Error.asException (Responses.java:98)
>   com.datastax.driver.core.DefaultResultSetFuture.onSet (DefaultResultSetFuture.java:149)
>  com.datastax.driver.core.RequestHandler.setFinalResult (RequestHandler.java:183)
>     com.datastax.driver.core.RequestHandler.access$2300 (RequestHandler.java:44)
> …ore.RequestHandler$SpeculativeExecution.setFinalResult (RequestHandler.java:748)
> ….driver.core.RequestHandler$SpeculativeExecution.onSet (RequestHandler.java:587)
> …atastax.driver.core.Connection$Dispatcher.channelRead0 (Connection.java:1013)
> …atastax.driver.core.Connection$Dispatcher.channelRead0 (Connection.java:936)
> ….netty.channel.SimpleChannelInboundHandler.channelRead (SimpleChannelInboundHandler.java:105)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
>   io.netty.handler.timeout.IdleStateHandler.channelRead (IdleStateHandler.java:254)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> …etty.handler.codec.MessageToMessageDecoder.channelRead (MessageToMessageDecoder.java:103)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> …etty.handler.codec.MessageToMessageDecoder.channelRead (MessageToMessageDecoder.java:103)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> io.netty.handler.codec.ByteToMessageDecoder.channelRead (ByteToMessageDecoder.java:242)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> io.netty.channel.DefaultChannelPipeline.fireChannelRead (DefaultChannelPipeline.java:847)
> ….channel.nio.AbstractNioByteChannel$NioByteUnsafe.read (AbstractNioByteChannel.java:131)
>    io.netty.channel.nio.NioEventLoop.processSelectedKey (NioEventLoop.java:511)
> ….channel.nio.NioEventLoop.processSelectedKeysOptimized (NioEventLoop.java:468)
>   io.netty.channel.nio.NioEventLoop.processSelectedKeys (NioEventLoop.java:382)
>                   io.netty.channel.nio.NioEventLoop.run (NioEventLoop.java:354)
> ….netty.util.concurrent.SingleThreadEventExecutor$2.run (SingleThreadEventExecutor.java:111)
>                                    java.lang.Thread.run (Thread.java:745)
> caused by com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout
during write query at consistency LOCAL_ONE (2 replica were required but only 1 acknowledged
the write)
>       com.datastax.driver.core.Responses$Error$1.decode (Responses.java:57)
>       com.datastax.driver.core.Responses$Error$1.decode (Responses.java:37)
> com.datastax.driver.core.Message$ProtocolDecoder.decode (Message.java:213)
> com.datastax.driver.core.Message$ProtocolDecoder.decode (Message.java:204)
> …etty.handler.codec.MessageToMessageDecoder.channelRead (MessageToMessageDecoder.java:89)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> …etty.handler.codec.MessageToMessageDecoder.channelRead (MessageToMessageDecoder.java:103)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> io.netty.handler.codec.ByteToMessageDecoder.channelRead (ByteToMessageDecoder.java:242)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> io.netty.channel.DefaultChannelPipeline.fireChannelRead (DefaultChannelPipeline.java:847)
> ….channel.nio.AbstractNioByteChannel$NioByteUnsafe.read (AbstractNioByteChannel.java:131)
>    io.netty.channel.nio.NioEventLoop.processSelectedKey (NioEventLoop.java:511)
> ….channel.nio.NioEventLoop.processSelectedKeysOptimized (NioEventLoop.java:468)
>   io.netty.channel.nio.NioEventLoop.processSelectedKeys (NioEventLoop.java:382)
>                   io.netty.channel.nio.NioEventLoop.run (NioEventLoop.java:354)
> ….netty.util.concurrent.SingleThreadEventExecutor$2.run (SingleThreadEventExecutor.java:111)
>                                    java.lang.Thread.run (Thread.java:745)
> # nodetool status
> xss =  -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar -XX:+UseThreadPriorities
-XX:ThreadPriorityPolicy=42 -XX:+CMSClassUnloadingEnabled -Xms8192M -Xmx8192M -Xmn2048M -Xss256k
> Note: Ownership information does not include topology; for complete information, specify
a keyspace
> Datacenter: AUS
> ===============
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address        Load       Tokens  Owns   Host ID                               Rack
> UN  172.16.213.62  85.52 GB   256     11.7%  27f2fd1d-5f3c-4691-a1f6-e28c1343e212  R1
> UN  172.16.213.63  83.11 GB   256     12.2%  4869f14b-e858-46c7-967c-60bd8260a149  R1
> UN  172.16.213.64  80.91 GB   256     11.7%  d4ad2495-cb24-4964-94d2-9e3f557054a4  R1
> UN  172.16.213.66  84.11 GB   256     10.3%  2a16c0dc-c36a-4196-89df-2de4f6b6cae5  R1
> UN  172.16.144.75  95.2 GB    256     11.4%  f87d6518-6c8e-49d9-a013-018bbedb8414  R1
> Datacenter: TAM
> ===============
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address        Load       Tokens  Owns   Host ID                               Rack
> UJ  10.14.0.155    4.38 GB    256     ?      c88bebae-737b-4ade-8f79-64f655036eee  R1
> UN  10.14.0.106    81.57 GB   256     10.0%  3b539927-b53a-4f50-9acd-d92fefbd84b9  R1
> UN  10.14.0.107    80.23 GB   256     10.4%  b70f674d-892f-42ff-a261-5356bee79e99  R1
> UN  10.14.0.108    83.64 GB   256     11.2%  6e24b17a-0b48-46b4-8edb-b0a9206314a3  R1
> UN  10.14.0.109    91.02 GB   256     11.2%  11f02dbd-257f-4623-81f4-b94db7365775  R1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message