cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Ancona <...@anconafamily.com>
Subject Re: Trying to find the problem with a broken pipe
Date Tue, 02 Aug 2011 17:37:48 GMT
On Tue, Aug 2, 2011 at 4:36 PM, Anthony Ikeda
<anthony.ikeda.dev@gmail.com> wrote:
> I'm not sure if this is a problem with Hector or with Cassandra.
> We seem to be seeing broken pipe issues with our connections on the client
> side (Exception below). A bit of googling finds possibly a problem with the
> amount of data we are trying to store, although I'm certain our datasets are
> not all that large.

I'm not sure what you're referring to here. Large requests could lead
to timeouts, but that's not what you're seeing here. Could you link to
the page you're referencing?

> A nodetool ring command doesn't seem to present any downed nodes:
> Address         DC          Rack        Status State   Load        
   Owns
>    Token
>
>   153951716904446304929228999025275230571
> 10.130.202.34   datacenter1 rack1       Up     Normal  470.74 KB
> 79.19%  118538200848404459763384037192174096102
> 10.130.202.35   datacenter1 rack1       Up     Normal  483.63 KB
> 20.81%  153951716904446304929228999025275230571
>
> There are no errors in the cassandra server logs.
>
> Are there any particular timeouts on connections that we need to be aware
> of? Or perhaps configure on the Cassandra nodes? Is this purely and issue
> with the Hector API configuration?

There is a server side timeout (rpc_timeout_in_ms in cassandra.yaml)
and a Hector client-side timeout
(CassandraHostConfigurator.cassandraThriftSocketTimeout). But again,
the "Broken pipe" error is not a timeout, it indicates that something
happened to the underlying network socket. For example you will see
those when a server node is restarted.

Some questions that might help troubleshoot this:
How often are these occurring?
Does this affect both nodes in the cluster or just one?
Are the exceptions related to any external events (e.g. node restarts,
network issues...)?
What versions of Hector and Cassandra are you running?

Keep in mind that failures like this will normally be retried by
Hector, resulting in no loss of data. For that reason, I think that
exception is logged as a warning in the newest Hector versions.

We've seen something similar, but more catastrophic because it affects
connectivity to the entire cluster, not just a single node. See this
post for more details: http://goo.gl/hrgkw So far we haven't
identified the cause.

Jim

> Anthony
>
> 2011-08-02 08:43:06,541 ERROR
> [me.prettyprint.cassandra.connection.HThriftClient] - Could not flush
> transport (to be expected if the pool is shutting down) in close for client:
> CassandraClient<cassandradevrk1:9393-33>
> org.apache.thrift.transport.TTransportException: java.net.SocketException:
> Broken pipe
> 	at
> org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:147)
> 	at
> org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:156)
> 	at
> me.prettyprint.cassandra.connection.HThriftClient.close(HThriftClient.java:85)
> 	at
> me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:232)
> 	at
> me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
> 	at
> me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:289)
> 	at
> me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53)
> 	at
> me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49)
> 	at
> me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
> 	at
> me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
> 	at
> me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48)
> 	at
> com.wsgc.services.registry.persistenceservice.impl.cassandra.strategy.read.StandardFindRegistryPersistenceStrategy.findRegistryByProfileId(StandardFindRegistryPersistenceStrategy.java:237)
> 	at
> com.wsgc.services.registry.persistenceservice.impl.cassandra.strategy.read.StandardFindRegistryPersistenceStrategy.execute(StandardFindRegistryPersistenceStrategy.java:277)
> 	at
> com.wsgc.services.registry.registryservice.impl.service.StandardRegistryService.getRegistriesByProfileId(StandardRegistryService.java:327)
> 	at
> com.wsgc.services.registry.webapp.impl.RegistryServicesController.getRegistriesByProfileId(RegistryServicesController.java:247)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> 	at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:597)
> 	at
> org.springframework.web.bind.annotation.support.HandlerMethodInvoker.invokeHandlerMethod(HandlerMethodInvoker.java:175)
> 	at
> org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.invokeHandlerMethod(AnnotationMethodHandlerAdapter.java:421)
> 	at
> org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.handle(AnnotationMethodHandlerAdapter.java:409)
> 	at
> org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:774)
> 	at
> org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:719)
> 	at
> org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:644)
> 	at
> org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:549)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
> 	at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> 	at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> 	at
> org.springframework.web.filter.HiddenHttpMethodFilter.doFilterInternal(HiddenHttpMethodFilter.java:77)
> 	at
> org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:76)
> 	at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> 	at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> 	at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> 	at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> 	at
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:563)
> 	at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
> 	at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> 	at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> 	at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
> 	at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:190)
> 	at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:291)
> 	at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:774)
> 	at
> org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:703)
> 	at
> org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:896)
> 	at
> org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:690)
> 	at java.lang.Thread.run(Thread.java:662)
> Caused by: java.net.SocketException: Broken pipe
> 	at java.net.SocketOutputStream.socketWrite0(Native Method)
> 	at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
> 	at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
> 	at
> org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145)
> 	... 47 more
> 2011-08-02 08:43:06,543 ERROR
> [me.prettyprint.cassandra.connection.HConnectionManager] - MARK HOST AS DOWN
> TRIGGERED for host cassandradevrk1(10.130.202.34):9393
> 2011-08-02 08:43:06,543 ERROR
> [me.prettyprint.cassandra.connection.HConnectionManager] - Pool state on
> shutdown:
> <ConcurrentCassandraClientPoolByHost>:{cassandradevrk1(10.130.202.34):9393};
> IsActive?: true; Active: 1; Blocked: 0; Idle: 15; NumBeforeExhausted: 49
> 2011-08-02 08:43:06,543 ERROR
> [me.prettyprint.cassandra.connection.ConcurrentHClientPool] - Shutdown
> triggered on
> <ConcurrentCassandraClientPoolByHost>:{cassandradevrk1(10.130.202.34):9393}
> 2011-08-02 08:43:06,544 ERROR
> [me.prettyprint.cassandra.connection.ConcurrentHClientPool] - Shutdown
> complete on
> <ConcurrentCassandraClientPoolByHost>:{cassandradevrk1(10.130.202.34):9393}
> 2011-08-02 08:43:06,544 INFO
> [me.prettyprint.cassandra.connection.CassandraHostRetryService] - Host
> detected as down was added to retry queue:
> cassandradevrk1(10.130.202.34):9393
> 2011-08-02 08:43:06,544 WARN
> [me.prettyprint.cassandra.connection.HConnectionManager] - Could not
> fullfill request on this host CassandraClient<cassandradevrk1:9393-33>
>

Mime
View raw message