Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of anthony.ikeda.dev@gmail.com
 designates 209.85.215.170 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <20110804231331.050558854cd8e3604bbb7dcb1d74d208.4412c733c4.wbe@email13.secureserver.net>
References: 
 <20110804231331.050558854cd8e3604bbb7dcb1d74d208.4412c733c4.wbe@email13.secureserver.net>
Date: Mon, 8 Aug 2011 18:11:59 -0700
Message-ID: 
 <CAFk=5qBokwd+OEL=gaXL_P5c9tK=Ob3OQ6s2xYQcrZFRaKB1PA@mail.gmail.com>
Subject: Re: Trying to find the problem with a broken pipe
From: Anthony Ikeda <anthony.ikeda.dev@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=000e0cd3a07a2ece6f04aa084060

--000e0cd3a07a2ece6f04aa084060
Content-Type: text/plain; charset=ISO-8859-1

Tim do you know if this is the actual reason that is causing the broken
pipe? I'm having a hard time convincing my team that modifying this value
will fix the issue.

Jonathan, do you know if there is a valid explanation on why Tim no longer
has the problem based on this change?

Anthony Ikeda


On Thu, Aug 4, 2011 at 11:13 PM, Tim Snyder <tim@proinnovations.com> wrote:

> I no longer get the error on the loader program. The steps I took to fix
> it are increasing the thrift_max_message_length_in_mb msg length,
> stopping cassandra, blowing away the prior data store, and then
> restarting cassandra.
>
> Tim
>
>
> -------- Original Message --------
> Subject: Re: Trying to find the problem with a broken pipe
> From: aaron morton <aaron@thelastpickle.com>
> Date: Fri, August 05, 2011 12:58 am
> To: user@cassandra.apache.org
>
> It's probably a network thing.
>
> The only thing I can think of in cassandra is
> thrift_max_message_length_in_mb in the config. That config setting will
> result in a TException thrown on the server side (i think), not sure if
> that makes the server kill the socket. I would hope the error returns to
> the client.
>
> Perhaps check the server log.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 4 Aug 2011, at 23:05, Tim Snyder wrote:
>
> > I am getting the same problem (Broken Pipe) on a loader program, after
> > about 8 million read, write pairs. I am pushing serialized objects into
> > a column with the program, the object it seems to be doing it on is much
> > larger than the prior objects, so I am wondering if it is possibly a
> > column size streaming issue through the thrift api? I am using Cassandra
> > 0.8.0 and Hector 0.8.0-1
> >
> > Tim
> >
> > -------- Original Message --------
> > Subject: Re: Trying to find the problem with a broken pipe
> > From: Anthony Ikeda <anthony.ikeda.dev@gmail.com>
> > Date: Tue, August 02, 2011 10:43 pm
> > To: user@cassandra.apache.org
> >
> >> Very interesting. After the second host goes down do you see
> >> "me.prettyprint.hector.api.exceptions.HectorException: All host pools
> >> marked down. Retry burden pushed out to client"?
> >
> > No, the last message is:
> > 2011-08-02 08:43:06,561 INFO
> > [me.prettyprint.cassandra.connection.HConnectionManager] - Client
> > CassandraClient<cassandradevrk2:9393-49> released to inactive or dead
> > pool. Closing.> Does your client recover after a period of time?
> >
> >
> >
> > The application seems to be fine for now but my concern is the
> > connection pooling as well - I mean do we have one pool or multiple?
> > I'll post to the Hector user group about the pooling because the
> > incident seems so isolated. We also have our infrastructure team looking
> > into the communication between the application server and the cassandra
> > nodes.
> >
> >
> > So far it's still a mystery.
> >
> >
> >
> >
> >
> > On Tue, Aug 2, 2011 at 1:25 PM, Jim Ancona <jim@anconafamily.com> wrote:
> > On Tue, Aug 2, 2011 at 6:13 PM, Anthony Ikeda
> > <anthony.ikeda.dev@gmail.com> wrote:
> >
> >> The link (which I may be misreading)
> >> is
> >
> http://groups.google.com/group/hector-users/browse_thread/thread/8d7004b6f85a0f2e
> >
> >
> > I hadn't found that one, but I doubt that our issue is related to that.
> >
> >
> >> It's only started happening today and happened on 2 occassions (8:43
> > and
> >> 10:21) performing the same function (querying a column family).
> >> It seems to be trying to access a connection on one of the servers
> >> The client accesses the first node:
> >>
> >> 2011-08-02 08:43:06,541 ERROR
> >> [me.prettyprint.cassandra.connection.HThriftClient] - Could not flush
> >> transport (to be expected if the pool is shutting down) in close for
> > client:
> >> CassandraClient<cassandradevrk1:9393-33>
> >> org.apache.thrift.transport.TTransportException:
> > java.net.SocketException:
> >> Broken pipe
> >>
> >> ...
> >> 2011-08-02 08:43:06,544 WARN
> >> [me.prettyprint.cassandra.connection.HConnectionManager] - Could not
> >> fullfill request on this host
> > CassandraClient<cassandradevrk1:9393-33>
> >>
> >> ...
> >>
> >> 2011-08-02 08:43:06,543 ERROR
> >> [me.prettyprint.cassandra.connection.HConnectionManager] - MARK HOST
> > AS DOWN
> >> TRIGGERED for host cassandradevrk1(10.130.202.34):9393
> >> 2011-08-02 08:43:06,543 ERROR
> >> [me.prettyprint.cassandra.connection.HConnectionManager] - Pool state
> > on
> >> shutdown:
> >>
> >
> <ConcurrentCassandraClientPoolByHost>:{cassandradevrk1(10.130.202.34):9393};
> >> IsActive?: true; Active: 1; Blocked: 0; Idle: 15; NumBeforeExhausted:
> > 49
> >> 2011-08-02 08:43:06,543 ERROR
> >> [me.prettyprint.cassandra.connection.ConcurrentHClientPool] -
> > Shutdown
> >> triggered on
> >>
> >
> <ConcurrentCassandraClientPoolByHost>:{cassandradevrk1(10.130.202.34):9393}
> >> 2011-08-02 08:43:06,544 ERROR
> >> [me.prettyprint.cassandra.connection.ConcurrentHClientPool] -
> > Shutdown
> >> complete on
> >>
> >
> <ConcurrentCassandraClientPoolByHost>:{cassandradevrk1(10.130.202.34):9393}
> >> 2011-08-02 08:43:06,544 INFO
> >> [me.prettyprint.cassandra.connection.CassandraHostRetryService] -
> > Host
> >> detected as down was added to retry queue:
> >> cassandradevrk1(10.130.202.34):9393
> >> 2011-08-02 08:43:06,544 WARN
> >> [me.prettyprint.cassandra.connection.HConnectionManager] - Could not
> >> fullfill request on this host CassandraClient<
> >> cassandradevrk1:9393-33>
> >> 2011-08-02 08:43:06,544 WARN
> >> [me.prettyprint.cassandra.connection.HConnectionManager] - Exception:
> >> me.prettyprint.hector.api.exceptions.HectorTransportException:
> >> org.apache.thrift.transport.TTransportException:
> > java.net.SocketException:
> >> Connection reset
> >>
> >>
> >> Then it appears to try the second node and fails:
> >>
> >> 2011-08-02 08:43:06,556 INFO
> >> [me.prettyprint.cassandra.connection.HConnectionManager] - Client
> >> CassandraClient<cassandradevrk1:9393-33> released to inactive or dead
> > pool.
> >> Closing.
> >> 2011-08-02 08:43:06,557 ERROR
> >> [me.prettyprint.cassandra.connection.HThriftClient] - Could not flush
> >> transport (to be expected if the pool is shutting down) in close for
> > client:
> >> CassandraClient<cassandradevrk2:9393-49>
> >>
> >> org.apache.thrift.transport.TTransportException:
> > java.net.SocketException:
> >> Broken pipe
> >>
> >> 2011-08-02 08:43:06,558 ERROR
> >> [me.prettyprint.cassandra.connection.HConnectionManager] - MARK HOST
> > AS DOWN
> >> TRIGGERED for host cassandradevrk2(10.130.202.35):9393
> >> 2011-08-02 08:43:06,559 ERROR
> >> [me.prettyprint.cassandra.connection.HConnectionManager] - Pool state
> > on
> >> shutdown:
> >>
> >
> <ConcurrentCassandraClientPoolByHost>:{cassandradevrk2(10.130.202.35):9393};
> >> IsActive?: true; Active: 1; Blocked: 0; Idle: 15; NumBeforeExhausted:
> > 49
> >> 2011-08-02 08:43:06,559 ERROR
> >> [me.prettyprint.cassandra.connection.ConcurrentHClientPool] -
> > Shutdown
> >> triggered on
> >>
> >
> <ConcurrentCassandraClientPoolByHost>:{cassandradevrk2(10.130.202.35):9393}
> >> 2011-08-02 08:43:06,559 ERROR
> >> [me.prettyprint.cassandra.connection.ConcurrentHClientPool] -
> > Shutdown
> >> complete on
> >>
> >
> <ConcurrentCassandraClientPoolByHost>:{cassandradevrk2(10.130.202.35):9393}
> >> 2011-08-02 08:43:06,559 INFO
> >> [me.prettyprint.cassandra.connection.CassandraHostRetryService] -
> > Host
> >> detected as down was added to retry queue:
> >> cassandradevrk2(10.130.202.35):9393
> >> 2011-08-02 08:43:06,560 WARN
> >> [me.prettyprint.cassandra.connection.HConnectionManager] - Could not
> >> fullfill request on this host
> > CassandraClient<cassandradevrk2:9393-49>
> >> 2011-08-02 08:43:06,560 WARN
> >> [me.prettyprint.cassandra.connection.HConnectionManager] - Exception:
> >> me.prettyprint.hector.api.exceptions.HectorTransportException:
> >> org.apache.thrift.transport.TTransportException:
> > java.net.SocketException:
> >> Connection reset
> >
> >
> >
> > Very interesting. After the second host goes down do you see
> > "me.prettyprint.hector.api.exceptions.HectorException: All host pools
> > marked down. Retry burden pushed out to client"?
> >
> > Does your client recover after a period of time?
> >
> >>
> >> The process is the same at 10:21.
> >> Are the exceptions related to any external events (e.g. node
> >> restarts, network issues...)?
> >> Not that I'm aware, unless there are firewall timeouts between the
> >> application and the node servers. Let me find out. The cassandra log
> > files
> >> have no errors reported.
> >> What versions of Hector and Cassandra are you running?
> >> Cassandra 0.8.1, Hector 0.8.0-1
> >
> >
> > Our issue is occurring with Cassandra 0.7.8 and Hector 0.7-30. We plan
> > to deploy Hector 0.7-31 this week and to turn on useSocketKeepalive.
> > Are you using that? We're also using tcpdump to capture packets when
> > failures occur to see if there are anomalies in the network traffic.
> >
> > Jim
> >
> >
> >
> >>
> >>
> >>
> >> On Tue, Aug 2, 2011 at 10:37 AM, Jim Ancona <jim@anconafamily.com>
> > wrote:
> >>>
> >>> On Tue, Aug 2, 2011 at 4:36 PM, Anthony Ikeda
> >>> <anthony.ikeda.dev@gmail.com> wrote:
> >>>> I'm not sure if this is a problem with Hector or with Cassandra.
> >>>> We seem to be seeing broken pipe issues with our connections on
> > the
> >>>> client
> >>>> side (Exception below). A bit of googling finds possibly a problem
> > with
> >>>> the
> >>>> amount of data we are trying to store, although I'm certain our
> > datasets
> >>>> are
> >>>> not all that large.
> >>>
> >>> I'm not sure what you're referring to here. Large requests could
> > lead
> >>> to timeouts, but that's not what you're seeing here. Could you link
> > to
> >>> the page you're referencing?
> >>>
> >>>> A nodetool ring command doesn't seem to present any downed nodes:
> >>>> Address DC Rack Status State Load
> >>>> Owns
> >>>> Token
> >>>>
> >>>> 153951716904446304929228999025275230571
> >>>> 10.130.202.34 datacenter1 rack1 Up Normal 470.74 KB
> >>>> 79.19% 118538200848404459763384037192174096102
> >>>> 10.130.202.35 datacenter1 rack1 Up Normal 483.63 KB
> >>>> 20.81% 153951716904446304929228999025275230571
> >>>>
> >>>> There are no errors in the cassandra server logs.
> >>>>
> >>>> Are there any particular timeouts on connections that we need to
> > be
> >>>> aware
> >>>> of? Or perhaps configure on the Cassandra nodes? Is this purely
> > and
> >>>> issue
> >>>> with the Hector API configuration?
> >>>
> >>> There is a server side timeout (rpc_timeout_in_ms in cassandra.yaml)
> >>> and a Hector client-side timeout
> >>> (CassandraHostConfigurator.cassandraThriftSocketTimeout). But again,
> >>> the "Broken pipe" error is not a timeout, it indicates that
> > something
> >>> happened to the underlying network socket. For example you will see
> >>> those when a server node is restarted.
> >>>
> >>> Some questions that might help troubleshoot this:
> >>> How often are these occurring?
> >>> Does this affect both nodes in the cluster or just one?
> >>> Are the exceptions related to any external events (e.g. node
> > restarts,
> >>> network issues...)?
> >>> What versions of Hector and Cassandra are you running?
> >>>
> >>> Keep in mind that failures like this will normally be retried by
> >>> Hector, resulting in no loss of data. For that reason, I think that
> >>> exception is logged as a warning in the newest Hector versions.
> >>>
> >>> We've seen something similar, but more catastrophic because it
> > affects
> >>> connectivity to the entire cluster, not just a single node. See this
> >>> post for more details: http://goo.gl/hrgkw So far we haven't
> >>> identified the cause.
> >>>
> >>> Jim
> >>>
> >>>> Anthony
> >>>>
> >>>> 2011-08-02 08:43:06,541 ERROR
> >>>> [me.prettyprint.cassandra.connection.HThriftClient] - Could not
> > flush
> >>>> transport (to be expected if the pool is shutting down) in close
> > for
> >>>> client:
> >>>> CassandraClient<cassandradevrk1:9393-33>
> >>>> org.apache.thrift.transport.TTransportException:
> >>>> java.net.SocketException:
> >>>> Broken pipe
> >>>> at
> >>>>
> >>>>
> >
> org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:147)
> >>>> at
> >>>>
> >>>>
> >
> org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:156)
> >>>> at
> >>>>
> >>>>
> >
> me.prettyprint.cassandra.connection.HThriftClient.close(HThriftClient.java:85)
> >>>> at
> >>>>
> >>>>
> >
> me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:232)
> >>>> at
> >>>>
> >>>>
> >
> me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
> >>>> at
> >>>>
> >>>>
> >
> me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:289)
> >>>> at
> >>>>
> >>>>
> >
> me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53)
> >>>> at
> >>>>
> >>>>
> >
> me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49)
> >>>> at
> >>>>
> >>>>
> >
> me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
> >>>> at
> >>>>
> >>>>
> >
> me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
> >>>> at
> >>>>
> >>>>
> >
> me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48)
> >>>> at
> >>>>
> >>>>
> >
> com.wsgc.services.registry.persistenceservice.impl.cassandra.strategy.read.StandardFindRegistryPersistenceStrategy.findRegistryByProfileId(StandardFindRegistryPersistenceStrategy.java:237)
> >>>> at
> >>>>
> >>>>
> >
> com.wsgc.services.registry.persistenceservice.impl.cassandra.strategy.read.StandardFindRegistryPersistenceStrategy.execute(StandardFindRegistryPersistenceStrategy.java:277)
> >>>> at
> >>>>
> >>>>
> >
> com.wsgc.services.registry.registryservice.impl.service.StandardRegistryService.getRegistriesByProfileId(StandardRegistryService.java:327)
> >>>> at
> >>>>
> >>>>
> >
> com.wsgc.services.registry.webapp.impl.RegistryServicesController.getRegistriesByProfileId(RegistryServicesController.java:247)
> >>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> > Method)
> >>>> at
> >>>>
> >>>>
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >>>> at
> >>>>
> >>>>
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >>>> at java.lang.reflect.Method.invoke(Method.java:597)
> >>>> at
> >>>>
> >>>>
> >
> org.springframework.web.bind.annotation.support.HandlerMethodInvoker.invokeHandlerMethod(HandlerMethodInvoker.java:175)
> >>>> at
> >>>>
> >>>>
> >
> org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.invokeHandlerMethod(AnnotationMethodHandlerAdapter.java:421)
> >>>> at
> >>>>
> >>>>
> >
> org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.handle(AnnotationMethodHandlerAdapter.java:409)
> >>>> at
> >>>>
> >>>>
> >
> org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:774)
> >>>> at
> >>>>
> >>>>
> >
> org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:719)
> >>>> at
> >>>>
> >>>>
> >
> org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:644)
> >>>> at
> >>>>
> >>>>
> >
> org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:549)
> >>>> at
> > javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
> >>>> at
> > javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
> >>>> at
> >>>>
> >>>>
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> >>>> at
> >>>>
> >>>>
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> >>>> at
> >>>>
> >>>>
> >
> org.springframework.web.filter.HiddenHttpMethodFilter.doFilterInternal(HiddenHttpMethodFilter.java:77)
> >>>> at
> >>>>
> >>>>
> >
> org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:76)
> >>>> at
> >>>>
> >>>>
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> >>>> at
> >>>>
> >>>>
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> >>>> at
> >>>>
> >>>>
> >
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> >>>> at
> >>>>
> >>>>
> >
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> >>>> at
> >>>>
> >>>>
> >
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:563)
> >>>> at
> >>>>
> >>>>
> >
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
> >>>> at
> >>>>
> >>>>
> >
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> >>>> at
> >>>>
> >>>>
> >
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> >>>> at
> >>>>
> >>>>
> >
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
> >>>> at
> >>>>
> > org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:190)
> >>>> at
> >>>>
> > org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:291)
> >>>> at
> >>>> org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:774)
> >>>> at
> >>>>
> >>>>
> >
> org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:703)
> >>>> at
> >>>>
> >>>>
> >
> org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:896)
> >>>> at
> >>>>
> >>>>
> >
> org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:690)
> >>>> at java.lang.Thread.run(Thread.java:662)
> >>>> Caused by: java.net.SocketException: Broken pipe
> >>>> at java.net.SocketOutputStream.socketWrite0(Native Method)
> >>>> at
> >>>>
> > java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
> >>>> at
> > java.net.SocketOutputStream.write(SocketOutputStream.java:136)
> >>>> at
> >>>>
> >>>>
> >
> org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145)
> >>>> ... 47 more
> >>>> 2011-08-02 08:43:06,543 ERROR
> >>>> [me.prettyprint.cassandra.connection.HConnectionManager] - MARK
> > HOST AS
> >>>> DOWN
> >>>> TRIGGERED for host cassandradevrk1(10.130.202.34):9393
> >>>> 2011-08-02 08:43:06,543 ERROR
> >>>> [me.prettyprint.cassandra.connection.HConnectionManager] - Pool
> > state on
> >>>> shutdown:
> >>>>
> >>>>
> >
> <ConcurrentCassandraClientPoolByHost>:{cassandradevrk1(10.130.202.34):9393};
> >>>> IsActive?: true; Active: 1; Blocked: 0; Idle: 15;
> > NumBeforeExhausted: 49
> >>>> 2011-08-02 08:43:06,543 ERROR
> >>>> [me.prettyprint.cassandra.connection.ConcurrentHClientPool] -
> > Shutdown
> >>>> triggered on
> >>>>
> >>>>
> >
> <ConcurrentCassandraClientPoolByHost>:{cassandradevrk1(10.130.202.34):9393}
> >>>> 2011-08-02 08:43:06,544 ERROR
> >>>> [me.prettyprint.cassandra.connection.ConcurrentHClientPool] -
> > Shutdown
> >>>> complete on
> >>>>
> >>>>
> >
> <ConcurrentCassandraClientPoolByHost>:{cassandradevrk1(10.130.202.34):9393}
> >>>> 2011-08-02 08:43:06,544 INFO
> >>>> [me.prettyprint.cassandra.connection.CassandraHostRetryService] -
> > Host
> >>>> detected as down was added to retry queue:
> >>>> cassandradevrk1(10.130.202.34):9393
> >>>> 2011-08-02 08:43:06,544 WARN
> >>>> [me.prettyprint.cassandra.connection.HConnectionManager] - Could
> > not
> >>>> fullfill request on this host
> > CassandraClient<cassandradevrk1:9393-33>
> >>>>
> >>
> >>
> >
>
>

--000e0cd3a07a2ece6f04aa084060
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Tim do you know if this is the actual reason that is causing the broken pip=
e? I&#39;m having a hard time convincing my team that modifying this value =
will fix the issue.<div><br></div><div>Jonathan, do you know if there is a =
valid explanation on why Tim no longer has the problem based on this change=
?</div>
<div><br></div><div>Anthony Ikeda</div><div><br></div><div><br><br><div cla=
ss=3D"gmail_quote">On Thu, Aug 4, 2011 at 11:13 PM, Tim Snyder <span dir=3D=
"ltr">&lt;<a href=3D"mailto:tim@proinnovations.com">tim@proinnovations.com<=
/a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex;">I no longer get the error on the loader pro=
gram. The steps I took to fix<br>
it are increasing the thrift_max_message_length_in_mb msg length,<br>
stopping cassandra, blowing away the prior data store, and then<br>
restarting cassandra.<br>
<div class=3D"im"><br>
Tim<br>
<br>
<br>
-------- Original Message --------<br>
Subject: Re: Trying to find the problem with a broken pipe<br>
</div><div><div></div><div class=3D"h5">From: aaron morton &lt;<a href=3D"m=
ailto:aaron@thelastpickle.com">aaron@thelastpickle.com</a>&gt;<br>
Date: Fri, August 05, 2011 12:58 am<br>
To: <a href=3D"mailto:user@cassandra.apache.org">user@cassandra.apache.org<=
/a><br>
<br>
It&#39;s probably a network thing.<br>
<br>
The only thing I can think of in cassandra is<br>
thrift_max_message_length_in_mb in the config. That config setting will<br>
result in a TException thrown on the server side (i think), not sure if<br>
that makes the server kill the socket. I would hope the error returns to<br=
>
the client.<br>
<br>
Perhaps check the server log.<br>
<br>
Cheers<br>
<br>
-----------------<br>
Aaron Morton<br>
Freelance Cassandra Developer<br>
@aaronmorton<br>
<a href=3D"http://www.thelastpickle.com" target=3D"_blank">http://www.thela=
stpickle.com</a><br>
<br>
On 4 Aug 2011, at 23:05, Tim Snyder wrote:<br>
<br>
&gt; I am getting the same problem (Broken Pipe) on a loader program, after=
<br>
&gt; about 8 million read, write pairs. I am pushing serialized objects int=
o<br>
&gt; a column with the program, the object it seems to be doing it on is mu=
ch<br>
&gt; larger than the prior objects, so I am wondering if it is possibly a<b=
r>
&gt; column size streaming issue through the thrift api? I am using Cassand=
ra<br>
&gt; 0.8.0 and Hector 0.8.0-1<br>
&gt;<br>
&gt; Tim<br>
&gt;<br>
&gt; -------- Original Message --------<br>
&gt; Subject: Re: Trying to find the problem with a broken pipe<br>
&gt; From: Anthony Ikeda &lt;<a href=3D"mailto:anthony.ikeda.dev@gmail.com"=
>anthony.ikeda.dev@gmail.com</a>&gt;<br>
&gt; Date: Tue, August 02, 2011 10:43 pm<br>
&gt; To: <a href=3D"mailto:user@cassandra.apache.org">user@cassandra.apache=
.org</a><br>
&gt;<br>
&gt;&gt; Very interesting. After the second host goes down do you see<br>
&gt;&gt; &quot;me.prettyprint.hector.api.exceptions.HectorException: All ho=
st pools<br>
&gt;&gt; marked down. Retry burden pushed out to client&quot;?<br>
&gt;<br>
&gt; No, the last message is:<br>
&gt; 2011-08-02 08:43:06,561 INFO<br>
&gt; [me.prettyprint.cassandra.connection.HConnectionManager] - Client<br>
&gt; CassandraClient&lt;cassandradevrk2:9393-49&gt; released to inactive or=
 dead<br>
&gt; pool. Closing.&gt; Does your client recover after a period of time?<br=
>
&gt;<br>
&gt;<br>
&gt;<br>
&gt; The application seems to be fine for now but my concern is the<br>
&gt; connection pooling as well - I mean do we have one pool or multiple?<b=
r>
&gt; I&#39;ll post to the Hector user group about the pooling because the<b=
r>
&gt; incident seems so isolated. We also have our infrastructure team looki=
ng<br>
&gt; into the communication between the application server and the cassandr=
a<br>
&gt; nodes.<br>
&gt;<br>
&gt;<br>
&gt; So far it&#39;s still a mystery.<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt; On Tue, Aug 2, 2011 at 1:25 PM, Jim Ancona &lt;<a href=3D"mailto:jim@a=
nconafamily.com">jim@anconafamily.com</a>&gt; wrote:<br>
&gt; On Tue, Aug 2, 2011 at 6:13 PM, Anthony Ikeda<br>
&gt; &lt;<a href=3D"mailto:anthony.ikeda.dev@gmail.com">anthony.ikeda.dev@g=
mail.com</a>&gt; wrote:<br>
&gt;<br>
&gt;&gt; The link (which I may be misreading)<br>
&gt;&gt; is<br>
&gt; <a href=3D"http://groups.google.com/group/hector-users/browse_thread/t=
hread/8d7004b6f85a0f2e" target=3D"_blank">http://groups.google.com/group/he=
ctor-users/browse_thread/thread/8d7004b6f85a0f2e</a><br>
&gt;<br>
&gt;<br>
&gt; I hadn&#39;t found that one, but I doubt that our issue is related to =
that.<br>
&gt;<br>
&gt;<br>
&gt;&gt; It&#39;s only started happening today and happened on 2 occassions=
 (8:43<br>
&gt; and<br>
&gt;&gt; 10:21) performing the same function (querying a column family).<br=
>
&gt;&gt; It seems to be trying to access a connection on one of the servers=
<br>
&gt;&gt; The client accesses the first node:<br>
&gt;&gt;<br>
&gt;&gt; 2011-08-02 08:43:06,541 ERROR<br>
&gt;&gt; [me.prettyprint.cassandra.connection.HThriftClient] - Could not fl=
ush<br>
&gt;&gt; transport (to be expected if the pool is shutting down) in close f=
or<br>
&gt; client:<br>
&gt;&gt; CassandraClient&lt;cassandradevrk1:9393-33&gt;<br>
&gt;&gt; org.apache.thrift.transport.TTransportException:<br>
&gt; java.net.SocketException:<br>
&gt;&gt; Broken pipe<br>
&gt;&gt;<br>
&gt;&gt; ...<br>
&gt;&gt; 2011-08-02 08:43:06,544 WARN<br>
&gt;&gt; [me.prettyprint.cassandra.connection.HConnectionManager] - Could n=
ot<br>
&gt;&gt; fullfill request on this host<br>
&gt; CassandraClient&lt;cassandradevrk1:9393-33&gt;<br>
&gt;&gt;<br>
&gt;&gt; ...<br>
&gt;&gt;<br>
&gt;&gt; 2011-08-02 08:43:06,543 ERROR<br>
&gt;&gt; [me.prettyprint.cassandra.connection.HConnectionManager] - MARK HO=
ST<br>
&gt; AS DOWN<br>
&gt;&gt; TRIGGERED for host cassandradevrk1(10.130.202.34):9393<br>
&gt;&gt; 2011-08-02 08:43:06,543 ERROR<br>
&gt;&gt; [me.prettyprint.cassandra.connection.HConnectionManager] - Pool st=
ate<br>
&gt; on<br>
&gt;&gt; shutdown:<br>
&gt;&gt;<br>
&gt; &lt;ConcurrentCassandraClientPoolByHost&gt;:{cassandradevrk1(10.130.20=
2.34):9393};<br>
&gt;&gt; IsActive?: true; Active: 1; Blocked: 0; Idle: 15; NumBeforeExhaust=
ed:<br>
&gt; 49<br>
&gt;&gt; 2011-08-02 08:43:06,543 ERROR<br>
&gt;&gt; [me.prettyprint.cassandra.connection.ConcurrentHClientPool] -<br>
&gt; Shutdown<br>
&gt;&gt; triggered on<br>
&gt;&gt;<br>
&gt; &lt;ConcurrentCassandraClientPoolByHost&gt;:{cassandradevrk1(10.130.20=
2.34):9393}<br>
&gt;&gt; 2011-08-02 08:43:06,544 ERROR<br>
&gt;&gt; [me.prettyprint.cassandra.connection.ConcurrentHClientPool] -<br>
&gt; Shutdown<br>
&gt;&gt; complete on<br>
&gt;&gt;<br>
&gt; &lt;ConcurrentCassandraClientPoolByHost&gt;:{cassandradevrk1(10.130.20=
2.34):9393}<br>
&gt;&gt; 2011-08-02 08:43:06,544 INFO<br>
&gt;&gt; [me.prettyprint.cassandra.connection.CassandraHostRetryService] -<=
br>
&gt; Host<br>
&gt;&gt; detected as down was added to retry queue:<br>
&gt;&gt; cassandradevrk1(10.130.202.34):9393<br>
&gt;&gt; 2011-08-02 08:43:06,544 WARN<br>
&gt;&gt; [me.prettyprint.cassandra.connection.HConnectionManager] - Could n=
ot<br>
&gt;&gt; fullfill request on this host CassandraClient&lt;<br>
&gt;&gt; cassandradevrk1:9393-33&gt;<br>
&gt;&gt; 2011-08-02 08:43:06,544 WARN<br>
&gt;&gt; [me.prettyprint.cassandra.connection.HConnectionManager] - Excepti=
on:<br>
&gt;&gt; me.prettyprint.hector.api.exceptions.HectorTransportException:<br>
&gt;&gt; org.apache.thrift.transport.TTransportException:<br>
&gt; java.net.SocketException:<br>
&gt;&gt; Connection reset<br>
&gt;&gt;<br>
&gt;&gt;<br>
&gt;&gt; Then it appears to try the second node and fails:<br>
&gt;&gt;<br>
&gt;&gt; 2011-08-02 08:43:06,556 INFO<br>
&gt;&gt; [me.prettyprint.cassandra.connection.HConnectionManager] - Client<=
br>
&gt;&gt; CassandraClient&lt;cassandradevrk1:9393-33&gt; released to inactiv=
e or dead<br>
&gt; pool.<br>
&gt;&gt; Closing.<br>
&gt;&gt; 2011-08-02 08:43:06,557 ERROR<br>
&gt;&gt; [me.prettyprint.cassandra.connection.HThriftClient] - Could not fl=
ush<br>
&gt;&gt; transport (to be expected if the pool is shutting down) in close f=
or<br>
&gt; client:<br>
&gt;&gt; CassandraClient&lt;cassandradevrk2:9393-49&gt;<br>
&gt;&gt;<br>
&gt;&gt; org.apache.thrift.transport.TTransportException:<br>
&gt; java.net.SocketException:<br>
&gt;&gt; Broken pipe<br>
&gt;&gt;<br>
&gt;&gt; 2011-08-02 08:43:06,558 ERROR<br>
&gt;&gt; [me.prettyprint.cassandra.connection.HConnectionManager] - MARK HO=
ST<br>
&gt; AS DOWN<br>
&gt;&gt; TRIGGERED for host cassandradevrk2(10.130.202.35):9393<br>
&gt;&gt; 2011-08-02 08:43:06,559 ERROR<br>
&gt;&gt; [me.prettyprint.cassandra.connection.HConnectionManager] - Pool st=
ate<br>
&gt; on<br>
&gt;&gt; shutdown:<br>
&gt;&gt;<br>
&gt; &lt;ConcurrentCassandraClientPoolByHost&gt;:{cassandradevrk2(10.130.20=
2.35):9393};<br>
&gt;&gt; IsActive?: true; Active: 1; Blocked: 0; Idle: 15; NumBeforeExhaust=
ed:<br>
&gt; 49<br>
&gt;&gt; 2011-08-02 08:43:06,559 ERROR<br>
&gt;&gt; [me.prettyprint.cassandra.connection.ConcurrentHClientPool] -<br>
&gt; Shutdown<br>
&gt;&gt; triggered on<br>
&gt;&gt;<br>
&gt; &lt;ConcurrentCassandraClientPoolByHost&gt;:{cassandradevrk2(10.130.20=
2.35):9393}<br>
&gt;&gt; 2011-08-02 08:43:06,559 ERROR<br>
&gt;&gt; [me.prettyprint.cassandra.connection.ConcurrentHClientPool] -<br>
&gt; Shutdown<br>
&gt;&gt; complete on<br>
&gt;&gt;<br>
&gt; &lt;ConcurrentCassandraClientPoolByHost&gt;:{cassandradevrk2(10.130.20=
2.35):9393}<br>
&gt;&gt; 2011-08-02 08:43:06,559 INFO<br>
&gt;&gt; [me.prettyprint.cassandra.connection.CassandraHostRetryService] -<=
br>
&gt; Host<br>
&gt;&gt; detected as down was added to retry queue:<br>
&gt;&gt; cassandradevrk2(10.130.202.35):9393<br>
&gt;&gt; 2011-08-02 08:43:06,560 WARN<br>
&gt;&gt; [me.prettyprint.cassandra.connection.HConnectionManager] - Could n=
ot<br>
&gt;&gt; fullfill request on this host<br>
&gt; CassandraClient&lt;cassandradevrk2:9393-49&gt;<br>
&gt;&gt; 2011-08-02 08:43:06,560 WARN<br>
&gt;&gt; [me.prettyprint.cassandra.connection.HConnectionManager] - Excepti=
on:<br>
&gt;&gt; me.prettyprint.hector.api.exceptions.HectorTransportException:<br>
&gt;&gt; org.apache.thrift.transport.TTransportException:<br>
&gt; java.net.SocketException:<br>
&gt;&gt; Connection reset<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt; Very interesting. After the second host goes down do you see<br>
&gt; &quot;me.prettyprint.hector.api.exceptions.HectorException: All host p=
ools<br>
&gt; marked down. Retry burden pushed out to client&quot;?<br>
&gt;<br>
&gt; Does your client recover after a period of time?<br>
&gt;<br>
&gt;&gt;<br>
&gt;&gt; The process is the same at 10:21.<br>
&gt;&gt; Are the exceptions related to any external events (e.g. node<br>
&gt;&gt; restarts, network issues...)?<br>
&gt;&gt; Not that I&#39;m aware, unless there are firewall timeouts between=
 the<br>
&gt;&gt; application and the node servers. Let me find out. The cassandra l=
og<br>
&gt; files<br>
&gt;&gt; have no errors reported.<br>
&gt;&gt; What versions of Hector and Cassandra are you running?<br>
&gt;&gt; Cassandra 0.8.1, Hector 0.8.0-1<br>
&gt;<br>
&gt;<br>
&gt; Our issue is occurring with Cassandra 0.7.8 and Hector 0.7-30. We plan=
<br>
&gt; to deploy Hector 0.7-31 this week and to turn on useSocketKeepalive.<b=
r>
&gt; Are you using that? We&#39;re also using tcpdump to capture packets wh=
en<br>
&gt; failures occur to see if there are anomalies in the network traffic.<b=
r>
&gt;<br>
&gt; Jim<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt;&gt;<br>
&gt;&gt;<br>
&gt;&gt;<br>
&gt;&gt; On Tue, Aug 2, 2011 at 10:37 AM, Jim Ancona &lt;<a href=3D"mailto:=
jim@anconafamily.com">jim@anconafamily.com</a>&gt;<br>
&gt; wrote:<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; On Tue, Aug 2, 2011 at 4:36 PM, Anthony Ikeda<br>
&gt;&gt;&gt; &lt;<a href=3D"mailto:anthony.ikeda.dev@gmail.com">anthony.ike=
da.dev@gmail.com</a>&gt; wrote:<br>
&gt;&gt;&gt;&gt; I&#39;m not sure if this is a problem with Hector or with =
Cassandra.<br>
&gt;&gt;&gt;&gt; We seem to be seeing broken pipe issues with our connectio=
ns on<br>
&gt; the<br>
&gt;&gt;&gt;&gt; client<br>
&gt;&gt;&gt;&gt; side (Exception below). A bit of googling finds possibly a=
 problem<br>
&gt; with<br>
&gt;&gt;&gt;&gt; the<br>
&gt;&gt;&gt;&gt; amount of data we are trying to store, although I&#39;m ce=
rtain our<br>
&gt; datasets<br>
&gt;&gt;&gt;&gt; are<br>
&gt;&gt;&gt;&gt; not all that large.<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; I&#39;m not sure what you&#39;re referring to here. Large requ=
ests could<br>
&gt; lead<br>
&gt;&gt;&gt; to timeouts, but that&#39;s not what you&#39;re seeing here. C=
ould you link<br>
&gt; to<br>
&gt;&gt;&gt; the page you&#39;re referencing?<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt; A nodetool ring command doesn&#39;t seem to present any do=
wned nodes:<br>
&gt;&gt;&gt;&gt; Address DC Rack Status State Load<br>
&gt;&gt;&gt;&gt; Owns<br>
&gt;&gt;&gt;&gt; Token<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt; 153951716904446304929228999025275230571<br>
&gt;&gt;&gt;&gt; 10.130.202.34 datacenter1 rack1 Up Normal 470.74 KB<br>
&gt;&gt;&gt;&gt; 79.19% 118538200848404459763384037192174096102<br>
&gt;&gt;&gt;&gt; 10.130.202.35 datacenter1 rack1 Up Normal 483.63 KB<br>
&gt;&gt;&gt;&gt; 20.81% 153951716904446304929228999025275230571<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt; There are no errors in the cassandra server logs.<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt; Are there any particular timeouts on connections that we n=
eed to<br>
&gt; be<br>
&gt;&gt;&gt;&gt; aware<br>
&gt;&gt;&gt;&gt; of? Or perhaps configure on the Cassandra nodes? Is this p=
urely<br>
&gt; and<br>
&gt;&gt;&gt;&gt; issue<br>
&gt;&gt;&gt;&gt; with the Hector API configuration?<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; There is a server side timeout (rpc_timeout_in_ms in cassandra=
.yaml)<br>
&gt;&gt;&gt; and a Hector client-side timeout<br>
&gt;&gt;&gt; (CassandraHostConfigurator.cassandraThriftSocketTimeout). But =
again,<br>
&gt;&gt;&gt; the &quot;Broken pipe&quot; error is not a timeout, it indicat=
es that<br>
&gt; something<br>
&gt;&gt;&gt; happened to the underlying network socket. For example you wil=
l see<br>
&gt;&gt;&gt; those when a server node is restarted.<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; Some questions that might help troubleshoot this:<br>
&gt;&gt;&gt; How often are these occurring?<br>
&gt;&gt;&gt; Does this affect both nodes in the cluster or just one?<br>
&gt;&gt;&gt; Are the exceptions related to any external events (e.g. node<b=
r>
&gt; restarts,<br>
&gt;&gt;&gt; network issues...)?<br>
&gt;&gt;&gt; What versions of Hector and Cassandra are you running?<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; Keep in mind that failures like this will normally be retried =
by<br>
&gt;&gt;&gt; Hector, resulting in no loss of data. For that reason, I think=
 that<br>
&gt;&gt;&gt; exception is logged as a warning in the newest Hector versions=
.<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; We&#39;ve seen something similar, but more catastrophic becaus=
e it<br>
&gt; affects<br>
&gt;&gt;&gt; connectivity to the entire cluster, not just a single node. Se=
e this<br>
&gt;&gt;&gt; post for more details: <a href=3D"http://goo.gl/hrgkw" target=
=3D"_blank">http://goo.gl/hrgkw</a> So far we haven&#39;t<br>
&gt;&gt;&gt; identified the cause.<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; Jim<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt; Anthony<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt; 2011-08-02 08:43:06,541 ERROR<br>
&gt;&gt;&gt;&gt; [me.prettyprint.cassandra.connection.HThriftClient] - Coul=
d not<br>
&gt; flush<br>
&gt;&gt;&gt;&gt; transport (to be expected if the pool is shutting down) in=
 close<br>
&gt; for<br>
&gt;&gt;&gt;&gt; client:<br>
&gt;&gt;&gt;&gt; CassandraClient&lt;cassandradevrk1:9393-33&gt;<br>
&gt;&gt;&gt;&gt; org.apache.thrift.transport.TTransportException:<br>
&gt;&gt;&gt;&gt; java.net.SocketException:<br>
&gt;&gt;&gt;&gt; Broken pipe<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTranspor=
t.java:147)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.ja=
va:156)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; me.prettyprint.cassandra.connection.HThriftClient.close(HThriftClient.=
java:85)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; me.prettyprint.cassandra.connection.HConnectionManager.operateWithFail=
over(HConnectionManager.java:232)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailov=
er(KeyspaceServiceImpl.java:131)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(Keyspace=
ServiceImpl.java:289)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(=
ThriftSliceQuery.java:53)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(=
ThriftSliceQuery.java:49)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceA=
ndMeasure(KeyspaceOperationCallback.java:20)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKe=
yspace.java:85)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftS=
liceQuery.java:48)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; com.wsgc.services.registry.persistenceservice.impl.cassandra.strategy.=
read.StandardFindRegistryPersistenceStrategy.findRegistryByProfileId(Standa=
rdFindRegistryPersistenceStrategy.java:237)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; com.wsgc.services.registry.persistenceservice.impl.cassandra.strategy.=
read.StandardFindRegistryPersistenceStrategy.execute(StandardFindRegistryPe=
rsistenceStrategy.java:277)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; com.wsgc.services.registry.registryservice.impl.service.StandardRegist=
ryService.getRegistriesByProfileId(StandardRegistryService.java:327)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; com.wsgc.services.registry.webapp.impl.RegistryServicesController.getR=
egistriesByProfileId(RegistryServicesController.java:247)<br>
&gt;&gt;&gt;&gt; at sun.reflect.NativeMethodAccessorImpl.invoke0(Native<br>
&gt; Method)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j=
ava:39)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess=
orImpl.java:25)<br>
&gt;&gt;&gt;&gt; at java.lang.reflect.Method.invoke(Method.java:597)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.springframework.web.bind.annotation.support.HandlerMethodInvoker.i=
nvokeHandlerMethod(HandlerMethodInvoker.java:175)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandler=
Adapter.invokeHandlerMethod(AnnotationMethodHandlerAdapter.java:421)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandler=
Adapter.handle(AnnotationMethodHandlerAdapter.java:409)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.springframework.web.servlet.DispatcherServlet.doDispatch(Dispatche=
rServlet.java:774)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.springframework.web.servlet.DispatcherServlet.doService(Dispatcher=
Servlet.java:719)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.springframework.web.servlet.FrameworkServlet.processRequest(Framew=
orkServlet.java:644)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServle=
t.java:549)<br>
&gt;&gt;&gt;&gt; at<br>
&gt; javax.servlet.http.HttpServlet.service(HttpServlet.java:617)<br>
&gt;&gt;&gt;&gt; at<br>
&gt; javax.servlet.http.HttpServlet.service(HttpServlet.java:717)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Appli=
cationFilterChain.java:290)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFi=
lterChain.java:206)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.springframework.web.filter.HiddenHttpMethodFilter.doFilterInternal=
(HiddenHttpMethodFilter.java:77)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRe=
questFilter.java:76)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Appli=
cationFilterChain.java:235)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFi=
lterChain.java:206)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperVa=
lve.java:233)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.apache.catalina.core.StandardContextValve.invoke(StandardContextVa=
lve.java:191)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.apache.catalina.authenticator.AuthenticatorBase.invoke(Authenticat=
orBase.java:563)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.ja=
va:127)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.ja=
va:102)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValv=
e.java:109)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java=
:298)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:190)<=
br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:291)<br=
>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt; org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.ja=
va:774)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.jav=
a:703)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocke=
t.java:896)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPo=
ol.java:690)<br>
&gt;&gt;&gt;&gt; at java.lang.Thread.run(Thread.java:662)<br>
&gt;&gt;&gt;&gt; Caused by: java.net.SocketException: Broken pipe<br>
&gt;&gt;&gt;&gt; at java.net.SocketOutputStream.socketWrite0(Native Method)=
<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt; java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)<br=
>
&gt;&gt;&gt;&gt; at<br>
&gt; java.net.SocketOutputStream.write(SocketOutputStream.java:136)<br>
&gt;&gt;&gt;&gt; at<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTranspor=
t.java:145)<br>
&gt;&gt;&gt;&gt; ... 47 more<br>
&gt;&gt;&gt;&gt; 2011-08-02 08:43:06,543 ERROR<br>
&gt;&gt;&gt;&gt; [me.prettyprint.cassandra.connection.HConnectionManager] -=
 MARK<br>
&gt; HOST AS<br>
&gt;&gt;&gt;&gt; DOWN<br>
&gt;&gt;&gt;&gt; TRIGGERED for host cassandradevrk1(10.130.202.34):9393<br>
&gt;&gt;&gt;&gt; 2011-08-02 08:43:06,543 ERROR<br>
&gt;&gt;&gt;&gt; [me.prettyprint.cassandra.connection.HConnectionManager] -=
 Pool<br>
&gt; state on<br>
&gt;&gt;&gt;&gt; shutdown:<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; &lt;ConcurrentCassandraClientPoolByHost&gt;:{cassandradevrk1(10.130.20=
2.34):9393};<br>
&gt;&gt;&gt;&gt; IsActive?: true; Active: 1; Blocked: 0; Idle: 15;<br>
&gt; NumBeforeExhausted: 49<br>
&gt;&gt;&gt;&gt; 2011-08-02 08:43:06,543 ERROR<br>
&gt;&gt;&gt;&gt; [me.prettyprint.cassandra.connection.ConcurrentHClientPool=
] -<br>
&gt; Shutdown<br>
&gt;&gt;&gt;&gt; triggered on<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; &lt;ConcurrentCassandraClientPoolByHost&gt;:{cassandradevrk1(10.130.20=
2.34):9393}<br>
&gt;&gt;&gt;&gt; 2011-08-02 08:43:06,544 ERROR<br>
&gt;&gt;&gt;&gt; [me.prettyprint.cassandra.connection.ConcurrentHClientPool=
] -<br>
&gt; Shutdown<br>
&gt;&gt;&gt;&gt; complete on<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt; &lt;ConcurrentCassandraClientPoolByHost&gt;:{cassandradevrk1(10.130.20=
2.34):9393}<br>
&gt;&gt;&gt;&gt; 2011-08-02 08:43:06,544 INFO<br>
&gt;&gt;&gt;&gt; [me.prettyprint.cassandra.connection.CassandraHostRetrySer=
vice] -<br>
&gt; Host<br>
&gt;&gt;&gt;&gt; detected as down was added to retry queue:<br>
&gt;&gt;&gt;&gt; cassandradevrk1(10.130.202.34):9393<br>
&gt;&gt;&gt;&gt; 2011-08-02 08:43:06,544 WARN<br>
&gt;&gt;&gt;&gt; [me.prettyprint.cassandra.connection.HConnectionManager] -=
 Could<br>
&gt; not<br>
&gt;&gt;&gt;&gt; fullfill request on this host<br>
&gt; CassandraClient&lt;cassandradevrk1:9393-33&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;<br>
&gt;&gt;<br>
&gt;<br>
<br>
</div></div></blockquote></div><br></div>

--000e0cd3a07a2ece6f04aa084060--