incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nate McCall <zznat...@gmail.com>
Subject Re: System hints compaction stuck
Date Wed, 07 Aug 2013 18:58:15 GMT
Is there anything else on the network that could be attempting to
connect to 9160?

That is the exact error you would get when someone initiates a
connection and sends a null byte. You can reproduce it thusly:
echo -n 'm' | nc localhost 9160


On Wed, Aug 7, 2013 at 11:11 AM, David McNelis <dmcnelis@gmail.com> wrote:
> Nate,
>
> We had a node that was flaking on us last week and had a lot of handoffs
> fail to that node.  We ended up decommissioning that node entirely.  I can't
> find the actual error we were getting at the time (logs have been rotated
> out), but currently we're not seeing any errors there.
>
> We haven't had any schema updates recently and we are using the sync rpc
> server.  We had hsha turned on for a while, but we were getting a bunch of
> transport frame size errors.
>
>
> On Wed, Aug 7, 2013 at 1:55 PM, Nate McCall <zznate.m@gmail.com> wrote:
>>
>> Thrift and ClientState are both unrelated to hints.
>>
>> What do you see in the logs after "Started hinted handoff for
>> host:..." from HintedHandoffManager?
>>
>> It should either have an error message or something along the lines of
>> "Finished hinted handoff of:..."
>>
>> Where there any schema updates that preceded this happening?
>>
>> As for the thrift stuff, which rpc_server_type are you using?
>>
>>
>>
>> On Wed, Aug 7, 2013 at 6:14 AM, David McNelis <dmcnelis@gmail.com> wrote:
>> > Morning folks,
>> >
>> > For the last couple of days all of my nodes (17, all running 1.2.8) have
>> > been stuck at various percentages of completion for compacting
>> > system.hints.
>> > I've tried restarting the nodes (including a full rolling restart of the
>> > cluster) to no avail.
>> >
>> > When I turn on Debugging I am seeing this error on all of the nodes
>> > constantly:
>> >
>> > DEBUG 09:03:21,999 Thrift transport error occurred during processing of
>> > message.
>> > org.apache.thrift.transport.TTransportException
>> >         at
>> >
>> > org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
>> >         at
>> > org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>> >         at
>> >
>> > org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
>> >         at
>> >
>> > org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
>> >         at
>> > org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>> >         at
>> >
>> > org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
>> >         at
>> >
>> > org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
>> >         at
>> >
>> > org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
>> >         at
>> > org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:22)
>> >         at
>> >
>> > org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199)
>> >         at
>> >
>> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> >         at
>> >
>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> >         at java.lang.Thread.run(Thread.java:724)
>> >
>> >
>> > When I turn on tracing, I see that shortly after this error there is a
>> > message similar to:
>> > TRACE 09:03:22,000 ClientState removed for socket addr
>> > /10.55.56.211:35431
>> >
>> > The IP in this message is sometimes a client machine, sometimes another
>> > cassandra node with no processes other than C* running on it (which I
>> > think
>> > rules out an issue with a particular client library doing something
>> > funny
>> > with Thrift).
>> >
>> > While I wouldn't expect a Thrift issue to cause problems with
>> > compaction,
>> > I'm out of other ideas at the moment.  Anyone have any thoughts they
>> > could
>> > share?
>> >
>> > Thanks,
>> > David
>
>

Mime
View raw message