cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain RODRIGUEZ <arodr...@gmail.com>
Subject Re: Error after 1.2.0 upgrade
Date Fri, 04 Jan 2013 09:57:02 GMT
I use the same process as Aaron and I think after disabling gossip, thrift
and draining the node, nothing more is witten in this node which is
considered as being down. But after stopping Cassandra, if you are using
counters and while drain is still broken, I would consider emptying the
commit logs too to avoid recounting.

I use an ubuntu package install. Why "service cassandra stop" doesn't use
one of this "best practice" shut down. This script could do all the step
described above by Aaron automatically, couldn't it ?

Here is the ticket: https://issues.apache.org/jira/browse/CASSANDRA-5111
Le 3 janv. 2013 21:31, "Edward Capriolo" <edlinuxguru@gmail.com> a écrit :

> There is a danger here disablethrift and disablegossip do not stop the fat
> client.
>
> On Thu, Jan 3, 2013 at 3:07 PM, aaron morton <aaron@thelastpickle.com>wrote:
>
>> This is what I do to shutdown. Disabling thrift and gossip will stop
>> incoming requests, but it wont stop existing streams. However these do not
>> go through the commit log.
>>
>> echo "Disabling thrift and gossip..."
>> nodetool -h localhost disablethrift;
>> nodetool -h localhost disablegossip;
>>
>> echo "Sleeping for 10..."
>> sleep 10;
>>
>> echo "Drain..."
>> nodetool -h localhost drain;
>>
>> echo "Sleeping for 10..."
>> sleep 10;
>>
>> echo "Stopping..."
>> sudo monit stop cassandra;
>>
>> A
>>
>>    -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> New Zealand
>>
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 4/01/2013, at 9:02 AM, Edward Capriolo <edlinuxguru@gmail.com> wrote:
>>
>> The only true drain is
>> 1) turn on ip tables to stop all incoming traffic
>> 2) flush
>> 3) wait
>> 4) delete files
>> 5) upgrade
>> 6) restart
>>
>>
>> On Thu, Jan 3, 2013 at 2:59 PM, Michael Kjellman <mkjellman@barracuda.com
>> > wrote:
>>
>>> That's why I didn’t create a ticket as I knew there was one. But, I
>>> thought this had been fixed in 1.1.7 ??
>>>
>>> From: Edward Capriolo <edlinuxguru@gmail.com>
>>> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>>> Date: Thursday, January 3, 2013 11:57 AM
>>> To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>>> Subject: Re: Error after 1.2.0 upgrade
>>>
>>> There is a bug on this, drain has been in a weird state for a long time.
>>> In 1.0 it did not work labeled as a known limitation.
>>>
>>> https://issues.apache.org/jira/browse/CASSANDRA-4446
>>>
>>>
>>>
>>> On Thu, Jan 3, 2013 at 2:49 PM, Michael Kjellman <
>>> mkjellman@barracuda.com> wrote:
>>>
>>>>  Another thing: for those that use counters this might be a problem.
>>>>
>>>> I always do a nodetool drain before upgrading a node (as is good
>>>> practice btw). However, in every case on every one of my nodes, the commit
>>>> log was replayed on each node and mutations were created. Could lead to
>>>> double counting of counters…
>>>>
>>>> No bug for that yet
>>>>
>>>> Best,
>>>> Micahel
>>>>
>>>> From: Michael Kjellman <mkjellman@barracuda.com>
>>>> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>>>> Date: Thursday, January 3, 2013 11:42 AM
>>>> To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>>>> Subject: Re: Error after 1.2.0 upgrade
>>>>
>>>> Tracking Issues:
>>>>
>>>> https://issues.apache.org/jira/browse/CASSANDRA-5101
>>>> https://issues.apache.org/jira/browse/CASSANDRA-5104 which was created
>>>> because of https://issues.apache.org/jira/browse/CASSANDRA-5103
>>>> https://issues.apache.org/jira/browse/CASSANDRA-5102
>>>>
>>>> Also friendly reminder to all that cql2 created indexes will not work
>>>> with cql3. You need to drop them and recreate in cql3, otherwise you'll see
>>>> rpc_timeout issues.
>>>>
>>>> I'll update with more issues as I see them. The fun bugs never happen
>>>> in your dev environment do they :)
>>>>
>>>> From: aaron morton <aaron@thelastpickle.com>
>>>> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>>>> Date: Thursday, January 3, 2013 11:38 AM
>>>> To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>>>> Subject: Re: Error after 1.2.0 upgrade
>>>>
>>>> Michael,
>>>> Could you share some of your problems ? May be of help for others.
>>>>
>>>> Cheers
>>>>
>>>>   -----------------
>>>> Aaron Morton
>>>> Freelance Cassandra Developer
>>>> New Zealand
>>>>
>>>> @aaronmorton
>>>> http://www.thelastpickle.com
>>>>
>>>> On 4/01/2013, at 5:45 AM, Michael Kjellman <mkjellman@barracuda.com>
>>>> wrote:
>>>>
>>>> I'm having huge upgrade issues from 1.1.7 -> 1.2.0 atm but in a 12 node
>>>> cluster which I am slowly massaging into a good state I haven't seen this
>>>> in 15+ hours of operation…
>>>>
>>>> This looks related to JNA?
>>>>
>>>> From: Alain RODRIGUEZ <arodrime@gmail.com>
>>>> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>>>> Date: Thursday, January 3, 2013 8:42 AM
>>>> To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>>>> Subject: Error after 1.2.0 upgrade
>>>>
>>>> In a dev env, C* 1.1.7 -> 1.2.0, 1 node.
>>>>
>>>> I run Cassandra in a 8GB memory environment.
>>>>
>>>> The upgrade went well, but I sometimes have the following error:
>>>>
>>>> INFO 17:31:04,143 Node /192.168.100.201 state jump to normal
>>>>  INFO 17:31:04,149 Enqueuing flush of Memtable-local@1654799672(32/32
>>>> serialized/live bytes, 2 ops)
>>>>  INFO 17:31:04,149 Writing Memtable-local@1654799672(32/32
>>>> serialized/live bytes, 2 ops)
>>>>  INFO 17:31:04,371 Completed flushing
>>>> /home/stockage/cassandra/data/system/local/system-local-ia-12-Data.db (91
>>>> bytes) for commitlog position ReplayPosition(segmentId=1357230649515,
>>>> position=49584)
>>>>  INFO 17:31:04,376 Startup completed! Now serving reads.
>>>>  INFO 17:31:04,798 Compacted to
>>>> [/var/lib/cassandra/data/system/local/system-local-ia-13-Data.db,].  950
to
>>>> 471 (~49% of original) bytes for 1 keys at 0,000507MB/s.  Time: 886ms.
>>>>  INFO 17:31:04,889 mx4j successfuly loaded
>>>> HttpAdaptor version 3.0.2 started on port 8081
>>>>  INFO 17:31:04,967 Not starting native transport as requested. Use JMX
>>>> (StorageService->startNativeTransport()) to start it
>>>>  INFO 17:31:04,980 Binding thrift service to /0.0.0.0:9160
>>>>  INFO 17:31:05,007 Using TFramedTransport with a max frame size of
>>>> 15728640 bytes.
>>>>  INFO 17:31:09,964 Using synchronous/threadpool thrift server on
>>>> 0.0.0.0 : 9160
>>>>  INFO 17:31:09,965 Listening for thrift clients...
>>>> *** java.lang.instrument ASSERTION FAILED ***: "!errorOutstanding" with
>>>> message transform method call failed at
>>>> ../../../src/share/instrument/JPLISAgent.c line: 806
>>>> ERROR 17:33:56,002 Exception in thread Thread[Thrift:1702,5,main]
>>>> java.lang.StackOverflowError
>>>>         at java.net.SocketInputStream.socketRead0(Native Method)
>>>>         at java.net.SocketInputStream.read(Unknown Source)
>>>>         at java.io.BufferedInputStream.fill(Unknown Source)
>>>>         at java.io.BufferedInputStream.read1(Unknown Source)
>>>>         at java.io.BufferedInputStream.read(Unknown Source)
>>>>         at
>>>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
>>>>         at
>>>> org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>>>>         at
>>>> org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
>>>>         at
>>>> org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
>>>>         at
>>>> org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>>>>         at
>>>> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
>>>>         at
>>>> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
>>>>         at
>>>> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
>>>>         at
>>>> org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:22)
>>>>         at
>>>> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199)
>>>>         at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
>>>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
>>>> Source)
>>>>         at java.lang.Thread.run(Unknown Source)
>>>>
>>>> ----------------------------------
>>>> Join Barracuda Networks in the fight against hunger.
>>>> To learn how you can help in your community, please visit:
>>>> http://on.fb.me/UAdL4f
>>>>   ­­
>>>>
>>>>
>>>>
>>>> ----------------------------------
>>>> Join Barracuda Networks in the fight against hunger.
>>>> To learn how you can help in your community, please visit:
>>>> http://on.fb.me/UAdL4f
>>>>   ­­
>>>>
>>>> ----------------------------------
>>>> Join Barracuda Networks in the fight against hunger.
>>>> To learn how you can help in your community, please visit:
>>>> http://on.fb.me/UAdL4f
>>>>   ­­
>>>>
>>>
>>>
>>> ----------------------------------
>>> Join Barracuda Networks in the fight against hunger.
>>> To learn how you can help in your community, please visit:
>>> http://on.fb.me/UAdL4f
>>>   ­­
>>>
>>
>>
>>
>

Mime
View raw message