Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 47260DBC8 for ; Wed, 29 Aug 2012 09:45:23 +0000 (UTC) Received: (qmail 46234 invoked by uid 500); 29 Aug 2012 09:45:21 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 45752 invoked by uid 500); 29 Aug 2012 09:45:20 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 45734 invoked by uid 99); 29 Aug 2012 09:45:20 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Aug 2012 09:45:20 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a92.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Aug 2012 09:45:12 +0000 Received: from homiemail-a92.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a92.g.dreamhost.com (Postfix) with ESMTP id AB92B3DC06D for ; Wed, 29 Aug 2012 02:44:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :content-type:message-id:mime-version:subject:date:references:to :in-reply-to; s=thelastpickle.com; bh=usQjXl31RwVbwfRP0XKKsr7zfQ k=; b=e40dceF1VwDwozhs5L1b300LXfKFZv5oFbdwJUqN/yoRJNUX5W07/8WT+v CQPoKTZVNJe0gVg9tl2Mh9uYFrS1Fl/jATj1QcgdGmb8swaS5T+alc58b9kagxCe xETLn9Ng/q1EuX/qD74VHdnR6NiVd2OTuI4HUwE3rW7wj0O9k= Received: from [172.16.1.10] (unknown [203.86.207.101]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a92.g.dreamhost.com (Postfix) with ESMTPSA id DB5F63DC05E for ; Wed, 29 Aug 2012 02:44:45 -0700 (PDT) From: aaron morton Content-Type: multipart/alternative; boundary="Apple-Mail=_D2C64F88-6765-4FF5-8EA4-07F555F24184" Message-Id: <8EB620BC-03C8-4BE1-8F45-1F8D4FAA827C@thelastpickle.com> Mime-Version: 1.0 (Mac OS X Mail 6.0 \(1486\)) Subject: Re: Node forgets about most of its column families Date: Wed, 29 Aug 2012 21:44:39 +1200 References: <50367ACA.4090301@globalrelay.net> <5036937F.8000305@globalrelay.net> <7FB4A3F7-0105-49C5-B630-EABFC611A8FB@thelastpickle.com> <5037C593.6090509@globalrelay.net> <503D2C5E.8060500@globalrelay.net> To: user@cassandra.apache.org In-Reply-To: <503D2C5E.8060500@globalrelay.net> X-Mailer: Apple Mail (2.1486) --Apple-Mail=_D2C64F88-6765-4FF5-8EA4-07F555F24184 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 > But the following nodetool repair crashes. It has to be stopped and = then re-started. How did it crash ? > Are there any suggestions for logging or similar so that we can get a = clue next time this happens. Can you make the logs from #5 available? If you feel you can describe the situation please create a ticket on = https://issues.apache.org/jira/browse/CASSANDRA Cheers =20 ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 29/08/2012, at 8:38 AM, Edward Sargisson = wrote: > For the record, we just had a recurrence of this.=20 > This time, when the node (#5) came back it didn't properly rejoin the = ring.=20 > We stopped every node and brought them back one by one to get the ring = to link up correctly. > Then, all the even nodes (#2, #4, #6) had out of data schemas. >=20 > nodetool resetlocalschema works. > But the following nodetool repair crashes. It has to be stopped and = then re-started. >=20 > Are there any suggestions for logging or similar so that we can get a = clue next time this happens. >=20 > Cheers, > Edward >=20 >=20 > On 12-08-24 11:18 AM, Edward Sargisson wrote: >> Sadly, I don't think we can get much. >>=20 >> All I know about the repro is that it was around a node restart. I've = just tried that and everything's fine. I see now ERROR level messages in = the logs. >>=20 >> Clearly, some other conditions are required but we don't know them as = yet. >>=20 >> Many thanks, >> Edward >>=20 >>=20 >> On 12-08-24 03:29 AM, aaron morton wrote: >>> If this is still a test environment can you try to reproduce the = fault ? Or provide some more details on the sequence of events? >>>=20 >>> If you still have the logs around can you see if any ERROR level = messages were logged? >>>=20 >>> Cheers >>>=20 >>> ----------------- >>> Aaron Morton >>> Freelance Developer >>> @aaronmorton >>> http://www.thelastpickle.com >>>=20 >>> On 24/08/2012, at 8:33 AM, Edward Sargisson = wrote: >>>=20 >>>> Ah, yes, I forgot that bit thanks! >>>>=20 >>>> 1.1.2 running on Centos. >>>>=20 >>>> Running nodetool resetlocalschema then nodetool repair fixed the = problem but not understanding what happened is a concern. >>>>=20 >>>> Cheers, >>>> Edward >>>>=20 >>>>=20 >>>> On 12-08-23 12:40 PM, Rob Coli wrote: >>>>> On Thu, Aug 23, 2012 at 11:47 AM, Edward Sargisson >>>>> wrote: >>>>>> I was wondering if anybody had seen the following behaviour = before and how >>>>>> we might detect it and keep the application running. >>>>> I don't know the answer to your problem, but anyone who does will = want >>>>> to know in what version of Cassandra you are encountering this = issue. >>>>> :) >>>>>=20 >>>>> =3DRob >>>>>=20 >>>>=20 >>>> --=20 >>>> Edward Sargisson >>>> senior java developer >>>> Global Relay >>>>=20 >>>> edward.sargisson@globalrelay.net >>>>=20 >>>>=20 >>>> 866.484.6630=20 >>>> New York | Chicago | Vancouver | London (+44.0800.032.9829) | = Singapore (+65.3158.1301) >>>>=20 >>>> Global Relay Archive supports email, instant messaging, BlackBerry, = Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, = Facebook and more.=20 >>>>=20 >>>> Ask about Global Relay Message =97 The Future of Collaboration in = the Financial Services World >>>>=20 >>>> All email sent to or from this address will be retained by Global = Relay=92s email archiving system. This message is intended only for the = use of the individual or entity to which it is addressed, and may = contain information that is privileged, confidential, and exempt from = disclosure under applicable law. Global Relay will not be liable for = any compliance or technical information provided herein. All trademarks = are the property of their respective owners. >>>=20 >>=20 >> --=20 >> Edward Sargisson >> senior java developer >> Global Relay >>=20 >> edward.sargisson@globalrelay.net >>=20 >>=20 >> 866.484.6630=20 >> New York | Chicago | Vancouver | London (+44.0800.032.9829) | = Singapore (+65.3158.1301) >>=20 >> Global Relay Archive supports email, instant messaging, BlackBerry, = Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, = Facebook and more.=20 >>=20 >> Ask about Global Relay Message =97 The Future of Collaboration in the = Financial Services World >>=20 >> All email sent to or from this address will be retained by Global = Relay=92s email archiving system. This message is intended only for the = use of the individual or entity to which it is addressed, and may = contain information that is privileged, confidential, and exempt from = disclosure under applicable law. Global Relay will not be liable for = any compliance or technical information provided herein. All trademarks = are the property of their respective owners. >=20 > --=20 > Edward Sargisson > senior java developer > Global Relay >=20 > edward.sargisson@globalrelay.net >=20 >=20 > 866.484.6630=20 > New York | Chicago | Vancouver | London (+44.0800.032.9829) | = Singapore (+65.3158.1301) >=20 > Global Relay Archive supports email, instant messaging, BlackBerry, = Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, = Facebook and more.=20 >=20 > Ask about Global Relay Message =97 The Future of Collaboration in the = Financial Services World >=20 > All email sent to or from this address will be retained by Global = Relay=92s email archiving system. This message is intended only for the = use of the individual or entity to which it is addressed, and may = contain information that is privileged, confidential, and exempt from = disclosure under applicable law. Global Relay will not be liable for = any compliance or technical information provided herein. All trademarks = are the property of their respective owners. --Apple-Mail=_D2C64F88-6765-4FF5-8EA4-07F555F24184 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=windows-1252
But = the following nodetool repair crashes. It has to be stopped and then = re-started.
How did it crash = ?

Are there any suggestions for logging or similar so = that we can get a clue next time this happens.
Can = you make the logs from #5 available?

If you = feel you can describe the situation please create a ticket on https://issues.ap= ache.org/jira/browse/CASSANDRA

Cheers

 
http://www.thelastpickle.com

On 29/08/2012, at 8:38 AM, Edward Sargisson <edward.sargisson@globalre= lay.net> wrote:

=20 =20
For the record, we just had a recurrence of this.
This time, when the node (#5) came back it didn't properly rejoin the ring.
We stopped every node and brought them back one by one to get the ring to link up correctly.
Then, all the even nodes (#2, #4, #6) had out of data schemas.

nodetool resetlocalschema works.
But the following nodetool repair crashes. It has to be stopped and then re-started.

Are there any suggestions for logging or similar so that we can get a clue next time this happens.

Cheers,
Edward


On 12-08-24 11:18 AM, Edward = Sargisson wrote:
Sadly, I don't think we can get much.

All I know about the repro is that it was around a node restart. I've just tried that and everything's fine. I see now ERROR level messages in the logs.

Clearly, some other conditions are required but we don't know them as yet.

Many thanks,
Edward


On 12-08-24 03:29 AM, aaron morton wrote:
If this is still a test environment can you try to reproduce the fault ? Or provide some more details on the sequence of events?

If you still have the logs around can you see if any ERROR level messages were logged?

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton

On 24/08/2012, at 8:33 AM, Edward Sargisson <edward.sargisson@globalre= lay.net> wrote:

Ah, yes, I = forgot that bit thanks!

1.1.2 running on Centos.

Running nodetool resetlocalschema then nodetool repair fixed the problem but not understanding what happened is a concern.

Cheers,
Edward


On 12-08-23 12:40 PM, Rob Coli wrote:
On Thu, Aug 23, 2012 at 11:47 AM, =
Edward Sargisson
<edward.sargisson@glob=
alrelay.net> wrote:
I was wondering if anybody had seen =
the following behaviour before and how
we might detect it and keep the application running.
I don't know the answer to your =
problem, but anyone who does will want
to know in what version of Cassandra you are encountering this issue.
:)

=3DRob


--

Edward = Sargisson

senior = java developer
Global = Relay

edward.sargisson@globalrelay.net


866.484.6630 
New York | Chicago | Vancouver  =
|  = London  (+44.0800.032.9829)  = |  = Singapore  (+65.3158.1301)

Global Relay Archive supports email, instant messaging, BlackBerry, Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, Facebook and more. 


Ask about = Global Relay = Message =97 The Future of Collaboration in the Financial Services = World


All email sent to or from this address will be retained by Global Relay=92s email archiving system. This message is intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential, and exempt from disclosure under applicable law.  Global = Relay will not be liable for any compliance or technical information provided herein.  = All trademarks are the property of their respective owners.



--

Edward Sargisson

senior = java developer
Global = Relay

edward.sargisson@globalrelay.net


866.484.6630 
New York | Chicago | Vancouver  =
|  = London  (+44.0800.032.9829)  Singapore  (+65.3158.1301)

Global Relay Archive supports email, instant messaging, BlackBerry, Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, Facebook and more.  =


Ask about Global Relay Message =97 The Future of Collaboration in the Financial Services = World


All email sent to or from this address will be retained by Global Relay=92s email archiving system. This message is intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential, and exempt from disclosure under applicable law.  Global Relay will not be = liable for any compliance or technical information provided herein.  All trademarks are the property of their respective owners.


--

Edward Sargisson

senior = java developer
Global Relay

edward.sargisson@globalrelay.net


866.484.6630 
New York | Chicago | Vancouver  =
London  (+44.0800.032.9829)  Singapore  (+65.3158.1301)

Global Relay Archive supports email, instant messaging, BlackBerry, Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, Facebook and more. 


Ask about Global Relay Message =97 The Future of Collaboration in the Financial Services = World


All email sent to or from this address will be retained by Global Relay=92s email archiving system. This message is intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential, and exempt from disclosure under applicable law.  Global Relay will = not be liable for any compliance or technical information provided herein.  All trademarks are the property of their respective owners.


= --Apple-Mail=_D2C64F88-6765-4FF5-8EA4-07F555F24184--