incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yan Chunlu <springri...@gmail.com>
Subject Re: with proof Re: cassandra goes infinite loop and data lost.....
Date Thu, 21 Jul 2011 03:42:39 GMT
thans for the reply.

now the problem is how can I get rid of the ""N of 2147483647 ", it seems
never ends, and the node never goes UP....
last time it happens I run "node cleanup", turns out some data loss(not sure
if caused by cleanup).

On Thu, Jul 21, 2011 at 11:37 AM, aaron morton <aaron@thelastpickle.com>wrote:

> Personally I would do a repair first if you need to do one, just so you are
> confident everything is where is should be.
>
> Then do the move as described in the wiki.
>
> Cheers
>
>  -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 21 Jul 2011, at 15:14, Yan Chunlu wrote:
>
> sorry for the misunderstanding.  I saw many N of 2147483647 which N=0 and
> thought it was not doing anything.
>
> my node was very unbalanced and I was intend to rebalance it by "nodetool
> move" after a "node repair", does that cause the slices much large?
>
> Address         Status State   Load            Owns    Token
>
>
>  84944475733633104818662955375549269696
> 10.28.53.2      Down   Normal  71.41 GB        81.09%
>  52773518586096316348543097376923124102
> 10.28.53.3     Up     Normal  14.72 GB        10.48%
>  70597222385644499881390884416714081360
> 10.28.53.4      Up     Normal  13.5 GB         8.43%
> 84944475733633104818662955375549269696
>
>
> should I do "nodetool move" according to
> http://wiki.apache.org/cassandra/Operations#Load_balancing  before doing
> repair?
>
> thank you for your help!
>
>
>
> On Thu, Jul 21, 2011 at 10:47 AM, Jonathan Ellis <jbellis@gmail.com>wrote:
>
>> This is not an infinite loop, you can see the column objects being
>> iterated over are different.
>>
>> Like I said last time, "I do see that it's saying "N of 2147483647"
>> which looks like you're
>> doing slices with a much larger limit than is advisable."
>>
>> On Wed, Jul 20, 2011 at 9:00 PM, Yan Chunlu <springrider@gmail.com>
>> wrote:
>> > this time it is another node, the node goes down during repair, and come
>> > back but never up, I change log level to "DEBUG" and found out it print
>> out
>> > the following message infinitely
>> > DEBUG [main] 2011-07-20 20:58:16,286 SliceQueryFilter.java (line 123)
>> > collecting 0 of 2147483647: 76616c7565:false:6@1311207851757243
>> > DEBUG [main] 2011-07-20 20:58:16,319 SliceQueryFilter.java (line 123)
>> > collecting 0 of 2147483647: 76616c7565:false:98@1306722716288857
>> > DEBUG [main] 2011-07-20 20:58:16,424 SliceQueryFilter.java (line 123)
>> > collecting 0 of 2147483647: 76616c7565:false:95@1311089980134545
>> > DEBUG [main] 2011-07-20 20:58:16,611 SliceQueryFilter.java (line 123)
>> > collecting 0 of 2147483647: 76616c7565:false:85@1311154048866767
>> > DEBUG [main] 2011-07-20 20:58:16,754 SliceQueryFilter.java (line 123)
>> > collecting 0 of 2147483647: 76616c7565:false:366@1311207176880564
>> > DEBUG [main] 2011-07-20 20:58:16,770 SliceQueryFilter.java (line 123)
>> > collecting 0 of 2147483647: 76616c7565:false:80@1310443605930900
>> > DEBUG [main] 2011-07-20 20:58:16,816 SliceQueryFilter.java (line 123)
>> > collecting 0 of 2147483647: 76616c7565:false:486@1311173929610402
>> > DEBUG [main] 2011-07-20 20:58:16,870 SliceQueryFilter.java (line 123)
>> > collecting 0 of 2147483647: 76616c7565:false:101@1310818289021118
>> > DEBUG [main] 2011-07-20 20:58:17,041 SliceQueryFilter.java (line 123)
>> > collecting 0 of 2147483647: 76616c7565:false:677@1311202595772170
>> > DEBUG [main] 2011-07-20 20:58:17,047 SliceQueryFilter.java (line 123)
>> > collecting 0 of 2147483647: 76616c7565:false:374@1311147641237918
>> >
>> >
>> >
>> > On Thu, Jul 14, 2011 at 1:36 PM, Jonathan Ellis <jbellis@gmail.com>
>> wrote:
>> >>
>> >> That says "I'm collecting data to answer requests."
>> >>
>> >> I don't see anything here that indicates an infinite loop.
>> >>
>> >> I do see that it's saying "N of 2147483647" which looks like you're
>> >> doing slices with a much larger limit than is advisable (good way to
>> >> OOM the way you already did).
>> >>
>> >> On Wed, Jul 13, 2011 at 8:27 PM, Yan Chunlu <springrider@gmail.com>
>> wrote:
>> >> > I gave cassandra 8GB heap size and somehow it run out of memory and
>> >> > crashed.
>> >> > after I start it, it just runs in to the following infinite loop, the
>> >> > last
>> >> > line:
>> >> > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
>> >> > collecting 0 of 2147483647: 100zs:false:14@1310168625866434
>> >> > goes for ever
>> >> > I have 3 nodes and RF=2, so I am losing data. is that means I am
>> screwed
>> >> > and
>> >> > can't get it back?
>> >> > DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123)
>> >> > collecting 20 of 2147483647: q74k:false:14@1308886095008943
>> >> > DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123)
>> >> > collecting 0 of 2147483647: 10fbu:false:1@1310223075340297
>> >> > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
>> >> > collecting 0 of 2147483647: apbg:false:13@1305641597957086
>> >> > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
>> >> > collecting 1 of 2147483647: auje:false:13@1305641597957075
>> >> > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
>> >> > collecting 2 of 2147483647: ayj8:false:13@1305641597957060
>> >> > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
>> >> > collecting 3 of 2147483647: b4fz:false:13@1305641597957096
>> >> > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
>> >> > collecting 0 of 2147483647: 100zs:false:14@1310168625866434
>> >> > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
>> >> > collecting 1 of 2147483647: 1017f:false:14@1310168680375612
>> >> > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
>> >> > collecting 2 of 2147483647: 1018e:false:14@1310168759614715
>> >> > DEBUG [main] 2011-07-13 22:19:00,587 SliceQueryFilter.java (line 123)
>> >> > collecting 3 of 2147483647: 101dd:false:14@1310169260225339
>> >> >
>> >> > On Thu, Jul 14, 2011 at 11:27 AM, Yan Chunlu <springrider@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line
>> 123)
>> >> >> collecting 0 of 2147483647: 100zs:false:14@1310168625866434
>> >> >
>> >> >
>> >> > --
>> >> > 闫春路
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Jonathan Ellis
>> >> Project Chair, Apache Cassandra
>> >> co-founder of DataStax, the source for professional Cassandra support
>> >> http://www.datastax.com
>> >
>> >
>> >
>> > --
>> > 闫春路
>> >
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>>
>
>
>
> --
> 闫春路
>
>
>


-- 
闫春路

Mime
View raw message