Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of sylvain@datastax.com
 designates 209.85.213.171 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <4ebc2e5a.a5b4ec0a.3445.0f3d@mx.google.com>
References: <4ebc2e5a.a5b4ec0a.3445.0f3d@mx.google.com>
From: Sylvain Lebresne <sylvain@datastax.com>
Date: Thu, 10 Nov 2011 21:22:44 +0100
Message-ID: 
 <CAKkz8Q3Jzd=RRphcUVwKwEdpv9M2MJoJNVjDsC5pQYcb6-yKrg@mail.gmail.com>
Subject: Re: 1.0.2 Assertion Error
To: user@cassandra.apache.org
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

That would be a bug (as any assertion error would be), likely some
race condition.
Could you open a ticket?
The fact that this block the MemtablePostFlusher is unfortunately
related. Restarting the
node would fix but we need to make that more solid too.

--
Sylvain

On Thu, Nov 10, 2011 at 9:04 PM, Dan Hendry <dan.hendry.junk@gmail.com> wro=
te:
> Just happened again, seems to be with the same column family (at least on=
 a
> flusher thread for which the last activity was flushing a memtable for th=
at
> CF).
>
>
>
> It also looks like MemtablePostFlusher tasks blocked (and not getting
> cleared) as evidenced by tpstats:
>
>
>
> Pool Name=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Active=
=A0=A0 Pending=A0=A0=A0=A0=A0 Completed=A0=A0 Blocked=A0 All
> time blocked
>
> MemtablePostFlusher=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 1=A0=A0=A0=
=A0=A0=A0=A0 18=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 =A0=A016
> 0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0
>
>
>
> Dan
>
>
>
> From: Dan Hendry [mailto:dan.hendry.junk@gmail.com]
> Sent: November-10-11 14:49
> To: 'user@cassandra.apache.org'
> Subject: 1.0.2 Assertion Error
>
>
>
> Just saw this weird assertion after upgrading one of my nodes from 0.8.6 =
to
> 1.0.2 (its been running fine for a few hours now):
>
>
>
> INFO [FlushWriter:9] 2011-11-10 13:08:58,882 Memtable.java (line 237)
> Writing Memtable-Data@1388955390(25676955/430716097 serialized/live bytes=
,
> 478913 ops)
>
> ERROR [FlushWriter:9] 2011-11-10 13:08:59,513 AbstractCassandraDaemon.jav=
a
> (line 133) Fatal exception in thread Thread[FlushWriter:9,5,main]
>
> java.lang.AssertionError: CF size changed during serialization: was 4
> initially but 3 written
>
> =A0=A0=A0=A0=A0=A0=A0 at
> org.apache.cassandra.db.ColumnFamilySerializer.serializeForSSTable(Column=
FamilySerializer.java:94)
>
> =A0=A0=A0=A0 =A0=A0=A0at
> org.apache.cassandra.db.ColumnFamilySerializer.serializeWithIndexes(Colum=
nFamilySerializer.java:112)
>
> =A0=A0=A0=A0=A0=A0=A0 at
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:1=
77)
>
> =A0=A0=A0=A0=A0=A0=A0 at
> org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:264)
>
> =A0=A0=A0=A0=A0=A0=A0 at org.apache.cassandra.db.Memtable.access$400(Memt=
able.java:47)
>
> =A0=A0=A0=A0=A0=A0=A0 at org.apache.cassandra.db.Memtable$4.runMayThrow(M=
emtable.java:289)
>
> =A0=A0=A0=A0=A0=A0=A0 at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>
> =A0=A0=A0=A0=A0=A0=A0at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor=
.java:886)
>
> =A0=A0=A0=A0=A0=A0=A0 at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.jav=
a:908)
>
> =A0=A0=A0=A0=A0=A0=A0 at java.lang.Thread.run(Thread.java:662)
>
>
>
>
>
> After taking a quick peek at the code, it looks like the numbers (=934
> initially but 3 written=94) refer to the number of columns (in the memtab=
le?).
> Given the byte size of the memtable being flushed, there are *certainly*
> more than 4 columns. Besides this error, there are no other unexpected or
> unusual log entries and the node seems to be behaving normally.
>
>
>
> Thoughts?
>
>
>
> Dan Hendry