Return-Path: Delivered-To: apmail-incubator-cassandra-commits-archive@minotaur.apache.org Received: (qmail 14751 invoked from network); 1 Mar 2010 08:49:08 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 1 Mar 2010 08:49:08 -0000 Received: (qmail 11342 invoked by uid 500); 28 Feb 2010 19:22:28 -0000 Delivered-To: apmail-incubator-cassandra-commits-archive@incubator.apache.org Received: (qmail 11322 invoked by uid 500); 28 Feb 2010 19:22:28 -0000 Mailing-List: contact cassandra-commits-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cassandra-dev@incubator.apache.org Delivered-To: mailing list cassandra-commits@incubator.apache.org Received: (qmail 11313 invoked by uid 99); 28 Feb 2010 19:22:28 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 28 Feb 2010 19:22:28 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 28 Feb 2010 19:22:26 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id B3E73234C1F2 for ; Sun, 28 Feb 2010 19:22:05 +0000 (UTC) Message-ID: <933031493.22501267384925721.JavaMail.jira@brutus.apache.org> Date: Sun, 28 Feb 2010 19:22:05 +0000 (UTC) From: "Jonathan Ellis (JIRA)" To: cassandra-commits@incubator.apache.org Subject: [jira] Issue Comment Edited: (CASSANDRA-836) CommitLogSegment::seekAndWriteCommitLogHeader assumes header size doesn't change. In-Reply-To: <1546121191.17691267341425823.JavaMail.jira@brutus.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/CASSANDRA-836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839479#action_12839479 ] Jonathan Ellis edited comment on CASSANDRA-836 at 2/28/10 7:20 PM: ------------------------------------------------------------------- it's not a bug, because we never change the size of the bitset. if you want to add an assertion to that effect, fine, but making the serialization handle a situation that would be a horrible bug, is bad design. was (Author: jbellis): it's not a bug, because we never change the size of the bitset > CommitLogSegment::seekAndWriteCommitLogHeader assumes header size doesn't change. > --------------------------------------------------------------------------------- > > Key: CASSANDRA-836 > URL: https://issues.apache.org/jira/browse/CASSANDRA-836 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: n/a - all > Reporter: Ross M > Priority: Minor > Attachments: BitSetSerializer.java > > > CommitLogSegment::seekAndWriteCommitLogHeader assumes header size doesn't grow. there are pieces of the header (BitSet) that are serialized with java serialization which makes no such promises. > the following code: > /** writes header at the beginning of the file, then seeks back to current position */ > void seekAndWriteCommitLogHeader(byte[] bytes) throws IOException > { > long currentPos = logWriter.getFilePointer(); > logWriter.seek(0); > writeCommitLogHeader(bytes); > logWriter.seek(currentPos); > } > works fine as long as the header size doesn't change, but if it grows the new header will over write the beginning of the data segment. the bit-set being written in the header happens to serialize to the same size, but there is no guarantee of this. > i found this when looking at optimizing the serialization of data to disk (thus improving write throughput/performance.) i removed the ObjectOutputStream serialization in BitSetSerializer and replaced it with a custom serialization that omits the generic java serialization/ObjectOutputStream stuff and just writes on the "true" bits. the custom serialization worked fine, but broke other parts of the code when the header bitset had new bits turned on, thus growing the header's size, data segment bytes were overwritten. > the serialized version of a BitSet can grow in a similar manner, no pomises of size/consistency are made, but with current use it luckily doesn't seem to happen. > a good fix is unclear. without forcing the header to be a fixed/constant size in some manner this problem could pop up at any point. it's generally not safe to rewrite headers like this without custom code that ensures the size doesn't change. one fix would be to manually write all of the header data out (rather than relying on java serialization and serialization code in other parts of cassandra not to change.) another might be to pad the size of the header so that the data inside can grow, but that seems fraught with (potential) problems. (i've played around with padding the header length, but that seems to cause other things to break, which i haven't been able to track down yet.) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.