Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 72924 invoked from network); 31 Mar 2011 23:13:26 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 31 Mar 2011 23:13:26 -0000 Received: (qmail 42553 invoked by uid 500); 31 Mar 2011 23:13:24 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 42529 invoked by uid 500); 31 Mar 2011 23:13:24 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 42521 invoked by uid 99); 31 Mar 2011 23:13:24 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 31 Mar 2011 23:13:24 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a59.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 31 Mar 2011 23:13:17 +0000 Received: from homiemail-a59.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a59.g.dreamhost.com (Postfix) with ESMTP id 52760564055 for ; Thu, 31 Mar 2011 16:12:55 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=content-type :mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; q=dns; s= thelastpickle.com; b=y/T4FzL69zPvbmHUBH67tLPBM7hkYtlHbKivxHpLFl+ NTPBHHNSsY4CwbLyaLqChfSlkYBqMe1KrcoOBHyk0a4P1kozH7gmdU9kV1gFlC5x 72I2itV5/E9arBagQIZVwD11/y3QNxI7ULi2Sc+bzPQK4TRBkIxIi+bGhySTaNAI = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h= content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; s= thelastpickle.com; bh=5Ik9/lmL9Mdpa47FUVXZSC915jg=; b=4eYg9Yn2Gr fwnbpX2WBKvbU0kISFRqmMlH5XAD6ai+QIXNxvH2N512AFT7ZxCcnLgcH7+aRbXf q7F/LQjGQPt1jEEQQKBWjYppeG5YMq3vGLFDuxxC/nrKZ8PpOgCqGq9o4XDCHJzU SJ//V9iqno+TIGkPp+COt5bA5xmXoHbwA= Received: from [10.0.0.3] (CPE-58-168-9-85.lns3.cht.bigpond.net.au [58.168.9.85]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a59.g.dreamhost.com (Postfix) with ESMTPSA id A1DA5564058 for ; Thu, 31 Mar 2011 16:12:54 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1082.1) Subject: Re: Attempt to assign id to existing column family. From: aaron morton In-Reply-To: <4D93BB2B.8020105@nicira.com> Date: Fri, 1 Apr 2011 10:12:51 +1100 Content-Transfer-Encoding: quoted-printable Message-Id: <0CBA168A-BB12-4CBB-AF89-194188D15907@thelastpickle.com> References: <4D93A696.1090408@nicira.com> <4D93BB2B.8020105@nicira.com> To: user@cassandra.apache.org X-Mailer: Apple Mail (2.1082.1) X-Virus-Checked: Checked by ClamAV on apache.org There is no reason to change the RF on the system keyspace, it should = probably not be allowed. The system keyspace uses a LocalPartitioner and it's data is not = replicated through the same mechanism as a user keyspace.=20 =20 Aaron =20 On 31 Mar 2011, at 10:22, Jeremy Stribling wrote: > On 03/30/2011 02:54 PM, Jeremy Stribling wrote: >> After restarting a Cassandra 0.7.2 node, the node catches an = exception during initialization and refuses to start: >>=20 >> Caused by: org.apache.cassandra.config.ConfigurationException: = Attempt to assign id to existing column family. >> at = org.apache.cassandra.config.CFMetaData.map(CFMetaData.java:222) >> at = org.apache.cassandra.config.DatabaseDescriptor.loadSchemas(DatabaseDescrip= tor.java:477)=20 >> ... 2 more >>=20 >> Unlike a previous thread about this topic = (http://www.mail-archive.com/user@cassandra.apache.org/msg09024.html), = we are not trying to preserve the JVM across restarts. The restart = comes up in an entirely fresh JVM. We are, however, embedding Cassandra = in our application, but we're using the same steps used by = AbstractCassandraDaemon to bring it up. >>=20 >> Looking briefly through the code, the only way I see that this can = happen is if loadSchemas tries to load information about the system = table from storage (because the system table can be created in = CFMetaData from the earlier = DatabaseDescriptor.getTableMetaData(Table.SYSTEM_TABLE).values() call). = Or I guess the data on disk could have multiple entries under the same = key, but the system table issue seems more likely to me. Unfortunately = the logging is not specific enough for me to tell which key it is = failing with, and I haven't been able to reproduce this yet. >>=20 >> One relevant piece of information might be that, before the restart, = our application changed the replication factor of all the tables, = including the system table: >>=20 >> 2011-03-29 23:09:39,194 291146 [MigrationStage:1] INFO = org.apache.cassandra.db.migration.Migration - Applying migration = 9f371026-5a59-11e0-b23f-65ed1eced995 Update keyspace systemrep = factor:1rep strategy:LocalStrategy{...} to systemrep factor:3rep = strategy:LocalStrategy{...} >>=20 >> We're doing this in order to dynamically change the replication = factor as new nodes are being added to the cluster (e.g., it starts off = with one node and a repfactor of 1, and once there are three nodes, it = increases the repfactor on all tables to 3). Is it possible that = migrations over the system table get written to disk in a way that would = cause loadSchemas() during a restart to hit this exception? Are we even = allowed to change the replication factor of the system table? >>=20 >=20 > I've confirmed that this happens when loading column family = "IndexInfo" from the table "system" during the loadSchemas() call. Does = anyone know if there's a way to get around this? Perhaps, like I = theorized, it's not legit to change the replication factor on the system = table. >=20 >=20