Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 62D06D5E8 for ; Fri, 12 Oct 2012 23:43:03 +0000 (UTC) Received: (qmail 57745 invoked by uid 500); 12 Oct 2012 23:43:03 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 57701 invoked by uid 500); 12 Oct 2012 23:43:03 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 57691 invoked by uid 99); 12 Oct 2012 23:43:03 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Oct 2012 23:43:03 +0000 Date: Fri, 12 Oct 2012 23:43:03 +0000 (UTC) From: "Florent Clairambault (JIRA)" To: commits@cassandra.apache.org Message-ID: <1675457678.40223.1350085383165.JavaMail.jiratomcat@arcas> In-Reply-To: <1624491622.4291.1343895182925.JavaMail.jiratomcat@issues-vm> Subject: [jira] [Updated] (CASSANDRA-4481) Commitlog not replayed after restart - data lost MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-4481?page=3Dcom.atla= ssian.jira.plugin.system.issuetabpanels:all-tabpanel ] Florent Clairambault updated CASSANDRA-4481: -------------------------------------------- Comment: was deleted (was: I doesn't work, it failed again a week ago on a 1.1.5 that was runnin= g for a little bit. First of all, it's a commitLog writing and/or reading issue, so if you flus= h your data frequently (every hour and in the stop command of the rc.d's sc= ript) you reduce your risk of big data losses. You can lose days of data if= you don't do that. Restarting cassandra and going 2 days in the past is a = very unpleasant situation. So here is the new process I applied to fix my data (which is in fact resta= rting from scatch [except we keep the data]): - Export the keyspace's schema {code} cassandra-cli -k ks >schema.txt < backup) - Start cassandra (Not having a keyspace folder is like not having any data= , it's not a problem). - Delete the keyspace (I know deletion creates snapshots and moving is unec= essary but it's easier to use sstableloader that way) - Recreate the keyspace with the schema exported and simplified - Use sstableloader to import data: {code} cd backup; find -type d -exec sstableloader -d localhost {} \; {code} NOTE: Don't think about replaying your commitLogs with your new schema, the= column families won't have the same id. Any empty cassandra instance startup does at least 1 mutation replay becaus= e of the "system" keyspace. So I still think 0 replayed mutations should ne= ver occur and if they do, we should have some warning with them. And if it'= s indeed "a CF that doesn't fully exist", it should be reported at startup. I hope we can find a way to reproduce it.) =20 > Commitlog not replayed after restart - data lost > ------------------------------------------------ > > Key: CASSANDRA-4481 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4481 > Project: Cassandra > Issue Type: Bug > Affects Versions: 1.1.2 > Environment: Single node cluster on 64Bit CentOS > Reporter: Ivo Mei=C3=9Fner > Priority: Critical > > When data is written to the commitlog and I restart the machine, all comm= ited data is lost that has not been flushed to disk.=20 > In the startup logs it says that it replays the commitlog successfully, b= ut the data is not available then.=20 > When I open the commitlog file in an editor I can see the added data, but= after the restart it cannot be fetched from cassandra.=20 > {code} > INFO 09:59:45,362 Replaying /var/myproject/cassandra/commitlog/CommitLog= -83203377067.log > INFO 09:59:45,476 Finished reading /var/myproject/cassandra/commitlog/Co= mmitLog-83203377067.log > INFO 09:59:45,476 Log replay complete, 0 replayed mutations > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrato= rs For more information on JIRA, see: http://www.atlassian.com/software/jira