cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Cassandra won't start after node crash
Date Tue, 08 Jun 2010 22:15:36 GMT
Sounds like you had some bad hardware take down your index files.
(Cassandra fsyncs them after writing them and before renaming them to
being live, so if it's missing pieces then it's always been hardware
at fault that I have seen.

You could try rebuilding your index files from the data files, but
they may be toast, too.

So: step 1, run bin/sstable2json to make sure your data files are actually okay.

Step 2, rebuild your index files from your data files.

I can never muster up the energy to make an index rebuilder in Java.
So here's one in Python.

(I recommend testing this on a sstable + index pair that are known to
be good, before trusting it to rebuild a damaged index.  In particular
I think it might be broken with a 32bit python instead of 64bit.
Works On My Machine!)

# usage: buildindex <sstable Data filename> <filename to write new
index>

import sys, struct, stat, os

infname, outfname = sys.argv[1:3]
if '-Data' not in infname:
    raise Exception('%s does not look like a Cassandra data filename' % infname)

inf = open(infname, 'r')
outf = open(outfname, 'w')
fsize = os.stat(infname)[stat.ST_SIZE]

while inf.tell() < fsize:
    # read current row key and write index entry
    dataposition = inf.tell()
    keysize, = struct.unpack('>H', inf.read(2))
    key = inf.read(keysize)
    outf.write(struct.pack('>H', keysize))
    outf.write(key)
    outf.write(struct.pack('>q', dataposition))

    # skip to the next row
    datasize, = struct.unpack('>i', inf.read(4))
    inf.seek(inf.tell() + datasize)


On Tue, Jun 8, 2010 at 8:56 AM, Lucas Di Pentima
<lucas@di-pentima.com.ar> wrote:
> Hello,
>
> I've had a server crash, and after rebooting I cannot start the Cassandra instance, it's
a one-node cluster. I'm running cassandra 0.6.1 on Debian Linux and jre 1.6.0_12.
>
> Is my data lost, should I recreate the DB?
>
> The error message is:
>
> ====================================================================================
>  INFO 12:46:30,823 Auto DiskAccessMode determined to be standard
>  INFO 12:46:31,084 Sampling index for /usr/local/cassandra/data/system/LocationInfo-9-Data.db
>  INFO 12:46:31,084 Sampling index for /usr/local/cassandra/data/system/LocationInfo-10-Data.db
>  INFO 12:46:31,084 Sampling index for /usr/local/cassandra/data/system/LocationInfo-11-Data.db
>  INFO 12:46:31,135 Sampling index for /usr/local/cassandra/data/Empire/CampaignCampaignRuns-469-Data.db
>  INFO 12:46:31,135 Sampling index for /usr/local/cassandra/data/Empire/CampaignCampaignRuns-470-Data.db
>  INFO 12:46:31,135 Sampling index for /usr/local/cassandra/data/Empire/Open-85-Data.db
>  INFO 12:46:35,772 Sampling index for /usr/local/cassandra/data/Empire/Open-106-Data.db
>  INFO 12:46:36,864 Sampling index for /usr/local/cassandra/data/Empire/Open-283-Data.db
>  INFO 12:46:37,228 Sampling index for /usr/local/cassandra/data/Empire/Open-372-Data.db
>  INFO 12:46:37,436 Sampling index for /usr/local/cassandra/data/Empire/Open-526-Data.db
>  INFO 12:46:37,644 Sampling index for /usr/local/cassandra/data/Empire/Open-535-Data.db
>  INFO 12:46:37,644 Sampling index for /usr/local/cassandra/data/Empire/Open-536-Data.db
>  INFO 12:46:37,644 Sampling index for /usr/local/cassandra/data/Empire/Open-537-Data.db
> ERROR 12:46:37,644 Corrupt file /usr/local/cassandra/data/Empire/Open-537-Data.db; skipped
> java.io.UTFDataFormatException: malformed input around byte 0
>        at java.io.DataInputStream.readUTF(DataInputStream.java:639)
>        at java.io.RandomAccessFile.readUTF(RandomAccessFile.java:887)
>        at org.apache.cassandra.io.SSTableReader.loadIndexFile(SSTableReader.java:261)
>        at org.apache.cassandra.io.SSTableReader.open(SSTableReader.java:125)
>        at org.apache.cassandra.io.SSTableReader.open(SSTableReader.java:114)
>        at org.apache.cassandra.db.ColumnFamilyStore.<init>(ColumnFamilyStore.java:178)
>        at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:248)
>        at org.apache.cassandra.db.Table.<init>(Table.java:338)
>        at org.apache.cassandra.db.Table.open(Table.java:199)
>        at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:91)
>        at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:177)
>  INFO 12:46:37,644 Sampling index for /usr/local/cassandra/data/Empire/CampaignRunClickStream-9-Data.db
>  INFO 12:46:37,644 Sampling index for /usr/local/cassandra/data/Empire/CampaignRunClickStream-454-Data.db
>  INFO 12:46:37,696 Sampling index for /usr/local/cassandra/data/Empire/CampaignRunOpenStream-9-Data.db
>  INFO 12:46:37,696 Sampling index for /usr/local/cassandra/data/Empire/CampaignRunOpenStream-14-Data.db
>  INFO 12:46:37,696 Sampling index for /usr/local/cassandra/data/Empire/CampaignRunOpenStream-27-Data.db
>  INFO 12:46:37,748 Sampling index for /usr/local/cassandra/data/Empire/CampaignRunOpenStream-456-Data.db
> ERROR 12:46:37,748 Corrupt file /usr/local/cassandra/data/Empire/CampaignRunOpenStream-456-Data.db;
skipped
> java.io.UTFDataFormatException: malformed input around byte 48
>        at java.io.DataInputStream.readUTF(DataInputStream.java:617)
>        at java.io.RandomAccessFile.readUTF(RandomAccessFile.java:887)
>        at org.apache.cassandra.io.SSTableReader.loadIndexFile(SSTableReader.java:261)
>        at org.apache.cassandra.io.SSTableReader.open(SSTableReader.java:125)
>        at org.apache.cassandra.io.SSTableReader.open(SSTableReader.java:114)
>        at org.apache.cassandra.db.ColumnFamilyStore.<init>(ColumnFamilyStore.java:178)
>        at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:248)
>        at org.apache.cassandra.db.Table.<init>(Table.java:338)
>        at org.apache.cassandra.db.Table.open(Table.java:199)
>        at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:91)
>        at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:177)
>  INFO 12:46:37,748 Sampling index for /usr/local/cassandra/data/Empire/Click-21-Data.db
>  INFO 12:46:38,788 Sampling index for /usr/local/cassandra/data/Empire/Click-26-Data.db
>  INFO 12:46:39,048 Sampling index for /usr/local/cassandra/data/Empire/Click-259-Data.db
>  INFO 12:46:39,412 Sampling index for /usr/local/cassandra/data/Empire/Click-476-Data.db
>  INFO 12:46:39,464 Sampling index for /usr/local/cassandra/data/Empire/Click-477-Data.db
>  INFO 12:46:39,464 Sampling index for /usr/local/cassandra/data/Empire/Click-478-Data.db
>  INFO 12:46:39,464 Sampling index for /usr/local/cassandra/data/Empire/CampaignRunUniqueOpen-9-Data.db
>  INFO 12:46:39,464 Sampling index for /usr/local/cassandra/data/Empire/CampaignRunUniqueOpen-14-Data.db
>  INFO 12:46:39,464 Sampling index for /usr/local/cassandra/data/Empire/CampaignRunUniqueOpen-27-Data.db
>  INFO 12:46:39,516 Sampling index for /usr/local/cassandra/data/Empire/CampaignRunUniqueOpen-456-Data.db
> ERROR 12:46:39,516 Exception encountered during startup.
> java.lang.StringIndexOutOfBoundsException: String index out of range: -1
>        at java.lang.String.substring(String.java:1938)
>        at org.apache.cassandra.dht.RandomPartitioner.convertFromDiskFormat(RandomPartitioner.java:50)
>        at org.apache.cassandra.io.SSTableReader.loadIndexFile(SSTableReader.java:261)
>        at org.apache.cassandra.io.SSTableReader.open(SSTableReader.java:125)
>        at org.apache.cassandra.io.SSTableReader.open(SSTableReader.java:114)
>        at org.apache.cassandra.db.ColumnFamilyStore.<init>(ColumnFamilyStore.java:178)
>        at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:248)
>        at org.apache.cassandra.db.Table.<init>(Table.java:338)
>        at org.apache.cassandra.db.Table.open(Table.java:199)
>        at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:91)
>        at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:177)
> Exception encountered during startup.
> java.lang.StringIndexOutOfBoundsException: String index out of range: -1
>        at java.lang.String.substring(String.java:1938)
>        at org.apache.cassandra.dht.RandomPartitioner.convertFromDiskFormat(RandomPartitioner.java:50)
>        at org.apache.cassandra.io.SSTableReader.loadIndexFile(SSTableReader.java:261)
>        at org.apache.cassandra.io.SSTableReader.open(SSTableReader.java:125)
>        at org.apache.cassandra.io.SSTableReader.open(SSTableReader.java:114)
>        at org.apache.cassandra.db.ColumnFamilyStore.<init>(ColumnFamilyStore.java:178)
>        at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:248)
>        at org.apache.cassandra.db.Table.<init>(Table.java:338)
>        at org.apache.cassandra.db.Table.open(Table.java:199)
>        at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:91)
>        at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:177)
> ====================================================================================
>
> Thanks in advance
> --
> Lucas Di Pentima - Santa Fe, Argentina
> Jabber: lucas@di-pentima.com.ar
> MSN: ldipenti75@hotmail.com
>
>
>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Mime
View raw message