cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-11416) No longer able to load backups into new cluster if there was a dropped column
Date Mon, 18 Apr 2016 09:22:25 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15245353#comment-15245353
] 

Sylvain Lebresne commented on CASSANDRA-11416:
----------------------------------------------

bq. Aka assume they are there because someone dropped them in a previous life.

We could, though obviously it's always slightly scary to throw stuff away based on assumptions
we can't be 100% sure of. Though if we log a clear warning, that's probably good enough in
practice (that is, if it's not previously dropped data, it means you screwed up re-creating
your schema and the warning should be enough to have you fix that and re-load the backup).

Overall, I feel that the real fix for this is that backups should come with their schema,
and by this I mean in the "internal" format that includes column drop informations, and restoring
a backup should make sure the nodes are up to date on such infos.

I'll also note for the records that the previous behavior (in pre-3.0) wasn't perfect either
as we were in that case basically keeping the previously-dropped column data, but we didn't
have the drop information anymore so said data would sit there forever (in that sense, I would
argue that warning about the data but ignoring it otherwise is a better behavior overall).

> No longer able to load backups into new cluster if there was a dropped column
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-11416
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11416
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jeremiah Jordan
>            Assignee: Aleksey Yeschenko
>             Fix For: 3.0.x, 3.x
>
>
> The following change to the sstableloader test works in 2.1/2.2 but fails in 3.0+
> https://github.com/JeremiahDJordan/cassandra-dtest/commit/7dc66efb8d24239f0a488ec5a613240531aeb7db
> {code}
> CREATE TABLE test_drop (key text PRIMARY KEY, c1 text, c2 text, c3 text, c4 text)
> ...insert data...
> ALTER TABLE test_drop DROP c4
> ...insert more data...
> {code}
> Make a snapshot and save off a describe to backup table test_drop.
> Decide to restore the snapshot to a new cluster.   First restore the schema from describe.
(column c4 isn't there)
> {code}
> CREATE TABLE test_drop (key text PRIMARY KEY, c1 text, c2 text, c3 text)
> {code}
> sstableload the snapshot data.
> Works in 2.1/2.2.  Fails in 3.0+ with:
> {code}
> java.lang.RuntimeException: Unknown column c4 during deserialization
> java.lang.RuntimeException: Failed to list files in /var/folders/t4/rlc2b6450qbg92762l9l4mt80000gn/T/dtest-3eKv_g/test/node1/data1_copy/ks/drop_one-bcef5280f11b11e5825a43f0253f18b5
> 	at org.apache.cassandra.db.lifecycle.LogAwareFileLister.list(LogAwareFileLister.java:53)
> 	at org.apache.cassandra.db.lifecycle.LifecycleTransaction.getFiles(LifecycleTransaction.java:544)
> 	at org.apache.cassandra.io.sstable.SSTableLoader.openSSTables(SSTableLoader.java:76)
> 	at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:165)
> 	at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:104)
> Caused by: java.lang.RuntimeException: Unknown column c4 during deserialization
> 	at org.apache.cassandra.db.SerializationHeader$Component.toHeader(SerializationHeader.java:331)
> 	at org.apache.cassandra.io.sstable.format.SSTableReader.openForBatch(SSTableReader.java:430)
> 	at org.apache.cassandra.io.sstable.SSTableLoader.lambda$openSSTables$193(SSTableLoader.java:121)
> 	at org.apache.cassandra.db.lifecycle.LogAwareFileLister.lambda$innerList$184(LogAwareFileLister.java:75)
> 	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174)
> 	at java.util.TreeMap$EntrySpliterator.forEachRemaining(TreeMap.java:2965)
> 	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
> 	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
> 	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
> 	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> 	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
> 	at org.apache.cassandra.db.lifecycle.LogAwareFileLister.innerList(LogAwareFileLister.java:77)
> 	at org.apache.cassandra.db.lifecycle.LogAwareFileLister.list(LogAwareFileLister.java:49)
> 	... 4 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message