cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcus Eriksson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-13752) Corrupted SSTables created in 3.11
Date Tue, 29 Aug 2017 12:10:03 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16145183#comment-16145183
] 

Marcus Eriksson commented on CASSANDRA-13752:
---------------------------------------------

So this is because we reuse the same StreamingHistogram when we [open sstables early|https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java#L292]
- the early opened sstable will be using the streaming histogram that compaction is still
building. Since CASSANDRA-13038 we can modify the contents of the StreamingHistogram when
we call {{sum()}} ({{spool}} might be compacted into {{bin}}). So, if someone calls the {{ColumnFamilyStoreMBean#getDroppableTombstoneRatio}}
at the wrong time we could get either the CME from CASSANDRA-13756 or this corruption.

Making StreamingHistogram thread safe is one way of fixing this, but I would argue that we
should be using a "builder" for StreamingHistogram - we should never access the SH while building
it and for early opened sstables we should call {{.build()}} on the StreamingHistogramBuilder
and get a copy of the internal state.

Also, we should not query LIVE sstables [here|https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/db/ColumnFamilyStore.java#L2589]
- it should be using {{SSTableSet.CANONICAL}} (this is probably enough to fix this for now
- this is the only way I can see that we access the sstablemetadata in early opened sstables).

> Corrupted SSTables created in 3.11
> ----------------------------------
>
>                 Key: CASSANDRA-13752
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13752
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Hannu Kröger
>            Assignee: Hannu Kröger
>            Priority: Blocker
>             Fix For: 3.11.1
>
>
> We have discovered issues with corrupted SSTables. 
> {code}
> ERROR [SSTableBatchOpen:22] 2017-08-03 20:19:53,195 SSTableReader.java:577 - Cannot read
sstable /cassandra/data/mykeyspace/mytable-7a4992800d5611e7b782cb90016f2d17/mc-35556-big=[Data.db,
Statistics.db, Summary.db, Digest.crc32, CompressionInfo.db, TOC.txt, Index.db, Filter.db];
other IO error, skipping table
> java.io.EOFException: EOF after 1898 bytes out of 21093
>         at org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68)
~[apache-cassandra-3.11.0.jar:3.11.0]
>         at org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
~[apache-cassandra-3.11.0.jar:3.11.0]
>         at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) ~[apache-cassandra-3.11.0.jar:3.11.0]
>         at org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:377)
~[apache-cassandra-3.11.0.jar:3.11.0]
>         at org.apache.cassandra.io.sstable.metadata.StatsMetadata$StatsMetadataSerializer.deserialize(StatsMetadata.java:325)
~[apache-cassandra-3.11.0.jar:3.11.0]
>         at org.apache.cassandra.io.sstable.metadata.StatsMetadata$StatsMetadataSerializer.deserialize(StatsMetadata.java:231)
~[apache-cassandra-3.11.0.jar:3.11.0]
>         at org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:122)
~[apache-cassandra-3.11.0.jar:3.11.0]
>         at org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:93)
~[apache-cassandra-3.11.0.jar:3.11.0]
>         at org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:488)
~[apache-cassandra-3.11.0.jar:3.11.0]
>         at org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:396)
~[apache-cassandra-3.11.0.jar:3.11.0]
>         at org.apache.cassandra.io.sstable.format.SSTableReader$5.run(SSTableReader.java:561)
~[apache-cassandra-3.11.0.jar:3.11.0]
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_111]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[na:1.8.0_111]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[na:1.8.0_111]
>         at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
[apache-cassandra-3.11.0.jar:3.11.0]
> {code}
> Files look like this:
> {code}
> -rw-r--r--. 1 cassandra cassandra     3899251 Aug  7 08:37 mc-6166-big-CompressionInfo.db
> -rw-r--r--. 1 cassandra cassandra 16874421686 Aug  7 08:37 mc-6166-big-Data.db
> -rw-r--r--. 1 cassandra cassandra          10 Aug  7 08:37 mc-6166-big-Digest.crc32
> -rw-r--r--. 1 cassandra cassandra     2930904 Aug  7 08:37 mc-6166-big-Filter.db
> -rw-r--r--. 1 cassandra cassandra   111175880 Aug  7 08:37 mc-6166-big-Index.db
> -rw-r--r--. 1 cassandra cassandra       13762 Aug  7 08:37 mc-6166-big-Statistics.db
> -rw-r--r--. 1 cassandra cassandra      882008 Aug  7 08:37 mc-6166-big-Summary.db
> -rw-r--r--. 1 cassandra cassandra          92 Aug  7 08:37 mc-6166-big-TOC.txt
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message