Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3F80417ED6 for ; Wed, 1 Apr 2015 18:51:32 +0000 (UTC) Received: (qmail 83480 invoked by uid 500); 1 Apr 2015 18:50:57 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 83443 invoked by uid 500); 1 Apr 2015 18:50:57 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 83360 invoked by uid 99); 1 Apr 2015 18:50:57 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Apr 2015 18:50:57 +0000 Date: Wed, 1 Apr 2015 18:50:57 +0000 (UTC) From: "Ariel Weisberg (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-8052) OOMs from allocating large arrays when deserializing (e.g probably corrupted EstimatedHistogram data) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14391220#comment-14391220 ] Ariel Weisberg commented on CASSANDRA-8052: ------------------------------------------- Agree we should do a pass and add sanity checks for bad length prefixes that slip through due to a lack of checksum or a bug. The check is usually in the noise compared to the cost of constructing a variable size and frequently encoded thing. I am not sure it's appropriate for 2.1 because it's possible to add buggy sanity checks that fire when things would otherwise work. C* shouldn't emit files that aren't checksummed ever (or regions of files that aren't checksummed). Even if the checksum is not checked at runtime it's nice to be able to validate for debugging that the bytes of the file are correct because you have a checksum that tells you what they should have been. If that is something that we agree on then CASSANDRA-6897 might not be broad enough. > OOMs from allocating large arrays when deserializing (e.g probably corrupted EstimatedHistogram data) > ----------------------------------------------------------------------------------------------------- > > Key: CASSANDRA-8052 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8052 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: linux > Reporter: Matt Byrd > Labels: OOM, checksum, corruption, oom, serialization > > We've seen nodes with what are presumably corrupted sstables repeatedly OOM on attempted startup with such a message: > {code} > java.lang.OutOfMemoryError: Java heap space > at org.apache.cassandra.utils.EstimatedHistogram$EstimatedHistogramSerializer.deserialize(EstimatedHistogram.java:266) > at org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:292) > at org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:282) > at org.apache.cassandra.io.sstable.SSTableReader.openMetadata(SSTableReader.java:234) > at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:194) > at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:157) > at org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:273) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:722) > {code} > It's probably not a coincidence that it's throwing an exception here since this seems to be the first byte of the file read. > Presumably the correct operational process is just to replace the node, > however I was wondering if generally we might want to validate lengths when we deserialise things? > This could avoid allocating large byte buffers causing unpredictable OOMs and instead throw an exception to be handled as appropriate. > In this particular instance, there is no need for an unduly large size for the estimated histogram. > Admittedly things are slightly different in 2.1, though I suspect a similar thing might have happened with: > {code} > int numComponents = in.readInt(); > // read toc > Map toc = new HashMap<>(numComponents); > {code} > Doing a find usages of DataInputStream.readInt() reveals quite a few places where an int is read in and then an ArrayList, array or map of that size is created. > In some cases this size might validly vary over a java int, > or be in a performance critical or delicate piece of code where one doesn't want such checks. > Also there are other checksums and mechanisms at play which make some input less likely to be corrupted. > However, is it maybe worth a pass over instances of this type of input, to try and avoid such cases where it makes sense? > Perhaps there are less likely but worse failure modes present and hidden? > E.g if the deserialisation is happens to be for a message sent to some or all nodes say. -- This message was sent by Atlassian JIRA (v6.3.4#6332)