Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 70CB0DD59 for ; Mon, 27 Aug 2012 03:37:19 +0000 (UTC) Received: (qmail 80060 invoked by uid 500); 27 Aug 2012 03:37:19 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 79694 invoked by uid 500); 27 Aug 2012 03:37:10 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 79620 invoked by uid 99); 27 Aug 2012 03:37:07 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Aug 2012 03:37:07 +0000 Date: Mon, 27 Aug 2012 14:37:07 +1100 (NCT) From: "Vijay (JIRA)" To: commits@cassandra.apache.org Message-ID: <2104479489.513.1346038627617.JavaMail.jiratomcat@arcas> In-Reply-To: <1632388660.6786.1345747482574.JavaMail.jiratomcat@arcas> Subject: [jira] [Comment Edited] (CASSANDRA-4573) HSHA doesn't handle large messages gracefully MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-4573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13442248#comment-13442248 ] Vijay edited comment on CASSANDRA-4573 at 8/27/12 2:35 PM: ----------------------------------------------------------- Hi Tyler, I think the issue is related to GC pause can you conform you see the same? Run: tail -f /var/log/cassandra/system.log |grep GCInspector Enable GC logging. Run: python repro.py You would see timeout when you see "threads were stopped:" to be > 1.5 Seconds or so. was (Author: vijay2win@yahoo.com): Hi Tyler, I think the issue is related to GC pause can you conform you see the same? Run: tail -f /var/log/cassandra/system.log |grep GCInspector Enable GC logging. Run: python repro.py You would see timeout when you see "threads were stopped:" to be > 1.5 Seconds or so. HSHA copies the data into memory (Frame buffer) that might cause additional GC pressure, if we are used of having a small heap. We can try reducing the buffer size in the yaml too. > HSHA doesn't handle large messages gracefully > --------------------------------------------- > > Key: CASSANDRA-4573 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4573 > Project: Cassandra > Issue Type: Bug > Components: Core > Reporter: Tyler Hobbs > Assignee: Vijay > Attachments: repro.py > > > HSHA doesn't seem to enforce any kind of max message length, and when messages are too large, it doesn't fail gracefully. > With debug logs enabled, you'll see this: > {{DEBUG 13:13:31,805 Unexpected state 16}} > Which seems to mean that there's a SelectionKey that's valid, but isn't ready for reading, writing, or accepting. > Client-side, you'll get this thrift error (while trying to read a frame as part of {{recv_batch_mutate}}): > {{TTransportException: TSocket read 0 bytes}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira