Return-Path: Delivered-To: apmail-incubator-cassandra-commits-archive@minotaur.apache.org Received: (qmail 42894 invoked from network); 14 Apr 2009 05:04:36 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 14 Apr 2009 05:04:36 -0000 Received: (qmail 15366 invoked by uid 500); 14 Apr 2009 05:04:36 -0000 Delivered-To: apmail-incubator-cassandra-commits-archive@incubator.apache.org Received: (qmail 15334 invoked by uid 500); 14 Apr 2009 05:04:36 -0000 Mailing-List: contact cassandra-commits-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cassandra-dev@incubator.apache.org Delivered-To: mailing list cassandra-commits@incubator.apache.org Received: (qmail 15324 invoked by uid 99); 14 Apr 2009 05:04:36 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Apr 2009 05:04:36 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Apr 2009 05:04:35 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 1B8A929A0016 for ; Mon, 13 Apr 2009 22:04:15 -0700 (PDT) Message-ID: <924469593.1239685455111.JavaMail.jira@brutus> Date: Mon, 13 Apr 2009 22:04:15 -0700 (PDT) From: "Jonathan Ellis (JIRA)" To: cassandra-commits@incubator.apache.org Subject: [jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load") In-Reply-To: <2047973725.1237721750463.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698653#action_12698653 ] Jonathan Ellis commented on CASSANDRA-9: ---------------------------------------- I will make the style changes; thanks for the review, Todd. Any further discussion needed before commit? > Cassandra silently loses data when a single row gets large (under "heavy load") > ------------------------------------------------------------------------------- > > Key: CASSANDRA-9 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9 > Project: Cassandra > Issue Type: Bug > Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo > Reporter: Neophytos Demetriou > Assignee: Jonathan Ellis > Fix For: 0.3 > > Attachments: 0001-better-fix-for-9-v2.patch, 0001-better-fix-for-9.patch, executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch > > > When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community. > In storage-conf.xml: > RANDOM > 32 > 1 > > > >
>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say: > private int thresholdCount_ = 512*1024; > Here is a small program that will help you reproduce this (hopefully): > private static void doWrite() throws Throwable > { > int numRequests=0; > int numRequestsPerSecond = 3; > Table table = Table.open("MyTable"); > Random random = new Random(); > byte[] bytes = new byte[8]; > String key = "MyKey"; > int totalUsed = 0; > int total = 0; > for (int i = 0; i < 1500; ++i) { > RowMutation rm = new RowMutation("MyTable", key); > random.nextBytes(bytes); > int[] used = new int[500*1024]; > for (int z=0; z<500*1024;z++) { > used[z]=0; > } > int n = random.nextInt(16*1024); > for ( int k = 0; k < n; ++k ) { > int j = random.nextInt(500*1024); > if ( used[j] == 0 ) { > used[j] = 1; > ++totalUsed; > //int w = random.nextInt(4); > int w = 0; > rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w); > } > } > rm.apply(); > total += n; > System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed); > //Thread.sleep(1000*numRequests/numRequestsPerSecond); > numRequests++; > } > System.out.println("Write done"); > } > PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.