From user-return-26604-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Fri Jun 1 19:53:33 2012 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E303AC4B6 for ; Fri, 1 Jun 2012 19:53:33 +0000 (UTC) Received: (qmail 43610 invoked by uid 500); 1 Jun 2012 19:53:31 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 43596 invoked by uid 500); 1 Jun 2012 19:53:31 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 43587 invoked by uid 99); 1 Jun 2012 19:53:31 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Jun 2012 19:53:31 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL,TO_NO_BRKTS_PCNT X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.85.215.44] (HELO mail-lpp01m010-f44.google.com) (209.85.215.44) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Jun 2012 19:53:26 +0000 Received: by lagv3 with SMTP id v3so1943345lag.31 for ; Fri, 01 Jun 2012 12:53:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject :content-type:x-gm-message-state; bh=Bh3AymF0GKF8Zem4IiJuD9+ihJCyVmA6As538aMAem8=; b=SKvuBvpwpObiYingHD/GENFN8F+Twy2e+MedypbCgXOc1eieBkiLAUCrlbbZFMgUM2 3oN5mHeMDdJtm6firHnrGxuxxwTsEdcDGGA00RhCPx5AVRi9VkkkihpN94GZPcvN7SUi gPyTsPmDN/PZK6Bl9OLFOGpL5dTI4du74uzzoJa/EcPclJdzvfmWZFnpOEMN5ORkYGHG PgdN8JwU3pZitun+FySNPmMgTvD3p1BTKiYlN2NuT8eboBXrXqYnt9xoLhH4O9ckaSFp IS1C6OrPuY3wMo5Qnz9zydVqhU30ri3eKZqf6reV+R6ytaZhSjUsDujH870YGSzZYgiW lGPQ== Received: by 10.112.42.41 with SMTP id k9mr2468578lbl.90.1338580383939; Fri, 01 Jun 2012 12:53:03 -0700 (PDT) Received: from Rustams-MacBook-Air.local ([188.253.136.98]) by mx.google.com with ESMTPS id k4sm1844635lbb.12.2012.06.01.12.53.01 (version=SSLv3 cipher=OTHER); Fri, 01 Jun 2012 12:53:03 -0700 (PDT) Message-ID: <4FC91D9C.6030304@code.az> Date: Sat, 02 Jun 2012 00:53:00 +0500 From: Rustam Aliyev User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:12.0) Gecko/20120420 Thunderbird/12.0 MIME-Version: 1.0 To: user@cassandra.apache.org Subject: Can't delete from SCF wide row Content-Type: multipart/alternative; boundary="------------090903010309050303050502" X-Gm-Message-State: ALoCoQlEzYaKOOidPP7v58d8hUnZnlIaxUD3fN1II5FWAX12WOXEUcyK5seccYljDCIW4WgADxPW X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. --------------090903010309050303050502 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Hi all, I have SCF with ~250K rows. One of these rows is relatively large - it's a wide row (according to compaction logs) containing ~100.000 super columns and overall size of 1GB. Each super column has average size of 10K and ~10 sub columns. When I'm trying to delete ~90% of the columns in this particular row, Cassandra nodes which own this wide row (3 of 5, RF=3) quickly run out of the heap space. See logs from one of the hosts here: http://pastebin.com/raw.php?i=kwn7b3rP After that, all 3 nodes start flapping up/down and GC messages (like the one in the bottom of the pastebin above) appearing in the logs. Cassandra never repairs from this mode and the only way out if to "kill -9" and start again. On IRC it was suggested that it enters GC death spiral. I tried to throttle delete requests on the client side - sending batch of 100 delete requests each 500ms. So no more than 200 deletes/sec. But it didn't help. I can reduce it further to 100/sec, but I don't think it will help much. I delete millions of columns from other row in this SCF at the same rate and never have hit this problem. It only happens when I try to delete from this particular wide row. So right now I don't know how can I delete these columns. Any ideas? Many thanks, Rustam. --------------090903010309050303050502 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit Hi all,

I have SCF with ~250K rows. One of these rows is relatively large - it's a wide row (according to compaction logs) containing ~100.000 super columns and overall size of 1GB. Each super column has average size of 10K and ~10 sub columns.

When I'm trying to delete ~90% of the columns in this particular row, Cassandra nodes which own this wide row (3 of 5, RF=3) quickly run out of the heap space. See logs from one of the hosts here:

http://pastebin.com/raw.php?i=kwn7b3rP

After that, all 3 nodes start flapping up/down and GC messages (like the one in the bottom of the pastebin above) appearing in the logs. Cassandra never repairs from this mode and the only way out if to "kill -9" and start again. On IRC it was suggested that it enters GC death spiral.

I tried to throttle delete requests on the client side - sending batch of 100 delete requests each 500ms. So no more than 200 deletes/sec. But it didn't help. I can reduce it further to 100/sec, but I don't think it will help much.

I delete millions of columns from other row in this SCF at the same rate and never have hit this problem. It only happens when I try to delete from this particular wide row.

So right now I don't know how can I delete these columns. Any ideas?


Many thanks,
Rustam.
--------------090903010309050303050502--