Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: neutral (athena.apache.org: local policy)
MIME-Version: 1.0
Date: Sat, 2 Jun 2012 12:43:48 +0500
Message-ID: 
 <CAJdH=ExEW-EEjmyEh-TJYMgXfwWX_1OSxKoKD+1aJwXxp0Zn=w@mail.gmail.com>
Subject: Deleting from SCF wide row makes node unresponsive
From: Rustam Aliyev <rustam.lists@code.az>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=14dae9cfc89420121204c178768a

--14dae9cfc89420121204c178768a
Content-Type: text/plain; charset=ISO-8859-1

 Hi all,

I have SCF with ~250K rows. One of these rows is relatively large - it's a
wide row (according to compaction logs) containing ~100.000 super columns
and overall size of 1GB. Each super column has average size of 10K and ~10
sub columns.

When I'm trying to delete ~90% of the columns in this particular row,
Cassandra nodes which own this wide row (3 of 5, RF=3) quickly run out of
the heap space. See logs from one of the hosts here:

http://pastebin.com/raw.php?i=kwn7b3rP

After that, all 3 nodes start flapping up/down and GC messages (like the
one in the bottom of the pastebin above) appearing in the logs. Cassandra
never repairs from this mode and the only way out if to "kill -9" and start
again. On IRC it was suggested that it enters GC death spiral.

I tried to throttle delete requests on the client side - sending batch of
100 delete requests each 500ms. So no more than 200 deletes/sec. But it
didn't help. I can reduce it further to 100/sec, but I don't think it will
help much.

I delete millions of columns from other row in this SCF at the same rate
and never have hit this problem. It only happens when I try to delete from
this particular wide row.

So right now I don't know how can I delete these columns. Any ideas?


Many thanks,
Rustam.

--14dae9cfc89420121204c178768a
Content-Type: text/html; charset=ISO-8859-1


<div class="moz-text-html" lang="x-unicode">
  

    <font size="-1"><font face="Helvetica, Arial, sans-serif">Hi all,<br>
        <br>
        I have SCF with ~250K rows. One of these rows is relatively
        large - it&#39;s a wide row (according to compaction logs)
        containing ~100.000 super columns and overall size of 1GB. Each
        super column has average size of 10K and ~10 sub columns.<br>
        <br>
        When I&#39;m trying to delete ~90% of the columns in this particular
        row, Cassandra nodes which own this wide row (3 of 5, RF=3)
        quickly run out of the heap space. See logs from one of the
        hosts here:<br>
        <br>
        <a class="moz-txt-link-freetext" href="http://pastebin.com/raw.php?i=kwn7b3rP">http://pastebin.com/raw.php?i=kwn7b3rP</a><br>
        <br>
        After that, all 3 nodes start flapping up/down and GC messages
        (like the one in the bottom of the pastebin above) appearing in
        the logs. Cassandra never repairs from this mode and the only
        way out if to &quot;kill -9&quot; and start again. On IRC it was suggested
        that it enters GC death spiral.<br>
        <br>
        I tried to throttle delete requests on the client side - sending
        batch of 100 delete requests each 500ms. So no more than 200
        deletes/sec. But it didn&#39;t help. I can reduce it further to
        100/sec, but I don&#39;t think it will help much.<br>
        <br>
        I delete millions of columns from other row in this SCF at the
        same rate and never have hit this problem. It only happens when
        I try to delete from this particular wide row.<br>
        <br>
        So right now I don&#39;t know how can I delete these columns. Any
        ideas?<br>
        <br>
        <br>
        Many thanks,<br>
        Rustam.<br>
      </font></font>
  

</div>


--14dae9cfc89420121204c178768a--