Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: cassandra-user@incubator.apache.org
Received-SPF: pass (athena.apache.org: domain of jbellis@gmail.com designates
 74.125.82.175 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:from:date:message-id:subject:to
         :content-type:content-transfer-encoding;
        b=fpNys62REP4O0I6VLWQW3HLl30RHLo3DDZJb1cF8b7r9zdKBvVQffVCxZcISMlcb1f
         M6MMbNTpVsrbOP1qrbXpmdNzvLQbjsU8cNW3r6xu3WJhO+ZkT2sjTCiHQavB7ee0gDUC
         DbdLRiB9tx4N1dkAuXBlL3HubxmSiHbVddIzU=
MIME-Version: 1.0
In-Reply-To: <967691.23282.qm@web31809.mail.mud.yahoo.com>
References: <209311.76109.qm@web31812.mail.mud.yahoo.com>
 <e06563881002240716t742411fvd15247372163aab9@mail.gmail.com>
	<906294.96297.qm@web31804.mail.mud.yahoo.com>
 <e06563881002241134k7bc58272k997a59191b076a13@mail.gmail.com>
	<371345.34010.qm@web31815.mail.mud.yahoo.com>
 <1267394009.14937667@192.168.2.230>
	<837634.67457.qm@web31811.mail.mud.yahoo.com>
 <e06563881003030921j2980c84ey10b2fbad6a5b24b@mail.gmail.com>
	<967691.23282.qm@web31809.mail.mud.yahoo.com>
From: Jonathan Ellis <jbellis@gmail.com>
Date: Thu, 4 Mar 2010 10:31:16 -0600
Message-ID: <e06563881003040831pb2b2413of5455bc8efc5567b@mail.gmail.com>
Subject: Re: Anti-compaction Diskspace issue even when latest patch applied
To: cassandra-user@incubator.apache.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

https://issues.apache.org/jira/browse/CASSANDRA-579 should make a big
difference in speed.  If you want to take a stab at it I can point you
in the right direction. :)

On Thu, Mar 4, 2010 at 10:24 AM, shiv shivaji <shivajisus@yahoo.com> wrote:
> Yes.
>
> The IP change trick seems to work. Load balancing seems a little slow, bu=
t I
> will open a new thread on that if needed.
>
> Thanks, Shiv
>
>
> ________________________________
> From: Jonathan Ellis <jbellis@gmail.com>
> To: cassandra-user@incubator.apache.org
> Sent: Wed, March 3, 2010 9:21:28 AM
> Subject: Re: Anti-compaction Diskspace issue even when latest patch appli=
ed
>
> You are proposing manually moving your data from a 5TB disk to a 12TB
> disk, and that is the only change you want to make?=A0 Then just keep
> the IP the same when you restart it after moving, and you won't have
> to do anything else, it will just look like the node was down
> temporarily and is now back up.
>
> On Tue, Mar 2, 2010 at 7:26 PM, shiv shivaji <shivajisus@yahoo.com> wrote=
:
>> Thanks, just realized this after looking at the source code.
>>
>> Seems like the decommission will not work for me due to disk space issue=
s.
>> I
>> am currently moving all the data on the heavy node (5 TB full) to a 12 T=
B
>> disk drive. I am planning to remove the old token and resign the old tok=
en
>> to this node.
>>
>> According to the docs, it says to use decommission, however lack of disk
>> space does not allow me to do this. If I manually move all the data file=
s
>> and then do a removetoken and start the node with a new token, would tha=
t
>> work (as was implied in a JIRA)?
>>
>> Shiv
>>
>>
>> ________________________________
>> From: Stu Hood <stu.hood@rackspace.com>
>> To: cassandra-user@incubator.apache.org
>> Sent: Sun, February 28, 2010 1:53:29 PM
>> Subject: Re: Anti-compaction Diskspace issue even when latest patch
>> applied
>>
>> `nodetool cleanup` is a very expensive process: it performs a major
>> compaction, and should not be done that frequently.
>>
>> -----Original Message-----
>> From: "shiv shivaji" <shivajisus@yahoo.com>
>> Sent: Sunday, February 28, 2010 3:34pm
>> To: cassandra-user@incubator.apache.org
>> Subject: Re: Anti-compaction Diskspace issue even when latest patch
>> applied
>>
>> Seems like the temporary solution was to run a cron job which calls
>> nodetool
>> cleanup every 5 mins or so. This stopped the disk space from going too
>> low.
>>
>> The manual solution you mentioned is likely worthy of consideration as t=
he
>> load balancing is taking a while.
>>
>> I will track the jira issue of anticompaction and diskspace. Thanks for
>> the
>> pointer.
>>
>>
>> Thanks, Shiv
>>
>>
>>
>>
>> ________________________________
>> From: Jonathan Ellis <jbellis@gmail.com>
>> To: cassandra-user@incubator.apache.org
>> Sent: Wed, February 24, 2010 11:34:59 AM
>> Subject: Re: Anti-compaction Diskspace issue even when latest patch
>> applied
>>
>> as you noticed, "nodeprobe move" first unloads the data, then moves to
>> the new position.=A0 so that won't help you here.
>>
>> If you are using replicationfactor=3D1, scp the data to the previous
>> node on the ring, then reduce the original node's token so it isn't
>> responsible for so much, and run cleanup.=A0 (you can do this w/ higher
>> RF too, you just have to scp the data more places.)
>>
>> Finally, you could work on
>> https://issues.apache.org/jira/browse/CASSANDRA-579 so it doesn't need
>> to anticompact to disk before moving data.
>>
>> -Jonathan
>>
>> On Wed, Feb 24, 2010 at 12:06 PM, shiv shivaji <shivajisus@yahoo.com>
>> wrote:
>>> According to the stack trace I get in the log, it makes it look like th=
e
>>> patch was for anti-compaction but I did not look at the source code in
>>> detail yet.
>>>
>>> java.util.concurrent.ExecutionException:
>>> java.lang.UnsupportedOperationException: disk full
>>>=A0 =A0 =A0 =A0 at
>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>>>=A0 =A0 =A0 =A0 at java.util.concurrent.FutureTask.get(FutureTask.java:8=
3)
>>>=A0 =A0 =A0 =A0 at
>>>
>>>
>>> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecu=
te(DebuggableThreadPoolExecutor.java:86)
>>>=A0 =A0 =A0 =A0 at
>>>
>>>
>>> org.apache.cassandra.db.CompactionManager$CompactionExecutor.afterExecu=
te(CompactionManager.java:570)
>>>=A0 =A0 =A0 =A0 at
>>>
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecut=
or.java:888)
>>>=A0 =A0 =A0 =A0 at
>>>
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.j=
ava:908)
>>>=A0 =A0 =A0 =A0 at java.lang.Thread.run(Thread.java:619)
>>> Caused by: java.lang.UnsupportedOperationException: disk full
>>>=A0 =A0 =A0 =A0 at
>>>
>>>
>>> org.apache.cassandra.db.CompactionManager.doAntiCompaction(CompactionMa=
nager.java:344)
>>>=A0 =A0 =A0 =A0 at
>>>
>>>
>>> org.apache.cassandra.db.CompactionManager.doCleanupCompaction(Compactio=
nManager.java:405)
>>>=A0 =A0 =A0 =A0 at
>>>
>>>
>>> org.apache.cassandra.db.CompactionManager.access$400(CompactionManager.=
java:49)
>>>=A0 =A0 =A0 =A0 at
>>>
>>>
>>> org.apache.cassandra.db.CompactionManager$2.call(CompactionManager.java=
:130)
>>>=A0 =A0 =A0 =A0 at
>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>>=A0 =A0 =A0 =A0 at java.util.concurrent.FutureTask.run(FutureTask.java:1=
38)
>>>=A0 =A0 =A0 =A0 at
>>>
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecut=
or.java:886)
>>>=A0 =A0 =A0 =A0 ... 2 more
>>>
>>> I tried "nodetool cleanup" before and it did not really stop the disk
>>> from
>>> filling, is there a way to force move the data or some other way to sol=
ve
>>> the issue?
>>>
>>> Thanks, Shiv
>>>
>>> ________________________________
>>> From: Jonathan Ellis <jbellis@gmail.com>
>>> To: cassandra-user@incubator.apache.org
>>> Sent: Wed, February 24, 2010 7:16:32 AM
>>> Subject: Re: Anti-compaction Diskspace issue even when latest patch
>>> applied
>>>
>>> The patch you refer to was to help *compaction*, not *anticompaction*.
>>>
>>> If the space is mostly hints for other machines (is that what you
>>> meant by "due to past problems with others?") you should run nodeprobe
>>> cleanup on it to remove data that doesn't actually belong on that
>>> node.
>>>
>>> -Jonathan
>>>
>>> On Wed, Feb 24, 2010 at 3:09 AM, shiv shivaji <shivajisus@yahoo.com>
>>> wrote:
>>>> For about 6TB of=A0 total data size with a replication factor of 2 (6T=
B x
>>>> 2)
>>>> on a five node cluster, I see about 4.6 TB on one machine (due to
>>>> potential
>>>> past problems with other machines). The machine has a disk of 6TB.
>>>>
>>>> The data folder on this machine has 59,289 files totally 4.6 TB. The
>>>> files
>>>> are the data, filter and indexes. I see that anti-compaction is runnin=
g.
>>>> I
>>>> applied a recent patch which does not do anti-compaction if disk space
>>>> is
>>>> limited. I still see it happening. I have also called nodetool
>>>> loadbalance
>>>> on this machine. Seems like it will run out of disk space anyway.
>>>>
>>>> The machine diskspace consumed are: (Each machine has a 6TB hard-drive
>>>> on
>>>> RAID).
>>>>
>>>> Machine Space Consumed
>>>> M1=A0 =A0 4.47 TB
>>>> M2=A0 =A0 2.93 TB
>>>> M3=A0 =A0 1.83 GB
>>>> M4=A0 =A0 56.19 GB
>>>> M5=A0 =A0 398.01 GB
>>>>
>>>> How can I force M1 to immediately move its load to M3 and M4 for
>>>> instance
>>>> (or any other machine). The nodetool move command moves all data, is
>>>> there
>>>> a
>>>> way instead to force move 50% of data to M3 and the remaining 50% to M=
4
>>>> and
>>>> resume anti-compaction after the move?
>>>>
>>>> Thanks, Shiv
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>