Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: local policy)
From: aaron morton <aaron@thelastpickle.com>
Content-Type: multipart/alternative;
 boundary="Apple-Mail=_1CE1F33A-01C9-4EA8-961E-34526E020686"
Message-Id: <D27DC38F-2AB0-4865-A815-4C48B2FFD48A@thelastpickle.com>
Mime-Version: 1.0 (Mac OS X Mail 6.0 \(1485\))
Subject: Re: nodetool repair uses insane amount of disk space
Date: Fri, 17 Aug 2012 10:57:57 +1200
References: 
 <CAF8LmBHS5rHN9WjWQS5WVv1qzLMFr62OjxjkywqsjRACXGLsZQ@mail.gmail.com>
To: user@cassandra.apache.org
In-Reply-To: 
 <CAF8LmBHS5rHN9WjWQS5WVv1qzLMFr62OjxjkywqsjRACXGLsZQ@mail.gmail.com>


--Apple-Mail=_1CE1F33A-01C9-4EA8-961E-34526E020686
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=iso-8859-1

What version are using ? There were issues with repair using =
lots-o-space in 0.8.X, it's fixed in 1.X

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 17/08/2012, at 2:56 AM, Michael Morris <michael.m.morris@gmail.com> =
wrote:

> Occasionally as I'm doing my regular anti-entropy repair I end up with =
a node that uses an exceptional amount of disk space (node should have =
about 5-6 GB of data on it, but ends up with 25+GB, and consumes the =
limited amount of disk space I have available)
>=20
> How come a node would consume 5x its normal data size during the =
repair process?
>=20
> My setup is kind of strange in that it's only about 80-100GB of data =
on a 35 node cluster, with 2 data centers and 3 racks, however the rack =
assignments are unbalanced.  One data center has 8 nodes, and the other =
data center is split into 2 racks with one rack of 9 nodes, and the =
other with 18 nodes.  However, within each rack, the tokens are =
distributed equally. It's a long sad story about how we ended up this =
way, but it basically boils down to having to utilize existing resources =
to resolve a production issue.
>=20
> Additionally, the repair process takes (what I feel is) an extremely =
long time to complete (36+ hours), and it always seems that nodes are =
streaming data to each other, even on back-to-back executions of the =
repair.
>=20
> Any help on these issues is appreciated.
>=20
> - Mike
>=20


--Apple-Mail=_1CE1F33A-01C9-4EA8-961E-34526E020686
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=iso-8859-1

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Diso-8859-1"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">What =
version are using ? There were issues with repair using lots-o-space in =
0.8.X, it's fixed in 1.X<div><br></div><div>Cheers</div><div><br><div =
apple-content-edited=3D"true">
<span class=3D"Apple-style-span" style=3D"border-collapse: separate; =
color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; =
font-variant: normal; font-weight: normal; letter-spacing: normal; =
line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: =
0px; text-transform: none; white-space: normal; widows: 2; word-spacing: =
0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: =
normal; font-weight: normal; letter-spacing: normal; line-height: =
normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: =
normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: =
0px; -webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; =
"><div><div>-----------------</div><div>Aaron Morton</div><div>Freelance =
Developer</div><div>@aaronmorton</div><div><a =
href=3D"http://www.thelastpickle.com">http://www.thelastpickle.com</a></di=
v></div></div></span></div></span></div></span></span>
</div>

<br><div><div>On 17/08/2012, at 2:56 AM, Michael Morris &lt;<a =
href=3D"mailto:michael.m.morris@gmail.com">michael.m.morris@gmail.com</a>&=
gt; wrote:</div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite">Occasionally as I'm doing my regular anti-entropy repair I =
end up with a node that uses an exceptional amount of disk space (node =
should have about 5-6 GB of data on it, but ends up with 25+GB, and =
consumes the limited amount of disk space I have available)<br>

<br>How come a node would consume 5x its normal data size during the =
repair process?<br><br>My setup is kind of strange in that it's only =
about 80-100GB of data on a 35 node cluster, with 2 data centers and 3 =
racks, however the rack assignments are unbalanced.&nbsp; One data =
center has 8 nodes, and the other data center is split into 2 racks with =
one rack of 9 nodes, and the other with 18 nodes.&nbsp; However, within =
each rack, the tokens are distributed equally. It's a long sad story =
about how we ended up this way, but it basically boils down to having to =
utilize existing resources to resolve a production issue.<br>

<br>Additionally, the repair process takes (what I feel is) an extremely =
long time to complete (36+ hours), and it always seems that nodes are =
streaming data to each other, even on back-to-back executions of the =
repair.<br>
<br>Any help on these issues is appreciated.<br><br>- Mike<br><br>
</blockquote></div><br></div></body></html>=

--Apple-Mail=_1CE1F33A-01C9-4EA8-961E-34526E020686--