Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
MIME-Version: 1.0
In-Reply-To: <7C10EF71-435C-477E-BCE5-8D3720A4088F@crowdstrike.com>
References: 
 <CA+4UHyPc94NKQdfmXrA+=o2eqC6CxwYqC=ydnEG9FvGPsvJ4Kg@mail.gmail.com>
	<7C10EF71-435C-477E-BCE5-8D3720A4088F@crowdstrike.com>
Date: Mon, 7 Dec 2015 19:14:25 -0500
Message-ID: 
 <CA+4UHyO+qw3fPDO=UW_KV1xKojVjJLU-UVOGWD5Uif2Yf=0-2A@mail.gmail.com>
Subject: Re: lots of tombstone after compaction
From: Kai Wang <depend@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=001a113ad8463861b8052657db06

--001a113ad8463861b8052657db06
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Rob and Jeff,

Thank you. It makes sense. I am on 2.1.10 and will upgrade to 2.1.12.
On Dec 7, 2015 7:05 PM, "Jeff Jirsa" <jeff.jirsa@crowdstrike.com> wrote:

> https://issues.apache.org/jira/browse/CASSANDRA-7953
>
> https://issues.apache.org/jira/browse/CASSANDRA-10505
>
> There are buggy versions of cassandra that will multiple tombstones durin=
g
> compaction. 2.1.12 SHOULD correct that, if you=E2=80=99re on 2.1.
>
>
>
> From: Kai Wang
> Reply-To: "user@cassandra.apache.org"
> Date: Monday, December 7, 2015 at 3:46 PM
> To: "user@cassandra.apache.org"
> Subject: lots of tombstone after compaction
>
> I bulkloaded a few tables using CQLSStableWrite/sstableloader. The data
> are large amount of wide rows with lots of null's. It takes one day or tw=
o
> for the compaction to complete. sstable count is at single digit. Maximum
> partition size is ~50M and mean size is ~5M. However I am seeing frequent
> read query timeouts caused by tombstone_failure_threshold (100000). These
> tables are basically read-only. There're no writes.
>
> I just kicked off compaction on those tables using nodetool. Hopefully it
> can remove those tombstones. But is it normal to have these many tombston=
es
> after the initial compactions? Is this related to the fact the original
> data has lots of nulls?
>
> Thanks.
>

--001a113ad8463861b8052657db06
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<p dir=3D"ltr">Rob and Jeff,</p>
<p dir=3D"ltr">Thank you. It makes sense. I am on 2.1.10 and will upgrade t=
o 2.1.12.</p>
<div class=3D"gmail_quote">On Dec 7, 2015 7:05 PM, &quot;Jeff Jirsa&quot; &=
lt;<a href=3D"mailto:jeff.jirsa@crowdstrike.com">jeff.jirsa@crowdstrike.com=
</a>&gt; wrote:<br type=3D"attribution"><blockquote class=3D"gmail_quote" s=
tyle=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div=
 style=3D"word-wrap:break-word"><div><div><div style=3D"color:rgb(0,0,0);fo=
nt-family:Calibri,sans-serif;font-size:14px"><a href=3D"https://issues.apac=
he.org/jira/browse/CASSANDRA-7953" target=3D"_blank">https://issues.apache.=
org/jira/browse/CASSANDRA-7953</a></div><div style=3D"color:rgb(0,0,0);font=
-family:Calibri,sans-serif;font-size:14px"><br></div><div style=3D"color:rg=
b(0,0,0);font-family:Calibri,sans-serif;font-size:14px"><a href=3D"https://=
issues.apache.org/jira/browse/CASSANDRA-10505" target=3D"_blank">https://is=
sues.apache.org/jira/browse/CASSANDRA-10505</a></div><div style=3D"color:rg=
b(0,0,0);font-family:Calibri,sans-serif;font-size:14px"><br></div><div><fon=
t face=3D"Calibri,sans-serif">There are buggy versions of cassandra that wi=
ll multiple tombstones during compaction. 2.1.12 SHOULD correct that, if yo=
u=E2=80=99re on 2.1.</font></div><div><u><font face=3D"Calibri,sans-serif">=
<br></font></u></div><div><u><font face=3D"Calibri,sans-serif"><br></font><=
/u></div><div style=3D"color:rgb(0,0,0);font-family:Calibri,sans-serif;font=
-size:14px"><div></div></div></div></div><div style=3D"color:rgb(0,0,0);fon=
t-family:Calibri,sans-serif;font-size:14px"><br></div><span style=3D"color:=
rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px"><div style=3D"fon=
t-family:Calibri;font-size:12pt;text-align:left;color:black;BORDER-BOTTOM:m=
edium none;BORDER-LEFT:medium none;PADDING-BOTTOM:0in;PADDING-LEFT:0in;PADD=
ING-RIGHT:0in;BORDER-TOP:#b5c4df 1pt solid;BORDER-RIGHT:medium none;PADDING=
-TOP:3pt"><span style=3D"font-weight:bold">From: </span> Kai Wang<br><span =
style=3D"font-weight:bold">Reply-To: </span> &quot;<a href=3D"mailto:user@c=
assandra.apache.org" target=3D"_blank">user@cassandra.apache.org</a>&quot;<=
br><span style=3D"font-weight:bold">Date: </span> Monday, December 7, 2015 =
at 3:46 PM<br><span style=3D"font-weight:bold">To: </span> &quot;<a href=3D=
"mailto:user@cassandra.apache.org" target=3D"_blank">user@cassandra.apache.=
org</a>&quot;<br><span style=3D"font-weight:bold">Subject: </span> lots of =
tombstone after compaction<br></div><div><br></div><div><div><div dir=3D"lt=
r"><div><div>I bulkloaded a few tables using CQLSStableWrite/sstableloader.=
 The data are large amount of wide rows with lots of null&#39;s. It takes o=
ne day or two for the compaction to complete. sstable count is at single di=
git. Maximum partition size is ~50M and mean
 size is ~5M. However I am seeing frequent read query timeouts caused by to=
mbstone_failure_threshold (100000). These tables are basically read-only. T=
here&#39;re no writes.<br><br></div>
I just kicked off compaction on those tables using nodetool. Hopefully it c=
an remove those tombstones. But is it normal to have these many tombstones =
after the initial compactions? Is this related to the fact the original dat=
a has lots of nulls?<br><br></div>
Thanks.<br></div></div></div></span></div>
</blockquote></div>

--001a113ad8463861b8052657db06--