Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
MIME-Version: 1.0
In-Reply-To: <CALE39-caFARi9JRJpgD8-rpeO2kKjDN3Bv604zJiTcLSAe9H9w@mail.gmail.com>
References: <CALE39-eJeMW0PYnf-HufvvEG6ZfErW=7psgzg_TW9r_vMpLZsw@mail.gmail.com>
 <CALE39-caFARi9JRJpgD8-rpeO2kKjDN3Bv604zJiTcLSAe9H9w@mail.gmail.com>
From: Yuji Ito <yuji@imagine-orb.com>
Date: Fri, 25 Nov 2016 15:41:54 +0900
Message-ID: <CALE39-d95B3oSZAJ8TXJD-ZgT425261DvcgX8V07TMqN5sq_GA@mail.gmail.com>
Subject: Re: Does recovery continue after truncating a table?
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=001a113c842efc899e05421a6ab6
archived-at: Fri, 25 Nov 2016 06:42:03 -0000

--001a113c842efc899e05421a6ab6
Content-Type: text/plain; charset=UTF-8

Hi all,

I revised the script to reproduce the issue.
I think the issue happens more frequently than before.
Killing another node is added to the previous script.

==== [script] ====
#!/bin/sh

node1_ip=<node1 IP address>
node2_ip=<node2 IP address>
node3_ip=<node3 IP address>
node2_user=<user name>
node3_user=<user name>
rows=10000

echo "consistency quorum;" > init_data.cql
for key in $(seq 0 $(expr $rows - 1))
do
    echo "insert into testdb.testtbl (key, val) values($key, 1111) IF NOT
EXISTS;" >> init_data.cql
    done

    while true
    do
    echo "truncate the table"
    cqlsh $node1_ip -e "truncate table testdb.testtbl" > /dev/null 2>&1
    if [ $? -ne 0 ]; then
        echo "truncating failed"
    continue
    else
        break
    fi
done

echo "kill C* process on node3"
pdsh -l $node3_user -R ssh -w $node3_ip "ps auxww | grep CassandraDaemon |
awk '{if (\$13 ~ /cassand/) print \$2}' | xargs sudo kill -9"

echo "insert $rows rows"
cqlsh $node1_ip -f init_data.cql > insert_log 2>&1

echo "restart C* process on node3"
pdsh -l $node3_user -R ssh -w $node3_ip "sudo /etc/init.d/cassandra start"

while true
do
echo "truncate the table again"
cqlsh $node1_ip -e "truncate table testdb.testtbl"
if [ $? -ne 0 ]; then
    echo "truncating failed"
        continue
else
    echo "truncation succeeded!"
    break
fi
done

echo "kill C* process on node2"
pdsh -l $node2_user -R ssh -w $node2_ip "ps auxww | grep CassandraDaemon |
awk '{if (\$13 ~ /cassand/) print \$2}' | xargs sudo kill -9"

cqlsh $node1_ip --request-timeout 3600 -e "consistency serial; select
count(*) from testdb.testtbl;"
sleep 10
cqlsh $node1_ip --request-timeout 3600 -e "consistency serial; select
count(*) from testdb.testtbl;"

echo "restart C* process on node2"
pdsh -l $node2_user -R ssh -w $node2_ip "sudo /etc/init.d/cassandra start"


Thanks,
yuji


On Fri, Nov 18, 2016 at 7:52 PM, Yuji Ito <yuji@imagine-orb.com> wrote:

> I investigated source code and logs of killed node.
> I guess that unexpected writes are executed when truncation is being
> executed.
>
> Some writes were executed after flush (the first flush) in truncation and
> these writes could be read.
> These writes were requested as MUTATION by another node for hinted handoff.
> Their data was stored to a new memtable and flushed (the second flush) to
> a new SSTable before snapshot in truncation.
> So, the truncation discarded only old SSTables, not the new SSTable.
> That's because ReplayPosition which was used for discarding SSTable was
> that of the first flush.
>
> I copied some parts of log as below.
> "##" line is my comment.
> The point is that the ReplayPosition is moved forward by the second flush.
> It means some writes are executed after the first flush.
>
> == log ==
> ## started truncation
> TRACE [SharedPool-Worker-16] 2016-11-17 08:36:04,612
> ColumnFamilyStore.java:2790 - truncating testtbl
> ## the first flush started before truncation
> DEBUG [SharedPool-Worker-16] 2016-11-17 08:36:04,612
> ColumnFamilyStore.java:952 - Enqueuing flush of testtbl: 591360 (0%)
> on-heap, 0 (0%) off-heap
> INFO  [MemtableFlushWriter:1] 2016-11-17 08:36:04,613 Memtable.java:352 -
> Writing Memtable-testtbl@1863835308(42.625KiB serialized bytes, 2816 ops,
> 0%/0% of on/off-heap limit)
> ...
> DEBUG [MemtableFlushWriter:1] 2016-11-17 08:36:04,973 Memtable.java:386 -
> Completed flushing /var/lib/cassandra/data/testdb
> /testtbl-562848f0a55611e68b1451065d58fdfb/tmp-lb-1-big-Data.db
> (17.651KiB) for commitlog position ReplayPosition(segmentId=1479371760395,
> position=315867)
> ## this ReplayPosition was used for discarding SSTables
> ...
> TRACE [MemtablePostFlush:1] 2016-11-17 08:36:05,022 CommitLog.java:298 -
> discard completed log segments for ReplayPosition(segmentId=1479371760395,
> position=315867), table 562848f0-a556-11e6-8b14-51065d58fdfb
> ## end of the first flush
> DEBUG [SharedPool-Worker-16] 2016-11-17 08:36:05,028
> ColumnFamilyStore.java:2823 - Discarding sstable data for truncated CF +
> indexes
> ## the second flush before snapshot
> DEBUG [SharedPool-Worker-16] 2016-11-17 08:36:05,028
> ColumnFamilyStore.java:952 - Enqueuing flush of testtbl: 698880 (0%)
> on-heap, 0 (0%) off-heap
> INFO  [MemtableFlushWriter:2] 2016-11-17 08:36:05,029 Memtable.java:352 -
> Writing Memtable-testtbl@1186728207(50.375KiB serialized bytes, 3328 ops,
> 0%/0% of on/off-heap limit)
> ...
> DEBUG [MemtableFlushWriter:2] 2016-11-17 08:36:05,258 Memtable.java:386 -
> Completed flushing /var/lib/cassandra/data/testdb
> /testtbl-562848f0a55611e68b1451065d58fdfb/tmp-lb-2-big-Data.db
> (17.696KiB) for commitlog position ReplayPosition(segmentId=1479371760395,
> position=486627)
> ...
> TRACE [MemtablePostFlush:1] 2016-11-17 08:36:05,289 CommitLog.java:298 -
> discard completed log segments for ReplayPosition(segmentId=1479371760395,
> position=486627), table 562848f0-a556-11e6-8b14-51065d58fdfb
> ## end of the second flush: position was moved
> ...
> ## only old SSTable was deleted because this SSTable was older than
> ReplayPosition(segmentId=1479371760395, position=315867)
> TRACE [NonPeriodicTasks:1] 2016-11-17 08:36:05,303 SSTable.java:118 -
> Deleted /var/lib/cassandra/data/testdb/testtbl-562848f0a55611e68b145
> 1065d58fdfb/lb-1-big
> ...
> TRACE [SharedPool-Worker-16] 2016-11-17 08:36:05,320
> ColumnFamilyStore.java:2841 - truncate complete
> TRACE [SharedPool-Worker-16] 2016-11-17 08:36:05,320
> TruncateVerbHandler.java:53 - Truncation(keyspace='testdb', cf='testtbl')
> applied.  Enqueuing response to 36512@/10.91.145.7
> TRACE [SharedPool-Worker-16] 2016-11-17 08:36:05,320
> MessagingService.java:728 - /10.91.145.27 sending REQUEST_RESPONSE to
> 36512@/10.91.145.7
> ## end of truncation
> ====
>
> Actually, "truncated_at" of the table on the system.local after running
> the script was 0x00000158716da30b0004d1db00000158716db524.
> It means segmentId=1479371760395, position=315867
> truncated_at=1479371765028 (2016-11-17 08:36:05,028)
>
> thanks,
> yuji
>
>
> On Wed, Nov 16, 2016 at 5:25 PM, Yuji Ito <yuji@imagine-orb.com> wrote:
>
>> Hi,
>>
>> I could find stale data after truncating a table.
>> It seems that truncating starts while recovery is being executed just
>> after a node restarts.
>> After the truncating finishes, recovery still continues?
>> Is it expected?
>>
>> I use C* 2.2.8 and can reproduce it as below.
>>
>> ==== [create table] ====
>> cqlsh $ip -e "drop keyspace testdb;"
>> cqlsh $ip -e "CREATE KEYSPACE testdb WITH replication = {'class':
>> 'SimpleStrategy', 'replication_factor': '3'};"
>> cqlsh $ip -e "CREATE TABLE testdb.testtbl (key int PRIMARY KEY, val int);"
>>
>> ==== [script] ====
>> #!/bin/sh
>>
>> node1_ip=<node1 IP address>
>> node2_ip=<node2 IP address>
>> node3_ip=<node3 IP address>
>> node3_user=<user name>
>> rows=10000
>>
>> echo "consistency quorum;" > init_data.cql
>> for key in $(seq 0 $(expr $rows - 1))
>> do
>>     echo "insert into testdb.testtbl (key, val) values($key, 1111) IF NOT
>> EXISTS;" >> init_data.cql
>> done
>>
>> while true
>> do
>> echo "truncate the table"
>> cqlsh $node1_ip -e "truncate table testdb.testtbl"
>> if [ $? -ne 0 ]; then
>>     echo "truncating failed"
>>     continue
>> else
>>     break
>> fi
>> done
>>
>> echo "kill C* process on node3"
>> pdsh -l $node3_user -R ssh -w $node3_ip "ps auxww | grep CassandraDaemon
>> | awk '{if (\$13 ~ /cassand/) print \$2}' | xargs sudo kill -9"
>>
>> echo "insert $rows rows"
>> cqlsh $node1_ip -f init_data.cql > insert_log 2>&1
>>
>> echo "restart C* process on node3"
>> pdsh -l $node3_user -R ssh -w $node3_ip "sudo /etc/init.d/cassandra start"
>>
>> while true
>> do
>> echo "truncate the table again"
>> cqlsh $node1_ip -e "truncate table testdb.testtbl"
>> if [ $? -ne 0 ]; then
>>     echo "truncating failed"
>>     continue
>> else
>>     break
>> fi
>> done
>>
>> cqlsh $node1_ip --request-timeout 3600 -e "consistency serial; select
>> count(*) from testdb.testtbl;"
>> sleep 10
>> cqlsh $node1_ip --request-timeout 3600 -e "consistency serial; select
>> count(*) from testdb.testtbl;"
>>
>>
>> ==== [result] ====
>> truncate the table
>> kill C* process on node3
>> insert 10000 rows
>> restart C* process on node3
>> 10.91.145.27: Starting Cassandra: OK
>> truncate the table again
>> <stdin>:1:TruncateError: Error during truncate: Cannot achieve
>> consistency level ALL
>> truncating failed
>> truncate the table again
>> <stdin>:1:TruncateError: Error during truncate: Cannot achieve
>> consistency level ALL
>> truncating failed
>> truncate the table again
>> <stdin>:1:TruncateError: Error during truncate: Cannot achieve
>> consistency level ALL
>> truncating failed
>> truncate the table again
>> <stdin>:1:TruncateError: Error during truncate: Cannot achieve
>> consistency level ALL
>> truncating failed
>> truncate the table again
>> <stdin>:1:TruncateError: Error during truncate: Cannot achieve
>> consistency level ALL
>> truncating failed
>> truncate the table again
>> <stdin>:1:TruncateError: Error during truncate: Cannot achieve
>> consistency level ALL
>> truncating failed
>> truncate the table again
>> Consistency level set to SERIAL.
>>
>>  count
>> -------
>>    300
>>
>> (1 rows)
>>
>> Warnings :
>> Aggregation query used without partition key
>>
>> Consistency level set to SERIAL.
>>
>>  count
>> -------
>>   2304
>>
>> (1 rows)
>>
>> Warnings :
>> Aggregation query used without partition key
>> ====
>>
>> I found it when I was investigating data lost problem. (Ref. "failure
>> node rejoin" thread)
>> I'm not sure this problem is related to data lost.
>>
>> Thanks,
>> yuji
>>
>
>

--001a113c842efc899e05421a6ab6
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Hi all,<div><br></div><div>I revised the script to reprodu=
ce the issue.</div><div>I think the issue happens more frequently=C2=A0than=
 before.</div><div>Killing another node is added to the previous script.</d=
iv><div style=3D"font-size:12.8px"><br></div><div><span style=3D"font-size:=
12.8px">=3D=3D=3D=3D [script] =3D=3D=3D=3D</span><br></div><div><div><span =
style=3D"font-size:12.8px">#!/bin/sh</span></div><div><span style=3D"font-s=
ize:12.8px"><br></span></div><div><div style=3D"font-size:12.8px">node1_ip=
=3D&lt;node1 IP address&gt;</div><div style=3D"font-size:12.8px">node2_ip=
=3D&lt;node2 IP address&gt;</div><div style=3D"font-size:12.8px">node3_ip=
=3D&lt;node3 IP address&gt;</div><div style=3D"font-size:12.8px"><span styl=
e=3D"font-size:12.8px">node2_user=3D&lt;user name&gt;</span><br></div><div =
style=3D"font-size:12.8px">node3_user=3D&lt;user name&gt;</div></div><div><=
span style=3D"font-size:12.8px">rows=3D10000</span></div><div><span style=
=3D"font-size:12.8px"><br></span></div><div><span style=3D"font-size:12.8px=
">echo &quot;consistency quorum;&quot; &gt; init_data.cql</span></div><div>=
<span style=3D"font-size:12.8px">for key in $(seq 0 $(expr $rows - 1))</spa=
n></div><div><span style=3D"font-size:12.8px">do</span></div><div><span sty=
le=3D"font-size:12.8px">=C2=A0 =C2=A0 echo &quot;insert into testdb.testtbl=
 (key, val) values($key, 1111) IF NOT EXISTS;&quot; &gt;&gt; init_data.cql<=
/span></div><div><span style=3D"font-size:12.8px">=C2=A0 =C2=A0 done</span>=
</div><div><span style=3D"font-size:12.8px"><br></span></div><div><span sty=
le=3D"font-size:12.8px">=C2=A0 =C2=A0 while true</span></div><div><span sty=
le=3D"font-size:12.8px">=C2=A0 =C2=A0 do</span></div><div><span style=3D"fo=
nt-size:12.8px">=C2=A0 =C2=A0 echo &quot;truncate the table&quot;</span></d=
iv><div><span style=3D"font-size:12.8px">=C2=A0 =C2=A0 cqlsh $node1_ip -e &=
quot;truncate table testdb.testtbl&quot; &gt; /dev/null 2&gt;&amp;1</span><=
/div><div><span style=3D"font-size:12.8px">=C2=A0 =C2=A0 if [ $? -ne 0 ]; t=
hen</span></div><div><span style=3D"font-size:12.8px">=C2=A0 =C2=A0 =C2=A0 =
=C2=A0 echo &quot;truncating failed&quot;</span></div><div><span style=3D"f=
ont-size:12.8px">=C2=A0 =C2=A0 continue</span></div><div><span style=3D"fon=
t-size:12.8px">=C2=A0 =C2=A0 else</span></div><div><span style=3D"font-size=
:12.8px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 break</span></div><div><span style=3D"=
font-size:12.8px">=C2=A0 =C2=A0 fi</span></div><div><span style=3D"font-siz=
e:12.8px">done</span></div><div><span style=3D"font-size:12.8px"><br></span=
></div><div><span style=3D"font-size:12.8px">echo &quot;kill C* process on =
node3&quot;</span></div><div><span style=3D"font-size:12.8px">pdsh -l $node=
3_user -R ssh -w $node3_ip &quot;ps auxww | grep CassandraDaemon | awk &#39=
;{if (\$13 ~ /cassand/) print \$2}&#39; | xargs sudo kill -9&quot;</span></=
div><div><span style=3D"font-size:12.8px"><br></span></div><div><span style=
=3D"font-size:12.8px">echo &quot;insert $rows rows&quot;</span></div><div><=
span style=3D"font-size:12.8px">cqlsh $node1_ip -f init_data.cql &gt; inser=
t_log 2&gt;&amp;1</span></div><div><span style=3D"font-size:12.8px"><br></s=
pan></div><div><span style=3D"font-size:12.8px">echo &quot;restart C* proce=
ss on node3&quot;</span></div><div><span style=3D"font-size:12.8px">pdsh -l=
 $node3_user -R ssh -w $node3_ip &quot;sudo /etc/init.d/cassandra start&quo=
t;</span></div><div><span style=3D"font-size:12.8px"><br></span></div><div>=
<span style=3D"font-size:12.8px">while true</span></div><div><span style=3D=
"font-size:12.8px">do</span></div><div><span style=3D"font-size:12.8px">ech=
o &quot;truncate the table again&quot;</span></div><div><span style=3D"font=
-size:12.8px">cqlsh $node1_ip -e &quot;truncate table testdb.testtbl&quot;<=
/span></div><div><span style=3D"font-size:12.8px">if [ $? -ne 0 ]; then</sp=
an></div><div><span style=3D"font-size:12.8px">=C2=A0 =C2=A0 echo &quot;tru=
ncating failed&quot;</span></div><div><span style=3D"font-size:12.8px">=C2=
=A0 =C2=A0 =C2=A0 =C2=A0 continue</span></div><div><span style=3D"font-size=
:12.8px">else</span></div><div><span style=3D"font-size:12.8px">=C2=A0 =C2=
=A0 echo &quot;truncation succeeded!&quot;</span></div><div><span style=3D"=
font-size:12.8px">=C2=A0 =C2=A0 break</span></div><div><span style=3D"font-=
size:12.8px">fi</span></div><div><span style=3D"font-size:12.8px">done</spa=
n></div><div><span style=3D"font-size:12.8px"><br></span></div><div><span s=
tyle=3D"font-size:12.8px">echo &quot;kill C* process on node2&quot;</span><=
/div><div><span style=3D"font-size:12.8px">pdsh -l $node2_user -R ssh -w $n=
ode2_ip &quot;ps auxww | grep CassandraDaemon | awk &#39;{if (\$13 ~ /cassa=
nd/) print \$2}&#39; | xargs sudo kill -9&quot;</span></div><div><span styl=
e=3D"font-size:12.8px"><br></span></div><div><span style=3D"font-size:12.8p=
x">cqlsh $node1_ip --request-timeout 3600 -e &quot;consistency serial; sele=
ct count(*) from testdb.testtbl;&quot;</span></div><div><span style=3D"font=
-size:12.8px">sleep 10</span></div><div><span style=3D"font-size:12.8px">cq=
lsh $node1_ip --request-timeout 3600 -e &quot;consistency serial; select co=
unt(*) from testdb.testtbl;&quot;</span></div><div><span style=3D"font-size=
:12.8px"><br></span></div><div><span style=3D"font-size:12.8px">echo &quot;=
restart C* process on node2&quot;</span></div><div><span style=3D"font-size=
:12.8px">pdsh -l $node2_user -R ssh -w $node2_ip &quot;sudo /etc/init.d/cas=
sandra start&quot;</span></div><div style=3D"font-size:12.8px"><br></div></=
div><div style=3D"font-size:12.8px"><br></div><div><div style=3D"font-size:=
12.8px">Thanks,</div><div style=3D"font-size:12.8px">yuji</div></div><div s=
tyle=3D"font-size:12.8px"><br></div></div><div class=3D"gmail_extra"><br><d=
iv class=3D"gmail_quote">On Fri, Nov 18, 2016 at 7:52 PM, Yuji Ito <span di=
r=3D"ltr">&lt;<a href=3D"mailto:yuji@imagine-orb.com" target=3D"_blank">yuj=
i@imagine-orb.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote=
" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><=
div dir=3D"ltr"><div>I investigated source code and logs of killed node.</d=
iv><div>I guess that unexpected writes are executed when truncation is bein=
g executed.</div><div><br></div><div>Some writes were executed after flush =
(the first flush) in truncation and these writes could be read.</div><div>T=
hese writes were requested as MUTATION by another node for hinted handoff.<=
/div><div>Their data was stored to a new memtable and flushed (the second f=
lush) to a new SSTable before snapshot in truncation.</div><div>So, the tru=
ncation discarded only old SSTables, not the new SSTable.</div><div>That=
9;s because ReplayPosition which was used for discarding SSTable was that o=
f the first flush.</div><div><br></div><div>I copied some parts of log as b=
elow.</div><div>&quot;##&quot; line is my comment.</div><div>The point is t=
hat the ReplayPosition is moved forward by the second flush.</div><div>It m=
eans some writes are executed after the first flush.<br></div><div><br></di=
v><div>=3D=3D log =3D=3D</div><div>## started truncation</div><div><div>TRA=
CE [SharedPool-Worker-16] 2016-11-17 08:36:04,612 ColumnFamilyStore.java:27=
90 - truncating testtbl</div><div>## the first flush started before truncat=
ion</div><div>DEBUG [SharedPool-Worker-16] 2016-11-17 08:36:04,612 ColumnFa=
milyStore.java:952 - Enqueuing flush of testtbl: 591360 (0%) on-heap, 0 (0%=
) off-heap</div><div>INFO =C2=A0[MemtableFlushWriter:1] 2016-11-17 08:36:04=
,613 Memtable.java:352 - Writing Memtable-testtbl@1863835308(42<wbr>.625KiB=
 serialized bytes, 2816 ops, 0%/0% of on/off-heap limit)</div><div>...</div=
><div>DEBUG [MemtableFlushWriter:1] 2016-11-17 08:36:04,973 Memtable.java:3=
86 - Completed flushing /var/lib/cassandra/data/testdb<wbr>/testtbl-562848f=
0a55611e68b145<wbr>1065d58fdfb/tmp-lb-1-big-Data.<wbr>db (17.651KiB) for co=
mmitlog position ReplayPosition(segmentId=3D14793<wbr>71760395, position=3D=
315867)</div><div>## this ReplayPosition was used for discarding SSTables</=
div><div>...</div><div>TRACE [MemtablePostFlush:1] 2016-11-17 08:36:05,022 =
CommitLog.java:298 - discard completed log segments for ReplayPosition(segm=
entId=3D14793<wbr>71760395, position=3D315867), table 562848f0-a556-11e6-8b=
14-51065d<wbr>58fdfb</div><div>## end of the first flush</div><div>DEBUG [S=
haredPool-Worker-16] 2016-11-17 08:36:05,028 ColumnFamilyStore.java:2823 - =
Discarding sstable data for truncated CF + indexes</div><div>## the second =
flush before snapshot<br></div><div>DEBUG [SharedPool-Worker-16] 2016-11-17=
 08:36:05,028 ColumnFamilyStore.java:952 - Enqueuing flush of testtbl: 6988=
80 (0%) on-heap, 0 (0%) off-heap</div><div>INFO =C2=A0[MemtableFlushWriter:=
2] 2016-11-17 08:36:05,029 Memtable.java:352 - Writing Memtable-testtbl@118=
6728207(50<wbr>.375KiB serialized bytes, 3328 ops, 0%/0% of on/off-heap lim=
it)</div><div>...</div><div>DEBUG [MemtableFlushWriter:2] 2016-11-17 08:36:=
05,258 Memtable.java:386 - Completed flushing /var/lib/cassandra/data/testd=
b<wbr>/testtbl-562848f0a55611e68b145<wbr>1065d58fdfb/tmp-lb-2-big-Data.<wbr=
>db (17.696KiB) for commitlog position ReplayPosition(segmentId=3D14793<wbr=
>71760395, position=3D486627)</div><div>...</div><div>TRACE [MemtablePostFl=
ush:1] 2016-11-17 08:36:05,289 CommitLog.java:298 - discard completed log s=
egments for ReplayPosition(segmentId=3D14793<wbr>71760395, position=3D48662=
7), table 562848f0-a556-11e6-8b14-51065d<wbr>58fdfb</div><div>## end of the=
 second flush: position was moved</div><div>...</div><div>## only old SSTab=
le was deleted because this SSTable was older than ReplayPosition(segmentId=
=3D14793<wbr>71760395, position=3D315867)</div><div>TRACE [NonPeriodicTasks=
:1] 2016-11-17 08:36:05,303 SSTable.java:118 - Deleted /var/lib/cassandra/d=
ata/testdb<wbr>/testtbl-562848f0a55611e68b145<wbr>1065d58fdfb/lb-1-big</div=
><div>...</div><div>TRACE [SharedPool-Worker-16] 2016-11-17 08:36:05,320 Co=
lumnFamilyStore.java:2841 - truncate complete</div><div>TRACE [SharedPool-W=
orker-16] 2016-11-17 08:36:05,320 TruncateVerbHandler.java:53 - Truncation(=
keyspace=3D&#39;testdb&#39;, cf=3D&#39;testtbl&#39;) applied.=C2=A0 Enqueui=
ng response to 36512@/<a href=3D"http://10.91.145.7" target=3D"_blank">10.9=
1.145.7</a></div><div>TRACE [SharedPool-Worker-16] 2016-11-17 08:36:05,320 =
MessagingService.java:728 - /<a href=3D"http://10.91.145.27" target=3D"_bla=
nk">10.91.145.27</a> sending REQUEST_RESPONSE to 36512@/<a href=3D"http://1=
0.91.145.7" target=3D"_blank">10.91.145.7</a></div><div>## end of truncatio=
n</div></div><div>=3D=3D=3D=3D</div><div><br></div><div>Actually, &quot;tru=
ncated_at&quot; of the table on the system.local after running the script w=
as 0x00000158716da30b0004d1db0000<wbr>0158716db524.</div><div><div>It means=
 segmentId=3D1479371760395, position=3D315867 truncated_at=3D1479371765028 =
(2016-11-17 08:36:05,028)</div></div><div><br></div><div>thanks,</div><div>=
yuji</div><div><br></div></div><div class=3D"HOEnZb"><div class=3D"h5"><div=
 class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Wed, Nov 16, 2016 =
at 5:25 PM, Yuji Ito <span dir=3D"ltr">&lt;<a href=3D"mailto:yuji@imagine-o=
rb.com" target=3D"_blank">yuji@imagine-orb.com</a>&gt;</span> wrote:<br><bl=
ockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #=
ccc solid;padding-left:1ex"><div dir=3D"ltr">Hi,<div><br></div><div>I could=
 find stale data after truncating a table.</div><div>It seems that truncati=
ng starts while recovery is being executed just after a node restarts.<br><=
/div><div>After the truncating finishes, recovery still continues?</div><di=
v>Is it expected?</div><div><br></div><div>I use C* 2.2.8 and can reproduce=
 it as below.</div><div><br></div><div>=3D=3D=3D=3D [create table] =3D=3D=
=3D=3D</div><div><span style=3D"font-size:12.8px">cqlsh $ip -e &quot;drop k=
eyspace testdb;&quot;</span><br></div><div><div style=3D"font-size:12.8px">=
<span style=3D"font-size:12.8px">cqlsh $ip -e &quot;CREATE KEYSPACE testdb =
WITH replication =3D {&#39;class&#39;: &#39;SimpleStrategy&#39;, &#39;repli=
cation_factor&#39;: &#39;3&#39;};&quot;</span></div><div style=3D"font-size=
:12.8px"><span style=3D"font-size:12.8px">cqlsh $ip -e &quot;CREATE TABLE t=
estdb.testtbl (key int PRIMARY KEY, val int);&quot;</span></div></div><div>=
<br></div><div>=3D=3D=3D=3D [script] =3D=3D=3D=3D</div><div>#!/bin/sh<br></=
div><div><div><br></div><div>node1_ip=3D&lt;node1 IP address&gt;</div><div>=
node2_ip=3D&lt;node2 IP address&gt;</div><div>node3_ip=3D&lt;node3 IP addre=
ss&gt;</div><div>node3_user=3D&lt;user name&gt;</div><div>rows=3D10000</div=
><div><br></div><div>echo &quot;consistency quorum;&quot; &gt; init_data.cq=
l</div><div>for key in $(seq 0 $(expr $rows - 1))</div><div>do</div><div>=
=C2=A0 =C2=A0 echo &quot;insert into testdb.testtbl (key, val) values($key,=
 1111) IF NOT EXISTS;&quot; &gt;&gt; init_data.cql</div><div>done</div><div=
><br></div><div>while true</div><div>do</div><div>echo &quot;truncate the t=
able&quot;<br></div><div>cqlsh $node1_ip -e &quot;truncate table testdb.tes=
ttbl&quot;</div><div>if [ $? -ne 0 ]; then</div><div>=C2=A0 =C2=A0 echo &qu=
ot;truncating failed&quot;</div><div>=C2=A0 =C2=A0 continue</div><div>else<=
/div><div>=C2=A0 =C2=A0 break</div><div>fi</div><div>done</div><div><br></d=
iv><div>echo &quot;kill C* process on node3&quot;</div><div>pdsh -l $node3_=
user -R ssh -w $node3_ip &quot;ps auxww | grep CassandraDaemon | awk &#39;{=
if (\$13 ~ /cassand/) print \$2}&#39; | xargs sudo kill -9&quot;</div><div>=
<br></div><div>echo &quot;insert $rows rows&quot;</div><div>cqlsh $node1_ip=
 -f init_data.cql &gt; insert_log 2&gt;&amp;1</div><div><br></div><div>echo=
 &quot;restart C* process on node3&quot;</div><div>pdsh -l $node3_user -R s=
sh -w $node3_ip &quot;sudo /etc/init.d/cassandra start&quot;</div><div><br>=
</div><div>while true</div><div>do</div><div>echo &quot;truncate the table =
again&quot;</div><div>cqlsh $node1_ip -e &quot;truncate table testdb.testtb=
l&quot;</div><div>if [ $? -ne 0 ]; then</div><div>=C2=A0 =C2=A0 echo &quot;=
truncating failed&quot;</div><div>=C2=A0 =C2=A0 continue</div><div>else</di=
v><div>=C2=A0 =C2=A0 break</div><div>fi</div><div>done</div><div><br></div>=
<div>cqlsh $node1_ip --request-timeout 3600 -e &quot;consistency serial; se=
lect count(*) from testdb.testtbl;&quot;</div><div>sleep 10</div><div>cqlsh=
 $node1_ip --request-timeout 3600 -e &quot;consistency serial; select count=
(*) from testdb.testtbl;&quot;</div></div><div><br></div><div><br></div><di=
v>=3D=3D=3D=3D [result] =3D=3D=3D=3D</div><div><div>truncate the table</div=
><div>kill C* process on node3</div><div>insert 10000 rows</div><div>restar=
t C* process on node3</div><div><a href=3D"http://10.91.145.27" target=3D"_=
blank">10.91.145.27</a>: Starting Cassandra: OK</div><div>truncate the tabl=
e again</div><div>&lt;stdin&gt;:1:TruncateError: Error during truncate: Can=
not achieve consistency level ALL</div><div>truncating failed</div><div>tru=
ncate the table again</div><div>&lt;stdin&gt;:1:TruncateError: Error during=
 truncate: Cannot achieve consistency level ALL</div><div>truncating failed=
</div><div>truncate the table again</div><div>&lt;stdin&gt;:1:TruncateError=
: Error during truncate: Cannot achieve consistency level ALL</div><div>tru=
ncating failed</div><div>truncate the table again</div><div>&lt;stdin&gt;:1=
:TruncateError: Error during truncate: Cannot achieve consistency level ALL=
</div><div>truncating failed</div><div>truncate the table again</div><div>&=
lt;stdin&gt;:1:TruncateError: Error during truncate: Cannot achieve consist=
ency level ALL</div><div>truncating failed</div><div>truncate the table aga=
in</div><div>&lt;stdin&gt;:1:TruncateError: Error during truncate: Cannot a=
chieve consistency level ALL</div><div>truncating failed</div><div>truncate=
 the table again</div><div>Consistency level set to SERIAL.</div><div><br><=
/div><div>=C2=A0count</div><div>-------</div><div>=C2=A0 =C2=A0300</div><di=
v><br></div><div>(1 rows)</div><div><br></div><div>Warnings :</div><div>Agg=
regation query used without partition key</div><div><br></div><div>Consiste=
ncy level set to SERIAL.</div><div><br></div><div>=C2=A0count</div><div>---=
----</div><div>=C2=A0 2304</div><div><br></div><div>(1 rows)</div><div><br>=
</div><div>Warnings :</div><div>Aggregation query used without partition ke=
y</div></div><div>=3D=3D=3D=3D</div><div><br></div><div>I found it when I w=
as investigating data lost problem. (Ref. &quot;failure node rejoin&quot; t=
hread)<br></div><div>I&#39;m not sure this problem is related to data lost.=
<br></div><div><br></div><div>Thanks,</div><div>yuji</div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>

--001a113c842efc899e05421a6ab6--