Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: local policy)
From: aaron morton <aaron@thelastpickle.com>
Content-Type: multipart/alternative;
 boundary="Apple-Mail=_E82EA460-3CB0-407A-AA00-8F3639E1EE32"
Message-Id: <EBBBF7BE-C1C1-4125-B699-B55D5493E360@thelastpickle.com>
Mime-Version: 1.0 (Mac OS X Mail 6.0 \(1486\))
Subject: Re: Commit log periodic sync?
Date: Wed, 29 Aug 2012 20:48:15 +1200
References: <8780CAD3-C2C3-4238-B173-FF98FCF73154@gmail.com>
 <3D853BDC-6CAC-402B-9936-4011EEDCD7EF@thelastpickle.com>
 <B1167F23-805A-4ED1-BAF3-EFD2DCFF53EF@gmail.com>
 <CB001AAD-3748-46C9-8198-9297817B9FB4@thelastpickle.com>
 <12069422-BA6E-4BE2-B0AB-92BA084482BA@gmail.com>
To: user@cassandra.apache.org
In-Reply-To: <12069422-BA6E-4BE2-B0AB-92BA084482BA@gmail.com>


--Apple-Mail=_E82EA460-3CB0-407A-AA00-8F3639E1EE32
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8

> - if during the streaming session, the sstable that was about to =
stream out but was being compacted, would we see error in the log?
No.=20
Remember we never modify files on disk. And the "truth" contained in one =
file generated by compaction is the same as the"truth" contained in the =
files before compaction.

> - could this lead to data not found?
No.

> - is it safe to let a node serving read/write requests while repair is =
running?
Yes.=20
All maintenance operations are online operations.=20
=20
> Data created before the last flush was still missing, according to the =
client that talked to DC1 (the disaster DC).=20
Losing data from before the flush sounds very strange. That sort of =
thing would get noticed, specially if you can pretty much reproduce it.=20=


I would wind the test back to a single node see if it works as expected. =
If it does walk the test scenario forward to multiple nodes then =
multiple DC's until it fails. If the single node test fails check your =
test and then grab someone on IRC (I'm aaron_morton and there are plenty =
of other smart helpful people there).
=20
Sorry it's sometimes hard to diagnose problems over email.=20

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 28/08/2012, at 11:28 PM, rubbish me <rubbish.me@googlemail.com> =
wrote:

> Thanks again Aaron.
>=20
>> I think case I would not expect to see data lose. If you are still in =
a test scenario can you try to reproduce the problem ? If possible can =
you reproduce it with a single node ?
>=20
> We will try that later this week.=20
> ----
>=20
> We did the same exercise this week, this time we did a flush and =
snapshot before the DR actually happened - as an attempt to identify if =
the commit logs fsync was the problem.=20
>=20
> We can clearly see stables were created for the flush command.=20
> And those sstables were loaded in when the nodes started up again =
after the DR exercise.=20
>=20
> At this point we believed all nodes had all the data, so we let them =
serving client requests while we run repair on the nodes.=20
>=20
> Data created before the last flush was still missing, according to the =
client that talked to DC1 (the disaster DC).=20
>=20
> We had a look at the log of one of the DC1 nodes. The suspicious thing =
was that latest sstable was being compacted during streaming sessions of =
the repair. But no error was reported.=20
>=20
> Here comes my questions:
> - if during the streaming session, the sstable that was about to =
stream out but was being compacted, would we see error in the log?
> - could this lead to data not found?
> - is it safe to let a node serving read/write requests while repair is =
running?
>=20
> Many thanks again.=20
>=20
> - A
>=20
>=20
>=20
>=20
> aaron morton <aaron@thelastpickle.com> =E6=96=BC 27 Aug 2012 09:08 =
=E5=AF=AB=E9=81=93=EF=BC=9A
>=20
>>> Brutally. kill -9.
>> that's fine. I was thinking about reboot -f -n
>>=20
>>> We are wondering if the fsync of the commit log was working.
>> I would say yes only because there other reported problems.=20
>>=20
>> I think case I would not expect to see data lose. If you are still in =
a test scenario can you try to reproduce the problem ? If possible can =
you reproduce it with a single node ?
>>=20
>> Cheers
>>=20
>>=20
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>=20
>> On 25/08/2012, at 11:00 AM, rubbish me <rubbish.me@googlemail.com> =
wrote:
>>=20
>>> Thanks, Aaron, for your reply - please see the inline.
>>>=20
>>>=20
>>> On 24 Aug 2012, at 11:04, aaron morton wrote:
>>>=20
>>>>> - we are running on production linux VMs (not ideal but this is =
out of our hands)
>>>> Is the VM doing anything wacky with the IO ?
>>>=20
>>> Could be.  But I thought we would ask here first.  This is a bit =
difficult to prove cos we dont have the control over these VMs.
>>>=20
>>>> =20
>>>>=20
>>>>> As part of a DR exercise, we killed all 6 nodes in DC1,
>>>> Nice disaster. Out of interest, what was the shutdown process ?
>>>=20
>>> Brutally. kill -9.
>>>=20
>>>=20
>>>>=20
>>>>> We noticed that data that was written an hour before the exercise, =
around the last memtables being flushed,was not found in DC1.=20
>>>> To confirm, data was written to DC 1 at CL LOCAL_QUORUM before the =
DR exercise.=20
>>>>=20
>>>> Was the missing data written before or after the memtable flush ? =
I'm trying to understand if the data should have been in the commit log =
or the memtables.=20
>>>=20
>>> Missing data was those written after the last flush.  These data was =
retrievable before the DR exercise.
>>>=20
>>>>=20
>>>> Can you provide some more info on how you are detecting it is not =
found in DC 1?
>>>>=20
>>>=20
>>> We tried hector, consistencylevel=3Dlocal quorum.  We had missing =
column or the whole row. =20
>>>=20
>>> We tried cassandra-cli on DC1 nodes, same.
>>>=20
>>> However once we run the same query on DC2, C* must have then done a =
read-repair. That particular piece of result data would appear in DC1 =
again.
>>>=20
>>>=20
>>>>> If we understand correctly, commit logs are being written first =
and then to disk every 10s.=20
>>>> Writes are put into a bounded queue and processed as fast as the IO =
can keep up. Every 10s a sync messages is added to the queue. Not that =
the commit log segment may rotate at any time which requires a sync.=20
>>>>=20
>>>> A loss of data across all nodes in a DC seems odd. If you can =
provide some more information we may be able to help.=20
>>>=20
>>>=20
>>> We are wondering if the fsync of the commit log was working.  But we =
saw no errors / warning in logs.  Wondering if there is way to =
verify....
>>>=20
>>>=20
>>>>=20
>>>> Cheers
>>>>=20
>>>> -----------------
>>>> Aaron Morton
>>>> Freelance Developer
>>>> @aaronmorton
>>>> http://www.thelastpickle.com
>>>>=20
>>>> On 24/08/2012, at 6:01 AM, rubbish me <rubbish.me@googlemail.com> =
wrote:
>>>>=20
>>>>> Hi all
>>>>>=20
>>>>> First off, let's introduce the setup.=20
>>>>>=20
>>>>> - 6 x C* 1.1.2 in active DC (DC1), another 6 in another (DC2)
>>>>> - keyspace's RF=3D3 in each DC
>>>>> - Hector as client.
>>>>> - client talks only to DC1 unless DC1 can't serve the request. In =
which case talks only to DC2
>>>>> - commit log was periodically sync with the default setting of =
10s.=20
>>>>> - consistency policy =3D LOCAL QUORUM for both read and write.=20
>>>>> - we are running on production linux VMs (not ideal but this is =
out of our hands)
>>>>> -----
>>>>> As part of a DR exercise, we killed all 6 nodes in DC1, hector =
starts talking to DC2, all the data was still there, everything =
continued to work perfectly.=20
>>>>>=20
>>>>> Then we brought all nodes, one by one, in DC1 up. We saw a message =
saying all the commit logs were replayed. No errors reported.  We didn't =
run repair at this time.=20
>>>>>=20
>>>>> We noticed that data that was written an hour before the exercise, =
around the last memtables being flushed,was not found in DC1.=20
>>>>>=20
>>>>> If we understand correctly, commit logs are being written first =
and then to disk every 10s. At worst we lost the last 10s of data. What =
could be the cause of this behaviour?=20
>>>>>=20
>>>>> With the blessing of C* we could recovered all these data from =
DC2. But we would like to understand why.=20
>>>>>=20
>>>>> Many thanks in advanced.=20
>>>>>=20
>>>>> Amy
>>>>>=20
>>>>>=20
>>>>=20
>>>=20
>>=20


--Apple-Mail=_E82EA460-3CB0-407A-AA00-8F3639E1EE32
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=utf-8

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; =
"><blockquote type=3D"cite"><div bgcolor=3D"#FFFFFF">- if during the =
streaming session, the sstable that was about to stream out but was =
being compacted, would we see error in the =
log?</div></blockquote><div><div =
bgcolor=3D"#FFFFFF">No.&nbsp;</div></div><div bgcolor=3D"#FFFFFF">Remember=
 we never modify files on disk. And the "truth" contained in one file =
generated by compaction is the same as the"truth" contained in the files =
before compaction.</div><div bgcolor=3D"#FFFFFF"><br></div><div =
bgcolor=3D"#FFFFFF"><blockquote type=3D"cite"><div bgcolor=3D"#FFFFFF">- =
could this lead to data not found?</div></blockquote><div><div =
bgcolor=3D"#FFFFFF">No.</div></div><div =
bgcolor=3D"#FFFFFF"><br></div></div><blockquote type=3D"cite"><div =
bgcolor=3D"#FFFFFF">- is it safe to let a node serving read/write =
requests while repair is running?</div></blockquote><div =
bgcolor=3D"#FFFFFF">Yes.&nbsp;</div><div bgcolor=3D"#FFFFFF">All =
maintenance operations are online operations.&nbsp;</div><div =
bgcolor=3D"#FFFFFF">&nbsp;</div><div bgcolor=3D"#FFFFFF"><blockquote =
type=3D"cite"><div bgcolor=3D"#FFFFFF">Data created before the last =
flush was still missing, according to the client that talked to DC1 (the =
disaster DC).&nbsp;</div></blockquote><div><div bgcolor=3D"#FFFFFF">Losing=
 data from before the flush sounds very strange. That sort of thing =
would get noticed, specially if you can pretty much reproduce =
it.&nbsp;</div></div><div bgcolor=3D"#FFFFFF"><br></div><div =
bgcolor=3D"#FFFFFF">I would wind the test back to a single node see if =
it works as expected. If it does walk the test scenario forward to =
multiple nodes then multiple DC's until it fails. If the single node =
test fails check your test and then grab someone on IRC (I'm =
aaron_morton and there are plenty of other smart helpful people =
there).</div><div bgcolor=3D"#FFFFFF">&nbsp;</div><div =
bgcolor=3D"#FFFFFF">Sorry it's sometimes hard to diagnose problems over =
email.&nbsp;</div><div bgcolor=3D"#FFFFFF"><br></div><div =
bgcolor=3D"#FFFFFF">Cheers</div><div =
bgcolor=3D"#FFFFFF"><br></div></div><div apple-content-edited=3D"true">
<span class=3D"Apple-style-span" style=3D"border-collapse: separate; =
color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; =
font-variant: normal; font-weight: normal; letter-spacing: normal; =
line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: =
0px; text-transform: none; white-space: normal; widows: 2; word-spacing: =
0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: =
normal; font-weight: normal; letter-spacing: normal; line-height: =
normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: =
normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: =
0px; -webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; =
"><div><div>-----------------</div><div>Aaron Morton</div><div>Freelance =
Developer</div><div>@aaronmorton</div><div><a =
href=3D"http://www.thelastpickle.com">http://www.thelastpickle.com</a></di=
v></div></div></span></div></span></div></span></span>
</div>

<br><div><div>On 28/08/2012, at 11:28 PM, rubbish me &lt;<a =
href=3D"mailto:rubbish.me@googlemail.com">rubbish.me@googlemail.com</a>&gt=
; wrote:</div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite"><div bgcolor=3D"#FFFFFF"><div>Thanks again =
Aaron.</div><div><br></div><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.296875); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, 0.230469); =
"><blockquote type=3D"cite">I think case I would not expect to see data =
lose. If you are still in a test scenario can you try to reproduce the =
problem ? If possible can you reproduce it with a single node =
?</blockquote><div><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.296875); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, 0.230469); =
"><br></span></div></span><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.296875); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, 0.230469); ">We will =
try that later this week.&nbsp;<br></span><div><div><span =
class=3D"Apple-style-span" style=3D"-webkit-tap-highlight-color: =
rgba(26, 26, 26, 0.292969); -webkit-composition-fill-color: rgba(175, =
192, 227, 0.230469); -webkit-composition-frame-color: rgba(77, 128, 180, =
0.230469);">----</span></div><div><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, =
0.230469);"><br></span></div><div><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, 0.230469);">We did =
the same exercise this week, this time we did a flush and snapshot =
before the DR actually happened - as an attempt to identify if the =
commit logs fsync was the problem.&nbsp;</span></div><div><span =
class=3D"Apple-style-span" style=3D"-webkit-tap-highlight-color: =
rgba(26, 26, 26, 0.292969); -webkit-composition-fill-color: rgba(175, =
192, 227, 0.230469); -webkit-composition-frame-color: rgba(77, 128, 180, =
0.230469);"><br></span></div><div><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, 0.230469);">We can =
clearly see stables were created for the flush =
command.&nbsp;</span></div><div><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, 0.230469);">And =
those sstables were loaded in when the nodes started up again after the =
DR exercise.&nbsp;</span></div><div><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, =
0.230469);"><br></span></div><div><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, 0.230469);">At this =
point we believed all nodes had all the data, so we let them serving =
client requests while we run repair on the =
nodes.&nbsp;</span></div><div><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, =
0.230469);"><br></span></div><div><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, 0.230469);">Data =
created before the last flush was still missing, according to the client =
that talked to DC1 (the disaster DC).&nbsp;</span></div><div><span =
class=3D"Apple-style-span" style=3D"-webkit-tap-highlight-color: =
rgba(26, 26, 26, 0.292969); -webkit-composition-fill-color: rgba(175, =
192, 227, 0.230469); -webkit-composition-frame-color: rgba(77, 128, 180, =
0.230469);"><br></span></div><div><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, 0.230469);">We had a =
look at the log of one of the DC1 nodes. The suspicious thing was that =
latest sstable was being compacted during streaming sessions of the =
repair. But no error was reported.&nbsp;</span></div><div><span =
class=3D"Apple-style-span" style=3D"-webkit-tap-highlight-color: =
rgba(26, 26, 26, 0.292969); -webkit-composition-fill-color: rgba(175, =
192, 227, 0.230469); -webkit-composition-frame-color: rgba(77, 128, 180, =
0.230469);"><br></span></div><div><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, 0.230469);">Here =
comes my questions:</span></div><div><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, 0.230469);">- if =
during the streaming session, the sstable that was about to stream out =
but was being compacted, would we see error in the =
log?</span></div><div><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, 0.230469);">- could =
this lead to data not found?</span></div><div><span =
class=3D"Apple-style-span" style=3D"-webkit-tap-highlight-color: =
rgba(26, 26, 26, 0.292969); -webkit-composition-fill-color: rgba(175, =
192, 227, 0.230469); -webkit-composition-frame-color: rgba(77, 128, 180, =
0.230469);">- is it safe to let a node serving read/write requests while =
repair is running?</span></div><div><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, =
0.230469);"><br></span></div><div><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, 0.230469);">Many =
thanks again.&nbsp;</span></div><div><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, =
0.230469);"><br></span></div><div><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, 0.230469);">- =
A</span></div><div><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, =
0.230469);"><br></span></div><div><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, =
0.230469);"><br></span></div><div><span class=3D"Apple-style-span" =
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); =
-webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); =
-webkit-composition-frame-color: rgba(77, 128, 180, =
0.230469);"><br></span></div><div><br>aaron morton &lt;<a =
href=3D"mailto:aaron@thelastpickle.com">aaron@thelastpickle.com</a>&gt; =
=E6=96=BC 27 Aug 2012 09:08 =
=E5=AF=AB=E9=81=93=EF=BC=9A<br><br></div><div><span></span></div><blockquo=
te type=3D"cite"><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Dus-ascii"><blockquote type=3D"cite"><div style=3D"word-wrap: =
break-word; -webkit-nbsp-mode: space; -webkit-line-break: =
after-white-space; ">Brutally. kill -9.</div></blockquote><div><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; ">that's fine. I was thinking =
about reboot -f -n</div></div><div style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; =
"><br></div><blockquote type=3D"cite"><div style=3D"word-wrap: =
break-word; -webkit-nbsp-mode: space; -webkit-line-break: =
after-white-space; ">We are wondering if the fsync of the commit log was =
working.</div></blockquote>I would say yes only because there other =
reported problems.&nbsp;<div><br></div><div>I think case I would not =
expect to see data lose. If you are still in a test scenario can you try =
to reproduce the problem ? If possible can you reproduce it with a =
single node =
?</div><div><br></div><div>Cheers</div><div><br></div><div><br></div><div>=
<div apple-content-edited=3D"true">
<span class=3D"Apple-style-span" style=3D"border-collapse: separate; =
font-family: Helvetica; font-style: normal; font-variant: normal; =
font-weight: normal; letter-spacing: normal; line-height: normal; =
orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: =
none; white-space: normal; widows: 2; word-spacing: 0px; border-spacing: =
0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; =
font-family: Helvetica; font-style: normal; font-variant: normal; =
font-weight: normal; letter-spacing: normal; line-height: normal; =
orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; =
widows: 2; word-spacing: 0px; border-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; font-family: Helvetica; font-style: =
normal; font-variant: normal; font-weight: normal; letter-spacing: =
normal; line-height: normal; orphans: 2; text-indent: 0px; =
text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; =
border-spacing: 0px; -webkit-text-decorations-in-effect: none; =
-webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; =
font-size: medium; "><div style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; =
font-family: Helvetica; font-style: normal; font-variant: normal; =
font-weight: normal; letter-spacing: normal; line-height: normal; =
orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; =
widows: 2; word-spacing: 0px; border-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; =
"><div>-----------------</div><div>Aaron Morton</div><div>Freelance =
Developer</div><div>@aaronmorton</div><div><a =
href=3D"http://www.thelastpickle.com/">http://www.thelastpickle.com</a></d=
iv></div></span></div></span></div></span></span>
</div>

<br><div><div>On 25/08/2012, at 11:00 AM, rubbish me &lt;<a =
href=3D"mailto:rubbish.me@googlemail.com">rubbish.me@googlemail.com</a>&gt=
; wrote:</div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite"><div style=3D"word-wrap: break-word; -webkit-nbsp-mode: =
space; -webkit-line-break: after-white-space; "><div>Thanks, Aaron, for =
your reply - please see the inline.</div><div><br></div><br><div><div>On =
24 Aug 2012, at 11:04, aaron morton wrote:</div><br =
class=3D"Apple-interchange-newline"><blockquote type=3D"cite"><meta =
http-equiv=3D"Content-Type" content=3D"text/html charset=3Dus-ascii"><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><blockquote type=3D"cite">- we =
are running on production linux VMs (not ideal but this is out of our =
hands)<br></blockquote>Is the VM doing anything wacky with the IO =
?</div></blockquote><div><br></div><div>Could be. &nbsp;But I thought we =
would ask here first. &nbsp;This is a bit difficult to prove cos we dont =
have the control over these VMs.</div><br><blockquote type=3D"cite"><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; ">&nbsp;<div><br><blockquote =
type=3D"cite">As part of a DR exercise, we killed all 6 nodes in =
DC1,</blockquote>Nice disaster. Out of interest, what was the shutdown =
process ?</div></div></blockquote><div><br></div><div>Brutally. kill =
-9.</div><div><br></div><br><blockquote type=3D"cite"><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><br><blockquote type=3D"cite">We =
noticed that data that was written an hour before the exercise, around =
the last memtables being flushed,was not found in =
DC1.&nbsp;<br></blockquote><div>To confirm, data was written to DC 1 at =
CL LOCAL_QUORUM before the DR =
exercise.&nbsp;</div><div><br></div><div>Was the missing data written =
before or after the memtable flush ? I'm trying to understand if the =
data should have been in the commit log or the =
memtables.&nbsp;</div></div></blockquote><div><br></div><div>Missing =
data was those written after the last flush. &nbsp;These data was =
retrievable before the DR exercise.</div><br><blockquote =
type=3D"cite"><div style=3D"word-wrap: break-word; -webkit-nbsp-mode: =
space; -webkit-line-break: after-white-space; "><div><br></div><div>Can =
you provide some more info on how you are detecting it is not found in =
DC 1?</div><div><br></div></div></blockquote><div><br></div><div>We =
tried hector, consistencylevel=3Dlocal quorum. &nbsp;We had missing =
column or the whole row. &nbsp;</div><div><br></div><div>We tried =
cassandra-cli on DC1 nodes, same.</div><div><br></div><div>However once =
we run the same query on DC2, C* must have then done a read-repair. That =
particular piece of result data would appear in DC1 =
again.</div><div><br></div><br><blockquote type=3D"cite"><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><div><blockquote type=3D"cite">If=
 we understand correctly, commit logs are being written first and then =
to disk every 10s.&nbsp;</blockquote>Writes are put into a bounded queue =
and processed as fast as the IO can keep up. Every 10s a sync messages =
is added to the queue. Not that the commit log segment may rotate at any =
time which requires a sync.&nbsp;</div><div><br></div><div>A loss of =
data across all nodes in a DC seems odd. If you can provide some more =
information we may be able to =
help.&nbsp;</div></div></blockquote><div><br></div><div><br></div><div>We =
are wondering if the fsync of the commit log was working. &nbsp;But we =
saw no errors / warning in logs. &nbsp;Wondering if there is way to =
verify....</div><div><br></div><br><blockquote type=3D"cite"><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; =
"><div><br></div><div>Cheers</div><div><br><div =
apple-content-edited=3D"true">
<span class=3D"Apple-style-span" style=3D"border-collapse: separate; =
font-family: Helvetica; font-style: normal; font-variant: normal; =
font-weight: normal; letter-spacing: normal; line-height: normal; =
orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: =
none; white-space: normal; widows: 2; word-spacing: 0px; =
-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: =
0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; =
font-family: Helvetica; font-style: normal; font-variant: normal; =
font-weight: normal; letter-spacing: normal; line-height: normal; =
orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; =
widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; font-family: Helvetica; font-style: =
normal; font-variant: normal; font-weight: normal; letter-spacing: =
normal; line-height: normal; orphans: 2; text-indent: 0px; =
text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; =
-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: =
0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; font-family: Helvetica; font-style: =
normal; font-variant: normal; font-weight: normal; letter-spacing: =
normal; line-height: normal; orphans: 2; text-indent: 0px; =
text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; =
-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: =
0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; =
"><div>-----------------</div><div>Aaron Morton</div><div>Freelance =
Developer</div><div>@aaronmorton</div><div><a =
href=3D"http://www.thelastpickle.com/">http://www.thelastpickle.com</a></d=
iv></div></span></div></span></div></span></span>
</div>

<br><div><div>On 24/08/2012, at 6:01 AM, rubbish me &lt;<a =
href=3D"mailto:rubbish.me@googlemail.com">rubbish.me@googlemail.com</a>&gt=
; wrote:</div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite">Hi all<br><br>First off, let's introduce the setup. =
<br><br>- 6 x C* 1.1.2 in active DC (DC1), another 6 in another =
(DC2)<br>- keyspace's RF=3D3 in each DC<br>- Hector as client.<br>- =
client talks only to DC1 unless DC1 can't serve the request. In which =
case talks only to DC2<br>- commit log was periodically sync with the =
default setting of 10s. <br>- consistency policy =3D LOCAL QUORUM for =
both read and write. <br>- we are running on production linux VMs (not =
ideal but this is out of our hands)<br>-----<br>As part of a DR =
exercise, we killed all 6 nodes in DC1, hector starts talking to DC2, =
all the data was still there, everything continued to work perfectly. =
<br><br>Then we brought all nodes, one by one, in DC1 up. We saw a =
message saying all the commit logs were replayed. No errors reported. =
&nbsp;We didn't run repair at this time. <br><br>We noticed that data =
that was written an hour before the exercise, around the last memtables =
being flushed,was not found in DC1. <br><br>If we understand correctly, =
commit logs are being written first and then to disk every 10s. At worst =
we lost the last 10s of data. What could be the cause of this behaviour? =
<br><br>With the blessing of C* we could recovered all these data from =
DC2. But we would like to understand why. <br><br>Many thanks in =
advanced. =
<br><br>Amy<br><br><br></blockquote></div><br></div></div></blockquote></d=
iv><br></div></blockquote></div><br></div></blockquote></div></div></block=
quote></div><br></body></html>=

--Apple-Mail=_E82EA460-3CB0-407A-AA00-8F3639E1EE32--