Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of ailinykh@gmail.com designates
 209.85.217.172 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAAam9sumSEsjzmMgW_8ZOjj2LmFLs4_aTaTSNEuUoxJLWojPfA@mail.gmail.com>
References: 
 <CAK0tFt6eog7NXLkQ4DMXXD2Y6P=ZycUwG3sNZrH8r1KQ4Uyf-w@mail.gmail.com>
	<D40540EB-8C75-4E85-8412-82A9D3EF024D@thelastpickle.com>
	<CAK0tFt5Akr4hkyEk2pbMW_xKf5TVV4+q-Hi+qR84YY7X0xfngQ@mail.gmail.com>
	<CAAam9sumSEsjzmMgW_8ZOjj2LmFLs4_aTaTSNEuUoxJLWojPfA@mail.gmail.com>
Date: Fri, 7 Dec 2012 09:37:33 -0800
Message-ID: 
 <CAK0tFt7fW-DBqvD4PUhHwv9=SmfPPhybk9oUnYDgro18bFifxQ@mail.gmail.com>
Subject: Re: how to take consistant snapshot?
From: Andrey Ilinykh <ailinykh@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=e89a8f22bb53ae8b1a04d046ab68

--e89a8f22bb53ae8b1a04d046ab68
Content-Type: text/plain; charset=ISO-8859-1

That's right. But when I have incremental backup on each CF gets flushed
independently. I have "hot" CF which gets flushed every several minutes and
regular CF which gets flushed every hour or so. They have references to
each other and data in sstables is definitely inconsistent.


On Fri, Dec 7, 2012 at 9:28 AM, Tyler Hobbs <tyler@datastax.com> wrote:

> Snapshots trigger a flush first, so data that's currently in the commit
> log will be covered by the snapshot.
>
>
> On Thu, Dec 6, 2012 at 11:52 PM, Andrey Ilinykh <ailinykh@gmail.com>wrote:
>
>>
>>
>>
>> On Thu, Dec 6, 2012 at 7:34 PM, aaron morton <aaron@thelastpickle.com>wrote:
>>
>>> For background
>>>
>>>
>>> http://wiki.apache.org/cassandra/Operations?highlight=%28snapshot%29#Consistent_backups<http://wiki.apache.org/cassandra/Operations?highlight=(snapshot)#Consistent_backups>
>>>
>>> If you it for a single node then yes there is a chance of inconsistency
>>> across CF's.
>>>
>>> If you have mulitple nodes the snashots you take on the later nodes will
>>> help. If you use CL QUOURM for reads you *may* be ok (cannot work it out
>>> quickly.). If you use CL ALL for reads you will be ok. Or you can use
>>> nodetool repair to ensure the data is consistent.
>>>
>>> I'm talking about restoring whole cluster, so all nodes are restored
>> from backup and all of them are inconsistent because they lost data  from
>> commit logs.  It doesn't matter what CL I use, some data may be lost.
>> Cassandra 1.1 supports commit log archiving
>> http://www.datastax.com/docs/1.1/configuration/commitlog_archiving
>>  I think if I store both flushed sstables and commit logs it should solve
>> my problem. I'm wondering if someone has any experience with this feature?
>>
>> Thank you,
>>   Andrey
>>
>
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>
>

--e89a8f22bb53ae8b1a04d046ab68
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

That&#39;s right. But when I have incremental backup on each CF gets flushe=
d independently. I have &quot;hot&quot; CF which gets flushed every several=
 minutes and regular CF which gets flushed every hour or so. They have refe=
rences to each other and data in sstables is=A0definitely=A0inconsistent.<d=
iv>
<br></div><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On =
Fri, Dec 7, 2012 at 9:28 AM, Tyler Hobbs <span dir=3D"ltr">&lt;<a href=3D"m=
ailto:tyler@datastax.com" target=3D"_blank">tyler@datastax.com</a>&gt;</spa=
n> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">Snapshots trigger a flush first, so data tha=
t&#39;s currently in the commit log will be covered by the snapshot.<br><di=
v class=3D"gmail_extra">
<div><div class=3D"h5"><br><br><div class=3D"gmail_quote">On Thu, Dec 6, 20=
12 at 11:52 PM, Andrey Ilinykh <span dir=3D"ltr">&lt;<a href=3D"mailto:aili=
nykh@gmail.com" target=3D"_blank">ailinykh@gmail.com</a>&gt;</span> wrote:<=
br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><br><div class=3D"gmail_extra"><br><br><div =
class=3D"gmail_quote"><div>On Thu, Dec 6, 2012 at 7:34 PM, aaron morton <sp=
an dir=3D"ltr">&lt;<a href=3D"mailto:aaron@thelastpickle.com" target=3D"_bl=
ank">aaron@thelastpickle.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin-top:0px;margin-right:0px;=
margin-bottom:0px;margin-left:0.8ex;border-left-width:1px;border-left-color=
:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style=3D"w=
ord-wrap:break-word">


For background<div><br></div><div><a href=3D"http://wiki.apache.org/cassand=
ra/Operations?highlight=3D(snapshot)#Consistent_backups" target=3D"_blank">=
http://wiki.apache.org/cassandra/Operations?highlight=3D%28snapshot%29#Cons=
istent_backups</a></div>


<div><br></div><div>If you it for a single node then yes there is a chance =
of inconsistency across CF&#39;s.=A0</div><div><br></div><div>If you have m=
ulitple nodes the snashots you take on the later nodes will help. If you us=
e CL QUOURM for reads you *may* be ok (cannot work it out quickly.). If you=
 use CL ALL for reads you will be ok. Or you can use nodetool repair to ens=
ure the data is consistent.=A0</div>


<div><br></div></div></blockquote></div><div>I&#39;m talking about restorin=
g whole cluster, so all nodes are restored from backup and all of them are=
=A0inconsistent because they lost data =A0from commit logs. =A0It doesn&#39=
;t matter what CL I use, some data may be lost.=A0</div>


<div>Cassandra 1.1 supports commit log archiving</div><div><a href=3D"http:=
//www.datastax.com/docs/1.1/configuration/commitlog_archiving" target=3D"_b=
lank">http://www.datastax.com/docs/1.1/configuration/commitlog_archiving</a=
><br>

</div><div>
I think if I store both flushed sstables and commit logs it should solve my=
 problem. I&#39;m wondering if someone has any=A0experience=A0with this fea=
ture?</div><div><br></div><div>Thank you,</div><div>=A0 Andrey</div></div><=
/div>


</blockquote></div><br><br clear=3D"all"><br></div></div><span class=3D"HOE=
nZb"><font color=3D"#888888">-- <br><font color=3D"#888888">Tyler Hobbs<spa=
n></span><br>
<a href=3D"http://datastax.com/" target=3D"_blank">DataStax</a><br></font><=
br>
</font></span></div>
</blockquote></div><br></div>

--e89a8f22bb53ae8b1a04d046ab68--