Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of ailinykh@gmail.com designates
 209.85.215.44 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAAam9st__fc_r5v_AqVEpxeMa2mde+JW16mVAJazySbOoNrf9w@mail.gmail.com>
References: 
 <CAK0tFt6eog7NXLkQ4DMXXD2Y6P=ZycUwG3sNZrH8r1KQ4Uyf-w@mail.gmail.com>
	<D40540EB-8C75-4E85-8412-82A9D3EF024D@thelastpickle.com>
	<CAK0tFt5Akr4hkyEk2pbMW_xKf5TVV4+q-Hi+qR84YY7X0xfngQ@mail.gmail.com>
	<CAAam9sumSEsjzmMgW_8ZOjj2LmFLs4_aTaTSNEuUoxJLWojPfA@mail.gmail.com>
	<CAK0tFt7fW-DBqvD4PUhHwv9=SmfPPhybk9oUnYDgro18bFifxQ@mail.gmail.com>
	<CAAam9st__fc_r5v_AqVEpxeMa2mde+JW16mVAJazySbOoNrf9w@mail.gmail.com>
Date: Fri, 7 Dec 2012 13:04:15 -0800
Message-ID: 
 <CAK0tFt43Y_sWr6BeV9eAnUfKySyzCW-utLG2YeCa0O4PM2HrgA@mail.gmail.com>
Subject: Re: how to take consistant snapshot?
From: Andrey Ilinykh <ailinykh@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=bcaec5540890dfeb3c04d0498ed1

--bcaec5540890dfeb3c04d0498ed1
Content-Type: text/plain; charset=ISO-8859-1

Agreed.


On Fri, Dec 7, 2012 at 12:38 PM, Tyler Hobbs <tyler@datastax.com> wrote:

> Right.  I don't personally think incremental backup is useful beyond
> restoring individual nodes unless none of your data happens to reference
> any other rows.
>
>
> On Fri, Dec 7, 2012 at 11:37 AM, Andrey Ilinykh <ailinykh@gmail.com>wrote:
>
>> That's right. But when I have incremental backup on each CF gets flushed
>> independently. I have "hot" CF which gets flushed every several minutes and
>> regular CF which gets flushed every hour or so. They have references to
>> each other and data in sstables is definitely inconsistent.
>>
>>
>>
>> On Fri, Dec 7, 2012 at 9:28 AM, Tyler Hobbs <tyler@datastax.com> wrote:
>>
>>> Snapshots trigger a flush first, so data that's currently in the commit
>>> log will be covered by the snapshot.
>>>
>>>
>>> On Thu, Dec 6, 2012 at 11:52 PM, Andrey Ilinykh <ailinykh@gmail.com>wrote:
>>>
>>>>
>>>>
>>>>
>>>> On Thu, Dec 6, 2012 at 7:34 PM, aaron morton <aaron@thelastpickle.com>wrote:
>>>>
>>>>> For background
>>>>>
>>>>>
>>>>> http://wiki.apache.org/cassandra/Operations?highlight=%28snapshot%29#Consistent_backups<http://wiki.apache.org/cassandra/Operations?highlight=(snapshot)#Consistent_backups>
>>>>>
>>>>> If you it for a single node then yes there is a chance of
>>>>> inconsistency across CF's.
>>>>>
>>>>> If you have mulitple nodes the snashots you take on the later nodes
>>>>> will help. If you use CL QUOURM for reads you *may* be ok (cannot work it
>>>>> out quickly.). If you use CL ALL for reads you will be ok. Or you can use
>>>>> nodetool repair to ensure the data is consistent.
>>>>>
>>>>> I'm talking about restoring whole cluster, so all nodes are restored
>>>> from backup and all of them are inconsistent because they lost data  from
>>>> commit logs.  It doesn't matter what CL I use, some data may be lost.
>>>> Cassandra 1.1 supports commit log archiving
>>>> http://www.datastax.com/docs/1.1/configuration/commitlog_archiving
>>>>  I think if I store both flushed sstables and commit logs it should
>>>> solve my problem. I'm wondering if someone has any experience with this
>>>> feature?
>>>>
>>>> Thank you,
>>>>   Andrey
>>>>
>>>
>>>
>>>
>>> --
>>> Tyler Hobbs
>>> DataStax <http://datastax.com/>
>>>
>>>
>>
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>
>

--bcaec5540890dfeb3c04d0498ed1
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Agreed.<div><br></div><div class=3D"gmail_extra"><br><br><div class=3D"gmai=
l_quote">On Fri, Dec 7, 2012 at 12:38 PM, Tyler Hobbs <span dir=3D"ltr">&lt=
;<a href=3D"mailto:tyler@datastax.com" target=3D"_blank">tyler@datastax.com=
</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">Right.=A0 I don&#39;t personally think incre=
mental backup is useful beyond restoring individual nodes unless none of yo=
ur data happens to reference any other rows.<br>
<div class=3D"gmail_extra"><div><div class=3D"h5"><br><br><div class=3D"gma=
il_quote">
On Fri, Dec 7, 2012 at 11:37 AM, Andrey Ilinykh <span dir=3D"ltr">&lt;<a hr=
ef=3D"mailto:ailinykh@gmail.com" target=3D"_blank">ailinykh@gmail.com</a>&g=
t;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0=
 .8ex;border-left:1px #ccc solid;padding-left:1ex">

That&#39;s right. But when I have incremental backup on each CF gets flushe=
d independently. I have &quot;hot&quot; CF which gets flushed every several=
 minutes and regular CF which gets flushed every hour or so. They have refe=
rences to each other and data in sstables is=A0definitely=A0inconsistent.<d=
iv>

<div><div>
<br></div><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On =
Fri, Dec 7, 2012 at 9:28 AM, Tyler Hobbs <span dir=3D"ltr">&lt;<a href=3D"m=
ailto:tyler@datastax.com" target=3D"_blank">tyler@datastax.com</a>&gt;</spa=
n> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">Snapshots trigger a flush first, so data tha=
t&#39;s currently in the commit log will be covered by the snapshot.<br><di=
v class=3D"gmail_extra">


<div><div><br><br><div class=3D"gmail_quote">On Thu, Dec 6, 2012 at 11:52 P=
M, Andrey Ilinykh <span dir=3D"ltr">&lt;<a href=3D"mailto:ailinykh@gmail.co=
m" target=3D"_blank">ailinykh@gmail.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><br><div class=3D"gmail_extra"><br><br><div =
class=3D"gmail_quote"><div>On Thu, Dec 6, 2012 at 7:34 PM, aaron morton <sp=
an dir=3D"ltr">&lt;<a href=3D"mailto:aaron@thelastpickle.com" target=3D"_bl=
ank">aaron@thelastpickle.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin-top:0px;margin-right:0px;=
margin-bottom:0px;margin-left:0.8ex;border-left-width:1px;border-left-color=
:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style=3D"w=
ord-wrap:break-word">


For background<div><br></div><div><a href=3D"http://wiki.apache.org/cassand=
ra/Operations?highlight=3D(snapshot)#Consistent_backups" target=3D"_blank">=
http://wiki.apache.org/cassandra/Operations?highlight=3D%28snapshot%29#Cons=
istent_backups</a></div>


<div><br></div><div>If you it for a single node then yes there is a chance =
of inconsistency across CF&#39;s.=A0</div><div><br></div><div>If you have m=
ulitple nodes the snashots you take on the later nodes will help. If you us=
e CL QUOURM for reads you *may* be ok (cannot work it out quickly.). If you=
 use CL ALL for reads you will be ok. Or you can use nodetool repair to ens=
ure the data is consistent.=A0</div>


<div><br></div></div></blockquote></div><div>I&#39;m talking about restorin=
g whole cluster, so all nodes are restored from backup and all of them are=
=A0inconsistent because they lost data =A0from commit logs. =A0It doesn&#39=
;t matter what CL I use, some data may be lost.=A0</div>


<div>Cassandra 1.1 supports commit log archiving</div><div><a href=3D"http:=
//www.datastax.com/docs/1.1/configuration/commitlog_archiving" target=3D"_b=
lank">http://www.datastax.com/docs/1.1/configuration/commitlog_archiving</a=
><br>


</div><div>
I think if I store both flushed sstables and commit logs it should solve my=
 problem. I&#39;m wondering if someone has any=A0experience=A0with this fea=
ture?</div><div><br></div><div>Thank you,</div><div>=A0 Andrey</div></div><=
/div>


</blockquote></div><br><br clear=3D"all"><br></div></div><span><font color=
=3D"#888888">-- <br><font color=3D"#888888">Tyler Hobbs<span></span><br>
<a href=3D"http://datastax.com/" target=3D"_blank">DataStax</a><br></font><=
br>
</font></span></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br><br clear=3D"all"><br></div></div><span =
class=3D"HOEnZb"><font color=3D"#888888">-- <br><font color=3D"#888888">Tyl=
er Hobbs<span></span><br>
<a href=3D"http://datastax.com/" target=3D"_blank">DataStax</a><br></font><=
br>
</font></span></div>
</blockquote></div><br></div>

--bcaec5540890dfeb3c04d0498ed1--