Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: unknown ipv4:173.13.187.41 (nike.apache.org: encountered
 unrecognized mechanism during SPF processing of domain of frichard@xobni.com)
MIME-Version: 1.0
In-Reply-To: <0A0F17A1-1D13-4A6C-814F-B6FDD0BB3F84@thelastpickle.com>
References: 
 <CAFP8J3THYwxmmSuaU9-v9YYdL6LCMbaejyDzAn3fhe0=JA2yVg@mail.gmail.com>
	<0A0F17A1-1D13-4A6C-814F-B6FDD0BB3F84@thelastpickle.com>
Date: Mon, 25 Mar 2013 09:40:14 -0700
Message-ID: 
 <CAFP8J3TLO8YcZvCHywhfnqOUMwXxOncKs1J5Beaz3MTSTTZHFg@mail.gmail.com>
Subject: Re: Many to one type of replication.
From: Francois Richard <frichard@xobni.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=f46d043d67718cbf7f04d8c275fa

--f46d043d67718cbf7f04d8c275fa
Content-Type: text/plain; charset=ISO-8859-1

Thanks much,

I wanted to confirm.  We will do this at the application level.

FR


On Sun, Mar 24, 2013 at 10:03 AM, aaron morton <aaron@thelastpickle.com>wrote:

> From this mailing list I found this Github project that is doing something
> similar by looking at the commit logs:
> https://github.com/carloscm/cassandra-commitlog-extract
>
> IMHO tailing the logs is fragile, and you may be better off handling it at
> the application level.
>
> But is there other options around using a custom replication strategy?
>
> There is no such thing as "one directional" replication. For example
> replication everything from DC 1 to DC 2, but do not replicate from DC 2 to
> DC 1.
> You may be better off reducing the number of clusters and then running one
> transactional and one analytical DC.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 24/03/2013, at 3:42 AM, Francois Richard <frichard@xobni.com> wrote:
>
> Hi,
>
> We currently run our Cassandra deployment with
> multiple independent clusters.  The clusters are totally self contain in
> terms of redundancy and independent from each others.  We have a "sharding
> "layer higher in our stack to dispatch the requests to the right
> application stack and this stack connects to his associated Cassandra
> cluster. All the cassandra clusters are identical in terms of hosted
> keyspaces, column families, replication factor ...
>
> At this point I am investigating ways to build a central cassandra cluster
> that could contain all the data from all the other cassandra clusters and I
> am wondering how to best do it.  The goal is to have a global view of our
> data and to be able to do some massive crunching on it.
>
> For sure we can build some ETL type of job that would figure out the data
> that was updated, extract it, and load it to the central cassandra cluster.
>  From this mailing list I found this Github project that is doing something
> similar by looking at the commit logs:
> https://github.com/carloscm/cassandra-commitlog-extract
>
> But is there other options around using a custom replication strategy?
>  Any other general suggestions ?
>
> Thanks,
>
> FR
>
> --
>
> _____________________________________________
>
> *Francois Richard *
>
>
>
>


-- 

_____________________________________________

*Francois Richard *

VP Server Engineering and Operations**

Xobni Engineering

Xobni, Inc.

539 Bryant St

San Francisco, CA  94107

415-987-5305 Mobile

(For emergencies please leave a voice-mail to mobile)


www.xobni.com**

--f46d043d67718cbf7f04d8c275fa
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Thanks much,<div><br></div><div style>I wanted to confirm.=
 =A0We will do this at the application level.</div><div style><br></div><di=
v style>FR</div></div><div class=3D"gmail_extra"><br><br><div class=3D"gmai=
l_quote">
On Sun, Mar 24, 2013 at 10:03 AM, aaron morton <span dir=3D"ltr">&lt;<a hre=
f=3D"mailto:aaron@thelastpickle.com" target=3D"_blank">aaron@thelastpickle.=
com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"mar=
gin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style=3D"word-wrap:break-word"><blockquote type=3D"cite"><div dir=3D"l=
tr">From this mailing list I found this Github project that is doing someth=
ing similar by looking at the commit logs:=A0<a href=3D"https://github.com/=
carloscm/cassandra-commitlog-extract" target=3D"_blank">https://github.com/=
carloscm/cassandra-commitlog-extract</a></div>
</blockquote>IMHO tailing the logs is fragile, and you may be better off ha=
ndling it at the application level.=A0<div><br><div><blockquote type=3D"cit=
e"><div dir=3D"ltr">But is there other options around using a custom replic=
ation strategy?</div>
</blockquote>There is no such thing as &quot;one directional&quot; replicat=
ion. For example replication everything from DC 1 to DC 2, but do not repli=
cate from DC 2 to DC 1.=A0</div><div>You may be better off reducing the num=
ber of clusters and then running one transactional and one analytical DC.</=
div>
<div><br></div><div>Cheers</div><div><br><div>
<div style=3D"text-indent:0px;letter-spacing:normal;font-variant:normal;tex=
t-align:-webkit-auto;font-style:normal;font-weight:normal;line-height:norma=
l;text-transform:none;font-size:medium;white-space:normal;font-family:Helve=
tica;word-wrap:break-word;word-spacing:0px">
<div style=3D"text-indent:0px;letter-spacing:normal;font-variant:normal;tex=
t-align:-webkit-auto;font-style:normal;font-weight:normal;line-height:norma=
l;text-transform:none;font-size:medium;white-space:normal;font-family:Helve=
tica;word-wrap:break-word;word-spacing:0px">
<span style=3D"border-collapse:separate;border-spacing:0px"><div style=3D"w=
ord-wrap:break-word"><span style=3D"border-spacing:0px;text-indent:0px;lett=
er-spacing:normal;font-variant:normal;font-style:normal;font-weight:normal;=
line-height:normal;border-collapse:separate;text-transform:none;font-size:m=
edium;white-space:normal;font-family:Helvetica;word-spacing:0px"><div style=
=3D"word-wrap:break-word">
<span style=3D"border-spacing:0px;text-indent:0px;letter-spacing:normal;fon=
t-variant:normal;font-style:normal;font-weight:normal;line-height:normal;bo=
rder-collapse:separate;text-transform:none;font-size:medium;white-space:nor=
mal;font-family:Helvetica;word-spacing:0px"><div style=3D"word-wrap:break-w=
ord">
<span style=3D"border-spacing:0px;text-indent:0px;letter-spacing:normal;fon=
t-variant:normal;font-style:normal;font-weight:normal;line-height:normal;bo=
rder-collapse:separate;text-transform:none;font-size:medium;white-space:nor=
mal;font-family:Helvetica;word-spacing:0px"><div style=3D"word-wrap:break-w=
ord">
<div>-----------------</div><div>Aaron Morton</div><div>Freelance Cassandra=
 Consultant</div><div>New Zealand</div><div><br></div><div>@aaronmorton</di=
v><div><a href=3D"http://www.thelastpickle.com" target=3D"_blank">http://ww=
w.thelastpickle.com</a></div>
</div></span></div></span></div></span></div></span></div></div>
</div>

<br><div><div>On 24/03/2013, at 3:42 AM, Francois Richard &lt;<a href=3D"ma=
ilto:frichard@xobni.com" target=3D"_blank">frichard@xobni.com</a>&gt; wrote=
:</div><br><blockquote type=3D"cite"><div dir=3D"ltr">Hi,<div><br></div><di=
v>We currently run our Cassandra deployment with multiple=A0independent=A0c=
lusters. =A0The clusters are totally self contain in terms of redundancy an=
d independent=A0from each others. =A0We have a &quot;sharding &quot;layer h=
igher in our stack to dispatch the requests to the right application stack =
and this stack connects to his associated Cassandra cluster. All the cassan=
dra clusters are identical in terms of hosted keyspaces,=A0column=A0familie=
s, replication factor ...</div>

<div><br></div><div>At this point I am=A0investigating=A0ways to build a ce=
ntral cassandra cluster that could contain all the data from all the other =
cassandra clusters and I am wondering how to best do it. =A0The goal is to =
have a global view of our data and to be able to do some massive crunching =
on it.</div>

<div><br></div><div>For sure we can build some ETL type of job that would f=
igure out the data that was updated, extract it, and load it to the central=
 cassandra cluster. =A0From this mailing list I found this Github project t=
hat is doing something similar by looking at the commit logs: <a href=3D"ht=
tps://github.com/carloscm/cassandra-commitlog-extract" target=3D"_blank">ht=
tps://github.com/carloscm/cassandra-commitlog-extract</a></div>

<div><br></div><div>But is there other options around using a custom replic=
ation strategy? =A0Any other general suggestions ?</div><div><br></div><div=
>Thanks,</div><div><br></div><div>FR=A0</div><span class=3D"HOEnZb"><font c=
olor=3D"#888888">
<div><div><br></div>-- <br><p><span style=3D"font-size:8pt;color:rgb(255,13=
1,40)">_____________________________________________<span style=3D"letter-s=
pacing:1pt"></span></span></p><p><b><span style=3D"color:navy;letter-spacin=
g:1pt">Francois Richard </span></b><span style=3D"font-size:12pt"></span></=
p>
<p><br></p>
</div></font></span></div>
</blockquote></div><br></div></div></div></blockquote></div><br><br clear=
=3D"all"><div><br></div>-- <br><p><span style=3D"font-size:8.0pt;color:#ff8=
328">_____________________________________________<span style=3D"letter-spa=
cing:1.0pt"></span></span></p>


<p><b><span style=3D"color:navy;letter-spacing:1.0pt">Francois Richard </sp=
an></b><span style=3D"font-size:12.0pt"></span></p>

<p><span style=3D"font-size:8.0pt;color:navy;letter-spacing:1.0pt">VP Serve=
r
Engineering and Operations</span><b><span style=3D"font-size:8.0pt"></span>=
</b></p>

<p><span style=3D"font-size:8.0pt;color:navy;letter-spacing:1.0pt">Xobni En=
gineering </span></p>

<p><span style=3D"color:rgb(128,128,128);font-family:Arial,sans-serif;font-=
size:10px;letter-spacing:1px">Xobni, Inc.</span></p>

<p><span style=3D"font-size:8.0pt;color:gray;letter-spacing:1.0pt">539 Brya=
nt St</span></p>

<p><span style=3D"font-size:8.0pt;color:gray;letter-spacing:1.0pt">San Fran=
cisco,=A0CA=A0
94107</span><span style=3D"color:rgb(128,128,128);font-family:Arial,sans-se=
rif;font-size:10px;letter-spacing:1px">=A0</span></p>

<p><span style=3D"font-size:8.0pt;color:gray;letter-spacing:1.0pt">415-987-=
5305 Mobile</span></p>

<p><span style=3D"font-size:8.0pt;color:#c00000">(For emergencies please le=
ave a voice-mail to
mobile)</span></p>

<p><font color=3D"#808080" face=3D"Arial, sans-serif" size=3D"1"><span styl=
e=3D"letter-spacing:1px"><br></span></font></p>

<p><span style=3D"font-size:8.0pt;color:#557eb0;letter-spacing:1.0pt"><a hr=
ef=3D"http://www.xobni.com" target=3D"_blank">www.xobni.com</a><u></u></spa=
n></p>
</div>

--f46d043d67718cbf7f04d8c275fa--