Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: neutral (athena.apache.org: local policy)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws;
  s=s1024; d=yahoo.co.uk;
  h=X-YMail-OSG:Received:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type;
  b=yI0SUwETRxz8dj0iqBuFF5S58nwcfohL2E+0gDfZurCcGLD5oq7Hy5ClQ3VnQrJBUzAM4ULwpp/7wkOKs/O+NGGo9HBqXqag3h/o/Oqprkzr9W8aKHmIEx3Ga170lqJdw/5NjUHGTFJu7xK2ztKaO9AxISMrybeUF1s1brg2RTQ=;
References: 
 <CALdd-zjvWsd4r7fS8xL0D2JDzP9Vok2+w7u_4wCyPkfGMxKbPg@mail.gmail.com>
Message-ID: <1320324393.2047.YahooMailNeo@web132107.mail.ird.yahoo.com>
Date: Thu, 3 Nov 2011 12:46:33 +0000 (GMT)
From: Peter Tillotson <slatemine@yahoo.co.uk>
Reply-To: Peter Tillotson <slatemine@yahoo.co.uk>
Subject: Re: Second Cassandra users survey
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
In-Reply-To: 
 <CALdd-zjvWsd4r7fS8xL0D2JDzP9Vok2+w7u_4wCyPkfGMxKbPg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: multipart/alternative;
 boundary="-1806482184-754292353-1320324393=:2047"

---1806482184-754292353-1320324393=:2047
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable

I'm using Cassandra as a big graph database, loading large=A0volumes=A0of d=
ata live and linking on the fly.=A0=0AThe number of edges grow geometricall=
y with data added, and need to be read to continue linking the graph on the=
 fly.=A0=0A=0A=0AConsequently, my problem is constrained by:=0A=A0* Predomi=
nantly read - especially when data gets large and reads are quasi random=0A=
=A0* I have lots of data to plow in, to be read=0A=A0* Although the problem=
 scale out and possibly all be in RAM, it requires too much kit for the to =
be viable=A0=0A=0ASo, my findings with Cassandra are:=0A=A0* Compaction is =
expensive, I need it but=0A=A0 =A01) It takes away disk IO from my reads=0A=
=A0 =A02)=A0Destroys the file cache=0A=A0 =A0I've not had chance to do exte=
nsive tests with the Level db compaction=0A=A0* Compaction has been too har=
d to configure historically=0A=A0* Memory hungry=0A=0ASo for me the biggest=
 features would be=0A=A0* Cheaper compaction - =A0=A0=0A=A0* Lower memory u=
sage=0A=A0* Indexing dynamic colnames (eg Lucene TermEnum against rowkey:co=
lkey)=0A=A0 =A0I do a lot of=A0checking=A0against dynamic colnames =A0=0A=
=A0=0AThe great features are that redundancy, and live addition of shards i=
s available out of the box.=A0=0A=0A=0AI've also experimented with Golden O=
rb and Triggered updates, I think there is a fair bit that can be achieved =
in my problem with local data access. Through GoldenOrb and Hadoop writable=
s a managed to get both a BigTable and Pregel access model onto my Cassandr=
a data. It was schema specific, but provided a local compute model.=A0=0A=
=0Ap=A0=0A=0A=0A________________________________=0AFrom: Jonathan Ellis <jb=
ellis@gmail.com>=0ATo: user <user@cassandra.apache.org>=0ASent: Tuesday, 1 =
November 2011, 22:59=0ASubject: Second Cassandra users survey=0A=0AHi all,=
=0A=0ATwo years ago I asked for Cassandra use cases and feature requests.=
=0A[1]=A0 The results [2] have been extremely useful in setting and=0Aprior=
itizing goals for Cassandra development.=A0 But with the release of=0A1.0 w=
e've accomplished basically everything from our original wish=0Alist. [3]=
=0A=0AI'd love to hear from modern Cassandra users again, especially if=0Ay=
ou're usually a quiet lurker.=A0 What does Cassandra do well?=A0 What are=
=0Ayour pain points?=A0 What's your feature wish list?=0A=0AAs before, if y=
ou're in stealth mode or don't want to say anything in=0Apublic, feel free =
to reply to me privately and I will keep it off the=0Arecord.=0A=0A[1] http=
://www.mail-archive.com/cassandra-dev@incubator.apache.org/msg01148.html=0A=
[2] http://www.mail-archive.com/cassandra-user@incubator.apache.org/msg0144=
6.html=0A[3] http://www.mail-archive.com/dev@cassandra.apache.org/msg01524.=
html=0A=0A-- =0AJonathan Ellis=0AProject Chair, Apache Cassandra=0Aco-found=
er of DataStax, the source for professional Cassandra support=0Ahttp://www.=
datastax.com
---1806482184-754292353-1320324393=:2047
Content-Type: text/html; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable

<html><body><div style=3D"color:#000; background-color:#fff; font-family:ar=
ial, helvetica, sans-serif;font-size:10pt"><div><span><font class=3D"Apple-=
style-span" size=3D"2">I'm using Cassandra as a big graph database, loading=
 large&nbsp;volumes&nbsp;of data live and linking on the fly.&nbsp;</font><=
/span></div><div style=3D"font-family: arial, helvetica, sans-serif; font-s=
ize: 10pt; ">The number of edges grow geometrically with data added, and ne=
ed to be read to continue linking the graph on the fly.&nbsp;<br></div><div=
 style=3D"font-family: arial, helvetica, sans-serif; font-size: 10pt; "><br=
></div><div style=3D"font-family: arial, helvetica, sans-serif; font-size: =
10pt; ">Consequently, my problem is constrained by:</div><div style=3D"font=
-family: arial, helvetica, sans-serif; font-size: 10pt; ">&nbsp;* Predomina=
ntly read - especially when data gets large and reads are quasi random</div=
><div style=3D"font-family: arial, helvetica, sans-serif; font-size: 10pt; =
">&nbsp;*
 I have lots of data to plow in, to be read</div><div style=3D"font-family:=
 arial, helvetica, sans-serif; font-size: 10pt; ">&nbsp;* Although the prob=
lem scale out and possibly all be in RAM, it requires too much kit for the =
to be viable&nbsp;</div><div style=3D"font-family: arial, helvetica, sans-s=
erif; font-size: 10pt; "><br></div><div style=3D"font-family: arial, helvet=
ica, sans-serif; font-size: 10pt; ">So, my findings with Cassandra are:</di=
v><div style=3D"font-family: arial, helvetica, sans-serif; font-size: 10pt;=
 ">&nbsp;* Compaction is expensive, I need it but</div><div style=3D"font-f=
amily: arial, helvetica, sans-serif; font-size: 10pt; ">&nbsp; &nbsp;1) It =
takes away disk IO from my reads</div><div><font class=3D"Apple-style-span"=
 size=3D"2">&nbsp; &nbsp;2)&nbsp;Destroys the file cache</font></div><div><=
font class=3D"Apple-style-span" size=3D"2">&nbsp; &nbsp;I've not had chance=
 to do extensive tests with the Level db compaction</font></div><div><font
 class=3D"Apple-style-span" size=3D"2">&nbsp;* Compaction has been too hard=
 to configure historically</font></div><div><font class=3D"Apple-style-span=
" size=3D"2">&nbsp;* Memory hungry</font></div><div><font class=3D"Apple-st=
yle-span" size=3D"2"><br></font></div><div><font class=3D"Apple-style-span"=
 size=3D"2">So for me the biggest features would be</font></div><div><font =
class=3D"Apple-style-span" size=3D"2">&nbsp;* Cheaper compaction - &nbsp;&n=
bsp;</font></div><div><font class=3D"Apple-style-span" size=3D"2">&nbsp;* L=
ower memory usage</font></div><div><font class=3D"Apple-style-span" size=3D=
"2">&nbsp;* Indexing dynamic colnames (eg Lucene TermEnum against rowkey:co=
lkey)</font></div><div><font class=3D"Apple-style-span" size=3D"2">&nbsp; &=
nbsp;I do a lot of&nbsp;checking&nbsp;against dynamic colnames &nbsp;</font=
></div><div><font class=3D"Apple-style-span" size=3D"2">&nbsp;</font></div>=
<div><span class=3D"Apple-style-span" style=3D"font-size: 13px; ">The great=
 features are that redundancy,
 and live addition of shards is available out of the box.&nbsp;</span><br><=
/div><div style=3D"font-family: arial, helvetica, sans-serif; font-size: 10=
pt; "><br></div><div style=3D"font-family: arial, helvetica, sans-serif; fo=
nt-size: 10pt; ">I've also experimented with Golden Orb and Triggered updat=
es, I think there is a fair bit that can be achieved in my problem with loc=
al data access. Through GoldenOrb and Hadoop writables a managed to get bot=
h a BigTable and Pregel access model onto my Cassandra data. It was schema =
specific, but provided a local compute model.&nbsp;</div><div style=3D"font=
-family: arial, helvetica, sans-serif; font-size: 10pt; "><br></div><div st=
yle=3D"font-family: arial, helvetica, sans-serif; font-size: 10pt; ">p&nbsp=
;</div><div style=3D"font-family: arial, helvetica, sans-serif; font-size: =
10pt; "><br></div><div style=3D"font-size: 10pt; font-family: arial, helvet=
ica, sans-serif; "><div style=3D"font-size: 12pt; font-family: 'times new r=
oman',
 'new york', times, serif; "><font size=3D"2" face=3D"Arial"><hr size=3D"1"=
><b><span style=3D"font-weight:bold;">From:</span></b> Jonathan Ellis &lt;j=
bellis@gmail.com&gt;<br><b><span style=3D"font-weight: bold;">To:</span></b=
> user &lt;user@cassandra.apache.org&gt;<br><b><span style=3D"font-weight: =
bold;">Sent:</span></b> Tuesday, 1 November 2011, 22:59<br><b><span style=
=3D"font-weight: bold;">Subject:</span></b> Second Cassandra users survey<b=
r></font><br>Hi all,<br><br>Two years ago I asked for Cassandra use cases a=
nd feature requests.<br>[1]&nbsp; The results [2] have been extremely usefu=
l in setting and<br>prioritizing goals for Cassandra development.&nbsp; But=
 with the release of<br>1.0 we've accomplished basically everything from ou=
r original wish<br>list. [3]<br><br>I'd love to hear from modern Cassandra =
users again, especially if<br>you're usually a quiet lurker.&nbsp; What doe=
s Cassandra do well?&nbsp; What are<br>your pain points?&nbsp; What's your =
feature
 wish list?<br><br>As before, if you're in stealth mode or don't want to sa=
y anything in<br>public, feel free to reply to me privately and I will keep=
 it off the<br>record.<br><br>[1] <a href=3D"http://www.mail-archive.com/ca=
ssandra-dev@incubator.apache.org/msg01148.html" target=3D"_blank">http://ww=
w.mail-archive.com/cassandra-dev@incubator.apache.org/msg01148.html</a><br>=
[2] <a href=3D"http://www.mail-archive.com/cassandra-user@incubator.apache.=
org/msg01446.html" target=3D"_blank">http://www.mail-archive.com/cassandra-=
user@incubator.apache.org/msg01446.html</a><br>[3] <a href=3D"http://www.ma=
il-archive.com/dev@cassandra.apache.org/msg01524.html" target=3D"_blank">ht=
tp://www.mail-archive.com/dev@cassandra.apache.org/msg01524.html</a><br><br=
>-- <br>Jonathan Ellis<br>Project Chair, Apache Cassandra<br>co-founder of =
DataStax, the source for professional Cassandra support<br><a href=3D"http:=
//www.datastax.com"
 target=3D"_blank">http://www.datastax.com</a><br><br><br></div></div></div=
></body></html>
---1806482184-754292353-1320324393=:2047--