Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 16B2B7842 for ; Thu, 3 Nov 2011 15:51:07 +0000 (UTC) Received: (qmail 73536 invoked by uid 500); 3 Nov 2011 15:51:04 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 73510 invoked by uid 500); 3 Nov 2011 15:51:04 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 73502 invoked by uid 99); 3 Nov 2011 15:51:04 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Nov 2011 15:51:04 +0000 X-ASF-Spam-Status: No, hits=2.9 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [77.238.189.210] (HELO nm6-vm0.bullet.mail.ird.yahoo.com) (77.238.189.210) by apache.org (qpsmtpd/0.29) with SMTP; Thu, 03 Nov 2011 15:50:57 +0000 Received: from [77.238.189.49] by nm6.bullet.mail.ird.yahoo.com with NNFMP; 03 Nov 2011 15:50:35 -0000 Received: from [212.82.109.128] by tm2.bullet.mail.ird.yahoo.com with NNFMP; 03 Nov 2011 15:50:35 -0000 Received: from [127.0.0.1] by omp1020.mail.ird.yahoo.com with NNFMP; 03 Nov 2011 15:50:35 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 599275.32958.bm@omp1020.mail.ird.yahoo.com Received: (qmail 22491 invoked by uid 60001); 3 Nov 2011 15:50:35 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.co.uk; s=s1024; t=1320335435; bh=jp7nHmW+bDitVicKI43S40Tb8MQdN9QII6zY/YGkxl0=; h=X-YMail-OSG:Received:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=eod97LpQdQ697nMz1Kh1xNyF9+RUcIyCeocPTyyN+2dDfKHsivXNNFsoMXisyEfvGcUkPdJ3ISpNnA6v9Da+olyqVeWvuTL0wpSqOYMnQeiPUsI9r2grOZxL7wYhgZmIlh+zcbCvyezqjbFTsYOzPxShjilL+6S3MTfoDwGbqNA= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.co.uk; h=X-YMail-OSG:Received:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=rkug5yc7DEJ9Oik0sg0W9x2OJsXjUjvzUFaA8YyziyJPn//QoQtDPNaUfzcPJbNeQ/8PBPOmZfJTE/pp3WC5UQkM8mMYFbRedY4ukGCkXqqNTwOWElYT5Xm9wNqGPMGUQkhKIPo/gStqm3OUhbCLFFiJBx7lVk8+Nex9wos7Lug=; X-YMail-OSG: SkR76a0VM1mwXB9AaflAEKYAW6PgwfKrZNctIxaa7JFZjkI vVGiANShhAGO1El91bwTrRgCNRtWERCbvR7ztgm4TXP2tDrVdJ6as87O8Xm_ BH4lCCNBPQ9hHkK09hUa3ydaqYHMb3sn.5sRjijN8IkB5UyhP4OymRSjUKOz Mtz8SFONrR4jq0Dr68CH50DNLj7mvw_lKyHqourc61YS.DQoc_XHFZCgHcS0 zalGKywLmw3SVRfgWNM4cq2oyoHDW5qXN17Ta.9upHdkj.EcJ8elRhYgdUrp _8JKKjX_XnXcD12a8JcMLcjsEzbdoQ9yjc13KMjj5bghbsdg1IQHggWGnwF0 3N58pWVycdNFNRexp7fqaEXxUEX4ObG1S5DTVJW953fNUkengW2x18T0nc6U Ud2lCtIJX3CqFXqdDaIsqkg88x2dIHdnr3iBSxVaPvSx5N0kZfHjFwfK0XQk Sew2xSvMN3rv58wFK79qBJT2.qAUN3QFPqePdsUlLS.SWEoXyQzCrRJxOBVo WvGnoIssxVsIRPlvN1xyAu5_3fKVtqipd Received: from [194.116.198.179] by web132110.mail.ird.yahoo.com via HTTP; Thu, 03 Nov 2011 15:50:35 GMT X-Mailer: YahooMailWebService/0.8.115.325013 References: <1320324393.2047.YahooMailNeo@web132107.mail.ird.yahoo.com> Message-ID: <1320335435.22047.YahooMailNeo@web132110.mail.ird.yahoo.com> Date: Thu, 3 Nov 2011 15:50:35 +0000 (GMT) From: Peter Tillotson Reply-To: Peter Tillotson Subject: Re: Second Cassandra users survey To: "user@cassandra.apache.org" In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="1263293997-941126134-1320335435=:22047" --1263293997-941126134-1320335435=:22047 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable >> =A0* Indexing dynamic colnames (eg Lucene TermEnum against rowkey:colkey= )=0A>> =A0 =A0I do a lot of=A0checking=A0against dynamic colnames=0A>=0A>I = agree, some kind of integration with search engine is required to=0A>suppor= t adhoc queries as well and searching on column names. This will=0A>be real= ly helpful.=0A>=0A>Currently, one of the options is to write in 2 places. C= assandra +=0A>search engine.=0A>=0A=0AI thought a disk backed skiplist, wit= h every nth rowkey:colkey dragged into memory per sstable as per Lucene Ter= mEnum. =A0=0A=0A=0A________________________________=0AFrom: Mohit Anchlia <= mohitanchlia@gmail.com>=0ATo: user@cassandra.apache.org; Peter Tillotson =0ASent: Thursday, 3 November 2011, 14:15=0ASubject: R= e: Second Cassandra users survey=0A=0AOn Thu, Nov 3, 2011 at 5:46 AM, Peter= Tillotson wrote:=0A> I'm using Cassandra as a big = graph database, loading large=A0volumes=A0of data=0A> live and linking on t= he fly.=0A=0ANot sure if Cassandra is right fit to model complex vertexes a= nd edges.=0A=0A> The number of edges grow geometrically with data added, an= d need to be read=0A> to continue linking the graph on the fly.=0A>=0A> Con= sequently, my problem is constrained by:=0A> =A0* Predominantly read - espe= cially when data gets large and reads are quasi=0A> random=0A> =A0* I have = lots of data to plow in, to be read=0A> =A0* Although the problem scale out= and possibly all be in RAM, it requires=0A> too much kit for the to be via= ble=0A> So, my findings with Cassandra are:=0A> =A0* Compaction is expensiv= e, I need it but=0A> =A0 =A01) It takes away disk IO from my reads=0A> =A0 = =A02)=A0Destroys the file cache=0A> =A0 =A0I've not had chance to do extens= ive tests with the Level db compaction=0A> =A0* Compaction has been too har= d to configure historically=0A> =A0* Memory hungry=0A> So for me the bigges= t features would be=0A> =A0* Cheaper compaction -=0A> =A0* Lower memory usa= ge=0A> =A0* Indexing dynamic colnames (eg Lucene TermEnum against rowkey:co= lkey)=0A> =A0 =A0I do a lot of=A0checking=A0against dynamic colnames=0A=0AI= agree, some kind of integration with search engine is required to=0Asuppor= t adhoc queries as well and searching on column names. This will=0Abe reall= y helpful.=0A=0ACurrently, one of the options is to write in 2 places. Cass= andra +=0Asearch engine.=0A>=0A> The great features are that redundancy, an= d live addition of shards is=0A> available out of the box.=0A>=0A> I've als= o experimented with Golden Orb and Triggered updates, I think there=0A> is = a fair bit that can be achieved in my problem with local data access.=0A> T= hrough GoldenOrb and Hadoop writables a managed to get both a BigTable and= =0A> Pregel access model onto my Cassandra data. It was schema specific, bu= t=0A> provided a local compute model.=0A> p=0A> ___________________________= _____=0A> From: Jonathan Ellis =0A> To: user =0A> Sent: Tuesday, 1 November 2011, 22:59=0A> Subject: Sec= ond Cassandra users survey=0A>=0A> Hi all,=0A>=0A> Two years ago I asked fo= r Cassandra use cases and feature requests.=0A> [1]=A0 The results [2] have= been extremely useful in setting and=0A> prioritizing goals for Cassandra = development.=A0 But with the release of=0A> 1.0 we've accomplished basicall= y everything from our original wish=0A> list. [3]=0A>=0A> I'd love to hear = from modern Cassandra users again, especially if=0A> you're usually a quiet= lurker.=A0 What does Cassandra do well?=A0 What are=0A> your pain points?= =A0 What's your feature wish list?=0A>=0A> As before, if you're in stealth = mode or don't want to say anything in=0A> public, feel free to reply to me = privately and I will keep it off the=0A> record.=0A>=0A> [1]=0A> http://www= .mail-archive.com/cassandra-dev@incubator.apache.org/msg01148.html=0A> [2]= =0A> http://www.mail-archive.com/cassandra-user@incubator.apache.org/msg014= 46.html=0A> [3] http://www.mail-archive.com/dev@cassandra.apache.org/msg015= 24.html=0A>=0A> --=0A> Jonathan Ellis=0A> Project Chair, Apache Cassandra= =0A> co-founder of DataStax, the source for professional Cassandra support= =0A> http://www.datastax.com=0A>=0A>=0A> --1263293997-941126134-1320335435=:22047 Content-Type: text/html; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable
>>  * Indexing dynamic colnames (eg Lucen= e TermEnum against rowkey:colkey)
>>    I do a lot of checking&n= bsp;against dynamic colnames
>
>I agree, some kind of integration= with search engine is required to
>support adhoc queries as well and searching on column name= s. This will
>
>Currently, one of the options is to = write in 2 places. Cassandra +
>search engine.
>
I thought a disk backed skiplist, wit= h every nth rowkey:colkey dragged into memory per sstable as per Lucene Ter= mEnum.  


From: Mohit Anchlia <mohitanchlia@gmail.com>
To: user@cassandra.apache.org; Peter Tillot= son <slatemine@yahoo.co.uk>
= Sent: Thursday, 3 November 2011, 14:15
Subject: Re: Second Cassandra users survey

On Thu, Nov 3, 2011 at 5:46 AM, Peter Tillotson <slatemi= ne@yahoo.co.uk> wrote:
> I'm using Cassandra as a big graph da= tabase, loading large volumes of data
> live and linking on= the fly.

Not sure if Cassandra is right fit to model complex vertex= es and edges.

> The number of edges grow geometrically with data = added, and need to be read
> to continue linking the graph on the fly.
>
> Consequently, my problem is constrained by:
> &= nbsp;* Predominantly read - especially when data gets large and reads are q= uasi
> random
>  * I have lots of data to plow in, to be r= ead
>  * Although the problem scale out and possibly all be in R= AM, it requires
> too much kit for the to be viable
> So, my fi= ndings with Cassandra are:
>  * Compaction is expensive, I need = it but
>    1) It takes away disk IO from my reads
> =    2) Destroys the file cache
>    I've not = had chance to do extensive tests with the Level db compaction
>  = ;* Compaction has been too hard to configure historically
>  * M= emory hungry
> So for me the biggest features would be
>  = * Cheaper compaction -
>  * Lower memory usage
>  * I= ndexing dynamic colnames (eg Lucene TermEnum against rowkey:colkey)
>    I do a lot of checking again= st dynamic colnames

I agree, some kind of integration with search en= gine is required to
support adhoc queries as well and searching on colum= n names. This will
be really helpful.

Currently, one of the optio= ns is to write in 2 places. Cassandra +
search engine.
>
> T= he great features are that redundancy, and live addition of shards is
&g= t; available out of the box.
>
> I've also experimented with Go= lden Orb and Triggered updates, I think there
> is a fair bit that ca= n be achieved in my problem with local data access.
> Through GoldenO= rb and Hadoop writables a managed to get both a BigTable and
> Pregel= access model onto my Cassandra data. It was schema specific, but
> p= rovided a local compute model.
> p
> __________________________= ______
> From: Jonathan Ellis <jbe= llis@gmail.com>
> To: user <user@cassandra.a= pache.org>
> Sent: Tuesday, 1 November 2011, 22:59
> Sub= ject: Second Cassandra users survey
>
> Hi all,
>
>= Two years ago I asked for Cassandra use cases and feature requests.
>= ; [1]  The results [2] have been extremely useful in setting and
&g= t; prioritizing goals for Cassandra development.  But with the release= of
> 1.0 we've accomplished basically everything from our original w= ish
> list. [3]
>
> I'd love to hear from modern Cassandr= a users again, especially if
> you're usually a quiet lurker.  W= hat does Cassandra do well?  What are
> your pain points?  = What's your feature wish list?
>
> As before, if you're in stealth mode or don't want to say anything in
> public, feel free to= reply to me privately and I will keep it off the
> record.
>> [1]
> http://www.mail-archive.= com/cassandra-dev@incubator.apache.org/msg01148.html
> [2]
>= ; http://www.mail-archive.com/cassandra-us= er@incubator.apache.org/msg01446.html
> [3] http://www.mail-archive.com/dev@cassandra.apache.org/msg01524.html
= >
> --
> Jonathan Ellis
> Project Chair, Apache Cassan= dra
> co-founder of DataStax, the source for professional Cassandra s= upport
> http://www.datastax.com
>
>
>
<= br>
--1263293997-941126134-1320335435=:22047--