Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 8442B200B98 for ; Mon, 3 Oct 2016 15:26:09 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 82CF4160ADC; Mon, 3 Oct 2016 13:26:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A4BA4160ACC for ; Mon, 3 Oct 2016 15:26:07 +0200 (CEST) Received: (qmail 90907 invoked by uid 500); 3 Oct 2016 13:26:06 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 90897 invoked by uid 99); 3 Oct 2016 13:26:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Oct 2016 13:26:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id A7FFA1A5333 for ; Mon, 3 Oct 2016 13:26:05 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.179 X-Spam-Level: * X-Spam-Status: No, score=1.179 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id dwJj9U_qpLBC for ; Mon, 3 Oct 2016 13:26:00 +0000 (UTC) Received: from mail-lf0-f50.google.com (mail-lf0-f50.google.com [209.85.215.50]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id 4424C5FC3D for ; Mon, 3 Oct 2016 13:26:00 +0000 (UTC) Received: by mail-lf0-f50.google.com with SMTP id t81so87533399lfe.0 for ; Mon, 03 Oct 2016 06:26:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=HScOLFQ49yiAF/ANbdx2gshiKs22nWsagcm22LkZNKo=; b=kB2pqMBy3Az1xeagEaUTtTsqoszCNp1BcLAQyQy5yLFuaMuEMYqbTK7wMPhGzdlI9C q6Kx6Xp1IEJJJx79szNzeEow4M0Eg3w/sbiCl5BQnt8KAlaVvMPvo48poVPKT5vIRFgK 9nGA2gTW3r6aEjnbNu/8eZBY+Y7uj17m4EkwC2YoM2Najvd1Mrry0y+477p76T6p53ZB rdZnpK0hsaJt1akRg/ax3AVj47D+9hJ6UIHJwRTbBvH/hinA508xT9JdBCkTosv5Jyhz ykVLP2U9VSZA1QWkwXX792+0wRhKwE2UOOoUuEJaomkf7ybacbg5Q4+GEWVt/R6PlxHm XYww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=HScOLFQ49yiAF/ANbdx2gshiKs22nWsagcm22LkZNKo=; b=e/4+vkC2bHp8elqTU8Ygu0t0uWLnLDbaHC/7o2HI5NLUqHf/OGgTNO6SuSpPFLBsPA NE6Pe+efwLGMiHjFa/2C0y4P9oPoV9Dl/0pYNI/kg/RNA1dkysPIXh6rj2BCq5LJsbVZ hH7/Jk/ZoO4auCMa+1AvP4k6+KxGaC5ZS457eExb0kCrJSAcnYUO/vhQYtk92Hznq+LM /5I6s41WTFqWIeUcQrHo8nMHMwp6kG+xOqGW2RNfeiLt3/45ppzlSvvqrvQgxSbjwfRL KvB02EAm2Q11Zo4eMipw3tARIFOmjBejnmdPzzcwWNi2znHbv5lUxfA+ruD7pe3vm3ug U/iw== X-Gm-Message-State: AA6/9Rk1LfJleSauTb1qbGcJ+vgO+Dj/Fap5THIkI69vfolJIHn0A9AM6h+SMDx2FMJbfzKxqxfrVT4uuVlWMA== X-Received: by 10.25.87.20 with SMTP id l20mr6180217lfb.181.1475501152758; Mon, 03 Oct 2016 06:25:52 -0700 (PDT) MIME-Version: 1.0 Received: by 10.25.75.141 with HTTP; Mon, 3 Oct 2016 06:25:51 -0700 (PDT) In-Reply-To: References: <2045743758.13821426.1475252660560.JavaMail.zimbra@dbi-services.com> From: Edward Capriolo Date: Mon, 3 Oct 2016 09:25:51 -0400 Message-ID: Subject: Re: Cassandra data model right definition To: "user@cassandra.apache.org" Content-Type: multipart/related; boundary=001a1140eade138c91053df5e2ce archived-at: Mon, 03 Oct 2016 13:26:09 -0000 --001a1140eade138c91053df5e2ce Content-Type: multipart/alternative; boundary=001a1140eade138c8c053df5e2cd --001a1140eade138c8c053df5e2cd Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable The phrase is defensible, but that is the root of the problem. Take for example a skateboard. "A skateboard is like a bike because it has wheels and you ride on it." That is true and defensively true. :) However with not much more text you can accurately describe what it is, as opposed to something it is almost like. "A skateboard is a thin piece of wood on top of four small wheels that you stand on and ride" The old sentence Cassandra statement was something to the effect of "with the storage model of big table and the consistency model of dynamo". This accurately described the system and gave reference to specific known quantities (bigtable/dynamo) in which white papers existed for further reading. On Mon, Oct 3, 2016 at 6:24 AM, Benedict Elliott Smith wrote: > While that sentence leaves a lot to be desired (for me because it confers > a different meaning on row store), it doesn't say "Cassandra is like a > RDBMS" - it says "like an RDBMS, it organises data by rows and columns" - > i.e., in this regard only it is like an RDBMS, not more generally. > > I believe it was meant to help people, especially those afraid of the > NoSQL thrift world, understand that it still uses the basic concept of a > rows and columns they are used to. I agree it could be improved to > minimise the chance of misreading it, and I'm certain contributions would > be welcome here. > > I don't personally want to get bogged down in analysing every piece of > text anyone has ever written, so I'll bow out of further discussion on > this. These phrases may all be suboptimal, but they are certainly > defensible. Column store is not, that's all I wanted to contribute here. > > > > > > On 1 October 2016 at 19:35, Peter Lin wrote: > >> I'll second Ed's comment. >> >> The documentation should be more careful when using phrases "like >> relational databases". When we look at the history of relational databas= es, >> people expect certain things like ACID transactions, primary/foriegn key >> constraints, query planners, joins and relational algebra. Clearly >> Cassandra's storage engine does not follow most of those principals for = a >> good reason. >> >> The term row oriented storage would be more descriptive and appropriate. >> It avoids conflating Cassandra storage engine with "traditional" relatio= nal >> storage engines. Those of us that have spent over a decade using IBM DB2= , >> Oracle, Sql Server and Sybase tend to think of relational databases in a >> certain way. If we go back to 1998, most RDBMS storage engine had a max = row >> size limit. Databases like Sybase before version 9 preferred RAW disk fo= r >> optimal performance. I can go on and on, but there's no point really. >> >> Cassandra's storage engine is "row oriented", but it's not relational in >> RDBMS sense. We do everyone a huge disservice by using confusing >> terminology and then making fun of those who get confused. No one wins w= hen >> that happens. At the end of the day, what differentiates cassandra's >> storage engine is it support static and dynamic columns, which tradition= al >> RDBMS don't support today. Calling Cassandra storage "distributed tables= " >> doesn't really help in my bias opinion. >> >> For example, if you tell a SqlServer or Oracle RAC admin "cassandra uses >> distributed tables" they might answer "so what, sql server and oracle ca= n >> do that too." The difference is with RDBMS the partitioning is optional = and >> requires more work to configure. Whereas with Cassandra you can have >> everything in 1 node, which means there is only 1 partition and no >> different to 1 instance of sql server. Where you win is when you need to >> add 2 more nodes, Cassandra makes this easier whereas with SqlServer and >> Oracle you have to do a little bit more work. I've lost count of how man= y >> times I've to explained noSql databases to RDBMS admins and had to expla= in >> the official docs are stupid. >> >> >> >> On Sat, Oct 1, 2016 at 11:31 AM, Edward Capriolo >> wrote: >> >>> https://github.com/apache/cassandra >>> >>> Row store means that like >>> relational databases, Cassandra organizes data by rows and columns. The >>> Cassandra Query Language (CQL) is a close relative of SQL. >>> >>> I generally do not know what to say about these high level >>> "oversimplifications" like "firewalls block hackers". Are there "firewa= lls" >>> or do they mean IP routers with layer 4 packet inspections and layer 3 >>> Access Control Lists? >>> >>> We say (and I catch myself doing it all the time) "like relational >>> databases" often as if all relational databases work alike. A columnar >>> store like HP Vertica is a relational database.MySql has different stor= age >>> engines does MyIsam work like InnoDB? >>> >>> Google docs organizes data by rows and columns as well. You can wrap an= y >>> storage system into an API that makes them look like rows and columns. >>> Microsoft LINQ can enumerate your network cars and query them >>> https://msdn.microsoft.com/en-us/library/bb308959.aspx , that really >>> does not make your network cards a "row store" >>> >>> "Theoretically a row can have 2 billion columns, but in practice it >>> shouldn't have more than 100 million columns." >>> In practice (In my experience) the number is much lower than 100 >>> million, and if the data actually is deleted and readded frequently the >>> number of live columns(rows, whatever) you can use happily is even lowe= r >>> >>> >>> I believe on twitter (I am unable to find the tweet) someone was trying >>> to convince me Cassandra was a "columnar analytic database". ROFL >>> >>> I believe telling someone it "row store" "like a database", is not a >>> good idea. They might away content with that explanation. You are setti= ng >>> them up to walk into an anti-pattern. Like a case where the user is >>> attempting to write and deleting 1 row and 1 column 6 billion times a d= ay. >>> Then you end up explaining to them http://stackoverflow.com/ >>> questions/21755286/what-exactly-happens-when-tombstone-limit-is-reached >>> >>> and how the cassandra storage model is not "like a relational database"= . >>> >>> On Fri, Sep 30, 2016 at 9:22 PM, Edward Capriolo >>> wrote: >>> >>>> I can iterate over JSON data stored in mongo and present it as a table >>>> with rows and columns. It does not make mongo a rowstore. >>>> >>>> On Fri, Sep 30, 2016 at 9:16 PM, Edward Capriolo >>> > wrote: >>>> >>>>> The problem with calling it a row store: >>>>> >>>>> https://en.wikipedia.org/wiki/Row_(database) >>>>> >>>>> In the context of a relational database >>>>> , a *row*=E2=80=94= also >>>>> called a record >>>>> or tuple >>>>> =E2=80=94represents a single, im= plicitly >>>>> structured data item in a table >>>>> . In simple terms, a >>>>> database table can be thought of as consisting of *rows* andcolumns >>>>> or fields >>>>> .[1] >>>>> Each row >>>>> in a table represents a set of related data, and every row in the tab= le has >>>>> the same structure. >>>>> >>>>> When you have static columns and rows with maps, and lists, it is har= d >>>>> to argue that every row has the same structure. Physically at the sto= rage >>>>> layer they do not have the same structure and logically when accessin= g the >>>>> data they barely have the same structure, as the static column is jus= t >>>>> appearing inside each row it is actually not contained in. >>>>> >>>>> On Fri, Sep 30, 2016 at 4:47 PM, Jonathan Haddad >>>>> wrote: >>>>> >>>>>> +1000 to what Benedict says. I usually call it a "partitioned row >>>>>> store" which usually needs some extra explanation but is more accura= te than >>>>>> "column family" or whatever other thrift era terminology people stil= l use. >>>>>> On Fri, Sep 30, 2016 at 1:53 PM DuyHai Doan >>>>>> wrote: >>>>>> >>>>>>> I used to present Cassandra as a NoSQL datastore with "distributed" >>>>>>> table. This definition is closer to CQL and has some academic backg= round >>>>>>> (distributed hash table). >>>>>>> >>>>>>> >>>>>>> On Fri, Sep 30, 2016 at 7:43 PM, Benedict Elliott Smith < >>>>>>> benedict@apache.org> wrote: >>>>>>> >>>>>>>> Cassandra is not a "wide column store" anymore. It has a schema. >>>>>>>> Only thrift users no longer think they have a schema (though they = do), and >>>>>>>> thrift is being deprecated. >>>>>>>> >>>>>>>> I really wish everyone would kill the term "wide column store" wit= h >>>>>>>> fire. It seems to have never meant anything beyond "schema-less, >>>>>>>> row-oriented", and a "column store" means literally the opposite o= f this. >>>>>>>> >>>>>>>> Not only that, but people don't even seem to realise the term >>>>>>>> "column store" existed long before "wide column store" and the lat= ter is >>>>>>>> often abbreviated to the former, as here: >>>>>>>> http://www.planetcassandra.org/what-is-nosql/ >>>>>>>> >>>>>>>> Since it no longer applies, let's all agree as a community to >>>>>>>> forget this awful nomenclature ever existed. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 30 September 2016 at 18:09, Joaquin Casares < >>>>>>>> joaquin@thelastpickle.com> wrote: >>>>>>>> >>>>>>>>> Hi Mehdi, >>>>>>>>> >>>>>>>>> I can help clarify a few things. >>>>>>>>> >>>>>>>>> As Carlos said, Cassandra is a Wide Column Store. Theoretically a >>>>>>>>> row can have 2 billion columns, but in practice it shouldn't have= more than >>>>>>>>> 100 million columns. >>>>>>>>> >>>>>>>>> Cassandra partitions data to certain nodes based on the partition >>>>>>>>> key(s), but does provide the option of setting zero or more clust= ering >>>>>>>>> keys. Together, the partition key(s) and clustering key(s) form t= he primary >>>>>>>>> key. >>>>>>>>> >>>>>>>>> When writing to Cassandra, you will need to provide the full >>>>>>>>> primary key, however, when reading from Cassandra, you only need = to provide >>>>>>>>> the full partition key. >>>>>>>>> >>>>>>>>> When you only provide the partition key for a read operation, >>>>>>>>> you're able to return all columns that exist on that partition wi= th low >>>>>>>>> latency. These columns are displayed as "CQL rows" to make it eas= ier to >>>>>>>>> reason about. >>>>>>>>> >>>>>>>>> Consider the schema: >>>>>>>>> >>>>>>>>> CREATE TABLE foo ( >>>>>>>>> bar uuid, >>>>>>>>> >>>>>>>>> boz uuid, >>>>>>>>> >>>>>>>>> baz timeuuid, >>>>>>>>> data1 text, >>>>>>>>> >>>>>>>>> data2 text, >>>>>>>>> >>>>>>>>> PRIMARY KEY ((bar, boz), baz) >>>>>>>>> >>>>>>>>> ); >>>>>>>>> >>>>>>>>> >>>>>>>>> When you write to Cassandra you will need to send bar, boz, and >>>>>>>>> baz and optionally data*, if it's relevant for that CQL row. If y= ou chose >>>>>>>>> not to define a data* field for a particular CQL row, then nothin= g is >>>>>>>>> stored nor allocated on disk. But I wouldn't consider that caveat= to be >>>>>>>>> "schema-less". >>>>>>>>> >>>>>>>>> However, all writes to the same bar/boz will end up on the same >>>>>>>>> Cassandra replica set (a configurable number of nodes) and be sto= red on the >>>>>>>>> same place(s) on disk within the SSTable(s). And on disk, each fi= eld that's >>>>>>>>> not a partition key is stored as a column, including clustering k= eys (this >>>>>>>>> is optimized in Cassandra 3+, but now we're getting deep into int= ernals). >>>>>>>>> >>>>>>>>> In this way you can get fast responses for all activity for >>>>>>>>> bar/boz either over time, or for a specific time, with roughly th= e same >>>>>>>>> number of disk seeks, with varying lengths on the disk scans. >>>>>>>>> >>>>>>>>> Hope that helps! >>>>>>>>> >>>>>>>>> Joaquin Casares >>>>>>>>> Consultant >>>>>>>>> Austin, TX >>>>>>>>> >>>>>>>>> Apache Cassandra Consulting >>>>>>>>> http://www.thelastpickle.com >>>>>>>>> >>>>>>>>> On Fri, Sep 30, 2016 at 11:40 AM, Carlos Alonso < >>>>>>>>> info@mrcalonso.com> wrote: >>>>>>>>> >>>>>>>>>> Cassandra is a Wide Column Store http://db-engines.com/en >>>>>>>>>> /system/Cassandra >>>>>>>>>> >>>>>>>>>> Carlos Alonso | Software Engineer | @calonso >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 30 September 2016 at 18:24, Mehdi Bada < >>>>>>>>>> mehdi.bada@dbi-services.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> I have a theoritical question: >>>>>>>>>>> - Is Apache Cassandra really a column store? >>>>>>>>>>> Column store mean storing the data as column rather than as a >>>>>>>>>>> rows. >>>>>>>>>>> >>>>>>>>>>> In fact C* store the data as row, and data is partionned with >>>>>>>>>>> row key. >>>>>>>>>>> >>>>>>>>>>> Finally, for me, Cassandra is a row oriented schema less >>>>>>>>>>> DBMS.... Is it true for you also??? >>>>>>>>>>> >>>>>>>>>>> Many thanks in advance for your reply >>>>>>>>>>> >>>>>>>>>>> Best Regards >>>>>>>>>>> Mehdi Bada >>>>>>>>>>> ---- >>>>>>>>>>> >>>>>>>>>>> *Mehdi Bada* | Consultant >>>>>>>>>>> Phone: +41 32 422 96 00 | Mobile: +41 79 928 75 48 | Fax: +41 >>>>>>>>>>> 32 422 96 15 >>>>>>>>>>> dbi services, Rue de la Jeunesse 2, CH-2800 Del=C3=A9mont >>>>>>>>>>> mehdi.bada@dbi-services.com >>>>>>>>>>> www.dbi-services.com >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> *=E2=87=92 dbi services is recruiting Oracle & SQL Server exper= ts ! =E2=80=93 >>>>>>>>>>> Join the team >>>>>>>>>>> * >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>> >>> >> > --001a1140eade138c8c053df5e2cd Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
The phrase is defensible, but that is the root of the prob= lem. Take for example a skateboard.

"A skateboard is like a bik= e because it has wheels and you ride on it."

That is true and d= efensively true. :) However with not much more text you can accurately desc= ribe what it is, as opposed to something it is almost like.

"A = skateboard is a thin piece of wood on top of four small wheels that you sta= nd on and ride"

The old sentence Cassandra statement was someth= ing to the effect of "with the storage model of big table and the cons= istency model of dynamo". This accurately described the system and gav= e reference to specific known quantities (bigtable/dynamo) in which white p= apers existed for further reading.=C2=A0

On Mon, Oct 3, 2016 at 6:24 AM, Benedict E= lliott Smith <benedict@apache.org> wrote:
While that sentence leaves a lot to be d= esired (for me because it confers a different meaning on row store), it doe= sn't say "Cassandra is like a RDBMS" - it says "like an = RDBMS, it organises data by rows and columns" - i.e., in this regard o= nly it is like an RDBMS, not more generally.

I believe i= t was meant to help people, especially those afraid of the NoSQL thrift wor= ld, understand that it still uses the basic concept of a rows and columns t= hey are used to.=C2=A0 I agree it could be improved to minimise the chance = of misreading it, and I'm certain contributions would be welcome here.<= /div>

I don't personally want to get bogged dow= n in analysing every piece of text anyone has ever written, so I'll bow= out of further discussion on this.=C2=A0 These phrases may all be suboptim= al, but they are certainly defensible.=C2=A0 Column store is not, that'= s all I wanted to contribute here.


=


On 1 October 2016 at 19:35, Peter Lin <woolfel@gmail.c= om> wrote:
I'll second Ed's comment.

The doc= umentation should be more careful when using phrases "like relational = databases". When we look at the history of relational databases, peopl= e expect certain things like ACID transactions, primary/foriegn key constra= ints, query planners, joins and relational algebra. Clearly Cassandra's= storage engine does not follow most of those principals for a good reason.=

The term row oriented storage would be more descriptive and a= ppropriate. It avoids conflating Cassandra storage engine with "tradit= ional" relational storage engines. Those of us that have spent over a = decade using IBM DB2, Oracle, Sql Server and Sybase tend to think of relati= onal databases in a certain way. If we go back to 1998, most RDBMS storage = engine had a max row size limit. Databases like Sybase before version 9 pre= ferred RAW disk for optimal performance. I can go on and on, but there'= s no point really.

Cassandra's storage engine is "row= oriented", but it's not relational in RDBMS sense. We do everyone= a huge disservice by using confusing terminology and then making fun of th= ose who get confused. No one wins when that happens. At the end of the day,= what differentiates cassandra's storage engine is it support static an= d dynamic columns, which traditional RDBMS don't support today. Calling= Cassandra storage "distributed tables" doesn't really help i= n my bias opinion.

For example, if you tell a SqlServer or Ora= cle RAC admin "cassandra uses distributed tables" they might answ= er "so what, sql server and oracle can do that too." The differen= ce is with RDBMS the partitioning is optional and requires more work to con= figure. Whereas with Cassandra you can have everything in 1 node, which mea= ns there is only 1 partition and no different to 1 instance of sql server. = Where you win is when you need to add 2 more nodes, Cassandra makes this ea= sier whereas with SqlServer and Oracle you have to do a little bit more wor= k. I've lost count of how many times I've to explained noSql databa= ses to RDBMS admins and had to explain the official docs are stupid.


On Sat, Oct 1, 2016 at 11:31 AM, Edward Capriolo <= edlinuxguru@gmai= l.com> wrote:
https= ://github.com/apache/cassandra

Row store=C2=A0means that like relational databases, Cassandra organi= zes data by rows and columns. The Cassandra Query Language (CQL) is a close= relative of SQL.

I generally do not know what to say about t= hese high level "oversimplifications" like "firewalls block = hackers". Are there "firewalls" or do they mean IP routers w= ith layer 4 packet inspections and layer 3 Access Control Lists?

We = say (and I catch myself doing it all the time) "like relational databa= ses" often as if all relational databases work alike. A columnar store= like HP Vertica is a relational database.MySql has different storage engin= es does MyIsam work like InnoDB?

Google docs organizes data by rows = and columns as well. You can wrap any storage system into an API that makes= them look like rows and columns. Microsoft LINQ can enumerate your network= cars and query them=C2=A0https://msdn.microsoft.com/en-us/l= ibrary/bb308959.aspx , that really does not make your network cards a &= quot;row store"

"Theoretically a row can have 2 billion columns, but in practice it shou= ldn't have more than 100 million columns."
In practice (In my e= xperience) the number is much lower than 100 million, and if the data actua= lly is deleted and readded frequently the number of live columns(rows, what= ever) you can use happily is even lower


I believe on twit= ter (I am unable to find the tweet) someone was trying to convince me Cassa= ndra was a "columnar analytic database".=C2=A0 ROFL

I believe telling someone it "r= ow store" "like a database", is not a good idea. They might = away content with that=C2=A0explanation. You are setting them up to walk in= to an anti-pattern. Like a case where the user is attempting to write and d= eleting 1 row and 1 column 6 billion times a day. Then you end up explainin= g to them=C2=A0http://s= tackoverflow.com/questions/21755286/what-exactly-happens-when-tom= bstone-limit-is-reached=C2=A0

and how the cassandra storage= model is not "like a relational database".=C2=A0

On Fri, Sep 30= , 2016 at 9:22 PM, Edward Capriolo <edlinuxguru@gmail.com> wrote:
I can iterate = over JSON data stored in mongo and present it as a table with rows and colu= mns. It does not make mongo a rowstore.=C2=A0

On Fri, Sep 30, 2016 at 9:16 PM, Edward C= apriolo <edlinuxguru@gmail.com> wrote:
The problem with calling it a row store:<= br>
https://en.wikipedia.org/wiki/Row_(database)

In the context of a=C2=A0relational data= base, a=C2=A0row=E2=80=94also called a=C2=A0record=C2=A0or=C2=A0<= /span>tuple=E2=80=94represents a single, implicitly structured= =C2=A0data=C2=A0item in a=C2=A0table. In simple terms, a database table can be thought of as= consisting of=C2=A0rows=C2=A0andcolumnsfields.[1]=C2=A0Each row in a table represents a= set of related data, and every row in the table has the same structure.
When you have static columns and rows with maps, and lists, it = is hard to argue that every row has the same structure. Physically at the s= torage layer they do not have the same structure and logically when accessi= ng the data they barely have the same structure, as the static column is ju= st appearing inside each row it is actually not contained in.

On Fri, Sep 30, 2016 at 4= :47 PM, Jonathan Haddad <jon@jonhaddad.com> wrote:
+1000 to what Benedict says. I usually call it a &= quot;partitioned row store" which usually needs some extra explanation= but is more accurate than "column family" or whatever other thri= ft era terminology people still use.
On Fri, Sep 30, 2016 at 1:53 PM DuyHai Doan <doanduyhai@gmail.com> wrote= :
I used to presen= t Cassandra as a NoSQL datastore with "distributed" table. This d= efinition is closer to CQL and has some academic background (distributed ha= sh table).


On Fri, Sep 30, 2016 at 7:43 PM, Benedict Elliott Smith <b= enedict@apache.org> wrote:
=
Cassandra is not a "wide column store"= anymore.=C2=A0 It has a schema.=C2=A0 Only thrift users no longer think th= ey have a schema (though they do), and thrift is being deprecated.

I really wish everyone would kill the term "= ;wide column store" with fire.=C2=A0 It seems to have never meant anyt= hing beyond "schema-less, row-oriented", and a "column store= " means literally the opposite of this.

N= ot only that, but people don't even seem to realise the term "colu= mn store" existed long before "wide column store" and the la= tter is often abbreviated to the former, as here: http://www.planetcassand= ra.org/what-is-nosql/=C2=A0

Since it no l= onger applies, let's all agree as a community to forget this awful nome= nclature ever existed.


<= div class=3D"gmail_extra">
On 30 September 20= 16 at 18:09, Joaquin Casares <joaquin@thelastpickle.com> wrote:
Hi Mehdi,
I can help clarify a few things.

= As Carlos said, Cassandra is a Wide Column Store. Theoretically a row can h= ave 2 billion columns, but in practice it shouldn't have more than 100 = million columns.

Cassandra partitions data to cert= ain nodes based on the partition key(s), but does provide the option of set= ting zero or more clustering keys. Together, the=C2=A0partition=C2=A0key(s)= and clustering key(s) form the primary key.

When = writing to Cassandra, you will need to provide the full primary key, howeve= r, when reading from Cassandra, you only need to provide the full partition= key.

When you only provide the partition key for = a read operation, you're able to return all columns that exist on that = partition with low latency. These columns are displayed as "CQL rows&q= uot; to make it easier to reason about.

Consider t= he schema:

CREATE TABLE foo (
=C2=A0 bar uuid= ,
=C2=A0 boz uuid,
=C2=A0 baz timeuuid,=
=C2=A0 data1 text,
=C2=A0 data2 text,
=C2=A0 PRIMARY KEY ((bar, boz), baz)
);

When you write to Cassandra you will need to send bar, = boz, and baz and optionally data*, if it's relevant for that CQL row. I= f you chose not to define a data* field for a particular CQL row, then noth= ing is stored nor allocated on disk. But I wouldn't consider that cavea= t to be "schema-less".

However, all writ= es to the same bar/boz will end up on the same Cassandra replica set (a con= figurable number of nodes) and be stored on the same place(s) on disk withi= n the SSTable(s). And on disk, each field that's not a partition key is= stored as a column, including clustering keys (this is optimized in Cassan= dra 3+, but now we're getting deep into internals).

In this way you can get fast responses for all activity for bar/boz e= ither over time, or for a specific time, with roughly the same number of di= sk seeks, with varying lengths on the disk scans.

= Hope that helps!

Joaquin Casares
Cons= ultant
Austin= , TX

Apache Cassandra Consulting

On Fri, Sep 30, 2016 at 11:40 AM, Carlos Alo= nso <info@mrcalonso.com> wrote:
Cassandra is a Wide Column Store=C2=A0http://db-engine= s.com/en/system/Cassandra

Car= los Alonso | Software Engineer |=C2=A0@calonso=

On 30 September 2016 at 18:24, Mehdi Bada <mehdi.bada@dbi-services.com> wrote:
Hi all,

I have a theoritical question:
- Is Apache Cassandra reall= y a column store?
Column store mean storing the data as colum= n rather than as a rows.

In fact C* store the= data as row, and data is partionned with row key.

=
Finally, for me, Cassandra is a row oriented schema less DBMS.... Is i= t true for you also???

Many thanks in advance = for your reply

Best Regards
Meh= di Bada
----

Mehdi Bada |= Consultant
Ph= one: +41 32 422 96 00 | Mobile: +41 79 928 75 48 = | Fax:=C2=A0+41 32 422 96 15=
dbi services, Rue de la Jeunesse 2, CH-2800 Del=C3=A9mont
<= span style=3D"color:#808080;font-size:8pt">mehdi.bada@dbi-services.com

=E2=87=92 dbi services is recruiting Oracle & SQL Server expe= rts ! =E2=80=93 Join the team
=










--001a1140eade138c8c053df5e2cd-- --001a1140eade138c91053df5e2ce Content-Type: image/png; name="logo signature.png" Content-Disposition: inline; filename="logo signature.png" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: db6081752cd6482e_0.1.1 iVBORw0KGgoAAAANSUhEUgAAAXUAAAAnCAIAAAB/rk/XAAAAAXNSR0IArs4c6QAAAARnQU1BAACx jwv8YQUAAAAJcEhZcwAADsMAAA7DAcdvqGQAAA6pSURBVHhe7Z1pUFvXFcdfO0mbdPqhM51m2mZt J9tMJh/6oUmnbToTO15xnMXjJplJtyzTOHXjJI0TGxvjQHBiiBOCzWLA2MYLYjE2ZhU2GBsj9n0n FhL7asxiBJJA9C+dx/Xzk54kZEiUzP3NG807955777nb/90nwBbmOBwOZ2ng+sLhcJaKm/RldHS0 u7tbq9W2cxaCXq8fGhoym83iOHI4HBuivlgsloGBgf7+foPBMD09beS4DYZramoK0tzV1WUymWg8 ORwOEPVlYmKir6+Pbw+PgUCPjY319vaKNofDYfqC1yI8hOme4xlQZ51OJxocDofpi1ar5fpy63R2 dop3HA6H6Ut7e7tX68us2Xp5PVxfOBwpi6kvNbrhFf5ZT+/IrNOPwJw0mn3PVLwUm59e5/mus5gM k+d3jnz1yEjoI9fzA+YsM2KGV8L1hcORspj6kl/fIzwbLayOutjYB3N8yrQyLOeurScjLjaTgwcY CvcOfigM+Vov3BiK9okZXgnXFw5HymLqS0FD720vHhKejSls6ocJffEJz71nu+pgYQs5LBjLzLWo J4a2C0N+1mtwuzAa+xcxyyvh+sLhSPFyfZkdPbx8cNu8vmwTxuLXilleCdcXDkeKd+vL3Jyx5eyQ 34+hLFaV+finxq/VYoZXwvWFw5Hi7foCTB2XJzK24DL3VIhJ3grXFw5HygL0ZXbW4vASs93Tl1mL xeFluVHNzcyajV9nT1UdxmVsTrOYp8V0r2Sh+lJWVuZvQ7RdUV9fHxISAv+IiAgxifP9QqvVxsfH BwUF0cLARF+6dEnMW1QWuvY8w7W+VFwZCjld9+Kn5x55O9nBtSnpya1pAaqqK71jmpYBh/oSp2mb mDYllGtfP1a4er967YFc2bXmgPqFqPPBuXUFbX0ypTFoQunlyHp9JEzmL+1w3CJLrS8kLmAp9OUb WHBUPxoS7W8VrwqGgLgwZZGyFEF+A9MNnOkLjhV+Jyru3BAnrIgUnomwfq7Ap6NrWfg9ryXA+Y6N h2X6cv+OxN0ZVa8cunD3dhWue32VrkR8Puyf8m5yycjk/CHFYhmNe3pw/udHQ77CtYNPzFlmxVzv Y6n1hZy/uw80qt9LtrRXBUOoVCqEBImB0MDEZ2pq6rlz5yj3u4gzfYnKaRbWRAk+0T97JR7nl6P5 bWdKO+yv0PSGZ3Zl3fZCnLA+5ofPx+JTqi8P+ac8FpD64K7kjTF5hzSteS299lduUzc0aNmXWXC7 b0filqSSGXrtssyOxv5p0FeiL9F/WPRfsTOaZ+v0V5Xf0BbALeoLM3FCppvY2Njh4WHKpRSCzi/M H+uSnnvkj3TcUBbSURurBECeQkNDKRc3WMFUhFIIJMITn7iHD9WGqtAQSycfZkqPVHiPYwHgBiYS ySSk8bOCLAaZKesdsvDJhgjhKQmuk3GgREJ2GGTtMmdqC59kwoENIOonLaB0pHjQHQbKIp015BCM NhXHYZZNAaBZkE0WVUizBjARMAFiloUHHM4aodSoSxT15er49N2vnRTWHrzrb8dLWgfEVGVOaXR3 4OVofYy9vkA19qrrps0udKH72vUN0Xnwf3R3ikZra9GqL39ean0xGM0Jl65Iv0jymMXSFylsvYq2 DUq096eliZXBNgAB01aHVVzEpHno8SiritYQLVkp5EzpbJ2RyeJE67QWGRSVaNiQxs8KuhwNtvFk vQNwpiwpTsZBtG2wABj0HsoODtQdUjHUaSt0AziTTHjcHQZSkI7m0BaTLSlQCirLkM2CFMQPcIMI yScjIwMmDYIsPKVZA04adYmivpwu0Vt/GdfnIN56xCRX/CvsovW8c7O+/GZn0poDatOMWy81FR1D OOzglSo4t85qu9IXvMGpCrVfnamPyGral1qbU9mFxLzann2n6yKzm5GCexyFkora92c0hmc07kur 1w2MJ1zSssMKulnfMVLQ0IuUzIpOFAnPavosuabI1oXUYj0qR1XBqbUVV4aQgrJfpdXDx9pcVZdM kxZLX2hlsxVDuYBMuJEp83f4GKcVBmi90gPKfmUDWTCAAsCyw+LDFmLtUjpbZGSyfUWbBIsYRQBz A0gHsvhdbkhZ76hH7OFPrTPhUEI2DoBMFowUqhNN4J4UmW1RGkDEjNbZnqQIPeuOFPtNjtqYJyJH CiKhLrAKpeMgmyzckA8SYZLaSqMFuAdKs+a8UZco6sunKTXCqijhudgqrVsVAbxACWsOyr5/uddX 5Z9eSQ4umZg2rYvIfWBn4sux+dYXFlf60n118nfvpV5u6m/oGMmr7fY7UTkyMb3645zUYl1xy0Ba id4nIKele/SPH509Xay/0jcGhz0pNR8cLs0o70Bx/cDEuk/UrT2j78RomjqvLffLLGsb1PaNQZV8 j5Wj4yt2ZamrulAVurY2IAc+PoHqwsY+W3M9b0cVyXRzsfSFTEyh1ARk0tIBMn8lyIdK0TLCcsEN 1hCWGlso9rXRkmUbhkHpbAnK3GiHONw/SAey+F1uSDIZ1AV7xGxlyI21LjOl0KYCuKHm2CuGdIsC khsaCs+6IwNCgBYxQeRMUJBoVLRvhnKVJovFzzpFMy6LR2nWnDfqEkV9+fBoGfTlJ3890tw1Kia5 4mJjn8OfH8VcbiUHd3jz+OX7diSuC8+1fgXjSl/warPjeEXwqZrPUmp2HivfnVBV3T68zC8zQFXl f7IyUFX1flxJ5ZWhd2OLzbbXn/5rhraeUQjNxr3nJ6fN7x8qwfFn3GCC4mia+987VELVAiTiaLNi VzZVFZBgrQp6tOtEZXBKLURqR3zZblWl+D3RPIurL8ChyabW3p/AGqVtIIVK2T8h6ZGFLPvalJYs pdOmYiZzs1ahsP5kWdTiQjck/Cldhpgtwck4AJkpgwrihYKGi57ewFboRinpUFDAC+2OExA/CQ00 Aia1ZQ8FI5sFBjt/0csRHcqALB66p6qkOG/UJYr68t9ojbAy8pf/PNk5dF1MckWd/ir0SFgXLdWX u7erTpReIQd3IH1ZtV/tjr5gw0MUOgYncNW2D6/enV3SOvDy5/mNHSO9I5O6/vE9ydW1uqvQEZkQ hGU0vBySvymyCG9YQ2NTm6Ot55dV/tlNnSNdw9ertcNQHJxlXgrJw3Gm79pkW/doYGJ1advgx4lV nYMT+sEJyNb6oFyIlFijDS/RF5mCEKwU1AT3WDeAPB0emAGtLfslS+nskS5zoy3BnvBSkA5k8WMn S03g0GRIjwzOcT4OMlOG9LnNIgSkbqx39ueXhXbHOdKxJYGwnw5CNgtSaBzok51QZPEozZrzRl2i qC9bYopt+nLCfX2pV9CX40umL0bz7OG8tv3pDQGJVeEZDSkaHd6q8up6IjMbPz1Vs/9sA44nkID8 OuvXK1Impkyfn6nDQQb3OASdq+mGQ0Z5Jyr5JKl63+laFEFWWqkeVe1Jrgk7W59e3oHmjuS1HUBz KltzRe1L9P0LmcChCTcyHS5ZPPQokR65zGSlGHCQripWG3tjUlqylI6yqAHO9KhnbnQgRyKyAO09 AunAfolTbFQQyHLJZNCKx25B18QkR7gcBzIdvscBRE4OQLrrHH7/gpCQ5Vl3pKBC1CwdH5ogknLW BXsVAE70hX1Bi2jFJLt4lGbNnUaBaNuhqC/b48uF1VG3b4hr6LD+Yy7uoPT7u9EL+fuAN45Z9QUF 3dEXb8NLzi+0KGVQKawS2hIMmFhPyMI+FJPmH8hKS5a1K4W5oR5ZE3S8Bywwcka7Mk+CnJV6h1Ky DsJkYyLFyThIcx3uScAEgsaHYPuNgXrIwbPuMOz7RaBOkkhA6sZAFkkbcKIvLGY2EUAWj5NZc9ko INMeRX0JTq21fb8bQz9JcYdodQskSfbzo/t8E7efKScHl4wajCvDch7YmfRqXIE73+96G174/Qvb JHDGCsbKYFkASwfOYjGJ+jjXFwBP2gz4hIPMDXWyRSltAjdUijmzUNEuK0JZSr0D2AzoF4VKBaW9 YLDKgXQcWK4sGBn02JfuSQI1sGrRNNv8wLPuMFAVGmWVIzy0LlU33GNSKGyAyNlhx8lkASrC+g7s 40HwLGbpkLpsFJBpj6K+ZFd2/uDZGGHtwTfDC8UkV6zanQ1/+c+n/ZKWfZk1NmUkH+dcaOt9dHfK /TsSQ/MbrPb3XV84Xgs7TSi9QHHcQVFfxg2mhzYlQS/u3BAXkdk0ZnD2X5cMj09/dLRMWBctrHfw +7u/9UvanKjRDY+TsxIa7cCqMPWDu5IfC0it7rpqTeL6wvk2YI9lSIyYxPEIRX0ByUXttz8Xa/2V Fp/oxzanrAtUbz1SGny6LjyrKTK7+Yu0+j3JNe/EaNYGqh/8d5KwMvIXfz/+ow1xsu9foBdPf5kJ iXlyb/rrxwr90yvDC5qOFrepKtpxJVS0x2naArOq/3Hk4uOBqXDD4WWvum6Wvo+1WEZjn7rp74+i fg/RsWZ5JVxfvh+QvkBc2DsCxzOc6QtIKdI9vjkFkiEsjxCWh1v/ylF2IXFZuLA6Ci9Hh/Narb/y uyKyoMH67++OTZmWh2b//IMTYRcaP8upfdg/5VfbEn69LYH+0PEe20X3uJAOcXlqX2Z0YYsoLjau n9858MH8309vFcbPviVmeCVcXzgcKS70BRiMM+llHZ+n1m6JKX71iwvP7znnE6he5pe5Pih34968 tyIvByVWn9LozDOzPVcng5KrAxOr6EfaRvNMQoU2NL+husv6BVWpbjDmcusnWdX/Syn9j0qDs8wb xws3qzTvJpfsSq88UNB0pqbjxl9Oz2Mxjk8Whkyc3TSRtmmyIGjWYHtv8la4vnA4UlzrC8d9uL5w OFK4viwmXF84HCmivnR1dXF9uUVMJpNerxcNDofD9GVsbGxgYMBs/g78H6xeC8awu7tbNDgcDtMX i8XS29s7PDxsMBiMRiMexRz3mZ6ehrjg5QhDR+PJ4XCAqC8AEjMyMoJNgkN+O8dtdDodPnH64+LC 4ci4oS8cDoezuHB94XA4S8Pc3P8BAs+M9iABHg0AAAAASUVORK5CYII= --001a1140eade138c91053df5e2ce--