Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A87717786 for ; Thu, 28 Jul 2011 15:00:26 +0000 (UTC) Received: (qmail 97380 invoked by uid 500); 28 Jul 2011 15:00:23 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 97292 invoked by uid 500); 28 Jul 2011 15:00:23 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 97278 invoked by uid 99); 28 Jul 2011 15:00:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Jul 2011 15:00:22 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of sylvain@datastax.com designates 74.125.83.43 as permitted sender) Received: from [74.125.83.43] (HELO mail-gw0-f43.google.com) (74.125.83.43) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Jul 2011 15:00:17 +0000 Received: by gwm11 with SMTP id 11so2188257gwm.30 for ; Thu, 28 Jul 2011 07:59:56 -0700 (PDT) Received: by 10.236.153.134 with SMTP id f6mr144084yhk.177.1311865196166; Thu, 28 Jul 2011 07:59:56 -0700 (PDT) MIME-Version: 1.0 Received: by 10.236.103.180 with HTTP; Thu, 28 Jul 2011 07:59:36 -0700 (PDT) X-Originating-IP: [88.183.33.171] In-Reply-To: References: From: Sylvain Lebresne Date: Thu, 28 Jul 2011 16:59:36 +0200 Message-ID: Subject: Re: Changing the CLI, not a great idea! To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Thu, Jul 28, 2011 at 4:00 PM, Edward Capriolo wr= ote: > > > On Thu, Jul 28, 2011 at 9:35 AM, Jonathan Ellis wrote= : >> >> I'm talking about data compatibility, which is more important than cli >> statement compatibility. >> >> Consider someone with a python program that creates a CF with the >> default settings and inserts some (say) uuid columns and long data. >> >> If we changed CF creation to default to ascii we would break this progra= m. >> >> So we had to leave CF comparator defaulting to BytesType, when we >> changed the CLI to respect comparator/validator definitions when >> parsing user input. >> >> You could argue that CLI should continue to parse BytesType as ascii >> but then how could a user input actual binary data? =A0The lesser evil >> here is to educate users that "if you want to use ascii column names, >> that is how you should declare the comparator." >> >> On Thu, Jul 28, 2011 at 8:23 AM, Edward Capriolo >> wrote: >> > >> > >> > On Thu, Jul 28, 2011 at 8:46 AM, Jonathan Ellis >> > wrote: >> >> >> >> It defaults to hex because that is how bytestype is represented. =A0T= he >> >> default remains bytestype to provide the kind of backwards >> >> compatibility you are complaining about. :) >> >> >> >> On Thu, Jul 28, 2011 at 6:56 AM, Edward Capriolo >> >> >> >> wrote: >> >> > >> >> > >> >> > On Thursday, July 28, 2011, Sasha Dolgy wrote: >> >> >> Unfortunately, the perception that I have as a business consumer a= nd >> >> >> night-time hack, is that more importance and effort is placed on >> >> >> ensuring information is up to date and correct on the >> >> >> http://www.datastax.com/docs/0.8/index website and less on keeping >> >> >> the >> >> >> wiki up to date or relevant... which forces people to be introduce= d >> >> >> to >> >> >> a for-profit company to get relevant information ... which just so >> >> >> happens to employ a substantial amount of Apache Cassandra >> >> >> contributors ... not that there's anything wrong with that, right? >> >> >> >> >> >> On Thu, Jul 28, 2011 at 10:46 AM, David Boxenhorn >> >> >> >> >> >> wrote: >> >> >>> This is part of a much bigger problem, one which has many parts, >> >> >>> among >> >> >>> them: >> >> >>> >> >> >>> 1. Cassandra is complex. Getting a gestalt understanding of it >> >> >>> makes >> >> >>> me >> >> >>> think I understand how Alzheimer's patients must feel. >> >> >>> 2. There is no official documentation. Perhaps everything is out >> >> >>> there >> >> >>> somewhere, who knows? >> >> >>> 3. Cassandra is a moving target. Books are out of date before the= y >> >> >>> hit >> >> >>> the >> >> >>> press. >> >> >>> 4. Most of the important knowledge about Cassandra exists in a ki= nd >> >> >>> of >> >> >>> oral >> >> >>> history, that is hard to keep up with, and even harder to >> >> >>> understand >> >> >>> once >> >> >>> it's long past. >> >> >>> >> >> >>> I think it is clear that we need a better one-stop-shop for good >> >> >>> documentation. What hasn't been talked about much - but I think >> >> >>> it's >> >> >>> just >> >> >>> as >> >> >>> important - is a good one-stop-shop for Cassandra's oral history. >> >> >>> >> >> >>> (You might think this list is the place, but it's too noisy to be >> >> >>> useful, >> >> >>> except at the very tip of the cowcatcher. Cassandra needs a >> >> >>> canonized >> >> >>> version of its oral history.) >> >> >> >> >> > >> >> > Well the problem is not lack of documentation but changing things >> >> > that >> >> > probably do not matter and thus invalidating all documentation. >> >> > >> >> > To stay on point. Why does the cli default to hex. Come on who is >> >> > doing >> >> > inserts in hex? Would it be more.natural for the cli to so this: >> >> > >> >> > 'ascii' auto function call ascii >> >> > "utf8" auto function utf8 >> >> > Oxafaf auto function hex >> >> > >> >> > Or really do not change get add a new statement >> >> > Typedget >> >> > And leave get alone >> >> > >> >> > The argument to have two methods that almost do the same thing is a >> >> > bad >> >> > one, >> >> > but it is no worse then invalidating tons of docs. But really I can= 't >> >> > support a hex default, I know no one with a hex keyboard. >> >> > >> >> > >> >> >> >> >> >> >> >> -- >> >> Jonathan Ellis >> >> Project Chair, Apache Cassandra >> >> co-founder of DataStax, the source for professional Cassandra support >> >> http://www.datastax.com >> > >> > I am a little confused. How can it be backwards compatible if the same >> > statements don't work across versions? >> > >> > I am sure there is a good reason, but isn't there some clever way this >> > can >> > be done on the CLI without forcing me to create the column family with >> > meta >> > data or wrapping everything in asci('')? Something out of the box that >> > is >> > easy and makes both worlds happy? >> > >> > Remember I left the rdbms world to cure my addictions to schema's, don= 't >> > be >> > a 'schema pusher' :) >> > >> >> >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of DataStax, the source for professional Cassandra support >> http://www.datastax.com > > I agree that defaulting a column family to ascii to make the CLI happy is > the wrong thing to do. > > But I think that the CLI is for users, users are almost always working wi= th > human readable data. > > I feel that most CLI's do not force users to wrap CLI strings in ascii(''= ) > and are capable of working with binary data. > http://dev.mysql.com/doc/refman/5.0/en/string-syntax.html > > Maybe I am wrong, but I feel like this change could have been done withou= t > being disruptive and forcing users to re-educate. Can the antlr grammar b= e > re-worked in any such way? I'll play devil's advocate here but it seems to me that changing it again w= ould do exactly what you complain about here. Another change would confuse users ever more. We could argue that what you propose is vastly superior to what we have now and would thus justify yet one more change. But it's another debate, on= e that imho is debatable: I'm personally very happy when I use the CLI that provided I have set the right comparator, I don't have to care about quotin= g my strings at all (having to care whether I need a single or double quote woul= d be even more annoying). Also, making sure that people understand quickly that = there are column comparators, that those are useful and that they better use AsciiType or UTF8Type if this is what they want is not entirely a bad thing in my book. Again, just saying that it's debatable and the current way the CLI does things don't seem so retarded to me. That changing was confusing to users, I agree. But it's done, let's avoid d= oing it again without a very good reason. -- Sylvain