cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <>
Subject Re: Ball is rolling on High Performance Cassandra Cookbook second edition
Date Wed, 27 Jun 2012 21:32:37 GMT
On Wed, Jun 27, 2012 at 4:34 PM, Brian O'Neill <> wrote:
> RE: API method signatures changing
> That triggers another thought...
> What terminology will you use in the book to describe the data model?  CQL?
> When we wrote the RefCard on DZone, we intentionally favored/used CQL
> terminology.  On advisement from Jonathan and Kris Hahn, we wanted to start
> the process of sunsetting the legacy terms (keyspace, column family, etc.)
> in favor of the more familiar CQL terms (schema, table, etc.). I've gone on
> record in favor of the switch, but it is probably something worth noting in
> the book since that terminology does not yet align with all the client APIs
> yet. (e.g. Hector, Astyanax, etc.)
> I'm not sure when the client APIs will catch up to the new terminology, but
> we may want to inquire as to future proof the recipes as much as possible.
> -brian
> On Wed, Jun 27, 2012 at 4:18 PM, Edward Capriolo <>
> wrote:
>> On Wed, Jun 27, 2012 at 3:08 PM, Courtney Robinson <>
>> wrote:
>> > Sounds good.
>> > One thing I'd like to see is more coverage on Cassandra Internals. Out
>> > of
>> > the box Cassandra's great but having a little inside knowledge can be
>> > very
>> > useful because it helps you design your applications to work with
>> > Cassandra;
>> > rather than having to later make endless optimizations that could
>> > probably
>> > have been avoided had you done your implementation slightly differently.
>> >
>> > Another thing that may be worth adding would be a recipe that showed an
>> > approach to evaluating Cassandra for your organization/use case. I
>> > realize
>> > that's going to vary on a case by case basis but one thing I've noticed
>> > is
>> > that some people dive in without really thinking through whether
>> > Cassandra
>> > is actually the right fit for what they're doing. It sort of becomes a
>> > hammer for anything that looks like a nail.
>> >
>> > On Tue, Jun 26, 2012 at 10:25 PM, Edward Capriolo
>> > <>
>> > wrote:
>> >>
>> >> Hello all,
>> >>
>> >> It has not been very long since the first book was published but
>> >> several things have been added to Cassandra and a few things have
>> >> changed. I am putting together a list of changed content, for example
>> >> features like the old per Column family memtable flush settings versus
>> >> the new system with the global variable.
>> >>
>> >> My editors have given me the green light to grow the second edition
>> >> from ~200 pages currently up to 300 pages! This gives us the ability
>> >> to add more items/sections to the text.
>> >>
>> >> Some things were missing from the first edition such as Hector
>> >> support. Nate has offered to help me in this area. Please feel contact
>> >> me with any ideas and suggestions of recipes you would like to see in
>> >> the book. Also get in touch if you want to write a recipe. Several
>> >> people added content to the first edition and it would be great to see
>> >> that type of participation again.
>> >>
>> >> Thank you,
>> >> Edward
>> >
>> >
>> >
>> >
>> > --
>> > Courtney Robinson
>> >
>> >
>> > 07535691628 (No private #s)
>> >
>> Thanks for the comments. Yes the "INTERNALS" chapter was a bit tricky.
>> The challenge of writing about internals is they go stale fairly
>> quickly. I was considering writing a partitioner for the internals
>> chapter but then I thought about it more:
>> 1) Its hard
>> 2) The APIs can change. (They work the same way across versions but
>> they may have a different signature etc)
>> 3) 99.99% of people should be using the random partitioner :)
>> But I agree the external chapter can be made much stronger then it is.
>> The recipe format strict. It naturally conflicts with the typical use
>> case style. In a use case where you write a good amount of text
>> talking about problem domain, previous solutions, bragging about
>> company X. We can not do that with the recipe style, but we can do our
>> best to make the recipes as real world as possible. I tried to do that
>> throughout the text, you do not find many examples like 'writing foo
>> records to bar column families'. However the format does not allow
>> extensive text blocks mentioned above so it is difficult to set the
>> stage for a complex and detailed real world problem. Still, I think
>> for some examples we can take the next step and make the recipe more
>> real world practical and more use-case like.
> --
> Brian ONeill
> Lead Architect, Health Market Science (
> mobile:215.588.6024
> blog:
> blog:

As for terminology, I guess you can consider me a hard-liner as I have
a few problems with calling a column family a table. I might be in the
minority, but I know I am not alone. On one hand aliases make the
integration easier, but on the other
hand if a user does not understand what a column family is they will
likely use cassandra incorrectly.

Maybe this is just a semantics debate because a table in a column
oriented database is different then a table in a row oriented
database, but the column family data model is one of the cornerstones
of Cassandra. Globally replacing column family with table for the text
is not a good idea.

We will have to be smart about it. As thrift, the cli, the internals,
the high level clients will be like this for some time.

I definitely plan to add an entire chapter on CQL. I think we can put
it after the CLI chapter, the introduction of CQL can attempt to cover
the ground between the old school and the new school thinking.


View raw message