cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (3980)" <chris.a.mattm...@jpl.nasa.gov>
Subject Re: NewBie Question ~ Book for Cassandra
Date Mon, 13 Jun 2016 13:05:14 GMT
Hi Benjamin,



On 6/13/16, 6:38 AM, "Benjamin Lerer" <benjamin.lerer@datastax.com> wrote:

>Hi Chris,
>
>Disclaimer: I am a Datastax employee
>
>It is clear to me that the current official documentation is far from being
>enough. That's why I fully support the decision made by Jonathan to do our
>best to improve it.

Just as a small piece of advice - it seems like Jonathan is the “boss” of this
project. I’ve spoken with him here and there - he’s a great guy don’t get me
wrong - but Apache projects don’t have bosses. He is the chair of the project -
that earns him the great glory having to write a board report every month after
the project is created, and quarterly thereafter. The chair is expected to be
the eyes and ears of the project for the board. The project has a “Project 
Management Committee (PMC) or PMC” responsible jointly for stewarding the
project. There is also a “Committer” role at the ASF. Some communities define
PMC == C. The committer role does not have a binding VOTE on releases of the
software and/or on additions of new personnel to the project.

The reason I pointed this out and it may have just been me misreading but
it sounded like you suggested something like: Jonathan makes decision for
the project; you all jump. And I am just saying I hope that’s not the case.
You all should have equal decision making ability in the project especially
on the PMC.

>
>As an Apache Cassandra Committer mostly working on the CQL layer, I know
>that we have done our best to keep the CQL documentation up to date
>(https://cassandra.apache.org/doc/cql3/CQL-3.0.html). Now, English not
>being the native language of some of us, and as we are not technical
>writers, I would not really be surprised if some external persons have done
>a better job than us.
>
>I think our goal should be to provide an accurate and reliable
>documentation for the project.

I would amend the above to add “for the project[at the ASF]”. That’s 
the thing - as a *first* (and not *second*) though, the ASF project 
should be getting careful attention and that includes the documentation.


> Nevertheless, it seems legitimate to me to
>also provide links to external documentations, when people are asking for
>it, if others did a better job than us.

Sure, this happens in some projects from time to time. When there isn’t
a perception of control, it is possible to do this, especially if coinciding
with the external links there is some roadmap or some plan for actually keeping
the ASF documentation up to date. Real data point here - I wrote a book about
Apache Tika, Tika in Action. This was done, with frequent updates on what’s 
going on to dev@tika.apache.org. Over time, eventually we worked with Manning 
Publications to donate the code samples and examples from the book to the Apache 
Tika project. Much of the book inspiration and examples made it into Apache Tika
in parallel to the goings-on outside.

In a neutral playing ground it’s sometimes fine to point to external sources.
When those external sources usually boil down to a company’s web pages, and
there is strong perception that company is controlling the project, you can see
the dichotomy here.

>
>The conclusion that we can draw from Buhvan response is that the official
>documentation is probably currently not good enough as he is pointing to
>it. I believe that once we will have solve this problem, people will be
>more likely to make a reference to it. Until then, we should not be
>surprised if people are not pointing to it.

See above.

However also see that besides the current documentation, there needs to be
a roadmap for making Apache Cassandra and *its* documentation (not *DataStax’s*)
up to par for a basic user to build, deploy and run Cassandra. I don’t think that’s
the current case, is it?

Thanks for your email. I am hoping that we can work together to
get the project’s documentation (and also its governance) in a
better shape. 

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Director, Information Retrieval and Data Science Group (IRDS)
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
WWW: http://irds.usc.edu/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++




>
>On Sun, Jun 12, 2016 at 5:16 PM, Chris Mattmann <mattmann@apache.org> wrote:
>
>> Hi Harmeet,
>>
>> The dev list is the lifeblood of an Apache project, and
>> projects here at the ASF conduct 99% of their business in
>> public, not in private. The ASF is a non-profit for the
>> public good and we have a tradition of openness and
>> transparency.
>>
>> Even if the business isn’t pleasant some times, it must
>> be discussed, in public. The committers and PMC members for
>>
>> the code base - the name of which is *Apache* Cassandra as
>> the project is here at the *Apache Software Foundation* -
>> are Apache Software Foundation committers first, when they
>> deal or steward the Apache code-base. Even before their
>> $dayjobs.
>>
>> Cheers,
>> Chris
>>
>>
>> On 6/11/16, 11:54 PM, "mylisttech@gmail.com" <mylisttech@gmail.com> wrote:
>>
>> >Dear All,
>> >
>> >I am user of Cassandra. I am grateful to each of you for providing your
>> time as committers to the code base for a great product.
>> >
>> >This is what I wanted to suggest - could you gentlemen not create a group
>> email   Id to discuss matters of such importance amongst yourselves. Using
>> the dev list I am not sure is the best place. I have been reading emails
>> where insinuations have being made - if a particular company may high jack
>> the code base etc.
>> >
>> >We are all developers , we love our code. I don't think this is right
>> forum to bring things out of this proportion , read wash dirty linen.
>> >
>> >Pardon me if you think my opinion or inputs are wrong.
>> >
>> >I am newbie on Cassandra. I use it as an application developer. I don't
>> have any intention to judge your experiences or thoughts. Just saying this
>> could be done in a finer way without most if us getting to know about it.
>> >
>> >Regards,
>> >Harmeet
>> >
>> >
>> >
>> >On Jun 12, 2016, at 2:31, Tom Barber <tom.barber@meteorite.bi> wrote:
>> >
>> >> Looking at that thread, I'm surprised you didn't call Dave out as well,
>> >> that attitude did no one any favours.
>> >>
>> >>> Because lets all face the
>> >>> facts here, no one "likes" writing drivers and documentation, and I
>> have
>> >>> done both for this project.
>> >>
>> >> That's clearly incorrect, I (and I suspect other people) like writing
>> docs
>> >> because it means people can use your tools in a much easier manner than
>> >> looking through the code or unit tests.
>> >>
>> >> Tooling can be a burden but it doesn't excuse not writing docs, even if
>> it
>> >> becomes a PMC type rule for committers to commit Docs for new features
>> like
>> >> they should be committing unit tests. At least it improves what is
>> shipped
>> >> with the Apache project in question.
>> >>
>> >> Tom
>> >>
>> >> On Sat, Jun 11, 2016 at 7:21 PM, Chris Mattmann <mattmann@apache.org>
>> wrote:
>> >>
>> >>> Hi Russell,
>> >>>
>> >>> [CC/board@, board members may want to join the
>> >>> Apache Cassandra lists for specifics and further
>> >>> engagement]
>> >>>
>> >>> Multiple things that need to be addressed below, but TL;DR:
>> >>>
>> >>> 1. I have asked the Apache Cassandra PMC, and its chair, to provide
>> >>> a detailed description on how the project *isn’t* controlled by an
>> >>> external entity in its next monthly board report. The below further
>> >>> re-enforces the control. Further, it re-enforces the vitriol and
>> >>> name calling attitude when questioned and when someone suggests
>> >>> pointing to the Apache documentation and making it better as a first
>> >>> step. I plan on making it very loudly known at our next board meeting
>> >>> that something is awry. CC/board@ ahead of time on that.
>> >>>
>> >>> 2. You don’t seem to understand Apache. This is unfortunate.  I
>> >>> went to go look you up and see if you are a PMC member for Apache
>> >>> Cassandra. Funny enough, the main page doesn’t even link to the PMC
>> >>> (I couldn’t find a direct link). This isn’t even correct with respect
>> >>> to Apache branding guidelines here at the ASF. Shane, would you
>> >>> like to comment here? For an FYI to everyone, see:
>> >>> http://www.apache.org/foundation/marks/pmcs.html
>> >>>
>> >>> After a Google Search, I found this page:
>> >>> https://wiki.apache.org/cassandra/Committers
>> >>>
>> >>> That looks way out of date. Luckily there is the project.apache.org
>> >>> ASF page: https://projects.apache.org/committee.html?cassandra
>> >>>
>> >>> Which indicates you aren’t a committer or PMC member of the project.
>> >>> This is unfortunate. If you wrote a book for projects I work on, I
>> >>> would have hopefully long before and along the way got involved in
>> >>> the community, and encouraged you to contribute to the *core effort
>> >>> here at the ASF* and took you on the path towards becoming a PMC
>> >>> member in the *Apache project that is the core effort*.
>> >>>
>> >>> In short, I can see why you don’t understand Apache. It’s likely
>> >>> due to the fact that the Apache Cassandra PMC doesn’t seem to get
>> >>> it either. If they did, they would have worked to explain it to
>> >>> you.  More on that later.
>> >>>
>> >>> 3. The fact that you think “the companies that I try to [sic] vilify
>> >>> are the *future* of projects like this” isn’t just a statement that
>> >>> indicates you don’t get Apache. That someone in the community (which
>> >>> includes you even though you aren’t a committer or on the PMC) would
>> >>> think the “companies” are the “future” of any ASF project is
just
>> >>> way way bad. Like way bad. Off the rails bad. We are *individuals*
>> >>> here, not companies.
>> >>>
>> >>> 4. You state you have wrote drivers and documentation for this
>> >>> project.  Yet you aren’t a PMC member or committer at the ASF. Ever
>> >>> scratch your head and wonder why? By itself, again, sometimes there
>> >>> are reasons for this. Taken in context, there is something REALLY
>> >>> wrong here.
>> >>>
>> >>> Now, more specific replies inline below. Jonathan and PMC members
>> >>> for Apache Cassandra. Please take time to explain in your report
>> >>> what’s going on. I’m hopeful with mentorship and guidance and time
>> >>> this can be addressed but right now, not really happy with what
>> >>> I’m seeing.
>> >>>
>> >>>
>> >>>
>> >>> **********
>> >>> Specific comments
>> >>>
>> >>> On 6/11/16, 9:48 AM, "Russell Bradberry" <rbradberry@gmail.com>
wrote:
>> >>>
>> >>>> I respectfully disagree.  "Newbies" should be pointed in the direction
>> >>> that
>> >>>> will ensure the highest possibility of their success with the product.
>> >>>> This is the best decision for the project, regardless of where the
>> >>>> documentation may reside.
>> >>>
>> >>> While I agree with pointing Newbies to the point where
>> >>> there is the best documentation - I don’t agree that place
>> >>> should be outside of the Apache project.
>> >>>
>> >>>>
>> >>>> As one of the authors of an early book on Cassandra, the reason
we
>> wrote
>> >>> it
>> >>>> was because the ASF documentation was abysmal.
>> >>>
>> >>> What did you do to try and counteract this? Did you attempt to submit
>> >>> documentation patches and/or to submit documentation that would address
>> >>> that?
>> >>>
>> >>>> Now I am happy to say that
>> >>>> the book I wrote is obsolete, not just because it was written against
>> an
>> >>>> early version of Cassandra, but because the external documentation
is
>> so
>> >>>> thorough the need for a book to be written in no longer present.
>> >>>
>> >>> I had no problem with your statement until you put “external” before
>> the
>> >>> word “documentation”.
>> >>>
>> >>>>
>> >>>> If the ASF and the PMC want to promote internal documentation, then
a
>> >>>> serious amount of time and effort needs to be put into the
>> documentation.
>> >>>> This goes for every project in the ASF. The current state of
>> documentation
>> >>>> in any of the Apache projects sub-standard at best.
>> >>>
>> >>> This, unfortunately, is a strawman. I tell you that ASF projects should
>> >>> have
>> >>> the documentation that is required to run and should be the *first*
>> place
>> >>> you point users to for your documentation. You respond, well the ASF
>> >>> projects
>> >>> have crappy documentation as a whole. I totally disagree with that.
>> Here’s
>> >>> some examples: Tika, Nutch, Solr/Lucene, Subversion, HTTPD, Spark,
>> Hadoop,
>> >>> Maven, I could easily go on.
>> >>>
>> >>> A project that has been around as long as *Apache* (note I keep putting
>> >>> *Apache* in front of the project name too - something I don’t see
all
>> too
>> >>> often so far and something you should get used to) Cassandra should
>> know
>> >>> better. This isn’t a new Incubator project.
>> >>>
>> >>>>
>> >>>> You make mention, several times, of the community, and in this case
>> the
>> >>>> community has decided that the best source of documentation is the
one
>> >>> that
>> >>>> has had a company put financial investment into it.  You can't expect
>> a
>> >>>> community of unpaid volunteers to be able to coordinate and contribute
>> >>>> something of that high quality.
>> >>>
>> >>> Yes, I can. And yes, we do. That’s what we do at the ASF. It’s worked
>> >>> for many, many years, before, Apache Cassandra. It will work long after
>> >>> it too.
>> >>>
>> >>>>
>> >>>> Full disclosure, I am *not* on the PMC, nor am I an employee of
>> DataStax
>> >>> or
>> >>>> any other company that provides support for an open source project.
I
>> am a
>> >>>> member of the community that sees the highest probability of success
>> of
>> >>>> this project being that the PMC supports the development of the
core
>> >>>> product while the ancillary pieces like documentation and drivers
get
>> >>>> supported by those who are paid to support it.  Because lets all
face
>> the
>> >>>> facts here, no one "likes" writing drivers and documentation, and
I
>> have
>> >>>> done both for this project.
>> >>>
>> >>> Plenty of people are paid to support OSS software, even OSS software
>> at the
>> >>> ASF. But we must be diligent to wear our $dayjob hats, in contrast to
>> the
>> >>> ASF hats, and to do what’s right for the effort at Apache, since in
>> cases
>> >>> such as this, it is the *Apache* project, its community, and its
>> license,
>> >>> that are friendly to downstream users (even companies).
>> >>>
>> >>>>
>> >>>> Suffice it to say, that in my opinion, these "companies" that you
>> seem to
>> >>>> be trying so hard to vilify are the future of projects like this.
They
>> >>> fill
>> >>>> the gap that the ASF leaves with its volunteer based model.
>> >>>>
>> >>>> Also, to address your thinly veiled and pointed comments as of late.
>> It
>> >>>> seems you have already made up your mind about DataStax and are
>> continuing
>> >>>> in an effort to prove your point.  Doing this in a public manner
is
>> toxic
>> >>>> for the community and will do nothing more than to divide it and
risk
>> >>>> failure of the project.  I suggest you confer with the PMC and the
>> company
>> >>>> *privately* to determine what is best for the project and ultimately
>> the
>> >>>> community.
>> >>>
>> >>> This statement above, sadly, indicates how broken the governance of
>> >>> this project is. 99% of all discussion in the ASF is public. The only
>> >>> discussion in private is that adding new PMC members and/or committers.
>> >>> Would have been nice for someone long long long before me, to tell you
>> >>> that.
>> >>>
>> >>> Cheers,
>> >>> Chris
>> >>>
>> >>>>
>> >>>> Best,
>> >>>> -Russell Bradberry
>> >>>>
>> >>>> On Sat, Jun 11, 2016 at 12:16 PM, Mattmann, Chris A (3980) <
>> >>>> chris.a.mattmann@jpl.nasa.gov> wrote:
>> >>>>
>> >>>>> Hi Everyone,
>> >>>>>
>> >>>>> While this may be a current great source of documentation on
>> >>>>> Cassandra, and while it exists externally, the PMC should be
>> >>>>> be promoting (and hopefully ensuring) that the source of
>> documentation
>> >>>>> for Apache Cassandra is here at the ASF.
>> >>>>>
>> >>>>> I’m happy to be corrected that that is the case, and/or that
>> >>>>> I’ve missed something, but the first reply to questions like
>> >>>>> this from newbies shouldn’t be to point to an external website.
>> >>>>>
>> >>>>> Cheers,
>> >>>>> Chris
>> >>>>>
>> >>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>>>> Chris Mattmann, Ph.D.
>> >>>>> Chief Architect
>> >>>>> Instrument Software and Science Data Systems Section (398)
>> >>>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> >>>>> Office: 168-519, Mailstop: 168-527
>> >>>>> Email: chris.a.mattmann@nasa.gov
>> >>>>> WWW:  http://sunset.usc.edu/~mattmann/
>> >>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>>>> Director, Information Retrieval and Data Science Group (IRDS)
>> >>>>> Adjunct Associate Professor, Computer Science Department
>> >>>>> University of Southern California, Los Angeles, CA 90089 USA
>> >>>>> WWW: http://irds.usc.edu/
>> >>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On 6/11/16, 8:54 AM, "Bhuvan Rawal" <bhu1rawal@gmail.com>
wrote:
>> >>>>>
>> >>>>>> Hi Deepak,
>> >>>>>>
>> >>>>>> You can try Datastax Docs, they are most extensive and updated
>> >>>>>> documentation available.
>> >>>>>> As Cassandra is a fast developing technology I wonder if
there is a
>> >>> Book
>> >>>>> in
>> >>>>>> the market which covers latest features like Materialized
Views/
>> SASI
>> >>>>> Index
>> >>>>>> or new SSTable Format. I believe the best starting point
would be
>> the
>> >>>>>> Academy Tutorials and further Planet Cassandra - A week
in Cassandra
>> >>>>> series
>> >>>>>> provides good overview of blogs and developments by Cassandra
>> >>> Evangelists.
>> >>>>>> It also provides link of top blogs which help understand
internal
>> >>> working
>> >>>>>> of the Database.
>> >>>>>>
>> >>>>>> However if you still feel the need, you may refer to books,
here are
>> >>> some
>> >>>>>> that I know of -
>> >>>>>> Beginning Apache Cassandra Development - Vivek Mishra -
2014 - Link
>> >>>>>> <
>> >>>>>
>> >>>
>> https://www.amazon.com/Beginning-Apache-Cassandra-Development-Mishra/dp/1484201434
>> >>>>>>
>> >>>>>> Cassandra Data Modeling and Analysis - 2014 C.Y. Kan - Link
>> >>>>>> <
>> >>>>>
>> >>>
>> https://www.amazon.com/Cassandra-Data-Modeling-Analysis-C-Y/dp/1783988886/ref=sr_1_1?s=books&ie=UTF8&qid=1465659906&sr=1-1&keywords=cassandra+data+modeling+and+analysis
>> >>>>>>
>> >>>>>> Mastering Apache Cassandra - Second Edition - March 26 2015
- Link
>> >>>>>> <
>> >>>>>
>> >>>
>> https://www.amazon.com/gp/product/1784392618/ref=pd_lpo_sbs_dp_ss_3?pf_rd_p=1944687622&pf_rd_s=lpo-top-stripe-1&pf_rd_t=201&pf_rd_i=1484201434&pf_rd_m=ATVPDKIKX0DER&pf_rd_r=YVM1QBXHKAFK18J1XBAC
>> >>>>>>
>> >>>>>> Cassandra Design Patterns - 2015 - Link
>> >>>>>> <
>> >>>>>
>> >>>
>> https://www.amazon.com/Cassandra-Design-Patterns-Rajanarayanan-Thottuvaikkatumana/dp/178528570X/ref=sr_1_1?s=books&ie=UTF8&qid=1465659937&sr=1-1&keywords=cassandra+design+patterns
>> >>>>>>
>> >>>>>> Cassandra High Availability - 2014 - Link
>> >>>>>> <
>> >>>>>
>> >>>
>> https://www.amazon.com/Cassandra-High-Availability-Robbie-Strickland/dp/1783989122/ref=sr_1_1?s=books&ie=UTF8&qid=1465659975&sr=1-1&keywords=cassandra+high+availability
>> >>>>>>
>> >>>>>> Learning Apache Cassandra - Manage Fault Tolerant and Scalable
>> >>> Real-Time
>> >>>>>> Data - 2015 - Link
>> >>>>>> <
>> >>>>>
>> >>>
>> https://www.amazon.com/Learning-Apache-Cassandra-Tolerant-Real-Time/dp/1783989203/ref=sr_1_3?s=books&ie=UTF8&qid=1465659975&sr=1-3&keywords=cassandra+high+availability
>> >>>>>>
>> >>>>>>
>> >>>>>> Best Regards,
>> >>>>>> Bhuvan
>> >>>>>> Datastax Certified Architect
>> >>>>>>
>> >>>>>> On Sat, Jun 11, 2016 at 8:28 PM, Deepak Goel <deicool@gmail.com>
>> >>> wrote:
>> >>>>>>
>> >>>>>>> Hey
>> >>>>>>>
>> >>>>>>> Namaskara~Nalama~Guten Tag~Bonjour
>> >>>>>>>
>> >>>>>>> I am a newbie.
>> >>>>>>>
>> >>>>>>> Which would be the best book for a newbie to learn Cassandra?
>> >>>>>>>
>> >>>>>>> Thank You
>> >>>>>>> Deepak
>> >>>>>>>   --
>> >>>>>>> Keigu
>> >>>>>>>
>> >>>>>>> Deepak
>> >>>>>>> 73500 12833
>> >>>>>>> www.simtree.net, deepak@simtree.net
>> >>>>>>> deicool@gmail.com
>> >>>>>>>
>> >>>>>>> LinkedIn: www.linkedin.com/in/deicool
>> >>>>>>> Skype: thumsupdeicool
>> >>>>>>> Google talk: deicool
>> >>>>>>> Blog: http://loveandfearless.wordpress.com
>> >>>>>>> Facebook: http://www.facebook.com/deicool
>> >>>>>>>
>> >>>>>>> "Contribute to the world, environment and more :
>> >>>>>>> http://www.gridrepublic.org
>> >>>>>>> "
>> >>>>>>>
>> >>>>>
>> >>>
>> >>>
>>
>>
Mime
View raw message