cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Tillotson <>
Subject Re: How is Cassandra being used?
Date Wed, 16 Nov 2011 09:10:43 GMT
I've read through the thread and have a few comments and and idea. 

1) I can understand a preference for opt in
2) As a user I would have probably opted in every time I hit a performance issue
3) Opt in may well be skewed to poorer use cases or hardware issues
4) There is a trust gap that needs to be bridged before opt out is acceptable

Now for the Idea, perhaps a report tool, in nodetool that generates a human readable profile,
in the short term a manual submission process, perhaps down the line fully automated.

So basically there are two good plans in your email
1) Standard reporting  (+1)
2) Automated feedback (opt in +1)


From: Jonathan Ellis <>
To: dev <>
Sent: Tuesday, 15 November 2011, 23:23
Subject: How is Cassandra being used?

I started a "users survey" thread over on the users list (replies are
still trickling in), but as useful as that is, I'd like to get
feedback that is more quantitative and with a broader base.  This will
let us prioritize our development efforts to better address what
people are actually using it for, with less guesswork.  For instance:
we put a lot of effort into compression for 1.0.0; if it turned out
that only 1% of 1.0.x users actually enable compression, then it means
that we should spend less effort fine-tuning that moving forward, and
use the energy elsewhere.

(Of course it could also mean that we did a terrible job getting the
word out about new features and explaining how to use them, but either
way, it would be good to know!)

I propose adding a basic cluster reporting feature to cassandra.yaml,
enabled by default.  It would send anonymous information about your
cluster to an VM.  Information like, number (but not names)
of keyspaces and columnfamilies, ks-level options like compression, cf
options like compaction strategy, data types (again, not names) of
columns, average row size (or better: the histogram data), and average
sstables per read.


Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message