incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Brosius <>
Subject Re: How is Cassandra being used?
Date Wed, 16 Nov 2011 01:24:51 GMT
+1 for an opt-in approach. To get better opt-in rates perhaps prompt for it on start (once)
rather than hope folks find it buried in the yaml

Eric Evans <> wrote:

>On Tue, Nov 15, 2011 at 11:23 PM, Jonathan Ellis <> wrote:
>> I started a "users survey" thread over on the users list (replies are
>> still trickling in), but as useful as that is, I'd like to get
>> feedback that is more quantitative and with a broader base.  This will
>> let us prioritize our development efforts to better address what
>> people are actually using it for, with less guesswork.  For instance:
>> we put a lot of effort into compression for 1.0.0; if it turned out
>> that only 1% of 1.0.x users actually enable compression, then it means
>> that we should spend less effort fine-tuning that moving forward, and
>> use the energy elsewhere.
>> (Of course it could also mean that we did a terrible job getting the
>> word out about new features and explaining how to use them, but either
>> way, it would be good to know!)
>> I propose adding a basic cluster reporting feature to cassandra.yaml,
>> enabled by default.  It would send anonymous information about your
>> cluster to an VM.  Information like, number (but not names)
>> of keyspaces and columnfamilies, ks-level options like compression, cf
>> options like compaction strategy, data types (again, not names) of
>> columns, average row size (or better: the histogram data), and average
>> sstables per read.
>> Thoughts?
>I think this is potentially quite dangerous; There are a lot people
>who get very twitchy at the idea of software that Phones Home.  I've
>seen this so many times, and in all cases it was for software a lot
>less sensitive than a database.
>I'm sure you've already considered this though, you're already talking
>about anonymity, and transparency, and what I assume is neutrality of
>the collection endpoint (can apache actually provide a VM; is that a
>thing?).  I'm just afraid that we'll scare people off before they can
>be properly convinced that it's all on the up-and-up.
>I'm curious to see what others think, but at the moment I'm hovering
>somewhere around a -0 if it were opt-in (off by default).
>Eric Evans
>Acunu | | @acunu
View raw message