Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 125EF200B8C for ; Mon, 12 Sep 2016 13:11:11 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 10D96160AC8; Mon, 12 Sep 2016 11:11:11 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 2F2F1160AB8 for ; Mon, 12 Sep 2016 13:11:10 +0200 (CEST) Received: (qmail 5987 invoked by uid 500); 12 Sep 2016 11:11:08 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 5978 invoked by uid 99); 12 Sep 2016 11:11:08 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Sep 2016 11:11:08 +0000 Received: from [192.168.23.9] (host86-162-232-31.range86-162.btcentralplus.com [86.162.232.31]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 337FD1A0044 for ; Mon, 12 Sep 2016 11:11:08 +0000 (UTC) Subject: Re: Partition size To: user@cassandra.apache.org References: <5cb62627-be02-a8d9-9a2b-6b45f19012ca@apache.org> From: Mark Thomas Message-ID: <53d965b0-5ae2-ccb2-9680-4d364cba940e@apache.org> Date: Mon, 12 Sep 2016 12:10:50 +0100 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit archived-at: Mon, 12 Sep 2016 11:11:11 -0000 On 09/09/2016 21:11, Benedict Elliott Smith wrote: > Come on. This kind of inconsistent 'policing' is not helpful. How is it inconsistent? Since I subscribed to the mailing list on 22 August, this is the first instance I have seen of anyone providing a link to third party docs rather than the equivalent project hosted docs in response to a user question. If I missed any, please point them out. The lists are pretty busy and that, combined with my minimal technical knowledge of Cassandra, means it is perfectly possible I missed some. I've done a quick double check of the user@ archives and while I do see a number of messages referencing 3rd party docs, those references were made by the OP rather than someone from the community providing an answer. > By all means, push the /*committers*/ to improve the project docs as is > happening, and to promote the internal resources over external ones. > > But Mark has absolutely no formal connection with the project, and his > contributions have only been to file a couple of JIRA (all of which have > so far been ignored by those of his colleagues who /are/ active > community members, I'll note!). Shaming him for not linking docs that > describe something /other/ than what he was even talking about is > crossing the line IMO. Any member of a project community (contributor, committer or PMC member) directing users to 3rd party docs in preference to project docs without a good reason is missing an opportunity to strengthen that project community. > Linking to third-party resources is commonplace, the only difference I > can see here is that these have been called "docs" by the authors, > instead of a blog post, and Mark has a DataStax email address. Linking to third party reference docs for an Apache project in response to a configuration question about that Apache project on one of the project's mailing lists is pretty unusual. Linking to third party docs, blogs, etc is fairly common but they tend to be linked by the OP in the form of "I've followed the instructions I found here and it doesn't work". The responses to such questions typically include links to the relevant parts of the Apache hosted docs. If the question is more involved then I have seen links to blogs, presentations, YouTube etc provided as an answer. If this happens multiple times for the same topic then it is usually added to an FAQ, wiki or similar along with an e-mail to the author to see if they'd be willing to contribute something to the docs. > Would you have reacted this way if Aaron Morton linked a blog post by > thelastpickle? Or a random user posted their own resources? Obviously not. Wrong. My reaction was based on the content of the message (a link to 3rd party docs in response to a question when an equivalent link to project hosted docs was available) not on who sent it or their employer. > I was initially all for the ASF endeavour to counteract DataStax' > outsized influence on the project, and was hopeful you might achieve > some positive change. Perhaps you may well still do. But it seems to > me that the ASF behaviour is beginning to cross from constructive > criticism of the project participants to prejudicially hostile behaviour > against certain community members - and that is unlikely to result in a > better project. > > You should be treating everyone consistently, in a manner that promotes > project health. It is not healthy if community members are directing users to 3rd party documentation in preference to the project's own documentation. If it is happening because the project's documentation is non-existent / wrong / poorly written / etc. then that is understandable (and would be an issue the project needed to address) but that was not the case in this instance. There are many aspects to community health. In the grand scheme of things the single e-mail that started this particular discussion is in the noise. However, a consistent pattern of such e-mails would be much more troubling. My intent was to ensure that such a pattern did not form. Whether people agree with my response or not, the community is hopefully more aware of the issue than it was previously. Mark > On Friday, 9 September 2016, Mark Thomas > wrote: > > On 09/09/2016 16:46, Mark Curtis wrote: > > If your partition sizes are over 100MB iirc then you'll normally see > > warnings in your system.log, this will outline the partition key, at > > least in Cassandra 2.0 and 2.1 as I recall. > > > > Your best friend here is nodetool cfstats which shows you the > > min/mean/max partition sizes for your table. It's quite often used to > > pinpoint large partitons on nodes in a cluster. > > > > More info > > here: > https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsCFstats.html > > > Folks, > > It is *Apache* Cassandra. If you are going to point to docs, please > point to the official Apache docs unless there is a very good reason > not to. > > In this case: > > http://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html#compaction_large_partition_warning_threshold_mb > > > looks to the place. > > Mark > > > > > > Thanks > > > > Mark > > > > > > On 9 September 2016 at 02:53, Anshu Vajpayee > > wrote: > > > > Is there any way to get partition size for a partition key ? > > > > >