Return-Path: X-Original-To: apmail-cassandra-dev-archive@www.apache.org Delivered-To: apmail-cassandra-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8F6A11113B for ; Tue, 3 Jun 2014 19:00:20 +0000 (UTC) Received: (qmail 19534 invoked by uid 500); 3 Jun 2014 19:00:19 -0000 Delivered-To: apmail-cassandra-dev-archive@cassandra.apache.org Received: (qmail 19496 invoked by uid 500); 3 Jun 2014 19:00:19 -0000 Mailing-List: contact dev-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list dev@cassandra.apache.org Received: (qmail 19486 invoked by uid 99); 3 Jun 2014 19:00:19 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Jun 2014 19:00:19 +0000 X-ASF-Spam-Status: No, hits=1.0 required=5.0 tests=SPF_HELO_PASS,SPF_SOFTFAIL X-Spam-Check-By: apache.org Received-SPF: softfail (nike.apache.org: transitioning domain of schemouil@gmail.com does not designate 62.210.136.60 as permitted sender) Received: from [62.210.136.60] (HELO palantir.mithrandir.net) (62.210.136.60) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Jun 2014 19:00:15 +0000 Received: from [192.168.142.11] (silmaril.mithrandir.net [82.241.251.98]) (Authenticated sender) by palantir.mithrandir.net (Postfix) with ESMTPSA id 8FF2820E1AD6 for ; Tue, 3 Jun 2014 20:59:50 +0200 (CEST) Message-ID: <538E1B25.5010605@gmail.com> Date: Tue, 03 Jun 2014 20:59:49 +0200 From: Simon Chemouil User-Agent: Thunderbird [Debian GNU/Linux, X11] MIME-Version: 1.0 To: dev@cassandra.apache.org Subject: Re: Refactoring cassandra service package References: <538D8CE6.3070408@gmail.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Brandon Williams racontait le 03/06/2014 20:00: > Relevant: https://issues.apache.org/jira/browse/CASSANDRA-6881 Thanks for the pointer, couldn't find that issue. > On Tue, Jun 3, 2014 at 12:59 PM, Gary Dusbabek wrote: > >> On Tue, Jun 3, 2014 at 3:52 AM, Simon Chemouil >> wrote: >> >>> Hi, >>> >>> I'm new to Cassandra and felt like exploring and hacking on the code. I >>> was surprised to see the usage of so many mutable/global state statics >>> all over the service package (basically global variables/singletons). >>> >>> While I understand it can be practical to work with singletons, and that >>> in any case I'm not sure multi-tenant Cassandra (as in two different >>> Cassandra instances within the same process) would make sense at all (or >>> even work considering there is some native access going on with JNA), I >>> find static state can easily lead to tangled 'spaghetti' code (accessing >>> the singletons from anywhere, even where one shouldn't), and in general >>> it ties the code to the VM instance, rather than to the class. >>> >>> I tried to find if it was an actual design choice, but from my >>> understanding this is more something inherited from the early Cassandra >>> times at Facebook. I just found this thread[1] pointing to issue >>> CASSANDRA-741 (slightly more limited scope) that was marked as WONTFIX >>> because no one took it (but still marked as open for contribution). The >>> current code conventions also don't mention the usage of singletons >>> except by stating: "Do not extract interfaces (or abstract classes) >>> unless you actually need multiple implementations of it" (switching to a >>> "service"-style design doesn't require passing interfaces but it's >>> highly encouraged to help testability). >>> >>> So, I'd like to try to make this refactoring happen and remove all (or >>> most) mutable static state. It would be an easy way in for me in >>> Cassandra's internals (maybe to contribute further). I think it would >>> help testing (ability to unit test components without going to the >>> storage for instance) and in general modernize the code. It would also >>> make hacking on Cassandra easier because people could pick different >>> pieces without pulling the whole thing. >>> >>> It would definitely break backwards compatibility with current Java code >>> that directly embeds Cassandra / uses it as a library, but I would keep >>> the same abstraction so the refactoring would be easy. In any case, >>> backwards compatibility can be broken by many more changes than just >>> refactoring, and once this is done it will be easier to deal with >>> backwards compatibility. >>> >>> Obviously all ".instance" fields would be gone, and I'd try to fix >>> potential cyclic class dependencies and generally make sure classes >>> dependencies form a direct acyclic graph with CassandraDaemon as its >>> root. The basic idea is to have each 'service' component require all its >>> service dependencies in their constructor (and keeping them as a final >>> field), rather than getting them via the global namespace (singleton >>> instances). >>> >>> If I had it my way, I'd probably use a dependency injection framework, >>> namely Dagger which is as far as I knpw the lightest Java DI framework >>> actively developed (jointly developed by Square and Google's Java team >>> responsible for Guice & Guava), which has a neat compile-time annotation >>> processor that detects missing dependencies early on. It works with both >>> Android and J2SE and is very fast, simple and light (65kB vs 710kB for >>> Guice). >>> >>> So, the question is: would you guys accept such a patch? I'd rather not >>> do the work if it has no chance of being merged upstream :). >>> >> >> This has come up before. Let's face it, removing the singletons is a >> tempting proposition. >> >> Several of us have been down the path of trying to do it. >> >> At the end of the day, here's what you'd end up with (absolutely best >> case): >> >> 1. Modifying just about every class, sometimes substantially. >> 2. A huge patch for someone else to review. >> 3. No performance gains, no bug fixes. In fact, since so many classes have >> to be changed, I'd say that the risk of introducing a bug/regression is >> fairly likely. >> 4. Complicated merges when bugs need to be fixed in older versions. >> 5. More modular and testable code. >> >> So far, the positive aspects of 5 have not been able to trump the >> challenges presented by 1, 2, 3, and 4. >> >> Kind Regards, >> >> Gary. >> >> >>> >>> Cheers, >>> >>> -- >>> Simon >>> >>> >>> [1] >>> >>> >> http://grokbase.com/t/cassandra/dev/107xr48hek/creating-two-instances-in-code >>> >> >