Return-Path: X-Original-To: apmail-storm-dev-archive@minotaur.apache.org Delivered-To: apmail-storm-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A0A4710E0B for ; Wed, 26 Feb 2014 17:36:29 +0000 (UTC) Received: (qmail 8591 invoked by uid 500); 26 Feb 2014 17:36:28 -0000 Delivered-To: apmail-storm-dev-archive@storm.apache.org Received: (qmail 8531 invoked by uid 500); 26 Feb 2014 17:36:27 -0000 Mailing-List: contact dev-help@storm.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@storm.incubator.apache.org Delivered-To: mailing list dev@storm.incubator.apache.org Received: (qmail 8510 invoked by uid 99); 26 Feb 2014 17:36:26 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 Feb 2014 17:36:26 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: 216.145.54.171 is neither permitted nor denied by domain of evans@yahoo-inc.com) Received: from [216.145.54.171] (HELO mrout1.yahoo.com) (216.145.54.171) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 Feb 2014 17:36:23 +0000 Received: from GQ1-EX10-CAHT08.y.corp.yahoo.com (gq1-ex10-caht08.corp.gq1.yahoo.com [10.73.118.87]) by mrout1.yahoo.com (8.14.4/8.14.4/y.out) with ESMTP id s1QHZlvd099804 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Wed, 26 Feb 2014 09:35:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=yahoo-inc.com; s=cobra; t=1393436148; bh=XuHwhWBz4wJrArZynuZG+oxWZxie+MgfEOnnKJkjzjs=; h=From:Subject:Date:References:In-Reply-To; b=Eos1CIujkDr7fQkzAc3ECv1S9w/gGXaLzeIiIHf+r3Lj1h3cLSW1S/enOeJrX46HA aMMzJxRZgYB7ZAB202e9IFpTIovL/DzQyPwuqtUXCRm8atLm5sfApsHVAFg8wpIZkt GpPqF/RijlDB8YN3t8bWIKJ/be33QYp5erhhaOsw= Received: from GQ1-MB01-02.y.corp.yahoo.com ([fe80::a049:b5af:9055:ada6]) by GQ1-EX10-CAHT08.y.corp.yahoo.com ([fe80::1da1:7b65:cb46:5de4%16]) with mapi id 14.03.0181.006; Wed, 26 Feb 2014 09:35:47 -0800 From: Bobby Evans To: "user@storm.incubator.apache.org" CC: "dev@storm.incubator.apache.org" Subject: Re: [DISCUSS] Pulling "Contrib" Modules into Apache Thread-Topic: [DISCUSS] Pulling "Contrib" Modules into Apache Thread-Index: AQHPMnkQKNumqZ23f0WdLqF+CA80a5rHN36AgAAD3QCAABz/gIAAFH+AgACCFIA= Date: Wed, 26 Feb 2014 17:35:46 +0000 Message-ID: References: <20265179-33CB-44AA-9F79-1053AFF2C568@gmail.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.3.9.131030 x-originating-ip: [10.74.91.218] Content-Type: multipart/alternative; boundary="_000_CF3373B825EA7evansyahooinccom_" MIME-Version: 1.0 X-Milter-Version: master.31+4-gbc07cd5+ X-CLX-ID: 436148000 X-Virus-Checked: Checked by ClamAV on apache.org --_000_CF3373B825EA7evansyahooinccom_ Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable I can see a lot of value in having a distribution of storm that comes with = batteries included, everything is tested together and you know it works. B= ut I don=92t see much long term developer benefit in building them all toge= ther. If there is strong coupling between storm and these external project= s so that they break when storm changes then we need to understand the coup= ling and decide if we want to reduce that coupling by stabilizing APIs, imp= roving version numbering and release process, etc.; or if the functionality= is something that should be offered as a base service in storm. I can see politically the value of giving these other projects a home in Ap= ache, and making them sub-projects is the simplest route to that. I=92d lo= ve to have storm on yarn inside Apache. I just don=92t want to go overboar= d with it. There was a time when HBase was a =93contrib=94 module under Ha= doop along with a lot of other things, and the Apache board came and told H= adoop to brake it up. Bringing storm-kafka into storm does not sound like it will solve much from= a developer=92s perspective, because there is at least as much coupling wi= th kafka as there is with storm. I can see how it is a huge amount of over= head and pain to set up a new project just for a few hundred lines of code,= as such I am in favor of pulling in closely related projects, especially t= hose that are spouts and state implementations. I just want to be sure that= we do it carefully, with a good reason, and with enough people who are fam= iliar with the code to support it long term. If it starts to look like we are pulling in too many projects perhaps we sh= ould look at something more like the bigtop project https://bigtop.apache.= org/ which produces a tested distribution of Hadoop with many different sub= -projects included in it. I am also a bit concerned about these sub-projects becoming second class ci= tizens, where we break something, but because the build is off by default w= e don=92t know it. I would prefer that they are built and tested by defaul= t. If the build and test time starts to take too long, to me that means we= need to start wondering if we have too many contrib modules. =97Bobby From: Brian Enochson > Reply-To: "user@storm.incubator.apache.org" > Date: Tuesday, February 25, 2014 at 9:50 PM To: "user@storm.incubator.apache.org" = > Cc: "dev@storm.incubator.apache.org"= > Subject: Re: [DISCUSS] Pulling "Contrib" Modules into Apache hi, I am in agreement with Taylor and believe I understand his intent. An in= credible tool/framework/application like Storm is only enhanced and gains v= alue from the number of well maintained and vetted modules that can be used= for integration and adding further functionality. I am relatively new to the Storm community but have spent quite some time= reviewing contributing modules out there, reviewing various duplicates and= running into some version incompatibilities. I understand the need to keep= Storm itself pure, but do think there needs to be some structure and gover= nance added to the contributing modules. Look at the benefit a tool like np= m brings to the node community. I like the idea of sponsorship, vetting and a community vote. I, as sure= many would be, am willing to offer support and time to working through how= to set this up and helping with the implementation if it is decided to pur= sue some solution. I hope these views are taken in the sprit they are made, to make this inc= redible system even better along with the surrounding eco-system. Thanks, Brian On Tue, Feb 25, 2014 at 9:36 PM, P. Taylor Goetz > wrote: Just to be clear (and play a little Devil=92s advocate :) ), I=92m not sugg= esting that whatever a =93contrib=94 project/module/subproject might becom= e, be a clearinghouse for anything Storm-related. I see it as something that is well-vetted by the Storm community, subject t= o PPMC review, vote, etc. Entry would require community review, PPMC review= , and in some cases ASF IP clearance/legal review. Anything added would req= uire some level of commitment from the PPMC/committers to provide some leve= l of support. In other words, nothing =93willy-nilly=94. One option could be that any module added require (X > 0) number of commit= ters to volunteer as =93sponsor=94s for the module, and commit to maintaini= ng it. That being said, I don=92t see storm-kafka being any different from anythin= g else that provides integration points for Storm. -Taylor On Feb 25, 2014, at 7:53 PM, Nathan Marz > wrote: I'm only +1 for pulling in storm-kafka and updating it. Other projects put = these contrib modules in a "contrib" folder and keep them managed as comple= tely separate codebases. As it's not actually a "module" necessary for Stor= m, there's an argument there for doing it that way rather than via the mult= i-module route. On Tue, Feb 25, 2014 at 4:39 PM, Milinda Pathirage > wrote: Hi Taylor, I'm +1 for pulling these external libraries into Apache codebase. This will certainly benifit Strom community. I also like to contribute to this process. Thanks Milinda On Tue, Feb 25, 2014 at 5:28 PM, P. Taylor Goetz > wrote: > A while back I opened STORM-206 [1] to capture ideas for pulling in > "contrib" modules to the Apache codebase. > > In the past, we had the storm-contrib github project [2] which subsequent= ly > got broken up into individual projects hosted on the stormprocessor githu= b > group [3] and elsewhere. > > The problem with this approach is that in certain cases it led to code ro= t > (modules not being updated in step with Storm's API), fragmentation > (multiple similar modules with the same name), and confusion. > > A good example of this is the storm-kafka module [4], since it is a widel= y > used component. Because storm-contrib wasn't being tagged in github, a lo= t > of users had trouble reconciling with which versions of storm it was > compatible. Some users built off specific commit hashes, some forked, and= a > few even pushed custom builds to repositories such as clojars. With kafka > 0.8 now available, there are two main storm-kafka projects, the original > (compatible with kafka 0.7) and an updated fork [5] (compatible with kafk= a > 0.8). > > My intention is not to find fault in any way, but rather to point out the > resulting pain, and work toward a better solution. > > I think it would be beneficial to the Storm user community to have certai= n > commonly used modules like storm-kafka brought into the Apache Storm > project. Another benefit worth considering is the licensing/legal oversig= ht > that the ASF provides, which is important to many users. > > If this is something we want to do, then the big question becomes what so= rt > governance process needs to be established to ensure that such things are > properly maintained. > > Some random thoughts, questions, etc. that jump to mind include: > > What to call these things: "contib modules", "connectors", "integration > modules", etc.? > Build integration: I imagine they would be a multi-module submodule of th= e > main maven build. Probably turned off by default and enabled by a maven > profile. > Governance: Have one or more committer volunteers responsible for > maintenance, merging patches, etc.? Proposal process for pulling new > modules? > > > I look forward to hearing others' opinions. > > - Taylor > > > [1] https://issues.apache.org/jira/browse/STORM-206 > [2] https://github.com/nathanmarz/storm-contrib > [3] https://github.com/stormprocessor > [4] https://github.com/nathanmarz/storm-contrib/tree/master/storm-kafka > [5] https://github.com/wurstmeister/storm-kafka-0.8-plus -- Milinda Pathirage PhD Student | Research Assistant School of Informatics and Computing | Data to Insight Center Indiana University twitter: milindalakmal skype: milinda.pathirage blog: http://milinda.pathirage.org -- Twitter: @nathanmarz http://nathanmarz.com --_000_CF3373B825EA7evansyahooinccom_--