Return-Path: X-Original-To: apmail-apex-dev-archive@minotaur.apache.org Delivered-To: apmail-apex-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 719E718074 for ; Thu, 24 Dec 2015 07:00:23 +0000 (UTC) Received: (qmail 79726 invoked by uid 500); 24 Dec 2015 07:00:20 -0000 Delivered-To: apmail-apex-dev-archive@apex.apache.org Received: (qmail 79661 invoked by uid 500); 24 Dec 2015 07:00:20 -0000 Mailing-List: contact dev-help@apex.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@apex.incubator.apache.org Delivered-To: mailing list dev@apex.incubator.apache.org Received: (qmail 79647 invoked by uid 99); 24 Dec 2015 07:00:20 -0000 Received: from Unknown (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Dec 2015 07:00:20 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 95CFCC0608 for ; Thu, 24 Dec 2015 07:00:19 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3 X-Spam-Level: *** X-Spam-Status: No, score=3 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=datatorrent-com.20150623.gappssmtp.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id 5qSuwDmgIuBW for ; Thu, 24 Dec 2015 07:00:09 +0000 (UTC) Received: from mail-yk0-f177.google.com (mail-yk0-f177.google.com [209.85.160.177]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 689BE201EB for ; Thu, 24 Dec 2015 07:00:09 +0000 (UTC) Received: by mail-yk0-f177.google.com with SMTP id k129so9451567yke.0 for ; Wed, 23 Dec 2015 23:00:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=datatorrent-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=1wfCcLXIvuExIVDhPB/szp99PJCb/cx2w4+6WYIN2QE=; b=ic1d7QOR+hZ37lzH6hdmSCQhn1DqTRitkFtz8N81rn9nJ2DLjX4smM1WeXcO9hFBvf vyBPt0Ad/LEJ7WPamdMSAYdSdv8UOclytdrIfd6k2Rdflmvrf7xnfL5krDvKRzfv5q8R v2H3IdABHjun8/bzKa2TW1rPL3ShvA77kCSOUgxaH/dRvxPD5eNI5Xe5miec/cnQKn4i +Z4BuMPQI5ZMPQXsn4cpUthsqFfzBZUIJnN6Z54LlYvbELBgLNcXqQN6RTZfvhsRGNll 4Shf9Qy2CQK0tFTmLzbjLLhL45gk9E/DdpoWhUUPH+baIX71tIvTFYX99RdnWn+T53+/ ojjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=1wfCcLXIvuExIVDhPB/szp99PJCb/cx2w4+6WYIN2QE=; b=NlW84FTzSn3SLiSk/+aI3XusbEIyAhXE6Vj7rUYYbBitJrMpUcsO370jFkjqmGHEqH jYcXwlbzL11gA6YSA3oa1HtNs97bj2zR5zfXiGpYfacEI1s5cHsMLni1nIlG8+vnwXAr gzW50ULXkAlywuApSCRDm9mEqYjVFPXG6EsRZbQIIAVoDkRZsmewzATfDK0KD8LzDvVn LkeIs26UZokjA8s5omWfafcDsowagGimiesPeNJCWBsF9ldjmG5rC+BoNJdipNYegEHi NRjsiET6buO5joMRL19Zofd+2bjABrz4N3FPuiRWlOEvX/z5hfqBVC90gqMpSn4SuvfD JJvw== X-Gm-Message-State: ALoCoQnCqUcV3wTKxF4vIpZtG8+jCboIJwbXyIgO/nhl9cPJ/901H/PgC71GYkjEBqKDWdP1LPccvB8zBSVzZrLbS185MNK1z1Tb4/EMi/WbdIJWRxYRidk= MIME-Version: 1.0 X-Received: by 10.129.108.88 with SMTP id h85mr30331099ywc.156.1450940408337; Wed, 23 Dec 2015 23:00:08 -0800 (PST) Received: by 10.37.99.198 with HTTP; Wed, 23 Dec 2015 23:00:08 -0800 (PST) In-Reply-To: References: Date: Thu, 24 Dec 2015 12:30:08 +0530 Message-ID: Subject: Re: Why are parser and formatter operator hidden in Malhar/contrib/schema From: Shubham Pathak To: dev@apex.incubator.apache.org Content-Type: multipart/alternative; boundary=001a114d88f0a1487605279f63ba --001a114d88f0a1487605279f63ba Content-Type: text/plain; charset=UTF-8 Hi, Can we take a decision regarding whether all parsers need to be in one place ? ( either malhar lib / contrib ) In this PR https://github.com/apache/incubator-apex-malhar/pull/137 we are moving XML and JSON parser from contrib to lib. I suggest we move CSV parsers as well . I understand Thomas's point of dependencies, but shouldn't we also think about users that use malhar that would wish to have all parsers under common bucket ? Thanks, Shubham On Fri, Dec 18, 2015 at 2:44 AM, Thomas Weise wrote: > This has changed over time. Pieces that cannot be covered by unit tests > should still be kept out of library. > > In addition, we cannot keep on expanding the number of dependencies in a > single monolithic module. Hence, we are going start to create smaller > modules with their dependencies and tests setup correctly. The first of > those will be Kafka, for which a PR is currently open. > > For lib, we can add more operators as long as they don't introduce new > dependencies. When building an app and using one operator from lib, > everything else comes with it. That's a problem for the application > assembly we are trying to address. > > Thomas > > > On Thu, Dec 17, 2015 at 1:06 PM, Sandesh Hegde > wrote: > > > Even my understanding was along the lines, "if an unit test can't be run > on > > a developer machine without installing dependencies ( RabbitMq/redis ec), > > it should go into contrib". > > > > Documenting the exact process as to what goes where helps. > > > > On Thu, Dec 17, 2015 at 12:16 PM Chinmay Kolhatkar < > > chinmay@datatorrent.com> > > wrote: > > > > > Hi Isha, > > > > > > I think what Shubham meant to say is any operator which has a > dependency > > on > > > external entity (not library) should go in contrib. Others should go in > > > library. > > > > > > For eg. DB related operators needs an instance/server of respect > database > > > to be running. Without that running server, the operator cannot satisfy > > the > > > functionality expected. Hence it goes into contrib. > > > > > > I think none of the parsers will have dependency on external entity, > > hence > > > they should go in library and not contrib. > > > > > > - Chinmay. > > > On 18 Dec 2015 00:42, "Isha Arkatkar" wrote: > > > > > > > Hi Shubham, > > > > > > > > I think it is somewhat subjective what goes in Contrib Vs > > Library. > > > > Here is what I understood about general guideline: > > > > An operator would go in malhar-library if it does not have > any > > > > other library dependency than what is already available. If operator > > > needs > > > > to include a dependency, it could be added to Contrib but not > library. > > > > > > > > If we add an external library dependency in Malhar-library, the > > size > > > > of the lib jar keeps on growing. So, if we refer malhar-lib in a > > project, > > > > the size of the total package would increase, even if we may not > > directly > > > > use all external libs. > > > > > > > > For this reason, in pull request #137 > > > > , I move > > only > > > > Json and XML parsers to malhar-library. Csv one had library reference > > to > > > > supercsv, so it is still in contrib. > > > > > > > > Please correct if I missed something. :) > > > > > > > > Thanks! > > > > Isha > > > > > > > > On Wed, Dec 16, 2015 at 11:27 PM, Shubham Pathak < > > > shubham@datatorrent.com> > > > > wrote: > > > > > > > > > Hi, > > > > > > > > > > What do we mean by "additional dependency" in case of deciding > > contrib > > > vs > > > > > lib ? > > > > > As i understand, "additional dependency" is when an operator is > > > > > interacting with external technologies . For e.g kafka, MQ , HBase > > etc. > > > > > By this definition even CSV parsers need to be moved to malhar-lib. > > > > > > > > > > Thanks, > > > > > Shubham > > > > > > > > > > > > > > > On Thu, Dec 17, 2015 at 5:39 AM, Isha Arkatkar < > isha@datatorrent.com > > > > > > > > wrote: > > > > > > > > > > > Hi Chandni, > > > > > > > > > > > > I have moved that as well to lib, as the parsers depended on > > that. > > > > > > > > > > > > Thanks, > > > > > > Isha > > > > > > > > > > > > On Wed, Dec 16, 2015 at 1:01 PM, Chandni Singh < > > > > chandni@datatorrent.com> > > > > > > wrote: > > > > > > > > > > > > > There is a converter package under com.datatorrent.contrib > which > > > has > > > > a > > > > > > > Converter API. This belongs in library as well. > > > > > > > > > > > > > > Thanks, > > > > > > > Chandni > > > > > > > > > > > > > > On Tue, Dec 15, 2015 at 1:29 PM, Chandni Singh < > > > > > chandni@datatorrent.com> > > > > > > > wrote: > > > > > > > > > > > > > > > Isha, > > > > > > > > > > > > > > > > Thanks for moving this. When you move these files, please > place > > > > then > > > > > > > under > > > > > > > > a package which reflects its functionality. I don't see the > > need > > > > for > > > > > > > > package called schema. > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Chandni > > > > > > > > > > > > > > > > On Tue, Dec 15, 2015 at 12:31 PM, Isha Arkatkar < > > > > > isha@datatorrent.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > >> Hi, > > > > > > > >> > > > > > > > >> For csv parser there is an additional dependency. So, I'll > > > move > > > > > only > > > > > > > >> json > > > > > > > >> and xml to new location. > > > > > > > >> > > > > > > > >> Thanks, > > > > > > > >> Isha > > > > > > > >> > > > > > > > >> On Tue, Dec 15, 2015 at 11:42 AM, Thomas Weise < > > > > > > thomas@datatorrent.com> > > > > > > > >> wrote: > > > > > > > >> > > > > > > > >> > As long as the operators don't introduce additional > > > dependencies > > > > > > they > > > > > > > >> > should be in lib. > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > On Tue, Dec 15, 2015 at 9:34 AM, Shubham Pathak < > > > > > > > >> shubham@datatorrent.com> > > > > > > > >> > wrote: > > > > > > > >> > > > > > > > > >> > > Hi Chandni, > > > > > > > >> > > > > > > > > > >> > > I had written those operators. > > > > > > > >> > > Here is the jira for that > > > > > > > >> https://malhar.atlassian.net/browse/MLHR-1838 > > > > > > > >> > > You would find the entire discussion there. > > > > > > > >> > > > > > > > > > >> > > Why are all these operator under Malhar/contrib and not > > > > > Malhar/lib > > > > > > > >> > > When i was writing the code i saw AbstractCsvParser in > > > > contriib > > > > > > and > > > > > > > >> hence > > > > > > > >> > > added there. > > > > > > > >> > > > > > > > > > >> > > Recently i got to know which operators must go in > contrib > > > and > > > > > what > > > > > > > >> must > > > > > > > >> > go > > > > > > > >> > > in lib. > > > > > > > >> > > By that definition, these operators must belong to lib. > > > > > > > >> > > > > > > > > > >> > > Thanks, > > > > > > > >> > > Shubham > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > On Tue, Dec 15, 2015 at 1:32 PM, Chandni Singh < > > > > > > > >> chandni@datatorrent.com> > > > > > > > >> > > wrote: > > > > > > > >> > > > > > > > > > >> > > > Hi, > > > > > > > >> > > > > > > > > > > >> > > > I just came across couple of formatter and parser > > > operators > > > > > > which > > > > > > > >> are > > > > > > > >> > > under > > > > > > > >> > > > Malhar/contrib/schema. > > > > > > > >> > > > > > > > > > > >> > > > I have couple of questions: > > > > > > > >> > > > 1. What does schema denote here? > > > > > > > >> > > > 2. Why formatter/parser which are functions are placed > > > under > > > > > > > schema > > > > > > > >> > > > package? > > > > > > > >> > > > 2. Why are all these operator under Malhar/contrib and > > not > > > > > > > >> Malhar/lib > > > > > > > >> > > > > > > > > > > >> > > > Thanks, > > > > > > > >> > > > Chandni > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --001a114d88f0a1487605279f63ba--