apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pramod Immaneni <pra...@datatorrent.com>
Subject Re: More sensible modules/artifacts in malhar
Date Fri, 02 Oct 2015 23:28:08 GMT
We got to think about how people can find the operators and
dependencies when bundling the applications. The complain I hear often
is that folks can't find the operators they are looking for. We should
be careful about how much more work this will add for the user to now
search and find all the dependencies.

Thanks

> On Oct 2, 2015, at 3:44 PM, David Yan <david@datatorrent.com> wrote:
>
> I actually don't think it makes sense any more to separate malhar-library
> and malhar-contrib after the breakup, especially since we are planning for
> a major release for these changes.
>
> People are often confused, myself included, which operators should be in
> malhar-library and which ones should be in contrib.  Requiring a separate
> setup for unit test should not be a criteria because the user of the
> library couldn't care less whether the unit test requires extra setup.  The
> factor of requiring extra dependencies isn't valid either because there're
> already dependencies of malhar-library now that apex does not have.
>
> We can retain them for backward compatibility purpose but going forward new
> app packages should only use the baby artifacts, without denoting whether
> it's contrib or not.
>
> David
>
> On Tue, Sep 29, 2015 at 12:19 AM, Andy Perlitch <andy@datatorrent.com>
> wrote:
>
>> Hi all,
>>
>> This is a first cut at a plan to restructure malhar in a way that is more
>> portable and adherent to Maven's principles of modularity and dependency
>> management.
>>
>> Overview of Current Malhar Architecture
>> ---------------------------------------------------------------
>> The current malhar repo consists of several maven modules:
>>
>> * *malhar-library*
>>   operators which do not require additional transitive dependencies beyond
>> what Apex and Hadoop require
>> *  *malhar-contrib*
>>   operators requiring other maven dependencies
>> * *malhar-demos*
>>   demo applications
>> * *malhar-samples*
>>   sample code showing example usage of malhar operators
>> * *malhar-apps*
>>   apex applications (currently only logstream)
>>
>>
>> Proposed Changes
>> ---------------------------------------------------------------
>>
>> 1. *Scrub malhar-library for any operators needing additional dependencies*
>>  `malhar-library` is intended to consist of only operators without extra
>> transitive dependencies. All operators should be checked for the necessity
>> of extra dependencies.
>>
>> 2. *Move operators from malhar-demos and malhar-apps into contrib (or
>> library if prudent)*
>>    There are various operators in both of these modules that are general
>> enough to move into library or contrib.
>>
>> 3. *Create modules for all contrib subfolders*
>>    All folders under `contrib/src/main/com/datatorrent/contrib/` should be
>> converted to modules of contrib and listed as such in `/contrib/pom.xml`.
>>    Additionally, each of these smaller contrib modules will have its own
>> version and dependencies.
>>
>> 4. *Use the Shades Plugin to allow for backwards-compatible fully-qualified
>> class names*
>>    This is made possible by shades class relocation
>> <
>> https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html
>> feature. This might be a bit error prone as well as confusing to use for
>> outside developers, but it must be done if these changes are to be made
>> prior to a major release.
>>
>>
>>
>> Let me know what you all think of this approach.
>>
>> Best,
>> Andy
>>
>>
>> On Tue, Sep 22, 2015 at 11:20 AM, Chetan Narsude <chetan@datatorrent.com>
>> wrote:
>>
>>> +1
>>>
>>> On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta <gaurav@datatorrent.com>
>>> wrote:
>>>
>>>> I agree with David.. Each artifact should have it's own version
>>>>
>>>> Thanks
>>>> -Gaurav
>>>>
>>>>> On Tue, Sep 22, 2015 at 11:07 AM, David Yan <david@datatorrent.com>
>>>> wrote:
>>>>
>>>>> I actually think that each baby artifact should have its own version,
>>>>> because each artifact has its own interface and its own life cycle,
>>>>> especially after we break up the giant library, applications will
>>> depend
>>>> on
>>>>> the baby artifacts instead of the giant library.  For example if
>> there
>>> is
>>>>> no change in malhar-contrib-kafka (I think the name should actually
>> be
>>>>> apex-malhar-kafka), we should not confuse users by bumping the
>> version.
>>>>>
>>>>> David
>>>>>
>>>>> On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <andy@datatorrent.com
>>>
>>>>> wrote:
>>>>>
>>>>>> Tushar,
>>>>>>
>>>>>> I agree that all modules should inherit the version from the
>> "parent
>>>> pom"
>>>>>> of the malhar repo. I think the benefits outweigh the cost of
>> bumping
>>>>>> versions of components that haven't actually changed. I'd love to
>> get
>>>>>> others feedback on this as well.
>>>>>>
>>>>>> On another note, I plan on starting a spreadsheet/googledoc with
>> the
>>>>>> possible groupings of operators into these modules. Stay tuned...
>>>>>>
>>>>>> -Andy
>>>>>>
>>>>>> On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <
>>>> tushar@datatorrent.com>
>>>>>> wrote:
>>>>>>
>>>>>>> +1 for the general idea
>>>>>>>
>>>>>>> Does these independent modules going to have independent
>> versions?
>>>> For
>>>>>>> example, if there is no change in kafka operator between malhar
>> 3.0
>>>> and
>>>>>>> malhar 4.0, will we increment version of malhar-contrib-kafka
to
>>>> 4.0. I
>>>>>>> have learned from my previous project that, It is easier to
>> manage
>>>>>> versions
>>>>>>> if we make all modules at same version level for a release, even
>> if
>>>>> there
>>>>>>> is no change in a particular module.
>>>>>>>
>>>>>>> - Tushar.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <
>>>> tim@datatorrent.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I agree Andy's solution is better, but just for the sake
of
>>>> argument
>>>>>>>> profiles can be inherited from a parent pom, so if the maven
>>>>> archetype
>>>>>>>> defines a new project with a parent pom with the correct
>> profiles
>>>>>>> defined,
>>>>>>>> then the desired profiles can be activated in the pom of
the
>> new
>>>>>> project.
>>>>>>>> It is no more complicated than adding additional dependencies
>> to
>>>> your
>>>>>>>> project.
>>>>>>>>
>>>>>>>> On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <
>>>>>> sandesh@datatorrent.com
>>>>>>>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Currently all the dependencies in Malhar-Contrib are
marked
>> as
>>>>>>> optional.
>>>>>>>> So
>>>>>>>>> users have to already modify the existing POM to use
it in
>>> their
>>>>>>> project.
>>>>>>>>> So restructuring should be fine.
>>>>>>>>>
>>>>>>>>> On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
>>>>>>> chetan@datatorrent.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> The profiles are excellent when you are developing
>>>>> malhar-contrib.
>>>>>>>>> Profiles
>>>>>>>>>> do not work when you are using malhar-contrib. The
problem
>>> Andy
>>>>> is
>>>>>>>>> trying
>>>>>>>>>> to solve is the later. If there is an elegant solution
>> which
>>> I
>>>> am
>>>>>>>> missing
>>>>>>>>>> using profiles, please correct me.
>>>>>>>>>>
>>>>>>>>>> The way Andy suggested is the way many successful
projects
>> do
>>>> it.
>>>>>>> Look
>>>>>>>> at
>>>>>>>>>> Netty as an example.
>>>>>>>>>>
>>>>>>>>>> +1 for that.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Chetan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas
<
>>>>>>> tim@datatorrent.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> I think restructuring the project in that way
would be
>> the
>>>>>>>> technically
>>>>>>>>>>> correct thing to do, but if people are unwilling
to
>> accept
>>>> the
>>>>>>> change
>>>>>>>>> in
>>>>>>>>>>> project structure you could achieve something
similar by
>>>> using
>>>>>>> maven
>>>>>>>>>>> profiles. With profiles the project structure
would
>> remain
>>> as
>>>>> is.
>>>>>>>>>> Profiles
>>>>>>>>>>> could be added to the malhar pom, and a profile
would
>>> define
>>>>> the
>>>>>>>>>>> dependencies needed for different types of operators.
For
>>>>> example
>>>>>>> the
>>>>>>>>>> hbase
>>>>>>>>>>> profile would define the dependencies for the
hbase
>>> operator.
>>>>>> Then
>>>>>>>> any
>>>>>>>>>>> project using a malhar library would just activate
the
>>>> correct
>>>>>>>> profile
>>>>>>>>> in
>>>>>>>>>>> it's pom, and the correct dependencies would
be pulled
>> in.
>> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch
<
>>>>>>>> andy@datatorrent.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>
>>>>>>>>>>>> I am currently assigned to MLHR-1843
>>>>>>>>>>>> <https://malhar.atlassian.net/browse/MLHR-1843>,
which
>>>>>>> essentially
>>>>>>>>>> aims
>>>>>>>>>>> to
>>>>>>>>>>>> expose smaller, more consumable maven artifacts
that
>>> would
>>>> do
>>>>>>> away
>>>>>>>>> with
>>>>>>>>>>> the
>>>>>>>>>>>> need to manually include necessary dependencies
based
>> on
>>>> the
>>>>>>>>> operators
>>>>>>>>>> in
>>>>>>>>>>>> use.
>>>>>>>>>>>>
>>>>>>>>>>>> As an example, say I am building an app package
that
>>> needs
>>>>>> Kafka
>>>>>>>>> input
>>>>>>>>>>> and
>>>>>>>>>>>> output operators, but I don't want all the
other
>>> transitive
>>>>>>>>>> dependencies
>>>>>>>>>>>> that come via malhar-contrib. Currently I
would need to
>>>>> specify
>>>>>>>>>>>> malhar-contrib as a dependency, and add an
exclusions
>>> block
>>>>> in
>>>>>>> my
>>>>>>>>> app
>>>>>>>>>>>> package pom:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>>>>>>>>>>>> <artifactId>malhar-contrib</artifactId>
>>>>>> <version>3.0.0</version>
>>>>>>>>> <!--
>>>>>>>>>>> so
>>>>>>>>>>>> none of malhar-contrib's deps are included
-->*
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *  <exclusions>    <exclusion>
>> <groupId>*</groupId>
>>>>>>>>>>>> <artifactId>*</artifactId>  
 </exclusion>
>>>>>>>>> </exclusions></dependency>*
>>>>>>>>>>>>
>>>>>>>>>>>> Then, I would have to include the kafka library
>>> explicitly
>>>>> as a
>>>>>>>>>>> dependency:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *<dependency>  <groupId>org.apache.kafka</groupId>
>>>>>>>>>>>> <artifactId>kafka_2.10</artifactId>
>>>>>>>>>>>> <version>0.8.1.1</version></dependency>*
>>>>>>>>>>>>
>>>>>>>>>>>> Wouldn't it be nice if I could just put this
in my
>> pom?:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
>>>>>>>>>>>> <version>3.0.0</version></dependency>*
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> In order to make this possible, we will need
to
>> organize
>>>> the
>>>>>>> malhar
>>>>>>>>>>> project
>>>>>>>>>>>> into more granular modules (artifacts). Specifically,
>> the
>>>>>>>>>> malhar-contrib
>>>>>>>>>>>> artifact would essentially just be a pom
that specifies
>>>> each
>>>>>>>> smaller
>>>>>>>>>>> module
>>>>>>>>>>>> as a dependency:
>>>>>>>>>>>>
>>>>>>>>>>>> *<!-- in malhar-contrib's pom.xml: -->*
>>>>>>>>>>>>
>>>>>>>>>>>> *<modules>  <module>kafka</module>*
>>>>>>>>>>>> *  <module>twitter</module>*
>>>>>>>>>>>> *  <module>redis</module>*
>>>>>>>>>>>>
>>>>>>>>>>>> *  <!-- other smaller modules --></modules>*
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
>>>>>>>>>>>> <version>3.0.0</version></dependency>*
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>>>>>>>>>>>> <artifactId>malhar-contrib-twitter</artifactId>
>>>>>>>>>>>> <version>3.0.0</version></dependency>*
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>>>>>>>>>>>> <artifactId>malhar-contrib-redis</artifactId>
>>>>>>>>>>>> <version>3.0.0</version></dependency>*
>>>>>>>>>>>>
>>>>>>>>>>>> With these changes, there may be a risk of
breaking
>>>> backwards
>>>>>>>>>>>> compatibility, however I think the gain in
usability of
>>>>> malhar
>>>>>>>> merits
>>>>>>>>>> the
>>>>>>>>>>>> effort to make this work.
>>>>>>>>>>>>
>>>>>>>>>>>> I am still relatively new to maven, so I
would love to
>>> get
>>>>> some
>>>>>>>>>> feedback
>>>>>>>>>>>> from other devs about this!
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Andy Perlitch
>>>>>>>>>>>> Software Engineer
>>>>>>>>>>>> DataTorrent Inc
>>>>>>>>>>>> (408)829-9319
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>> Andy Perlitch
>>>>>> Software Engineer
>>>>>> DataTorrent Inc
>>>>>> (408)829-9319
>>
>>
>>
>> --
>> Regards,
>> Andy Perlitch
>> Software Engineer
>> DataTorrent Inc
>> (408)829-9319
>>

Mime
View raw message