incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Seetharam Venkatesh <venkat...@innerzeal.com>
Subject Re: [VOTE] Accept Apex into the Apache Incubator
Date Thu, 13 Aug 2015 20:25:26 GMT
+1 (Non-binding)

On Thu, Aug 13, 2015 at 1:09 PM Julian Hyde <jhyde@apache.org> wrote:

> +1 (binding)
>
> Julian
>
>
> > On Aug 13, 2015, at 12:40 PM, Gaurav Gupta <gaurav@datatorrent.com>
> wrote:
> >
> > +1 (Non-binding)
> >
> > -Gaurav
> >
> >> On Aug 13, 2015, at 10:22 AM, Pramod Immaneni <pramod@datatorrent.com>
> wrote:
> >>
> >> +1 (Non-binding)
> >>
> >> On Thu, Aug 13, 2015 at 7:48 AM, P. Taylor Goetz <ptgoetz@apache.org>
> wrote:
> >>
> >>> Following the discussion thread [1], I would like to call a VOTE for
> >>> Accepting Apex as a new Apache Incubator project.
> >>>
> >>> The proposal is available on the wiki [2] and is also attached below.
> >>>
> >>> The VOTE will be open for at least 72 hours.
> >>>
> >>> [ ] +1 Accept Apex into the Incubator
> >>> [ ] ±0 No opinion
> >>> [ ] -1 Do not accept Apex into the Incubator because…
> >>>
> >>> Thanks,
> >>>
> >>> -Taylor
> >>>
> >>> [1] http://s.apache.org/apex_discuss
> >>> [2] https://wiki.apache.org/incubator/ApexProposal
> >>>
> >>>
> >>> == Abstract ==
> >>> Apex is an enterprise grade native YARN big data-in-motion platform
> that
> >>> unifies stream processing as well as batch processing. Apex processes
> big
> >>> data in-motion in a highly scalable, highly performant, fault tolerant,
> >>> stateful, secure, distributed, and an easily operable way. It provides
> a
> >>> simple API that enables users to write or re-use generic Java code,
> thereby
> >>> lowering the expertise needed to write big data applications.
> >>>
> >>> Functional and operational specifications are separated. Apex is
> designed
> >>> in a way to enable users to write their own code (aka user defined
> >>> functions) as is and leave all operability to the platform. The API is
> very
> >>> simple and is designed to allow users to drop in their code as is. The
> >>> platform mainly deals with operability and treats functional code as a
> >>> black box. Operability includes fault tolerance, scalability, security,
> >>> ease of use, metrics api, webservices, etc. In other words there is no
> >>> separation of UDF (user defined functions), as all functional code is
> UDF.
> >>> This frees users to focus on functional development, and lets platform
> >>> provide operability support. The same code runs as is with different
> >>> operability attributes. The data-in-motion architecture of Apex unifies
> >>> stream as well as batch processing in a single platform. Since Apex is
> a
> >>> native YARN application, it leverages all the components of YARN
> without
> >>> duplication. Apex was developed with YARN in mind and has no
> overlapping
> >>> components/functionality with YARN.
> >>>
> >>> The Apex platform is supplemented by project Malhar, which is a
> library of
> >>> operators that implement common business logic functions needed by
> >>> customers who want to quickly develop applications. These operators
> provide
> >>> access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
> >>> RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB,
> Redis,
> >>> HBase, CouchDB and other databases along with JDBC connectors. The
> Malhar
> >>> library also includes a host of other common business logic patterns
> that
> >>> help users to significantly reduce the time it takes to go into
> production.
> >>> Ease of integration with all other big data technologies is one of the
> >>> primary missions of Malhar.
> >>>
> >>> == Proposal ==
> >>> The goal of this proposal is to establish the core engine of
> DataTorrent
> >>> RTS product as an Apache Software Foundation (ASF) project in order to
> >>> build a vibrant, diverse, and self-governed open source community
> around
> >>> the technology. DataTorrent will continue to sell management tools,
> >>> application building tools, easy to use big data applications, and
> custom
> >>> high end business logic operators. This proposal covers the Apex source
> >>> code (written in Java), Apex documentation and other materials
> currently
> >>> available on https://github.com/DataTorrent/Apex. This proposal also
> >>> covers the Malhar source code (written in Java), Malhar documentation,
> and
> >>> other materials currently available on
> >>> https://github.com/DataTorrent/Malhar. We have done a trademark check
> on
> >>> the name Apex, and have concluded that the Apex name is likely to be a
> >>> suitable project name.
> >>>
> >>> == Background ==
> >>> DataTorrent RTS is a mature and robust product developed as a native
> YARN
> >>> application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was
> launched
> >>> in Jan 2015. Both were well received by customers. RTS 3.0 was
> launched at
> >>> end of July 2015. RTS is among the first enterprise grade platform
> that was
> >>> developed from the ground up as native YARN application. DataTorrent
> RTS is
> >>> currently maintained by engineers as a closed source project. Even
> though
> >>> the engineers behind RTS are experienced software engineers and are
> >>> knowledge leaders in data-in-motion platforms, they have had little
> >>> exposure to the open source governance process. Customers are currently
> >>> running applications based on DataTorrent RTS in production.
> >>>
> >>> == Rationale ==
> >>> Big data applications written for non-Hadoop platforms typically
> require
> >>> major rewrites  to get them to work with Hadoop. This rewriting
> creates a
> >>> significant bottleneck in terms of resources (expertise) which in turn
> >>> jeopardizes the viability of such an endeavour. It is hard enough to
> >>> acquire big data expertise, demanding additional expertise to do a
> major
> >>> code conversion makes it a very hard problem for projects to
> successfully
> >>> migrate to Hadoop. Also, due to the batch processing nature of Hadoop’s
> >>> MapReduce paradigm, users often have to wait tens of minutes to see
> results
> >>> and act on them due to various delays in data flow. DataTorrent’s RTS
> >>> data-in-motion architecture is designed to address this problem. It
> enables
> >>> even the non big data developer to write code and operate it in a
> scalable,
> >>> fault tolerant manner. The big data-in-motion architecture of
> DataTorrent’s
> >>> RTS enables ease of integration into current enterprise infrastructure.
> >>> This goal was achieved by keeping the API simple and empowering users
> to
> >>> put in the connector code as is (or with minimal changes).
> >>>
> >>> Malhar is a manifestation of this reality, and we or the customer
> >>> engineers were able to create these connectors within a day or so if
> not
> >>> within a week. Connectors include those to integrate with message
> bus(es),
> >>> file systems, databases, other protocols, and more continue to be
> added.
> >>> Over a period of time we expect users to simply pick a connector that
> >>> already exists in Malhar and quickly begin integrating with their
> current
> >>> enterprise infrastructure. Within the data-in-motion architecture a
> stream
> >>> application is one with connector(s) to say Kafka, JMS, or Flume;
> while a
> >>> batch application is one with connector(s) to HDFS, HBase, FTP, NFS,
> S3n
> >>> etc. This allows usage of the platform for both stream as well as batch
> >>> processing with same business logic. Complete separation of user
> written
> >>> application code from all operational aspects of the system, as well as
> >>> support code for YARN, significantly expands the potential use cases
> that
> >>> can migrate to use Hadoop.
> >>>
> >>> Apex will enable Hadoop eco-system to migrate a lot more use cases. It
> >>> will enable the Hadoop eco-system to deliver on a promise to rapidly
> >>> transform current IT infrastructure. Apex will help in significantly
> >>> increasing productization of big data projects. One of the main
> barometers
> >>> of success in the Hadoop eco-system is significant reduction of time to
> >>> market for big data applications migrating to Hadoop. We believe that
> Apex
> >>> will be one of the platforms that will enable users to extract value
> from
> >>> big data, by reducing time to market. This rapid innovation can be
> >>> optimally achieved through a vibrant, diverse, self-governed community
> >>> collectively innovating around Apex and the Malhar library, while at
> the
> >>> same time cross-pollinating with various other big data platforms. ASF
> is
> >>> an ideal place to meet this goal.
> >>>
> >>> == Initial Goals ==
> >>> Our initial goals are to bring Apex and Malhar repositories into the
> ASF,
> >>> adapt internal engineering processes to open development, and foster a
> >>> collaborative development model in accordance with the "Apache Way."
> >>> DataTorrent plans to develop new functionality in an open,
> community-driven
> >>> way. To get there, the existing internal build, test and release
> processes
> >>> will be refactored to support open development. We already have an
> active
> >>> user community on google groups that we intend to migrate to Apache.
> >>>
> >>> == Current Status ==
> >>> Currently, the project Apex code base is available under Apache 2.0
> >>> license (https://github.com/DataTorrent/Apex). Project Malhar code
> base
> >>> is available under Apache 2.0 license (
> >>> https://github.com/DataTorrent/Malhar). Project Malhar was open
> sourced 2
> >>> years ago which should make it easy for the project Malhar team to
> adapt to
> >>> an  open, collaborative, and meritocratic environment. Contributors of
> >>> Malhar are employees of DataTorrent or have agreed to the shift to
> Apache.
> >>> Project Apex, in contrast, was developed as a proprietary,
> closed-source
> >>> product, but the internal engineering practices adopted by the
> development
> >>> team were common to Malhar, and should lend themselves well to an open
> >>> environment. DataTorrent plans to execute a software grant agreement as
> >>> part of the launch of the incubation of Apex as an Apache project.
> >>>
> >>> The DataTorrent team has always focused on building a robust end user
> >>> community of paying and non-paying customers. We think that the
> existing
> >>> community centered around the existing google groups mailing list
> should be
> >>> relatively easy to transform into an Apache-style community including
> both
> >>> users and developers.
> >>>
> >>> === Meritocracy ===
> >>> Our proposed list of initial committers include the current RTS R&D
> team,
> >>> and our existing customers. This group will form a base for the broader
> >>> community we will invite to collaborate on the codebase. We intend to
> >>> radically expand the initial developer and user community by running
> the
> >>> project in accordance with the "Apache Way". Users and new contributors
> >>> will be treated with respect and welcomed. By participating in the
> >>> community and providing quality patches/support that move the project
> >>> forward, they will earn merit. They also will be encouraged to provide
> >>> non-code contributions (documentation, events, presentations, community
> >>> management, etc.) and will gain merit for doing so. Those with a proven
> >>> support and quality track record will be encouraged to become
> committers.
> >>>
> >>> === Community ===
> >>> If Apex is accepted for incubation, the primary initial goal will be
> >>> transitioning the core community towards embracing the Apache Way of
> >>> project governance. We will solicit major existing contributors to
> become
> >>> committers on the project from the start. It should be noted that the
> >>> existing community is already more diverse in many ways than some
> top-level
> >>> Apache projects. We expect that we can encourage even more diversity.
> >>>
> >>> === Core Developers ===
> >>> While a few core developers are skilled in working in openly governed
> >>> Apache communities, most of the core developers are currently NOT
> >>> affiliated with the ASF and would require new ICLAs before committing
> to
> >>> the project. There would also be a learning curve associated with this
> >>> on-boarding. Changing current development practices to be more open
> will be
> >>> an important step.
> >>>
> >>> === Alignment ===
> >>> The following existing ASF projects provide related functionality as
> that
> >>> provided by Apex and should be considered when reviewing Apex proposal:
> >>>
> >>> Apache HadoopⓇ is a distributed storage and processing framework for
> very
> >>> large datasets focusing primarily on batch processing for analytic
> >>> purposes. Apex is a native YARN application. The Apex and Malhar
> roadmap
> >>> includes plans to continue to leverage YARN, and help the YARN
> community
> >>> develop the ability to support long running applications. Apex uses DFS
> >>> interface of its core checkpoint/commit. Malhar has a large number of
> >>> operators that leverage HDFS and other Apache projects. Our roadmap
> >>> includes plans to continue to deepen the currently close integration
> with
> >>> HDFS.
> >>>
> >>> Apache HBase offers tabular data stored in Hadoop based on the Google
> >>> Bigtable model. Malhar has HBase connectors to ease integration with
> HBase.
> >>> Malhar roadmap includes plans to continue to enhance integration with
> >>> Apache HBase.
> >>>
> >>> Apache Kafka offers distributed and durable publish-subscribe
> messaging.
> >>> Malhar integrates Kafka with Hadoop through feature rich connectors and
> >>> supports ingest as well as analytical functions to incoming data. Raw
> data
> >>> can be ingested from Kafka and results can be written to Kafka. Malhar
> >>> roadmap includes plans to continue to enhance integration with Apache
> Kafka.
> >>>
> >>> Apache Flume is a distributed, reliable, and available service for
> >>> efficiently collecting, aggregating, and moving large amounts of log
> data.
> >>> Malhar has Flume connectors to ease integration with Flume. These
> >>> connectors ensures that ingestion with Flume is fault tolerant and
> thus can
> >>> be done in real-time with the same SLA as Flume’s HDFS connectors.
> Malhar
> >>> roadmap includes plans to continue to enhance integration with Apache
> Flume.
> >>>
> >>> Apache Cassandra is a highly scalable, distributed key-value store that
> >>> focuses on eventual consistency. Malhar has connectors to ease
> integration
> >>> with Cassandra. Malhar roadmap includes plans to continue to enhance
> >>> integration with Apache Cassandra.
> >>>
> >>> Apache Accumulo is a distributed key-value store based on Google’s
> >>> BigTable design. Malhar has connectors to ease integration with
> Accumulo.
> >>> The Malhar roadmap includes plans to continue to enhance integration
> with
> >>> Apache Accumulo.
> >>>
> >>> Apache Tez is aimed at building an application framework which allows
> for
> >>> a complex DAG of tasks for process data. The Apex and Malhar roadmaps
> >>> include plans to integrate with Apache Tez but this is not currently
> >>> supported.
> >>>
> >>> Apache ActiveMQ and its sub project Apache Apollo offers a powerful
> >>> message queue framework. Malhar has ActiveMQ connectors that ease
> >>> integration with ActiveMQ.
> >>>
> >>> Apache Spark is an engine for processing large datasets, typically in a
> >>> Hadoop cluster. Malhar project makes it easy for users to integrate
> with
> >>> Spark. The Malhar roadmap includes plans to continue to enhance
> integration
> >>> with Apache Spark.
> >>>
> >>> Apache Flink is an engine for scalable batch and stream data
> processing.
> >>> Malhar project makes it easy for users to integrate with Flink. There
> is
> >>> overlap in how Flink leverages data-in-motion architecture for both
> stream
> >>> and batch processing, and it does subscribe to our thought process that
> >>> data-in-motion can handle both stream and batch, meanwhile a batch only
> >>> engine will find it harder to manage streams. We differ in terms of
> how we
> >>> handle operability, user defined code, metrics, webservices etc. Apex
> is
> >>> very operational oriented, while Flink has much more focus on
> functional
> >>> elements. Malhar and rapid availability of common business logic is
> another
> >>> differentiator. We believe both these approaches are valid and the
> >>> community and innovation will gain by through cross pollination. We
> plan to
> >>> integrate with Apache Flink via HDFS for now.
> >>>
> >>> Apache Hive software facilitates querying and managing large datasets
> >>> residing in distributed storage. Malhar project makes it easy for
> users to
> >>> integrate with Apache Hive. The Malhar roadmap includes plans to
> continue
> >>> to enhance integration with Apache Hive.
> >>>
> >>> Apache Pig is a platform for analyzing large data sets.  Pig consists
> of a
> >>> high-level language for expressing data analysis programs, coupled with
> >>> infrastructure for evaluating these programs. The Apex and Malhar
> roadmaps
> >>> include plans to integrate with Apache Pig.
> >>>
> >>> Apache Storm is a distributed realtime computation system. Malhar
> makes it
> >>> easy for users to integrate with Apache Storm. We plan to integrate
> with
> >>> Apache Storm via HDFS for now. Malhar roadmaps include plans to
> continue to
> >>> support mechanism for integration with Apache Storm.
> >>>
> >>> Apache Samza is a distributed stream processing framework. Malhar
> makes it
> >>> easy for users to integrate with Apache Samza. We plan to integrate
> with
> >>> Apache Samza via HDFS or Apache Kafka for now. Malhar roadmaps include
> >>> plans to continue to support mechanism for integration with Apache
> Samza.
> >>>
> >>> Apache Slider is a YARN application to deploy existing distributed
> >>> applications on YARN, monitor them, and make them larger or smaller as
> >>> desired even when the application is running. Once Slider matures, we
> will
> >>> take a look at close integration of Apex with Slider.
> >>>
> >>> Project Malhar and Apex are aligned to many more Apache projects and
> other
> >>> open source projects as ease of integration with other technologies is
> one
> >>> of the primary goals of this project. These include Apache Solr,
> >>> ElasticSearch, MongoDB, Aerospike, ZeroMQ, CouchDB, CouchBase,
> MemCache,
> >>> Redis, RabbitMQ, Apache Derby.
> >>>
> >>> == Known Risks ==
> >>> Development has been sponsored mostly by a single company (DataTorrent,
> >>> Inc.) thus far and coordinated mainly by the core DataTorrent RTS and
> >>> Malhar team, with active participation from our current customers.
> >>>
> >>> For the project to fully transition to the Apache Way governance model,
> >>> development must shift towards the merit-centric model of growing a
> >>> community of contributors balanced with the needs for extreme
> stability and
> >>> core implementation coherency.
> >>>
> >>> The tools and development practices in place for the DataTorrent RTS
> and
> >>> Malhar products are compatible with the ASF infrastructure and thus we
> do
> >>> not anticipate any on-boarding pains. Migration from the current GitHub
> >>> repository is also expected to be straightforward.
> >>>
> >>> === Orphaned products ===
> >>> DataTorrent is fully committed to DataTorrent Apex and Malhar and the
> >>> product will continue to be based on the Apex project. Moreover,
> >>> DataTorrent has a vested interest in making Apex succeed by driving its
> >>> close integration with sister ASF projects. We expect this to further
> >>> reduce the risk of orphaning the product.
> >>>
> >>> === Inexperience with Open Source ===
> >>> DataTorrent has embraced open source software by open sourcing Malhar
> >>> project under Apache 2.0 license. The DataTorrent team includes
> veterans
> >>> from the Yahoo! Hadoop team. Although some of the initial committers
> have
> >>> not been developers on an entirely open source, community-driven
> project,
> >>> we expect to bring to bear the open development practices of Malhar to
> the
> >>> Apex project. Additionally, several ASF veterans agreed to mentor the
> >>> project and are listed in this proposal. The project will rely on their
> >>> guidance and collective wisdom to quickly transition the entire team of
> >>> initial committers towards practicing the Apache Way. DataTorrent is
> also
> >>> driving the Kafka on YARN (KOYA) initiative.
> >>>
> >>> === Homogeneous Developers ===
> >>> While most of the initial committers are employed by DataTorrent, we
> have
> >>> already seen a healthy level of interest from our existing customers
> and
> >>> partners. We intend to convert that interest directly into
> participation
> >>> and will be investing in activities to recruit additional committers
> from
> >>> other companies.
> >>>
> >>> === Reliance on Salaried Developers ===
> >>> Most of the contributors are paid to work in the Big Data space. While
> >>> they might wander from their current employers, they are unlikely to
> >>> venture far from their core expertises and thus will continue to be
> engaged
> >>> with the project regardless of their current employers.
> >>>
> >>> === Relationships with Other Apache Products ===
> >>> As mentioned in the Alignment section, Apex may consider various
> degrees
> >>> of integration and code exchange with Apache Hadoop (YARN and HDFS),
> Apache
> >>> Kafka, Apache HBase, Apache Flume, Apache Cassandra, Apache Accumulo,
> >>> Apache Tez, Apache Hive, Apache Pig, Apache Storm, Apache Samza, Apache
> >>> Spark, Apache Slider. Given the success that the DataTorrent RTS
> product
> >>> enjoyed, we expect integration points to be inside and outside the
> project.
> >>> We look forward to collaborating with these communities as well as
> other
> >>> communities under the Apache umbrella.
> >>>
> >>> === An Excessive Fascination with the Apache Brand ===
> >>> While we intend to leverage the Apache ‘branding’ when talking to other
> >>> projects as testament of our project’s ‘neutrality’, we have no plans
> for
> >>> making use of Apache brand in press releases nor posting billboards
> >>> advertising acceptance of Apex into Apache Incubator.
> >>>
> >>>
> >>> == Documentation ==
> >>> See documentation for the current state of the project documentation
> >>> available as part of the GitHub repositories -
> >>> https://github.com/DataTorrent/Apex;
> https://github.com/DataTorrent/Malhar.
> >>> In addition a list of demos that serve as a how to guide are available
> at
> >>> https://github.com/DataTorrent/Malhar/tree/master/demos
> >>>
> >>> == Initial Source ==
> >>> DataTorrent has released the source code for Apex under Apache 2.0
> License
> >>> at https://github.com/DataTorrent/Apex, and that of Malhar under
> Apache
> >>> 2.0 licence at https://github.com/DataTorrent/Malhar. We encourage ASF
> >>> community members interested in this proposal to download the source
> code,
> >>> review it and try out the software.
> >>>
> >>> == Source and Intellectual Property Submission Plan ==
> >>> As soon as Apex is approved to join Apache Incubator, DataTorrent will
> >>> execute a Software Grant Agreement and the source code will be
> transitioned
> >>> onto ASF infrastructure. The code is already licensed under the  Apache
> >>> Software License, version 2.0. We know of no legal encumberments that
> would
> >>> inhibit the transfer of source code to the ASF.
> >>>
> >>> == External Dependencies ==
> >>> All dependencies fall under the permissive licenses categories, or weak
> >>> copy left (http://www.apache.org/legal/resolved.html#category-b). We
> >>> intend to remove the dependencies on GPL licensed technologies on which
> >>> APex or Malhar depend. These technologies are optional and have been
> marked
> >>> as such.
> >>>
> >>> Embedded dependencies (relocated):
> >>>  * None
> >>>
> >>> Runtime dependencies:
> >>>  * activemq-client
> >>>  * ant
> >>>  * async-http-client
> >>>  * bval-jsr303
> >>>  * commons-beanutils
> >>>  * commons-codec
> >>>  * commons-lang3
> >>>  * commons-compiler
> >>>  * embassador
> >>>  * fastutil
> >>>  * guava
> >>>  * hadoop-common
> >>>  * hadoop-common-tests
> >>>  * hadoop-yarn-client
> >>>  * httpclient
> >>>  * jackson-core-asl
> >>>  * jackson-mapper-asl
> >>>  * javax.mail
> >>>  * jersey-apache-client4
> >>>  * jersey-client
> >>>  * jetty-servlet
> >>>  * jetty-websocket
> >>>  * jline
> >>>  * kryo
> >>>  * named-regexp
> >>>  * netlet
> >>>  * rhino (GPL 2.0, optional)
> >>>  * slf4j-api
> >>>  * slf4j-log4j12
> >>>  * validation-api
> >>>  * xbean-asm5-shaded
> >>>  * zip4j
> >>>
> >>> Module or optional dependencies
> >>>  * accumulo-core
> >>>  * aerospike-client
> >>>  * amqp-client
> >>>  * aws-java-sdk-kinesis
> >>>  * cassandra-driver-core
> >>>  * couchbase-client
> >>>  * CouchbaseMock
> >>>  * elasticsearch
> >>>  * geoip-api (LGPL, optional)
> >>>  * hbase
> >>>  * hbase-client
> >>>  * hbase-server
> >>>  * hive-exec
> >>>  * hive-service
> >>>  * hiveunit
> >>>  * javax.mail-api
> >>>  * jedis
> >>>  * jms-api
> >>>  * jri (GPL, optional)
> >>>  * jriengine (LGPL, optional)
> >>>  * jruby (LGPL, optional)
> >>>  * jython (PSF License, optional)
> >>>  * jzmq (LGPL, optional)
> >>>  * kafka_2.10
> >>>  * lettuce (GPL, optional)
> >>>  * libthrift
> >>>  * Memcached-Java-Client
> >>>  * mongo-java-driver
> >>>  * mqtt-client
> >>>  * mysql-connector-java (GPL2, optional)
> >>>  * org.ektorp
> >>>  * rengine (LGPL, optional)
> >>>  * rome
> >>>  * solr-core
> >>>  * solr-solrj
> >>>  * spymemcached
> >>>  * sqlite4java
> >>>  * super-csv
> >>>  * twitter4j-core
> >>>  * twitter4j-stream
> >>>  * uadetector-resources
> >>>  * org.apache.servicemix.bundles.splunk
> >>>
> >>> Build only dependencies:
> >>>  * None
> >>>
> >>> Test only dependencies:
> >>>  * activemq-broker
> >>>  * activemq-kahadb-store
> >>>  * greenmail
> >>>  * hadoop-yarn-server-tests
> >>>  * hsqldb
> >>>  * janino
> >>>  * junit
> >>>  * MockFtpServer
> >>>  * mockito-all
> >>>  * testng
> >>>
> >>> Cryptography N/A
> >>>
> >>> == Required Resources ==
> >>> === Mailing lists ===
> >>>  * private@apex.incubator.apache.org (moderated subscriptions)
> >>>  * commits@apex.incubator.apache.org
> >>>  * dev@apex.incubator.apache.org
> >>>
> >>> === Git Repository ===
> >>>  * https://git-wip-us.apache.org/repos/asf/incubator-apex-core.git
> >>>  * https://git-wip-us.apache.org/repos/asf/incubator-apex-malhar.git
> >>>
> >>> === Issue Tracking ===
> >>>  * JIRA Project Apex (APEX_CORE) // If '_' is not allowed, use APEXCORE
> >>>  * JIRA Project Malhar (APEX_MALHAR) // If '_' is not allowed use
> >>> APEXMALHAR
> >>>
> >>> === Other Resources ===
> >>>  * Means of setting up regular builds for apex-core on
> builds.apache.org
> >>>  * Means of setting up regular builds for apex-malhar on
> >>> builds.apache.org
> >>>
> >>> === Rationale for Malhar and Apex having separate git and jira ===
> >>> We managed Malhar and Apex as two repos and two jiras on purpose. Both
> >>> code bases are released under Apache 2.0 and are proposed for
> incubation.
> >>> In terms of our vision to enable innovation around a native YARN
> >>> data-in-motion that unifies stream processing as well as batch
> processing
> >>> Malhar and Apex go hand in hand. Apex has base API that consists of
> java
> >>> api (functional), and attributes (operability). Malhar is a
> manifestation
> >>> of this api, but from user perspective, Malhar is itself an API to
> leverage
> >>> business logic. Over past three years we have found that the cadence of
> >>> release and api changes in Malhar is much rapid than Apex and it was
> >>> operationally much easier to separate them into their own repos. Two
> repos
> >>> will reflect clear separation of engine (Apex) and operators/business
> logic
> >>> (Malhar). It will allow or independent release cycles (operator change
> >>> independent of engine due to stable API). We however do not believe in
> two
> >>> levels of committers. We believe there should be one community that
> works
> >>> across both and innovates with ideas that Malhar and Apex combined
> provide
> >>> the value proposition. We are proposing that Apache incubation process
> help
> >>> us to foster development of one community (mailing list, committers),
> and a
> >>> yet be ok with two repos. We are proposing that this be taken up during
> >>> incubation. Community will learn if this works. The decision on
> whether to
> >>> split them into two projects be taken after the learning curve during
> >>> incubation.
> >>>
> >>> == Initial Committers ==
> >>>  * Roma Ahuja (rahuja at directv dot com)
> >>>  * Isha Arkatkar (isha at datatorrent dot com)
> >>>  * Raja Ali (raji at silverspringnet dot com)
> >>>  * Sunaina Chaudhary ( SChaudhary at directv dot com)
> >>>  * Bhupesh Chawda (bhupesh at datatorrent dot com)
> >>>  * Chaitanya Chelobu (chaitanya at datatorrent dot com)
> >>>  * Bright Chen (bright at datatorrent dot com)
> >>>  * Pradeep Dalvi (pradeep dot dalvi at datatorrent dot com)
> >>>  * Sandeep Deshmukh (sandeep at datatorrent dot com)
> >>>  * Yogi Devendra (yogi at datatorrent dot com)
> >>>  * Cem Ezberci (hasan dot ezberci at ge dot com)
> >>>  * Timothy Farkas (tim at datatorrent dot com)
> >>>  * Ilya Ganelin (ilya dot ganelin at capitalone dot com)
> >>>  * Vitthal Gogate (vitthal_gogate at yahoo dot com)
> >>>  * Parag Goradia (parag dot goradia at ge dot com)
> >>>  * Tushar Gosavi (tushar at datatorrent dot com)
> >>>  * Priyanka Gugale (priyanka at datatorrent dot com)
> >>>  * Gaurav Gupta (gaurav at datatorrent dot com)
> >>>  * Sandesh Hegde (sandesh at datatorrent dot com)
> >>>  * Siyuan Hua ( siyuan at datatorrent dot com)
> >>>  * Ajith Joseph (ajoseph at silverspring dot com)
> >>>  * Amol Kekre ( amol at datatorrent dot com)
> >>>  * Chinmay Kolhatkar ( chinmay at datatorrent dot com)
> >>>  * Pramod Immaneni ( pramod at datatorrent dot com)
> >>>  * Anuj Lal ( anuj dot lal at ge dot com)
> >>>  * Dongsu Lee (dlee3 at directv dot com)
> >>>  * Vitaly Li (blossom dot valley at gmail dot com)
> >>>  * Dean Lockgaard (dean  at datatorrent dot com)
> >>>  * Rohan Mehta (rohan_mehta at apple dot com)
> >>>  * Adi Mishra (apmishra at directv dot com, adi dot mishra at gmail dot
> >>> com)
> >>>  * Chetan Narsude (chetan  at datatorrent dot com)
> >>>  * Darin Nee (dnee at silverspring dot com)
> >>>  * Alexander Parfenov (sasha at datatorrent dot com)
> >>>  * Andrew Perlitch (andy at datatorrent dot com)
> >>>  * Shubham Phatak (shubham at datatorrent dot com)
> >>>  * Ashwin Putta (ashwin at datatorrent dot com)
> >>>  * Rikin Shah (shah_rikin at yahoo dot com)
> >>>  * Luis Ramos (l dot ramos at ge dot com)
> >>>  * Munagala Ramanath (ram at datatorrent dot com)
> >>>  * Vlad Rozov (vlad dot rozov at datatorrent dot com)
> >>>  * Atri Sharma (atri dot jiit at gmail dot com)
> >>>  * Chandni Singh (chandni at datatorrent dot com)
> >>>  * Venkatesh Sivasubramanian (venkateshs at ge dot com)
> >>>  * Aniruddha Thombare (aniruddha at datatorrent dot com)
> >>>  * Jessica Wang (jessica at datatorrent dot com)
> >>>  * Thomas Weise (thomas at datatorrent dot com)
> >>>  * David Yan (david at datatorrent dot com)
> >>>  * Kevin Yang (yang dot k at ge dot com)
> >>>  * Brennon York (brennon dot york at capitalone dot com)
> >>>
> >>> == Affiliations ==
> >>>  * Apple: Vitaly Li, Rohan Mehta
> >>>  * Barclays: Atri Sharma
> >>>  * Class Software: Justin Mclean
> >>>  * CapitalOne: Ilya Ganelin, Brennon York
> >>>  * DataTorrent: everyone else on this proposal
> >>>  * Datachief: Rikin Shah
> >>>  * DirecTV: Roma Ahuja, Sunaina Chaudhary, Dongsu Lee, Adi Mishra
> >>>  * E8security: Vitthal Gogate
> >>>  * General Electric: Cem Ezberci, Parag Goradia, Anuj Lal, Luis Ramos,
> >>> Venkatesh Sivasubramanian, Kevin Yang
> >>>  * Hortonworks: Alan Gates, Taylor Goetz, Chris Nauroth, Hitesh Shah
> >>>  * MapR: Ted Dunning
> >>>  * SilverSpring Networks: Raja Ali, Ajith Joseph, Darin Nee
> >>>
> >>> == Sponsors ==
> >>>
> >>> === Champion ===
> >>> Ted Dunning
> >>>
> >>> === Nominated Mentors ===
> >>>
> >>> The initial mentors are listed below:
> >>>  * Ted Dunning - Apache Member, MapR
> >>>  * Alan Gates - Apache Member, Hortonworks
> >>>  * Taylor Goetz - Apache Member, Hortonworks
> >>>  * Justin Mclean - Apache Member, Class Software
> >>>  * Chris Nauroth - Apache Member, Hortonworks
> >>>  * Hitesh Shah: Apache Member, Hortonworks
> >>>
> >>> === Sponsoring Entity ===
> >>>
> >>> We would like to propose Apache incubator to sponsor this project.
> >>>
> >>>
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message