incubator-cvs mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Incubator Wiki] Trivial Update of "ApexProposal" by AmolKekre
Date Tue, 04 Aug 2015 06:52:02 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Incubator Wiki" for change notification.

The "ApexProposal" page has been changed by AmolKekre:
https://wiki.apache.org/incubator/ApexProposal?action=diff&rev1=14&rev2=15

Comment:
cleanup

  == Abstract ==
  Apex is an enterprise grade native YARN big data-in-motion platform that unifies stream
processing as well as batch processing. Apex processes big data in-motion in a highly scalable,
highly performant, fault tolerant, stateful, secure, distributed, and an easily operable way.
It provides a simple API that enables users to write or re-use generic Java code, thereby
lowering the expertise needed to write big data applications.
  
- Functional and operational specifications are separated. Apex is designed in a way to enable
users to write their own code (aka user defined functions) as is and leave all operability
to the platform. The API is very simple and is designed to allow users to drop in their code
as is. The platform mainly deals with operability and treats functional code as a black box.
Operability includes fault tolerance, scalability, security, ease of use, metrics api, webservices
etc. In other words there is no separation of UDF (user defined functions), as all functional
code is UDF. This frees users to focus on functional development, and lets platform provide
operability support. The same code runs as is with different operability attributes. The data-in-motion
architecture of Apex unifies stream as well as batch processing in a single platform. Since
Apex is a native YARN application, it leverages all the components of YARN without duplication.
Apex was developed with YARN in mind and has no overlapping components/functionality with
YARN. 
+ Functional and operational specifications are separated. Apex is designed in a way to enable
users to write their own code (aka user defined functions) as is and leave all operability
to the platform. The API is very simple and is designed to allow users to drop in their code
as is. The platform mainly deals with operability and treats functional code as a black box.
Operability includes fault tolerance, scalability, security, ease of use, metrics api, webservices
etc. In other words there is no separation of UDF (user defined functions), as all functional
code is UDF. This frees users to focus on functional development, and lets platform provide
operability support. The same code runs as is with different operability attributes. The data-in-motion
architecture of Apex unifies stream as well as batch processing in a single platform. Since
Apex is a native YARN application, it leverages all the components of YARN without duplication.
Apex was developed with YARN in mind and has no overlapping components/functionality with
YARN.
  
- The Apex platform is supplemented by project Malhar which is a library of operators that
implement common business logic functions needed by customers who want to quickly develop
applications. These operators provide access to HDFS, S3, NFS, FTP, and other file systems;
 Kafka, ActiveMQ, RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
HBase, CouchDB and other databases along with JDBC connectors. The Malhar library also includes
a host of other common business logic patterns that help users to significantly reduce the
time it takes to go into production. Ease of integration with all other big data technologies
is one of the primary missions of Malhar.
+ The Apex platform is supplemented by project Malhar, which is a library of operators that
implement common business logic functions needed by customers who want to quickly develop
applications. These operators provide access to HDFS, S3, NFS, FTP, and other file systems;
 Kafka, ActiveMQ, RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
HBase, CouchDB and other databases along with JDBC connectors. The Malhar library also includes
a host of other common business logic patterns that help users to significantly reduce the
time it takes to go into production. Ease of integration with all other big data technologies
is one of the primary missions of Malhar.
  
  == Proposal ==
  The goal of this proposal is to establish the core engine of DataTorrent RTS product as
a Apache Software Foundation (ASF) project in order to build a vibrant, diverse, and self-governed
open source community around the technology. DataTorrent will continue to sell management
tools, application building tools, easy to use big data applications, and custom high end
business logic operators. This proposal covers the Apex source code (written in Java), Apex
documentation and other materials currently available on https://github.com/DataTorrent/Apex.
This proposal also covers the Malhar source code (written in Java), Malhar documentation,
and other materials currently available on https://github.com/DataTorrent/Malhar. We have
done a trademark check on the name Apex, and have concluded that the Apex name is likely to
be a suitable project name. 

---------------------------------------------------------------------
To unsubscribe, e-mail: cvs-unsubscribe@incubator.apache.org
For additional commands, e-mail: cvs-help@incubator.apache.org


Mime
View raw message