incubator-cvs mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Incubator Wiki] Trivial Update of "HivemallProposal" by MakotoYui
Date Mon, 22 Aug 2016 08:05:44 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Incubator Wiki" for change notification.

The "HivemallProposal" page has been changed by MakotoYui:
https://wiki.apache.org/incubator/HivemallProposal?action=diff&rev1=1&rev2=2

Comment:
Updated about the original license

  
  == Proposal ==
  
- Hivemall is a collection of machine learning algorithms and versatile data analystics functions.
It provides a number of ease of use machine learning functionalites through user-defined function
(UDF), user-defined aggregate function (UDAFs), and/or user-defined table generating functions
(UDTFs) of Apache Hive. It offers a variety of functionalities: regression, classification,
recommendation, anomaly detection, k-nearest neighbor, and feature engineering. Hivemall supports
state-of-the-art machine learning algorithms such as Soft Confidence Weighted, Adaptive Regularization
of Weight Vectors, Factorization Machines, and AdaDelta. Hivemall is mainly designed to run
on Apache Hive but it also supports Apache Pig and Apache Spark for the runtime.
+ Hivemall is a collection of machine learning algorithms and versatile data analytics functions.
It provides a number of ease of use machine learning functionalities through user-defined
function (UDF), user-defined aggregate function (UDAFs), and/or user-defined table generating
functions (UDTFs) of Apache Hive. It offers a variety of functionalities: regression, classification,
recommendation, anomaly detection, k-nearest neighbor, and feature engineering. Hivemall supports
state-of-the-art machine learning algorithms such as Soft Confidence Weighted, Adaptive Regularization
of Weight Vectors, Factorization Machines, and AdaDelta. Hivemall is mainly designed to run
on Apache Hive but it also supports Apache Pig and Apache Spark for the runtime.
  
  == Background ==
  
  Hivemall started as a research project of the main developer at National Institute of Advanced
Industrial Science and Technology (AIST) in 2013 and the initial version was released on 2
Oct, 2013 on Github: https://github.com/myui/hivemall.
  
- After the main developer moving to Treasure Data in 2015, the project has been actively
developed as an open source product and changed the license from GPL v2 to Apache License
v2 on Mar 16, 2015. The project copyright holders agreed to change the license then.
+ After the main developer moving to Treasure Data in 2015, the project has been actively
developed as an open source product and changed the license from GNU LGPL v2.1 to Apache License
v2 on Mar 16, 2015. The project copyright holders agreed to change the license then.
  
  The community is growing incrementally and the project has 15 contributors, 431 stars, and
131 forks on Github as of Aug 15, 2016. The project was awarded for the InfoWorld Bossie Awards
(the best open source big data tools) in 2014.
  
@@ -22, +22 @@

  
  == Rationale ==
  
- User-defined function is a powerful mechanism to enrich the expressive power of decrative
query languages like SQL, HiveQL, PigLatin, Spark SQL. Hive UDF interface is now becoming
the de-facto standard for SQL-on-Hadoop platforms; Apache Spark and Apache Pig have full supports
for Hive UDFs/UDAFs/UDTFs, and Apache Impala, Apache Drill, and Apache Tajo also have limited
supports for Hive UDFs/UDAFs.
+ User-defined function is a powerful mechanism to enrich the expressive power of declarative
query languages like SQL, HiveQL, PigLatin, Spark SQL. Hive UDF interface is now becoming
the de-facto standard for SQL-on-Hadoop platforms; Apache Spark and Apache Pig have full supports
for Hive UDFs/UDAFs/UDTFs, and Apache Impala, Apache Drill, and Apache Tajo also have limited
supports for Hive UDFs/UDAFs.
  
  Hivemall can be considered as a cross platform library for machine learning as Hivemall
is implemented as cross platform Hive UDFs/UDAFs/UDTFs; prediction models built by a batch
query of Apache Hive can be used on Apache Spark/Pig, and conversely, prediction models build
by Apache Spark can be used from Apache Hive/Pig.
  
@@ -37, +37 @@

  == Initial Goals ==
  
  The initial goals are as follows: 
-  - Establish the project governance in the Apache way and broaden the community
+  * Establish the project governance in the Apache way and broaden the community
-  - Improve documentations.
+  * Improve documentations.
-  - Adding more unit/scenario tests.
+  * Adding more unit/scenario tests.
-  - Handover of code and copyrights
+  * Handover of code and copyrights
  
  == Current Status ==
  
@@ -58, +58 @@

  
  === Community ===
  
- While there are 15 contributors in total, there are 3-4 active developers continously involved
for the major feature development at the moment.  We hope to extend our contributor base and
encourages suggestions and contributions from any potential user.
+ While there are 15 contributors in total, there are 3-4 active developers continuously involved
for the major feature development at the moment.  We hope to extend our contributor base and
encourages suggestions and contributions from any potential user.
  
  === Core Developers ===
  
@@ -66, +66 @@

  
  === Alignment ===
  
- Incubating at ASF is the natural choice for the Hivemall project because the Hivemall is
targetting to run on Apache Hive, Apache Spark, and Apache Pig. We encourage integrations
with other ASF data processing frameworks like Apache Impala and Apache Drill.
+ Incubating at ASF is the natural choice for the Hivemall project because the Hivemall is
targeting to run on Apache Hive, Apache Spark, and Apache Pig. We encourage integrations with
other ASF data processing frameworks like Apache Impala and Apache Drill.
  
  == Known Risks ==
  
@@ -88, +88 @@

  
  === Reliance on Salaried Developers ===
  
- The major developer is paid by his employer to contribute to this project and the other
developers are payed by their employers for Hadoop-related open source development. While
they might chage their affiliations over time, they are willing to have their expertise for
the open source development. So, the project would continue regardless their affiliations.
+ The major developer is paid by his employer to contribute to this project and the other
developers are payed by their employers for Hadoop-related open source development. While
they might change their affiliations over time, they are willing to have their expertise for
the open source development. So, the project would continue regardless their affiliations.
  
  === Relationships with Other Apache Products ===
  
@@ -143, +143 @@

  
  The dependencies all have Apache compatible licenses.
  
- Cryptography N/A
+ == Cryptography ==
+ 
+ N/A
  
  == Required resources ==
  
@@ -158, +160 @@

  
  https://git-wip-us.apache.org/repos/asf/incubator-hivemall.git
  
- === JIRA assistence ===
+ === JIRA assistance ===
  
  JIRA project Hivemall (HIVEMALL)
  
@@ -186, +188 @@

  == Sponsors ==
  
  === Champion ===
- 
- Roman Shaposhnik (Pivotal, ASF member, IPMC member) Apache Bigtop/Incubator PMC member 
+  * Roman Shaposhnik (Pivotal, ASF member, IPMC member) Apache Bigtop/Incubator PMC member

  
  === Nominated Mentors ===
  

---------------------------------------------------------------------
To unsubscribe, e-mail: cvs-unsubscribe@incubator.apache.org
For additional commands, e-mail: cvs-help@incubator.apache.org


Mime
View raw message