hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Minho Kim <mi...@apache.org>
Subject Re: [DISCUSSION] Spinoff ANN package
Date Thu, 06 Aug 2015 02:20:53 GMT
+1
I would like to participate in too. :-)

2015-08-06 6:12 GMT+09:00 Behroz Sikander <behroz89@gmail.com>:

> +1
> I would also like to participate :)
>
> On Wed, Aug 5, 2015 at 5:52 AM, Edward J. Yoon <edwardyoon@apache.org>
> wrote:
>
> > Guys,
> >
> > I plan to submit a 'DNN platform on top of Apache Hama' proposal as
> > below. I know Hama community is somewhat small, but the main reason is
> > that this domain-specific project is not fit for Apache Hama
> > community. Recruiting volunteers is also hard problem. I expect this
> > will become a very nice use-case of Apache Hama.
> >
> > If you have any suggestions or other opinions, Please let me know.
> > Also, if you want to participate in this project, Pls feel free to add
> > your name here.
> >
> > Thanks!
> >
> > --
> > == Abstract ==
> >
> > (tentatively named "Horn [hɔ:n]", korean meaning of Horn is a
> > "Spirit") is a neuron-centric programming APIs and execution framework
> > for large-scale deep learning, built on top of Apache Hama.
> >
> > == Proposal ==
> >
> > It is a goal of the Horn to provide a neuron-centric programming APIs
> > which allows user to easily define the characteristic of artificial
> > neural network model and its structure, and its execution framework
> > that leverages the heterogeneous resources on Hama and Hadoop YARN
> > cluster.
> >
> > == Background ==
> >
> > The initial ANN code was developed at Apache Hama project by a
> > committer, Yexi Jiang (Facebook) in 2013. The motivation behind this
> > work is to build a framework that provides more intuitive programming
> > APIs like Google's MapReduce or Pregel and supports applications
> > needing large model with huge memory consumptions in distributed way.
> >
> > == Rationale ==
> >
> > While many of deep learning open source softwares are still data or
> > model parallel only, we aim to support both data and model parallelism
> > and also fault-tolerant system design. The basic idea of data and
> > model parallelism is use of the remote parameter server to parallelize
> > model creation and distribute training across machines, and the BSP
> > framework of Apache Hama for performing asynchronous mini-batches.
> > Within single BSP job, each task group works asynchronously using
> > region barrier synchronization instead of global barrier
> > synchronization, and trains large-scale neural network model using
> > assigned data sets in BSP paradigm. This architecture is inspired by
> > Google's DistBelief (Jeff Dean et al, 2012).
> >
> > == Initial Goals ==
> >
> > Some current goals include:
> >
> >  * builds new community
> >  * provides more intuitive programming APIs
> >  * needs both data and model parallelism support
> >  * must run natively on both Hama and Hadoop2
> >  * needs also GPUs and InfiniBand support
> >
> > == Current Status ==
> >
> > === Meritocracy ===
> >
> > The core developers understand what it means to have a process based
> > on meritocracy. We will provide continuous efforts to build an
> > environment that supports this, encouraging community members to
> > contribute.
> >
> > === Community ===
> >
> > A small community has formed within the Apache Hama project and some
> > companies such as instant messenger service company and mobile
> > manufacturing company. And many people are interested in the
> > large-scale deep learning platform itself. By bringing Horn into
> > Apache, we believe that the community will grow even bigger.
> >
> > === Core Developers ===
> >
> > Edward J. Yoon, Thomas Jungblut, and Dongjin Lee
> >
> > == Known Risks ==
> >
> > === Orphaned Products ===
> >
> > Apache Hama is already a core open source component at Samsung
> > Electronics, and Horn also will be used by Samsung Electronics, and so
> > there is no direct risk for this project to be orphaned.
> >
> > === Inexperience with Open Source ===
> >
> > Some are very new and the others have experience using and/or working
> > on Apache open source projects.
> >
> > === Homogeneous Developers ===
> >
> > The initial committers are from different organizations such as,
> > Microsoft, Samsung Electronics, and Line Plus.
> >
> > === Reliance on Salaried Developers ===
> >
> > Other developers will also start working on the project in their spare
> > time.
> >
> > === Relationships with Other Apache Products ===
> >
> >  * Horn is based on Apache Hama
> >  * Apache Zookeeper is used for distributed locking service
> >  * Natively run on Apache Hadoop and Mesos
> >  * Horn can be somewhat overlapped with Singa podling.
> >
> > === An Excessive Fascination with the Apache Brand ===
> >
> > Horn itself will hopefully have benefits from Apache, in terms of
> > attracting a community and establishing a solid group of developers,
> > but also the relation with Apache Hama, a general-purpose BSP
> > computing engine. These are the main reasons for us to send this
> > proposal.
> >
> > == Documentation ==
> >
> > Initial plan about Horn can be found at
> > http://blog.udanax.org/2015/06/googles-distbelief-clone-project-on.html
> >
> > == Initial Source ==
> >
> > The initial source code has been release as part of Apache Hama
> > project developed under Apache Software Foundation. The source code is
> > currently hosted at
> >
> >
> https://svn.apache.org/repos/asf/hama/trunk/ml/src/main/java/org/apache/hama/ml/ann/
> >
> > == Cryptography ==
> >
> > Not applicable.
> >
> > == Required Resources ==
> >
> > Mailing Lists
> >
> >  * horn-private
> >  * horn-dev
> >
> > Subversion Directory
> >
> >  * Git is the preferred source control system: git://git.apache.org/horn
> >
> > Issue Tracking
> >
> >  * a JIRA issue tracker, HORN
> >
> > == Initial Committers and Affiliations ==
> >
> >  * Thomas Jungblut (tjungblut at apache dot org)
> >  * Edward J. Yoon (edwardyoon at apache dot org)
> >  * Dongjin Lee (dongjin.lee.kr at gmail dot com)
> >  * Minho Kim (minwise.kim at samsung dot com)
> >  * TODO
> >
> > == Affiliations ==
> >
> >  * Thomas Jungblut (Microsoft)
> >  * Edward J. Yoon (Samsung Electronics)
> >  * Donjin Lee (LINE Plus)
> >  * Minho Kim (Samsung Electronics)
> >  * TODO
> >
> > == Sponsors ==
> >
> > Champion
> >
> >  * Edward J. Yoon <edwardyoon at apache dot org>
> >
> > Nominated Mentors
> >
> >  * TODO
> >
> > Sponsoring Entity
> >
> > The Apache Incubator
> >
> > --
> > Best Regards, Edward J. Yoon
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message