Return-Path: X-Original-To: apmail-incubator-general-archive@www.apache.org Delivered-To: apmail-incubator-general-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 17DB518E20 for ; Fri, 21 Aug 2015 04:14:18 +0000 (UTC) Received: (qmail 2099 invoked by uid 500); 21 Aug 2015 04:14:17 -0000 Delivered-To: apmail-incubator-general-archive@incubator.apache.org Received: (qmail 1847 invoked by uid 500); 21 Aug 2015 04:14:17 -0000 Mailing-List: contact general-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@incubator.apache.org Delivered-To: mailing list general@incubator.apache.org Received: (qmail 1834 invoked by uid 99); 21 Aug 2015 04:14:17 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Aug 2015 04:14:17 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id AFFC61AA8C8 for ; Fri, 21 Aug 2015 04:14:16 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.005 X-Spam-Level: X-Spam-Status: No, score=-0.005 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.006, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id ab6mf3pXLTHt for ; Fri, 21 Aug 2015 04:14:03 +0000 (UTC) Received: from mailgw0.comp.nus.edu.sg (84-20.comp.nus.edu.sg [137.132.84.20]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTP id 54EA125073 for ; Fri, 21 Aug 2015 04:14:03 +0000 (UTC) Received: from localhost (avs2.comp.nus.edu.sg [192.168.20.37]) by mailgw0.comp.nus.edu.sg (Postfix) with ESMTP id DB590400DEEF3; Fri, 21 Aug 2015 12:13:55 +0800 (SGT) X-Virus-Scanned: amavisd-new at comp.nus.edu.sg Received: from mailgw0.comp.nus.edu.sg ([192.168.20.35]) by localhost (avs.comp.nus.edu.sg [192.168.20.37]) (amavisd-new, port 10024) with ESMTP id AVN4q-qftPxC; Fri, 21 Aug 2015 12:13:49 +0800 (SGT) Received: from webmail.comp.nus.edu.sg (roundcube0.comp.nus.edu.sg [192.168.20.55]) by mailgw0.comp.nus.edu.sg (Postfix) with ESMTP; Fri, 21 Aug 2015 12:13:49 +0800 (SGT) Received: by webmail.comp.nus.edu.sg (Postfix, from userid 48) id 9C57E400DD2E2; Fri, 21 Aug 2015 12:13:49 +0800 (SGT) To: general@incubator.apache.org Subject: Re: [DISCUSS] Horn Incubation Proposal X-PHP-Originating-Script: 99:rcube.php MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Date: Fri, 21 Aug 2015 12:13:49 +0800 From: ooibc Cc: "Edward J. Yoon" In-Reply-To: References: Message-ID: <8539e2431c0c0e274e2dc4488334036b@comp.nus.edu.sg> X-Sender: ooibc@comp.nus.edu.sg User-Agent: Roundcube Webmail/1.0.5 Hi, I am an initial committer of Apache(incubating) SINGA (http://singa.incubator.apache.org/) Both SINGA and the proposal follow the general parameter-server architecture: workers for computing gradients; servers for parameter updating. SINGA has implemented the model and data parallelism discussed in the Horn' proposal: multiple worker groups for asynchronous training---data parallelism; and multiple workers in one group for synchronous training---model parallelism. One feature of SINGA's architecture is that it can be extended to organize the servers in a hierarchical topology, which may help to reduce the communication bottleneck of servers organized in a flat topology. For the programming model, currently Horn proposes to support feed-forward models, e.g., MLP, auto-encoder, while SINGA supports all three categories of the known models, feed-forward models (eg MLP, CNN), energy models (eg RBM, DBM), and recurrent models (eg. RNN). SINGA provides good support for users to code, e.g., implement new parameter updating protocols or layers, and is being integrated with HDFS as well. We will submit the first release and full documentation to the mentors this weekend, and if ok, we will announce the first full release soon. The GPU version is scheduled for October release. Technical papers: http://www.comp.nus.edu.sg/~ooibc/singa-mm15.pdf http://www.comp.nus.edu.sg/~ooibc/singaopen-mm15.pdf and project website (which has more details than the Apache web site): http://www.comp.nus.edu.sg/~dbsystem/singa/ There are plenty of rooms for collaborations indeed... regards beng chin www.comp.nus.edu.sg/~ooibc On 2015-08-21 08:27, Edward J. Yoon wrote: > Hi all, > > We'd like to propose Horn (혼), a fully distributed system for > large-scale deep learning as an Apache Incubator project and start the > discussion. The complete proposal can be found at: > https://wiki.apache.org/incubator/HornProposal > > Any advices and helps are welcome! Thanks, Edward. > > = Horn Proposal = > > == Abstract == > > (tentatively named "Horn [hɔ:n]", korean meaning of Horn is a > "Spirit") is a neuron-centric programming APIs and execution framework > for large-scale deep learning, built on top of Apache Hama. > > == Proposal == > > It is a goal of the Horn to provide a neuron-centric programming APIs > which allows user to easily define the characteristic of artificial > neural network model and its structure, and its execution framework > that leverages the heterogeneous resources on Hama and Hadoop YARN > cluster. > > == Background == > > The initial ANN code was developed at Apache Hama project by a > committer, Yexi Jiang (Facebook) in 2013. The motivation behind this > work is to build a framework that provides more intuitive programming > APIs like Google's MapReduce or Pregel and supports applications > needing large model with huge memory consumptions in distributed way. > > == Rationale == > > While many of deep learning open source softwares such as Caffe, > DeepDist, and NeuralGiraph are still data or model parallel only, we > aim to support both data and model parallelism and also fault-tolerant > system design. The basic idea of data and model parallelism is use of > the remote parameter server to parallelize model creation and > distribute training across machines, and the BSP framework of Apache > Hama for performing asynchronous mini-batches. Within single BSP job, > each task group works asynchronously using region barrier > synchronization instead of global barrier synchronization, and trains > large-scale neural network model using assigned data sets in BSP > paradigm. Thus, we achieve data and model parallelism. This > architecture is inspired by Google's !DistBelief (Jeff Dean et al, > 2012). > > == Initial Goals == > > Some current goals include: > * builds new community > * provides more intuitive programming APIs > * needs both data and model parallelism support > * must run natively on both Hama and Hadoop2 > * needs also GPUs and InfiniBand support (FPGAs if possible) > > == Current Status == > > === Meritocracy === > > The core developers understand what it means to have a process based > on meritocracy. We will provide continuous efforts to build an > environment that supports this, encouraging community members to > contribute. > > === Community === > > A small community has formed within the Apache Hama project and some > companies such as instant messenger service company and mobile > manufacturing company. And many people are interested in the > large-scale deep learning platform itself. By bringing Horn into > Apache, we believe that the community will grow even bigger. > > === Core Developers === > > Edward J. Yoon, Thomas Jungblut, and Dongjin Lee > > == Known Risks == > > === Orphaned Products === > > Apache Hama is already a core open source component at Samsung > Electronics, and Horn also will be used by Samsung Electronics, and so > there is no direct risk for this project to be orphaned. > > === Inexperience with Open Source === > > Some are very new and the others have experience using and/or working > on Apache open source projects. > > === Homogeneous Developers === > > The initial committers are from different organizations such as, > Microsoft, Samsung Electronics, and Line Plus. > > === Reliance on Salaried Developers === > > Few will be worked as a full-time open source developer. Other > developers will also start working on the project in their spare time. > > === Relationships with Other Apache Products === > > * Horn is based on Apache Hama > * Apache Zookeeper is used for distributed locking service > * Natively run on Apache Hadoop and Mesos > * Horn can be somewhat overlapped with Singa podling (If possible, > we'd also like to use Singa or Caffe to do the heavy lifting part). > > === An Excessive Fascination with the Apache Brand === > > Horn itself will hopefully have benefits from Apache, in terms of > attracting a community and establishing a solid group of developers, > but also the relation with Apache Hama, a general-purpose BSP > computing engine. These are the main reasons for us to send this > proposal. > > == Documentation == > > Initial plan about Horn can be found at > http://blog.udanax.org/2015/06/googles-distbelief-clone-project-on.html > > == Initial Source == > > The initial source code has been release as part of Apache Hama > project developed under Apache Software Foundation. The source code is > currently hosted at > https://svn.apache.org/repos/asf/hama/trunk/ml/src/main/java/org/apache/hama/ml/ann/ > > == Cryptography == > > Not applicable. > > == Required Resources == > > === Mailing Lists === > > * horn-private > * horn-dev > > === Subversion Directory === > > * Git is the preferred source control system: > git://git.apache.org/horn > > === Issue Tracking === > > * a JIRA issue tracker, HORN > > == Initial Committers and Affiliations == > > * Thomas Jungblut (tjungblut AT apache DOT org) > * Edward J. Yoon (edwardyoon AT apache DOT org) > * Dongjin Lee (dongjin.lee.kr AT gmail DOT com) > * Minho Kim (minwise.kim AT samsung DOT com) > * Chia-Hung Lin (chl501 AT apache DOT org) > * Behroz Sikander (behroz.sikander AT tum DOT de) > * Hyok S. Choi (hyok.choi AT samsung DOT com) > * Kisuk Lee (ks881115 AT gmail DOT com) > > == Affiliations == > > * Thomas Jungblut (Microsoft) > * Edward J. Yoon (Samsung Electronics) > * Donjin Lee (LINE Plus) > * Minho Kim (Samsung Electronics) > * Chia-Hung Lin (Self) > * Behroz Sikander (Technical University of Munich) > * Hyok S. Choi (Samsung Electronics) > * Kisuk Lee (Seoul National University) > > == Sponsors == > > === Champion === > > * Edward J. Yoon > > === Nominated Mentors === > > * Luciano Resende > * Robin Anil > * Edward J. Yoon > > === Sponsoring Entity === > > The Apache Incubator --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org For additional commands, e-mail: general-help@incubator.apache.org