Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id CD86C200C3D for ; Tue, 28 Feb 2017 01:13:14 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id CC067160B6C; Tue, 28 Feb 2017 00:13:14 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id CC0B7160B60 for ; Tue, 28 Feb 2017 01:13:13 +0100 (CET) Received: (qmail 86467 invoked by uid 500); 28 Feb 2017 00:13:13 -0000 Mailing-List: contact dev-help@hama.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hama.apache.org Delivered-To: mailing list dev@hama.apache.org Received: (qmail 86456 invoked by uid 99); 28 Feb 2017 00:13:12 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Feb 2017 00:13:12 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 4BF10C0C4E for ; Tue, 28 Feb 2017 00:13:12 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -4.022 X-Spam-Level: X-Spam-Status: No, score=-4.022 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.001, SPF_HELO_PASS=-0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id s6nW0QF8R0RR for ; Tue, 28 Feb 2017 00:13:10 +0000 (UTC) Received: from mailout4.samsung.com (mailout4.samsung.com [203.254.224.34]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id F06275FB0B for ; Tue, 28 Feb 2017 00:13:09 +0000 (UTC) Received: from epcas1p4.samsung.com (unknown [182.195.41.48]) by mailout4.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTP id <0OM202DMO5XPB810@mailout4.samsung.com> for dev@hama.apache.org; Tue, 28 Feb 2017 09:13:01 +0900 (KST) Received: from epsmges5p3.samsung.com (unknown [182.195.40.67]) by epcas1p1.samsung.com (KnoxPortal) with ESMTP id 20170228001301epcas1p11a6b056af0e992ce2ee23b58ae497f4d~nSxsOx4hK0657706577epcas1p1O; Tue, 28 Feb 2017 00:13:01 +0000 (GMT) Received: from epcas5p4.samsung.com ( [182.195.41.42]) by epsmges5p3.samsung.com (EPCPMTA) with SMTP id 30.97.04781.D80C4B85; Tue, 28 Feb 2017 09:13:01 +0900 (KST) Received: from epcpsbgm2new.samsung.com (u27.gpu120.samsung.co.kr [203.254.230.27]) by epcas5p4.samsung.com (KnoxPortal) with ESMTP id 20170228001301epcas5p4705c285b5d1ccf9193798d7b5b918753~nSxr90aBt0623906239epcas5p41; Tue, 28 Feb 2017 00:13:01 +0000 (GMT) X-AuditID: b6c32a2e-f79d66d0000012ad-e9-58b4c08d4f0f Received: from epmmp1.local.host ( [203.254.227.16]) by epcpsbgm2new.samsung.com (EPCPMTA) with SMTP id 73.19.06422.D80C4B85; Tue, 28 Feb 2017 09:13:01 +0900 (KST) Received: from edwardyoon ([10.113.77.89]) by mmp1.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTPA id <0OM200HPB5XODUA0@mmp1.samsung.com>; Tue, 28 Feb 2017 09:13:00 +0900 (KST) From: "Edward J. Yoon" To: dev@hama.apache.org Cc: general@incubator.apache.org In-reply-to: <7f0ca21e0610482a9f68dcb5aa5a6f46@impetus.co.in> Subject: RE: Proposal for an Apache Hama sub-project Date: Tue, 28 Feb 2017 09:13:01 +0900 Message-id: <027201d29157$6cb79ac0$4626d040$@samsung.com> X-Mailer: Microsoft Outlook 14.0 Thread-index: AQGsmKYUpWBOLmZlZoNbMSQ9JsCQYwMe9YNzobBunfA= Content-language: ko X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrDKsWRmVeSWpSXmKPExsWy7bCmlm7vgS0RBn1XJS12//jFaLF9zTtG ByaPd4cbGT1mLdvKFMAUlWqTkZqYklqkkJqXnJ+SmZduq+QdHO8cb2pmYKhraGlhrqSQl5ib aqvk4hOg65aZAzRfSaEsMacUKBSQWFyspG9nU5RfWpKqkJFfXGKrFG1oaKRnaGCuZ2RkpGdi HGtlZApUkpCacfLpR8aCSc4V//79Y29gfGDWxcjBISFgIrHrVnkXIyeQKSZx4d56ti5GLg4h gaWMEi/+N7CCJIQE2pkk5r9Whygykfh35BIzRNFyRokHS3ezQDhvGCXmHJvABFLFJmAgsXbR ajBbREBc4lnTJ0YQm1lAQWJ/z1JmEJtTwFbieecfsA3CQFO/9M9jB7FZBFQlpq1/ywZi8wpY Siy8dJIRYrOCxI6zrxkhZlpJLH3cywYxU0Ri34t3jCBHSAhcZ5O4u2IHM8RrshKbDjBD9LpI vD22gBXCFpZ4dXwLO4QtLfF36S2o+ZMZJdaftYGYM4NRYv/aTiaIhLHE1wOnoZbxSfT+fsIE MZ9XoqNNCKLEQ+LG9TNQ8x0lZmx/yQQJlAmMEoenH2CfwCg3C8mtCxgZVzGKpRYU56anFpsW GOsVJ+YWl+al6yXn525iBKccLb0djP8WeB9iFOBgVOLhTVi4JUKINbGsuDL3EKMEB7OSCO/9 TUAh3pTEyqrUovz4otKc1OJDjKbA8JvILCWanA9Mh3kl8YYmZoYmRpZAaG5oriTOG2UwMUJI ID2xJDU7NbUgtQimj4mDU6qBUUYwSHLyEdMbgeFrP0lsFlN9/aPf8I2zjFGQthkDx7djG730 Xt/Znv8sQpo3WVPXTf5B6KrHAV4Wmmw3npz8V35GdgXbcnGRDw2392vuCss7G7Dn4Ky8cysL 190XmLUs8esH9o6c8oBtv3IYLq449f9byNXKh1WbPge3at3T1qqrTKmdyrHuoRJLcUaioRZz UXEiAAp7e5dPAwAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrNLMWRmVeSWpSXmKPExsVy+t9jAd3eA1siDNbs57DY/eMXo8X2Ne8Y HZg83h1uZPSYtWwrUwBTlJtNRmpiSmqRQmpecn5KZl66rVJoiJuuhZJCXmJuqq1ShK5vSJCS QlliTimQZ2SABhycA9yDlfTtEtwyTj79yFgwybni379/7A2MD8y6GDk5JARMJP4ducQMYYtJ XLi3nq2LkYtDSGApo8Sfpm3MEM4bRoll07ezg1SxCRhIrF20mgnEFhEQl3jW9IkRxGYWUJDY 37MUqmESo8ThvxvBijgFbCWed/5hBbGFgdZ96Z8HNohFQFVi2vq3bCA2r4ClxMJLJxkhzlCQ 2HH2NSPEAiuJpY972SAWiEjse/GOcQIj/ywk7gJGxlWMEqkFyQXFSem5Rnmp5XrFibnFpXnp esn5uZsYwQH9THoH4+Fd7ocYBTgYlXh4LYy3RAixJpYVV+YeYpTgYFYS4b2/CSjEm5JYWZVa lB9fVJqTWnyI0RToyInMUqLJ+cBoyyuJNzQxNzE3NrAwt7Q0MVIS522c/SxcSCA9sSQ1OzW1 ILUIpo+Jg1OqgXHrrSevGBZri8t8CrOYxJ4oMPvtyY8X1xb1CS5r0Hv0obPCTmG/Q+PlzSIS e01MVC6vFdCc+CRzrubaT0eOdWiv5bIuO+u89fCHSMXOzwvltk/kEXtgNSNzS4BE3nqp/1/k rtY3xX3v6ldhyZFs3r7pLuuZj/3sNbO/Z8kveWeaGHihu6DTyECJpTgj0VCLuag4EQAh9zfc fgIAAA== X-MTR: 20000000000000000@CPGS X-CMS-MailID: 20170228001301epcas5p4705c285b5d1ccf9193798d7b5b918753 X-Msg-Generator: CA X-Sender-IP: 203.254.230.27 X-Local-Sender: =?UTF-8?B?7Jyk7KeE7ISdG09wZW4gU2VydmljZSBQbGF0Zm9ybSBMYWIo?= =?UTF-8?B?Uy9X7IS87YSwKRvsgrzshLHsoITsnpAbUzUo7LGF7J6EKS/ssYXsnoQ=?= X-Global-Sender: =?UTF-8?B?RWR3YXJkIEouIFlvb24bT3BlbiBTZXJ2aWNlIFBsYXRmb3Jt?= =?UTF-8?B?IExhYi4bU2Ftc3VuZyBFbGVjdHJvbmljcxtTNS9TZW5pb3IgRW5naW5lZXI=?= X-Sender-Code: =?UTF-8?B?QzEwG1NUQUYbQzEwVjgyMTE=?= CMS-TYPE: 105P DLP-Filter: Pass X-CFilter-Loop: Reflected X-HopCount: 7 X-CMS-RootMailID: 20170227081755epcas3p29419f094e6ce82bec9e40a6131c39a6c X-RootMTR: 20170227081755epcas3p29419f094e6ce82bec9e40a6131c39a6c References: <7f0ca21e0610482a9f68dcb5aa5a6f46@impetus.co.in> archived-at: Tue, 28 Feb 2017 00:13:15 -0000 Thanks for your proposal. I of course think Apache Hama can be used for scheduling sync and async communication/computation networks with various topologies and resource allocation. However, I'm not sure whether this approach is also fit for modern microservice architecture? In my opinion, this can be discussed and cooked in Hama community as a sub-project until it's mature enough (CC'ing general@i.a.o. I'll be happy to read more feedbacks from ASF incubator community). P.S., It seems you referred to incubation proposal template. There's no need to add me as initial committer (I don't have much time to actively contribute to your project). And, I recently quit Samsung Electronics and joined to $200 billion sized O2O e-commerce company as a CTO. -----Original Message----- From: Sachin Ghai [mailto:sachin.ghai@impetus.co.in] Sent: Monday, February 27, 2017 5:16 PM To: dev@hama.apache.org Subject: Proposal for an Apache Hama sub-project Hama Community, I would like to propose a sub-project for Apache Hama and initiate discussion around the proposal. The proposed sub-project named 'Scalar' is a scalable orchestration, training and serving system for machine learning and deep learning. Scalar would leverage Apache Hama to automate the distributed training, model deployment and prediction serving. More details about the proposal are listed below as per Apache project proposal template: Abstract Scalar is a general purpose framework for simplifying massive scale big data analytics and deep learning modelling, deployment, serving with high performance. Proposal It is a goal of Scalar to provide an abstraction framework which allows user to easily scale the functions of training a model, deploying a model and serving the prediction from underlying machine learning or deep learning framework. It is also the characteristic of its execution framework to orchestrate heterogeneous workload graphs utilizing Apache Hama, Apache Hadoop, Apache Spark and TensorFlow resources. Background The initial Scalar code was developed in 2016 and has been successfully beta tested for one of the largest insurance organizations in a client specific PoC. The motivation behind this work is to build a framework that provides abstraction on heterogeneous data science frameworks and helps users leverage them in the most performant way. Rationale There is a sudden deluge of machine learning and deep learning frameworks in the industry. As an application developer, it becomes a hard choice to switch from one framework to another without rewriting the application. Also, there is additional plumbing to be done to retrieve the prediction results for each model in different frameworks. We aim to provide an abstraction framework which can be used to seamlessly train and deploy the model at scale on multiple frameworks like TensorFlow, Apache Horn or Caffe. The abstraction further provides a unified layer for serving the prediction in the most performant, scalable and efficient way for a multi-tenant deployment. The key performance metrics will be reduction in training time, lower error rate and lower latency time for serving models. Scalar consists of a core engine which can be used to create flows described in terms of state, sequences and algorithms. The engine invokes execution context of Apache Hama to train and deploy models on target framework. Apache Hama is used for a variety of functions including parameter tuning and scheduling computations on a distributed cluster. A data object layer provides access to data from heterogeneous sources like HDFS, local, S3 etc. A REST API layer is utilized for serving the prediction functions to client applications. A caching layer in the middle acts as a latency improver for various functions. Initial Goals Some current goals include: * Build community. * Provide general purpose API for machine learning and deep learning training, deployment and serving. * Serve the predictions with low latency. * Run massive workloads via Apache Hama on TensorFlow, Apache Spark and Caffe. * Provide CPU and GPU support on-premise or on cloud to run the algorithms. Current Status Meritocracy The core developers understand what it means to have a process based on meritocracy. We will provide continuous efforts to build an environment that supports this, encouraging community members to contribute. Community A small community has formed within the Apache Hama project community and companies such as enterprise services and product company and artificial intelligence startup. There is a lot of interest in data science serving systems and Artificial intelligence simplification systems. By bringing Scalar into Apache, we believe that the community will grow even bigger. Core Developers Edward J. Yoon, Sachin Ghai, Ishwardeep Singh, Rachna Gogia, Abhishek Soni, Nikunj Limbaseeya, Mayur Choubey Known Risks Orphaned Products Apache Hama is already a core open source component being utilized at Samsung Electronics, and Scalar is already getting adopted by major enterprise organizations. There is no direct risk for Scalar project to be orphaned. Inexperience with Open Source All contributors have experience using and/or working on Apache open source projects. Homogeneous Developers The initial committers are from different organizations such as Impetus, Chalk Digital, and Samsung Electronics. Reliance on Salaried Developers Few will be working as full-time open source developer. Other developers will also start working on the project in their spare time. Relationships with Other Apache Products * Scalar is being built on top of Apache Hama * Apache Spark is being used for machine learning. * Apache Horn is being used for deep learning. * The framework will run natively on Apache Hadoop and Apache Mesos. An Excessive Fascination with the Apache Brand Scalar itself will hopefully have benefits from Apache, in terms of attracting a community and establishing a solid group of developers, but also the relation with Apache Hadoop, Spark and Hama. These are the main reasons for us to send this proposal. Documentation Initial design of Scalar can be found at this link. Initial Source Impetus Technologies (Impetus) will contribute the initial orchestration code base to create this project. Impetus plans to contribute the Scalar code base, test cases, build files, and documentation to the ASF under the terms specified in the ASF Corporate Contributor License and further develop it with wider community. Once at Apache, the project will be licensed under the ASF license. Cryptography Not applicable. Required Resources Mailing Lists * scalar-dev * scalar-pmc Subversion Directory * Git is the preferred source control system: git://git.apache.org/scalar Issue Tracking * a JIRA issue tracker, SCALAR Initial Committers * Sachin Ghai (sachin.ghai AT impetus DOT co DOT in) * Edward J. Yoon (edwardyoon AT apache DOT org) * Abhishek Soni (abhishek.soni AT impetus DOT co DOT in) * Ishwardeep Singh ( ishwardeep AT chalkdigital DOT com ) * Nikunj Limbaseeya (nikunj.limbaseeya AT impetus DOT co DOT in) * Rachna Gogia (rachna AT hadoopsphere DOT org) * Mayur Choubey (mayur.choubey AT impetus DOT co DOT in) Affiliations * Sachin Ghai (Impetus) * Edward J. Yoon (Samsung Electronics) * Abhishek Soni (Impetus) * Ishwardeep Singh ( Chalk Digital) * Nikunj Limbaseeya (Impetus) * Rachna Gogia (HadoopSphere) * Mayur Choubey (Impetus) Sponsors Champion * Edward J. Yoon Nominated Mentors * Edward J. Yoon Sponsoring Entity The Apache Hama project -- End of proposal -- Thanks, Sachin Ghai ________________________________ NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.