Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 38969200C2D for ; Fri, 17 Feb 2017 13:55:50 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 350C8160B3F; Fri, 17 Feb 2017 12:55:50 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 6173A160B55 for ; Fri, 17 Feb 2017 13:55:49 +0100 (CET) Received: (qmail 9108 invoked by uid 500); 17 Feb 2017 12:55:48 -0000 Mailing-List: contact issues-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list issues@flink.apache.org Received: (qmail 9069 invoked by uid 99); 17 Feb 2017 12:55:48 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Feb 2017 12:55:48 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 29E4DC2778 for ; Fri, 17 Feb 2017 12:55:48 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.999 X-Spam-Level: X-Spam-Status: No, score=-1.999 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RP_MATCHES_RCVD=-2.999] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id J4SUoWyabemM for ; Fri, 17 Feb 2017 12:55:47 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id BB6005F238 for ; Fri, 17 Feb 2017 12:55:46 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 1DFCAE0416 for ; Fri, 17 Feb 2017 12:55:42 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id B45AF2411D for ; Fri, 17 Feb 2017 12:55:41 +0000 (UTC) Date: Fri, 17 Feb 2017 12:55:41 +0000 (UTC) From: "Kate Eri (JIRA)" To: issues@flink.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (FLINK-5782) Support GPU calculations MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 17 Feb 2017 12:55:50 -0000 [ https://issues.apache.org/jira/browse/FLINK-5782?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:all-tabpanel ] Kate Eri updated FLINK-5782: ---------------------------- Description:=20 This ticket was initiated as continuation of the dev discussion thread: [Ne= w Flink team member - Kate Eri (Integration with DL4J topic)|http://mail-ar= chives.apache.org/mod_mbox/flink-dev/201702.mbox/browser] =20 Recently we have proposed the idea to integrate [Deeplearning4J|https://dee= plearning4j.org/index.html] with Apache Flink.=20 It is known that DL models training is resource demanding process, so train= ing on CPU could converge much longer than on GPU. =20 But not only for DL training GPU usage could be supposed, but also for opti= mization of graph analytics and other typical data manipulations, nice over= view of GPU related problems is presented [Accelerating Spark workloads usi= ng GPUs|https://www.oreilly.com/learning/accelerating-spark-workloads-using= -gpus]. Currently the community pointed the following issues to consider: 1)=09Flink would like to avoid to write one more time its own GPU support, = to reduce engineering burden. That=E2=80=99s why such libraries like [ND4J|= http://nd4j.org/userguide] should be considered.=20 2)=09Currently Flink uses [Breeze|https://github.com/scalanlp/breeze], to o= ptimize linear algebra calculations, ND4J can=E2=80=99t be integrated as is= , because it still doesn=E2=80=99t support [sparse arrays|http://nd4j.org/u= serguide#faq]. Maybe this issue should be simply contributed to ND4J to ena= ble its usage? 3)=09The calculations would have to work with both available and not availa= ble GPUs. If the system detects that GPUs are available, then ideally it wo= uld exploit them. Thus GPU resource management could be incorporated in [FL= INK-5131|https://issues.apache.org/jira/browse/FLINK-5131] (only suggested)= . 4)=09It was mentioned that as far Flink takes care of shipping data around = the cluster, also it will perform its dump out to GPU for calculation and l= oad back up. In practice, the lack of a persist method for intermediate res= ults makes this troublesome (not because of GPUs but for calculating any so= rt of complex algorithm we expect to be able to cache intermediate results)= . That=E2=80=99s why the Ticket [FLINK-1730|https://issues.apache.org/jira/br= owse/FLINK-1730] must be implemented to solve such problem. =20 5)=09Also it was recommended to take a look at Apache Mahout, at least to g= et the experience with GPU integration and check its https://github.com/apache/mahout/tree/master/viennacl-omp https://github.com/apache/mahout/tree/master/viennacl=20 6) For now, GPU proposed only for batch calculations optimization, to supp= ort GPU for streaming should be started another ticket, because optimizatio= n of streaming by GPU requires additional research. =09 7) Also experience of Netflix regarding this question could be considered: = [Distributed Neural Networks with GPUs in the AWS Cloud|http://techblog.net= flix.com/search/label/CUDA] =20 This is considered as master ticket for GPU related ticktes was: This ticket was initiated as continuation of the dev discussion thread: [Ne= w Flink team member - Kate Eri (Integration with DL4J topic)|http://mail-ar= chives.apache.org/mod_mbox/flink-dev/201702.mbox/browser] =20 Recently we have proposed the idea to integrate [Deeplearning4J|https://dee= plearning4j.org/index.html] with Apache Flink.=20 It is known that DL models training is resource demanding process, so train= ing on CPU could converge much longer than on GPU. =20 But not only for DL training GPU usage could be supposed, but also for opti= mization of graph analytics and other typical data manipulations, nice over= view of GPU related problems is presented [Accelerating Spark workloads usi= ng GPUs|https://www.oreilly.com/learning/accelerating-spark-workloads-using= -gpus]. Currently the community pointed the following issues to consider: 1)=09Flink would like to avoid to write one more time its own GPU support, = to reduce engineering burden. That=E2=80=99s why such libraries like [ND4J|= http://nd4j.org/userguide] should be considered.=20 2)=09Currently Flink uses [Breeze|https://github.com/scalanlp/breeze], to o= ptimize linear algebra calculations, ND4J can=E2=80=99t be integrated as is= , because it still doesn=E2=80=99t support [sparse arrays|http://nd4j.org/u= serguide#faq]. Maybe this issue should be simply contributed to ND4J to ena= ble its usage? 3)=09The calculations would have to work with both available and not availa= ble GPUs. If the system detects that GPUs are available, then ideally it wo= uld exploit them. Thus GPU resource management could be incorporated in [FL= INK-5131|https://issues.apache.org/jira/browse/FLINK-5131] (only suggested)= . 4)=09It was mentioned that as far Flink takes care of shipping data around = the cluster, also it will perform its dump out to GPU for calculation and l= oad back up. In practice, the lack of a persist method for intermediate res= ults makes this troublesome (not because of GPUs but for calculating any so= rt of complex algorithm we expect to be able to cache intermediate results)= . That=E2=80=99s why the Ticket [FLINK-1730|https://issues.apache.org/jira/br= owse/FLINK-1730] must be implemented to solve such problem. =20 5)=09Also it was recommended to take a look at Apache Mahout, at least to g= et the experience with GPU integration and check its https://github.com/apache/mahout/tree/master/viennacl-omp https://github.com/apache/mahout/tree/master/viennacl=20 6)=09Also experience of Netflix regarding this question could be considered= : [Distributed Neural Networks with GPUs in the AWS Cloud|http://techblog.n= etflix.com/search/label/CUDA] =20 This is considered as master ticket for GPU related ticktes > Support GPU calculations > ------------------------ > > Key: FLINK-5782 > URL: https://issues.apache.org/jira/browse/FLINK-5782 > Project: Flink > Issue Type: Improvement > Components: Core > Affects Versions: 1.3.0 > Reporter: Kate Eri > Priority: Minor > > This ticket was initiated as continuation of the dev discussion thread: [= New Flink team member - Kate Eri (Integration with DL4J topic)|http://mail-= archives.apache.org/mod_mbox/flink-dev/201702.mbox/browser] =20 > Recently we have proposed the idea to integrate [Deeplearning4J|https://d= eeplearning4j.org/index.html] with Apache Flink.=20 > It is known that DL models training is resource demanding process, so tra= ining on CPU could converge much longer than on GPU. =20 > But not only for DL training GPU usage could be supposed, but also for op= timization of graph analytics and other typical data manipulations, nice ov= erview of GPU related problems is presented [Accelerating Spark workloads u= sing GPUs|https://www.oreilly.com/learning/accelerating-spark-workloads-usi= ng-gpus]. > Currently the community pointed the following issues to consider: > 1)=09Flink would like to avoid to write one more time its own GPU support= , to reduce engineering burden. That=E2=80=99s why such libraries like [ND4= J|http://nd4j.org/userguide] should be considered.=20 > 2)=09Currently Flink uses [Breeze|https://github.com/scalanlp/breeze], to= optimize linear algebra calculations, ND4J can=E2=80=99t be integrated as = is, because it still doesn=E2=80=99t support [sparse arrays|http://nd4j.org= /userguide#faq]. Maybe this issue should be simply contributed to ND4J to e= nable its usage? > 3)=09The calculations would have to work with both available and not avai= lable GPUs. If the system detects that GPUs are available, then ideally it = would exploit them. Thus GPU resource management could be incorporated in [= FLINK-5131|https://issues.apache.org/jira/browse/FLINK-5131] (only suggeste= d). > 4)=09It was mentioned that as far Flink takes care of shipping data aroun= d the cluster, also it will perform its dump out to GPU for calculation and= load back up. In practice, the lack of a persist method for intermediate r= esults makes this troublesome (not because of GPUs but for calculating any = sort of complex algorithm we expect to be able to cache intermediate result= s). > That=E2=80=99s why the Ticket [FLINK-1730|https://issues.apache.org/jira/= browse/FLINK-1730] must be implemented to solve such problem. =20 > 5)=09Also it was recommended to take a look at Apache Mahout, at least to= get the experience with GPU integration and check its > https://github.com/apache/mahout/tree/master/viennacl-omp > https://github.com/apache/mahout/tree/master/viennacl=20 > 6) For now, GPU proposed only for batch calculations optimization, to su= pport GPU for streaming should be started another ticket, because optimizat= ion of streaming by GPU requires additional research. =09 > 7) Also experience of Netflix regarding this question could be considered= : [Distributed Neural Networks with GPUs in the AWS Cloud|http://techblog.n= etflix.com/search/label/CUDA] =20 > This is considered as master ticket for GPU related ticktes -- This message was sent by Atlassian JIRA (v6.3.15#6346)