Return-Path: X-Original-To: apmail-flink-dev-archive@www.apache.org Delivered-To: apmail-flink-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AAE5811A1A for ; Mon, 16 Jun 2014 14:46:27 +0000 (UTC) Received: (qmail 20433 invoked by uid 500); 16 Jun 2014 14:46:27 -0000 Delivered-To: apmail-flink-dev-archive@flink.apache.org Received: (qmail 20383 invoked by uid 500); 16 Jun 2014 14:46:27 -0000 Mailing-List: contact dev-help@flink.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.incubator.apache.org Delivered-To: mailing list dev@flink.incubator.apache.org Received: (qmail 20372 invoked by uid 99); 16 Jun 2014 14:46:27 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Jun 2014 14:46:27 +0000 X-ASF-Spam-Status: No, hits=-2000.7 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.3] (HELO mail.apache.org) (140.211.11.3) by apache.org (qpsmtpd/0.29) with SMTP; Mon, 16 Jun 2014 14:46:28 +0000 Received: (qmail 17347 invoked by uid 99); 16 Jun 2014 14:46:03 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Jun 2014 14:46:03 +0000 Date: Mon, 16 Jun 2014 14:46:03 +0000 (UTC) From: "Stephan Ewen (JIRA)" To: dev@flink.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (FLINK-939) Distribute required JAR files with seperate service MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/FLINK-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14032485#comment-14032485 ] Stephan Ewen commented on FLINK-939: ------------------------------------ After looking a little more into the code, I think we need something more powerful, like a blob manager. Issues we need to address are the following: 1. Distribute large JAR files from (a) client to job manager and (b) job manager to task managers 2. Ship large program functions (closures) from (a) client to job manager and (b) job manager to task managers 3. Ship intermediate results from (a) task managers to job manager and (b) job manager to client All those can be transferred in the form of blobs or large blocks/frames. What we could use is a BlobManager on the JobManager that accepts requests "put(JobId, key, byte[])" and "get(jobid, key)". Clients and task managers can put data there and transmit the key via the RPC. The BlobManager needs to store the data on disk, if needed, to prevent OutOfMemoryErrors. I would suggest to start initially with a simple service that has a "Map>>" and puts all received blobs on disk in the temp directory. What is your opinion on that? > Distribute required JAR files with seperate service > --------------------------------------------------- > > Key: FLINK-939 > URL: https://issues.apache.org/jira/browse/FLINK-939 > Project: Flink > Issue Type: Improvement > Reporter: Ufuk Celebi > Assignee: Daniel Warneke > > Currently, required user JAR files are distributed via the RPC service in {{JobGraph.writeRequiredJarFiles(DataOutput, AbstractJobVertex[])}}. The RPC service then tries to allocate a buffer on the client side heap to write the on-disk JAR, which [can lead to problems|https://github.com/apache/incubator-flink/pull/18]. > This should be replaced with a seperate service. -- This message was sent by Atlassian JIRA (v6.2#6252)