Return-Path: X-Original-To: apmail-flink-issues-archive@minotaur.apache.org Delivered-To: apmail-flink-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EF42611C5E for ; Sun, 27 Jul 2014 18:33:59 +0000 (UTC) Received: (qmail 62041 invoked by uid 500); 27 Jul 2014 18:33:59 -0000 Delivered-To: apmail-flink-issues-archive@flink.apache.org Received: (qmail 62012 invoked by uid 500); 27 Jul 2014 18:33:59 -0000 Mailing-List: contact issues-help@flink.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.incubator.apache.org Delivered-To: mailing list issues@flink.incubator.apache.org Received: (qmail 62003 invoked by uid 99); 27 Jul 2014 18:33:59 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 27 Jul 2014 18:33:59 +0000 X-ASF-Spam-Status: No, hits=-2000.6 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.3] (HELO mail.apache.org) (140.211.11.3) by apache.org (qpsmtpd/0.29) with SMTP; Sun, 27 Jul 2014 18:33:58 +0000 Received: (qmail 61923 invoked by uid 99); 27 Jul 2014 18:33:38 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 27 Jul 2014 18:33:38 +0000 Date: Sun, 27 Jul 2014 18:33:38 +0000 (UTC) From: "Daniel Warneke (JIRA)" To: issues@flink.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Assigned] (FLINK-1025) Improve BLOB Service MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/FLINK-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Warneke reassigned FLINK-1025: ------------------------------------- Assignee: Daniel Warneke > Improve BLOB Service > -------------------- > > Key: FLINK-1025 > URL: https://issues.apache.org/jira/browse/FLINK-1025 > Project: Flink > Issue Type: Improvement > Components: JobManager > Affects Versions: 0.6-incubating > Reporter: Stephan Ewen > Assignee: Daniel Warneke > Fix For: 0.6-incubating > > > I like the idea of making it transparent where the blob service runs, so the code on the server/client side is agnostic to that. > The current merged code is in https://github.com/StephanEwen/incubator-flink/commits/blobservice > Local tests pass, I am trying distributed tests now. > There are a few suggestions for improvements: > - Since the all the resources are bound to a job or session, it makes sense to make all puts/gets relative to a jobId (becoming session id) and to have a cleanup hook that delete all resources associated with that job. > - The BLOB service has hardwired to compute a message digest for the contents, and to use that as the key. While it may make sense for jar files (cached libraries), for many cases in the future, that will be unnecessary and impose only overhead. I would vote to make this optional and allow just UUIDs for keys. An example is for the taskmanager to put a part of an intermediate result into the blob store, for the client to pick it up. > - At most points, we have started moving away from configured ports, because of configuration overhead and collisions in setups, where multiple instances end up on one machine. The latter happens actually frequently with YARN. I would suggest to have the JM open a port dynamically for the BlobService (similar as in TaskManager#getAvailablePort() ). RPC calls to figure out this configuration need to happen only once between client/JM and TM/JM. We can stomach that overhead ;-) > - The write method does not write the length a single time, but "per buffer". Why is it done that way? The array-based methods know the length up front, and when the contents comes from an input stream, I think we know the length as well (for files: filesize, for network: sent up front). > - I am personally in favor of moving away from static singleton registries. They tend to cause trouble during testing, pseudo cluster modes (multiple workers within one JVM). How hard is it to have a BlobService at the TaskManager / JobManager that we can pass as references to points where it is needed. -- This message was sent by Atlassian JIRA (v6.2#6252)