flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrey (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FLINK-6627) Expose tmp directories via API
Date Thu, 18 May 2017 16:09:04 GMT
Andrey created FLINK-6627:

             Summary: Expose tmp directories via API
                 Key: FLINK-6627
                 URL: https://issues.apache.org/jira/browse/FLINK-6627
             Project: Flink
          Issue Type: Improvement
    Affects Versions: 1.2.0
            Reporter: Andrey

Currently tmp/blob directories created based on fixed baseDir and random postfix. For example
blob directory:
new File(baseDir, String.format("blobStore-%s", UUID.randomUUID().toString()))

This directory name is not exposed externally. This will cause several issues in the following
1) Start 1 task manager
2) random blob directory created. For example: "blob-1"
3) Start 2 task manager
4) random blob directory created. For example: "blob-2"
5) 1 task manager dies unexpectedly. (kill -9 or OOM).
6) directory "blob-1" will not be deleted.
7) 1 task manager automatically restarted
8) random blob directory created. For example: "blob-3"

The issues:
* The directory "blob-1" will never be deleted. 
* The external cleanup script cannot get information about current directories being in use.
Because information is not exposed externally. So it cannot delete unused directories.
* Sorting directories by "created time" and keeping last X, won't help, because 1 faulty task
manager could generate X+1 new directories.
* giving different "blob.storage.directory" for different task managers is not a scalable
solution for cloud/docker deployment, because there should be central storage for current
number of running task managers.

Proposed solution:
* expose via rest API current working directory for blob/tmp. In that case: 
** cleanup script could get all blob/tmp directories being in use from the cluster
** get all blob/tmp directories ("ls")
** find blob/tmp directories not being used. 
** delete them

This message was sent by Atlassian JIRA

View raw message