flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Knauf <konstantin.kn...@tngtech.com>
Subject Re: Blobstorage Locally and on HDFS
Date Tue, 04 Oct 2016 14:54:46 GMT
Hi Ufuk,

any ideas? Any configuration that could be wrong?

Cheers,

Konstantin

On 30.09.2016 13:13, Konstantin Knauf wrote:
> Hi Ufuk,
> 
> thanks for your quick answer.
> 
> Setup: 2 Servers, each running a JM as well as TM
> 
> 1) Removing all existing blobstores locally (/tmp) as well as on HDFS
> 2) Starting a flink streaming job
> 
> Now there are the following BLOBs:
> 
> Local:
> 
> *Leader JM:
> 
> 4.0K    /tmp/blobStore-563a8820-9617-4d89-97a7-fc3cc258dff4/incoming
> 
> 64M     /tmp/blobStore-563a8820-9617-4d89-97a7-fc3cc258dff4
> 
> 64M     /tmp/blobStore-563a8820-9617-4d89-97a7-fc3cc258dff4/cache
> 
> 64M     /tmp/blobStore-c6b93d41-8916-4a8d-b595-6e35f0b10401
> 
> 64M     /tmp/blobStore-c6b93d41-8916-4a8d-b595-6e35f0b10401/cache
> 
> *Standby JM:
> 
> 64M     /tmp/blobStore-4cbfd3c0-2a70-4485-8fc0-045ca7f08cea
> 
> 64M     /tmp/blobStore-4cbfd3c0-2a70-4485-8fc0-045ca7f08cea/cache
> 
> HDFS:
> 
> 66595700 2016-09-30 13:03
> <..>/flink/blob/cache/blob_da76e12b949a83404f97b6eb59416deaa31a907b
> 
> 
> 3) Cancelinng both jobs via command line:
> 
> Now there are the following BLOBs:
> 
> **same as above**
> 
> When starting the same job again, no new blobs are created.
> 
> Is it a problem to delete local blobStores of running jobs or will the
> blobs just be downloaded again from HDFS if needed?
> 
> Cheers,
> 
> Konstantin
> 
> 
> 
> Is it correct, that ea
> 
> On 30.09.2016 10:28, Ufuk Celebi wrote:
>> On Fri, Sep 30, 2016 at 9:12 AM, Konstantin Knauf
>> <konstantin.knauf@tngtech.com> wrote:
>>> we are running a Flink (1.1.2) Stand-Alone Cluster with JM HA, and HDFS
>>> as checkpoint and recovery storage dir. What we see is that blobStores
>>> are stored in HDFS as well as under the local Jobmanagers and
>>> Taskmanagers /tmp directory.
>>>
>>> Is this the expected behaviour? Is there any documentation on which
>>> blobs are stored locally and which are stored in HDFS in our case? In
>>> particular, we would need to know when it is save to delete blobs stored
>>> locally because there are not cleanup up by Flink and fill up the /tmp
>>> partition eventually.
>>
>> BLOBs are copied to another directory in case of HA in order to be
>> available for other job managers that might take over.
>>
>> On regular termination (cancel, finish) all BLOBs should be cleaned
>> up. With hard failures, it can happen that BLOBs are not cleaned up.
>>
>> Do you know in which cases you see BLOBs not being cleaned up? If it
>> is the first one, that sounds like a bug to me.
>>
>> – Ufuk
>>
> 

-- 
Konstantin Knauf * konstantin.knauf@tngtech.com * +49-174-3413182
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
Sitz: Unterföhring * Amtsgericht München * HRB 135082


Mime
View raw message