flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrey Zagrebin <and...@data-artisans.com>
Subject Re: number of files in checkpoint directory grows endlessly
Date Thu, 29 Nov 2018 14:37:41 GMT
Could you share the logs to check possible failures to subsume or remove previous checkpoints?
What is the sizes of the files? It can help to understand how compaction goes.
Could you also provide more details how you setup TtlDb with Flink?

Best,
Andrey

> On 29 Nov 2018, at 11:34, Andrey Zagrebin <andrey@data-artisans.com> wrote:
> 
> Compaction merges SST files in background using native threads. While merging it filters
out removed and expired data. In general, the idea is that there are enough resources for
compaction to keep up with the DB update rate and reduce storage. It can be quite IO intensive.
Compaction has a lot of tuning knobs and statistics to monitor the process [1] which are usually
out of the scope of Flink depending on state access pattern of the application. You can create
and set RocksDBStateBackend for you application in Flink and configure it with custom RocksDb/column
specific options.
> 
> [1] https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide <https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide>
> [2] https://github.com/facebook/rocksdb/wiki/Compaction <https://github.com/facebook/rocksdb/wiki/Compaction>
> 
>> On 29 Nov 2018, at 11:20, <Bernd.Winterstein@Dev.Helaba.de <mailto:Bernd.Winterstein@Dev.Helaba.de>>
<Bernd.Winterstein@Dev.Helaba.de <mailto:Bernd.Winterstein@Dev.Helaba.de>> wrote:
>> 
>> We use TtlDB because the state contents should expire automatically after 24 hours.
Therefore we only changed the state backend to use TtlDb instead of RocksDB with a fixed retention
time.
>> 
>> We have a slow IO because we only have SAN volumes available. Can you further clarify
the problem with slow compaction.
>> 
>> Regards,
>> 
>> Bernd
>> 
>> 
>> -----Ursprüngliche Nachricht-----
>> Von: Andrey Zagrebin [mailto:andrey@data-artisans.com <mailto:andrey@data-artisans.com>]
>> Gesendet: Donnerstag, 29. November 2018 11:01
>> An: Winterstein, Bernd
>> Cc: Kostas Kloudas; user; s.richter@data-artisans.com <mailto:s.richter@data-artisans.com>;
till@data-artisans.com <mailto:till@data-artisans.com>; stephan@data-artisans.com <mailto:stephan@data-artisans.com>
>> Betreff: Re: number of files in checkpoint directory grows endlessly
>> 
>> If you use incremental checkpoints, state backend stores raw RocksDB SST files which
represent all state data. Each checkpoint adds SST files with new updates which are not present
in previous checkpoint, basically their difference.
>> 
>> One of the following could be happening:
>> - old keys are not explicitly deleted or expire (depending on how TtlDb is used)
>> - compaction is too slow to drop older SST files for the latest checkpoint so that
they can be deleted with the previous checkpoints
>> 
>>> On 29 Nov 2018, at 10:48, <Bernd.Winterstein@Dev.Helaba.de <mailto:Bernd.Winterstein@Dev.Helaba.de>>
<Bernd.Winterstein@Dev.Helaba.de <mailto:Bernd.Winterstein@Dev.Helaba.de>> wrote:
>>> 
>>> Hi
>>> We use Flink 1..6.2. As for the checkpoint directory there is only one chk-xxx
directory. Therefore if would expect only one checkpoint remains.
>>> The value of 'state.checkpoints.num-retained’ is not set explicitly.
>>> 
>>> The problem is not the number of checkpoints but the number of files in the "shared"
directory next to the chk-xxx directory.
>>> 
>>> 
>>> -----Ursprüngliche Nachricht-----
>>> Von: Andrey Zagrebin [mailto:andrey@data-artisans.com <mailto:andrey@data-artisans.com>]
>>> Gesendet: Donnerstag, 29. November 2018 10:39
>>> An: Kostas Kloudas
>>> Cc: Winterstein, Bernd; user; Stefan Richter; Till Rohrmann; Stephan
>>> Ewen
>>> Betreff: Re: number of files in checkpoint directory grows endlessly
>>> 
>>> Hi Bernd,
>>> 
>>> Did you change 'state.checkpoints.num-retained’ in flink-conf.yaml? By default,
only one checkpoint should be retained.
>>> 
>>> Which version of Flink do you use?
>>> Can you check Job Master logs whether you see there warning like this:
>>> `Fail to subsume the old checkpoint`?
>>> 
>>> Best,
>>> Andrey
>>> 
>>>> On 29 Nov 2018, at 10:18, Kostas Kloudas <k.kloudas@data-artisans.com
<mailto:k.kloudas@data-artisans.com>> wrote:
>>>> 
>>>> Hi Bernd,
>>>> 
>>>> I think the Till, Stefan or Stephan (cc'ed) are the best to answer your question.
>>>> 
>>>> Cheers,
>>>> Kostas
>>> 
>>> ________________________________
>>> 
>>> 
>>> Landesbank Hessen-Thueringen Girozentrale Anstalt des oeffentlichen
>>> Rechts
>>> Sitz: Frankfurt am Main / Erfurt
>>> Amtsgericht Frankfurt am Main, HRA 29821 / Amtsgericht Jena, HRA
>>> 102181
>>> 
>>> Bitte nutzen Sie die E-Mail-Verbindung mit uns ausschliesslich zum Informationsaustausch.
Wir koennen auf diesem Wege keine rechtsgeschaeftlichen Erklaerungen (Auftraege etc.) entgegennehmen.
>>> 
>>> Der Inhalt dieser Nachricht ist vertraulich und nur fuer den angegebenen Empfaenger
bestimmt. Jede Form der Kenntnisnahme oder Weitergabe durch Dritte ist unzulaessig. Sollte
diese Nachricht nicht fur Sie bestimmt sein, so bitten wir Sie, sich mit uns per E-Mail oder
telefonisch in Verbindung zu setzen.
>>> 
>>> Please use your E-mail connection with us exclusively for the exchange of information.
We do not accept legally binding declarations (orders, etc.) by this means of communication.
>>> 
>>> The contents of this message is confidential and intended only for the
>>> recipient indicated. Taking notice of this message or disclosure by third parties
is not permitted. In the event that this message is not intended for you, please contact us
via E-mail or phone.
>> 
>> ________________________________
>> 
>> 
>> Landesbank Hessen-Thueringen Girozentrale
>> Anstalt des oeffentlichen Rechts
>> Sitz: Frankfurt am Main / Erfurt
>> Amtsgericht Frankfurt am Main, HRA 29821 / Amtsgericht Jena, HRA 102181
>> 
>> Bitte nutzen Sie die E-Mail-Verbindung mit uns ausschliesslich zum Informationsaustausch.
Wir koennen auf diesem Wege keine rechtsgeschaeftlichen Erklaerungen (Auftraege etc.) entgegennehmen.
>> 
>> Der Inhalt dieser Nachricht ist vertraulich und nur fuer den angegebenen Empfaenger
bestimmt. Jede Form der Kenntnisnahme oder Weitergabe durch Dritte ist unzulaessig. Sollte
diese Nachricht nicht fur Sie bestimmt sein, so bitten wir Sie, sich mit uns per E-Mail oder
telefonisch in Verbindung zu setzen.
>> 
>> Please use your E-mail connection with us exclusively for the exchange of information.
We do not accept legally binding declarations (orders, etc.) by this means of communication.
>> 
>> The contents of this message is confidential and intended only for the recipient
indicated. Taking notice of this message or disclosure by third parties is not
>> permitted. In the event that this message is not intended for you, please contact
us via E-mail or phone.
> 


Mime
View raw message