flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vinay patil <vinay18.pa...@gmail.com>
Subject Re: Checkpointing with RocksDB as statebackend
Date Mon, 20 Feb 2017 18:04:43 GMT
Hi Xiaogang,

Thank you for your inputs.

Yes I have already tried setting MaxBackgroundFlushes and
MaxBackgroundCompactions to higher value (tried with 2, 4, 8) , still not
getting expected results.

System.getProperty("java.io.tmpdir") points to /tmp but there I could not
find RocksDB logs, can you please let me know where can I find it ?

Regards,
Vinay Patil

On Mon, Feb 20, 2017 at 7:32 AM, xiaogang.sxg [via Apache Flink User
Mailing List archive.] <ml-node+s2336050n11731h13@n4.nabble.com> wrote:

> Hi Vinay
>
> Can you provide the LOG file in RocksDB? It helps a lot to figure out the
> problems becuse it records the options and the events happened during the
> execution. Otherwise configured, it should locate at the path set in
> System.getProperty("java.io.tmpdir").
>
> Typically, a large amount of memory is consumed by RocksDB to store
> necessary indices. To avoid the unlimited growth in the memory consumption,
> you can put these indices into block cache (set CacheIndexAndFilterBlock to
> true) and properly set the block cache size.
>
> You can also increase the number of backgroud threads to improve the
> performance of flushes and compactions (via MaxBackgroundFlushes and
> MaxBackgroudCompactions).
>
> In YARN clusters, task managers will be killed if their memory utilization
> exceeds the allocation size. Currently Flink does not count the memory used
> by RocksDB in the allocation. We are working on fine-grained resource
> allocation (see FLINK-5131). It may help to avoid such problems.
>
> May the information helps you.
>
> Regards,
> Xiaogang
>
>
> ------------------------------------------------------------------
> 发件人:Vinay Patil <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=11731&i=0>>
> 发送时间:2017年2月17日(星期五) 21:19
> 收件人:user <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=11731&i=1>>
> 主 题:Re: Checkpointing with RocksDB as statebackend
>
> Hi Guys,
>
> There seems to be some issue with RocksDB memory utilization.
>
> Within few minutes of job run the physical memory usage increases by 4-5
> GB and it keeps on increasing.
> I have tried different options for Max Buffer Size(30MB, 64MB, 128MB ,
> 512MB) and Min Buffer to Merge as 2, but the physical memory keeps on
> increasing.
>
> According to RocksDB documentation, these are the main options on which
> flushing to storage is based.
>
> Can you please point me where am I doing wrong. I have tried different
> configuration options but each time the Task Manager is getting killed
> after some time :)
>
> Regards,
> Vinay Patil
>
> On Thu, Feb 16, 2017 at 6:02 PM, Vinay Patil <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=11731&i=2>> wrote:
> I think its more of related to RocksDB, I am also not aware about RocksDB
> but reading the tuning guide to understand the important values that can be
> set
>
> Regards,
> Vinay Patil
>
> On Thu, Feb 16, 2017 at 5:48 PM, Stefan Richter [via Apache Flink User
> Mailing List archive.] <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=11731&i=3>> wrote:
> What kind of problem are we talking about? S3 related or RocksDB related.
> I am not aware of problems with RocksDB per se. I think seeing logs for
> this would be very helpful.
>
> Am 16.02.2017 um 11:56 schrieb Aljoscha Krettek <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=11673&i=0>>:
>
> [hidden email] <http:///user/SendEmail.jtp?type=node&node=11673&i=1> and
[hidden
> email] <http:///user/SendEmail.jtp?type=node&node=11673&i=2> could this
> be the same problem that you recently saw when working with other people?
>
> On Wed, 15 Feb 2017 at 17:23 Vinay Patil <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=11673&i=3>> wrote:
> Hi Guys,
>
> Can anyone please help me with this issue
>
> Regards,
> Vinay Patil
>
> On Wed, Feb 15, 2017 at 6:17 PM, Vinay Patil <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=11673&i=4>> wrote:
> Hi Ted,
>
> I have 3 boxes in my pipeline , 1st and 2nd box containing source and s3
> sink and the 3rd box is window operator followed by chained operators and a
> s3 sink
>
> So in the details link section I can see that that S3 sink is taking time
> for the acknowledgement and it is not even going to the window operator
> chain.
>
> But as shown in the snapshot ,checkpoint id 19 did not get any
> acknowledgement. Not sure what is causing the issue
>
> Regards,
> Vinay Patil
>
> On Wed, Feb 15, 2017 at 5:51 PM, Ted Yu [via Apache Flink User Mailing
> List archive.] <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=11673&i=5>> wrote:
> What did the More Details link say ?
>
> Thanks
>
> > On Feb 15, 2017, at 3:11 AM, vinay patil <[hidden email]
> <http://user/SendEmail.jtp?type=node&node=11641&i=0>> wrote:
> >
> > Hi,
> >
> > I have kept the checkpointing interval to 6secs and minimum pause
> between
> > checkpoints to 5secs, while testing the pipeline I have observed that
> that
> > for some checkpoints it is taking long time , as you can see in the
> attached
> > snapshot checkpoint id 19 took the maximum time before it gets failed,
> > although it has not received any acknowledgements, now during this
> 10minutes
> > the entire pipeline did not make any progress and no data was getting
> > processed. (For Ex : In 13minutes 20M records were processed and when
> the
> > checkpoint took time there was no progress for the next 10minutes)
> >
> > I have even tried to set max checkpoint timeout to 3min, but in that
> case as
> > well multiple checkpoints were getting failed.
> >
> > I have set RocksDB FLASH_SSD_OPTION
> > What could be the issue ?
> >
> > P.S. I am writing to 3 S3 sinks
> >
> > checkpointing_issue.PNG
> > <http://apache-flink-user-mailing-list-archive.2336050.
> n4.nabble.com/file/n11640/checkpointing_issue.PNG>
> >
> >
> >
> > --
> > View this message in context: http://apache-flink-user-
> mailing-list-archive.2336050.n4.nabble.com/Checkpointing-
> with-RocksDB-as-statebackend-tp11640.html
> > Sent from the Apache Flink User Mailing List archive. mailing list
> archive at Nabble.com.
>
>
> ------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-flink-user-mailing-list-archive.2336050.
> n4.nabble.com/Checkpointing-with-RocksDB-as-statebackend-
> tp11640p11641.html
> To start a new topic under Apache Flink User Mailing List archive., email [hidden
> email] <http:///user/SendEmail.jtp?type=node&node=11673&i=6>
> To unsubscribe from Apache Flink User Mailing List archive., click here
> <#m_8892162958879126193_this>.
> NAML
> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
>
>
> ------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-flink-user-mailing-list-archive.2336050.
> n4.nabble.com/Checkpointing-with-RocksDB-as-statebackend-
> tp11640p11673.html
> To start a new topic under Apache Flink User Mailing List archive., email [hidden
> email] <http:///user/SendEmail.jtp?type=node&node=11731&i=4>
> To unsubscribe from Apache Flink User Mailing List archive., click here.
> NAML
> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
>
>
>
> ------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-flink-user-mailing-list-archive.2336050.
> n4.nabble.com/Checkpointing-with-RocksDB-as-statebackend-
> tp11640p11731.html
> To start a new topic under Apache Flink User Mailing List archive., email
> ml-node+s2336050n1h83@n4.nabble.com
> To unsubscribe from Apache Flink User Mailing List archive., click here
> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=dmluYXkxOC5wYXRpbEBnbWFpbC5jb218MXwxODExMDE2NjAx>
> .
> NAML
> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp11752.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.
Mime
View raw message