flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yang Wang <danrtsey...@gmail.com>
Subject Re: Flink Conf "yarn.flink-dist-jar" Question
Date Tue, 17 Mar 2020 01:36:32 GMT
Hi Hailu,

Sorry for the late response. If the Flink cluster(e.g. Yarn application) is
stopped directly
by `yarn application -kill`, then the staging directory will be left
behind. Since the jobmanager
do not have any change to clean up the staging directly. Also it may happen
when the
jobmanager crashed and reached the attempts limit of Yarn.

For FLINK-13938, yes, it is trying to use the Yarn public cache to
accelerate the container
launch.


Best,
Yang

Hailu, Andreas <Andreas.Hailu@gs.com> 于2020年3月10日周二 上午4:38写道:

> Also may I ask what causes these application ID directories to be left
> behind? Is it a job failure, or can they persist even if the application
> succeeds? I’d like to know so that I can implement my own cleanup in the
> interim to prevent exceeding user disk space quotas.
>
>
>
> *// *ah
>
>
>
> *From:* Hailu, Andreas [Engineering]
> *Sent:* Monday, March 9, 2020 1:20 PM
> *To:* 'Yang Wang' <danrtsey.wy@gmail.com>
> *Cc:* tison <wander4096@gmail.com>; user@flink.apache.org
> *Subject:* RE: Flink Conf "yarn.flink-dist-jar" Question
>
>
>
> Hi Yang,
>
>
>
> Yes, a combination of these two would be very helpful for us. We have a
> single shaded binary which we use to run all of the jobs on our YARN
> cluster. If we could designate a single location in HDFS for that as well,
> we could also greatly benefit from FLINK-13938.
>
>
>
> It sounds like a general public cache solution is what’s being called for?
>
>
>
> *// *ah
>
>
>
> *From:* Yang Wang <danrtsey.wy@gmail.com>
> *Sent:* Sunday, March 8, 2020 10:52 PM
> *To:* Hailu, Andreas [Engineering] <Andreas.Hailu@ny.email.gs.com>
> *Cc:* tison <wander4096@gmail.com>; user@flink.apache.org
> *Subject:* Re: Flink Conf "yarn.flink-dist-jar" Question
>
>
>
> Hi Hailu, tison,
>
>
>
> I created a very similar ticket before to accelerate Flink submission on
> Yarn[1]. However,
>
> we do not get a consensus in the PR. Maybe it's time to revive the
> discussion and try
>
> to find a common solution for both the two tickets[1][2].
>
>
>
>
>
> [1]. https://issues.apache.org/jira/browse/FLINK-13938
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_FLINK-2D13938&d=DwMFaQ&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=hRr4SA7BtUvKoMBP6VDhfisy2OJ1ZAzai-pcCC6TFXM&m=rlD0F8Cr4H0aPlN6O2_K13Q76RFOERSWuJANh4q6X_8&s=njA3vGYTf0g7Zsog8AiwS4bbXxblOxepBEWUV9W3E0s&e=>
>
> [2]. https://issues.apache.org/jira/browse/FLINK-14964
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_FLINK-2D14964&d=DwMFaQ&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=hRr4SA7BtUvKoMBP6VDhfisy2OJ1ZAzai-pcCC6TFXM&m=rlD0F8Cr4H0aPlN6O2_K13Q76RFOERSWuJANh4q6X_8&s=9kT1RZkGwWh3MAbc_ZUrsEsmRRfw6VK4rlNIeNxs6GU&e=>
>
>
>
>
>
> Best,
>
> Yang
>
>
>
> Hailu, Andreas <Andreas.Hailu@gs.com> 于2020年3月7日周六 上午11:21写道:
>
> Hi Tison, thanks for the reply. I’ve replied to the ticket. I’ll be
> watching it as well.
>
>
>
> *// *ah
>
>
>
> *From:* tison <wander4096@gmail.com>
> *Sent:* Friday, March 6, 2020 1:40 PM
> *To:* Hailu, Andreas [Engineering] <Andreas.Hailu@ny.email.gs.com>
> *Cc:* user@flink.apache.org
> *Subject:* Re: Flink Conf "yarn.flink-dist-jar" Question
>
>
>
> FLINK-13938 seems a bit different than your requirement. The one totally
> matches is FLINK-14964
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_FLINK-2D14964&d=DwMFaQ&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=hRr4SA7BtUvKoMBP6VDhfisy2OJ1ZAzai-pcCC6TFXM&m=9sMjDI0I_9Yni5ZWqV8GScK_KBTaA65yK9kBG-LE5_4&s=X1ZoN456fuc5mNxO6fBzDboEhrI0EHL873LzOd6tnN8&e=>.
> I'll appreciate it if you can share you opinion on the JIRA ticket.
>
>
>
> Best,
>
> tison.
>
>
>
>
>
> tison <wander4096@gmail.com> 于2020年3月7日周六 上午2:35写道:
>
> Yes your requirement is exactly taken into consideration by the community.
> We currently have an open JIRA ticket for the specific feature[1] and works
> for loosing the constraint of flink-jar schema to support DFS location
> should happen.
>
>
>
> Best,
>
> tison.
>
>
>
> [1] https://issues.apache.org/jira/browse/FLINK-13938
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_FLINK-2D13938&d=DwMFaQ&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=hRr4SA7BtUvKoMBP6VDhfisy2OJ1ZAzai-pcCC6TFXM&m=9sMjDI0I_9Yni5ZWqV8GScK_KBTaA65yK9kBG-LE5_4&s=ediMPoQtcPX7K-5fjXJxE2cPp5OySkzwXYfYj8mDWO0&e=>
>
>
>
>
>
> Hailu, Andreas <Andreas.Hailu@gs.com> 于2020年3月7日周六 上午2:03写道:
>
> Hi,
>
>
>
> We noticed that every time an application runs, it uploads the flink-dist
> artifact to the /user/<user>/.flink HDFS directory. This causes a user disk
> space quota issue as we submit thousands of apps to our cluster an hour. We
> had a similar problem with our Spark applications where it uploaded the
> Spark Assembly package for every app. Spark provides an argument to use a
> location in HDFS its for applications to leverage so they don’t need to
> upload them for every run, and that was our solution (see “spark.yarn.jar”
> configuration if interested.)
>
>
>
> Looking at the Resource Orchestration Frameworks page
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Dstable_ops_config.html-23yarn-2Dflink-2Ddist-2Djar&d=DwMFaQ&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=hRr4SA7BtUvKoMBP6VDhfisy2OJ1ZAzai-pcCC6TFXM&m=9sMjDI0I_9Yni5ZWqV8GScK_KBTaA65yK9kBG-LE5_4&s=3SPuvZu9nPph-qnE3TtbTngG-k3XDBLQGyk9I_tjNtI&e=>,
> I see there’s might be a similar concept through a “yarn.flink-dist-jar”
> configuration option. I wanted to place the flink-dist package we’re using
> in a location in HDFS and configure out jobs to point to it, e.g.
>
>
>
> yarn.flink-dist-jar: hdfs:////user/delp/.flink/flink-dist_2.11-1.9.1.jar
>
>
>
> Am I correct in that this is what I’m looking for? I gave this a try with
> some jobs today, and based on what I’m seeing in the launch_container.sh in
> our YARN application, it still looks like it’s being uploaded:
>
>
>
> export
> _FLINK_JAR_PATH="hdfs://d279536/user/delp/.flink/application_1583031705852_117863/flink-dist_2.11-1.9.1.jar"
>
>
>
> How can I confirm? Or is this perhaps not config I’m looking for?
>
>
>
> Best,
>
> Andreas
>
>
> ------------------------------
>
>
> Your Personal Data: We may collect and process information about you that
> may be subject to data protection laws. For more information about how we
> use and disclose your personal data, how we protect your information, our
> legal basis to use your information, your rights and who you can contact,
> please refer to: www.gs.com/privacy-notices
>
>
> ------------------------------
>
>
> Your Personal Data: We may collect and process information about you that
> may be subject to data protection laws. For more information about how we
> use and disclose your personal data, how we protect your information, our
> legal basis to use your information, your rights and who you can contact,
> please refer to: www.gs.com/privacy-notices
>
>
> ------------------------------
>
> Your Personal Data: We may collect and process information about you that
> may be subject to data protection laws. For more information about how we
> use and disclose your personal data, how we protect your information, our
> legal basis to use your information, your rights and who you can contact,
> please refer to: www.gs.com/privacy-notices
>

Mime
View raw message