airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bas Harenslak <basharens...@godatadriven.com>
Subject Re: Need more pairs of eyes for the flaky LocalExecutorTest fix
Date Mon, 29 Apr 2019 20:25:47 GMT
@Fokko I believe that’s implicitly created by multiprocessing.Manager()<https://docs.python.org/3.4/library/multiprocessing.html?highlight=process#multiprocessing.sharedctypes.multiprocessing.Manager>:

….. The returned manager object corresponds to a spawned child process and ….

I’m in favour of the Manager route, since this doesn’t introduce additional complex multiprocessing
code in Airflow. I’m reviewing right now although it’s a bit out of my comfort zone...

Bas

On 29 Apr 2019, at 21:46, Driesprong, Fokko <fokko@driesprong.frl<mailto:fokko@driesprong.frl>>
wrote:

I'm missing the part of another process? This is within the Scheduler
process if I understand correctly.

Cheers, Fokko

Op ma 29 apr. 2019 om 21:33 schreef Jarek Potiuk <Jarek.Potiuk@polidea.com<mailto:Jarek.Potiuk@polidea.com>>:

I am also leaning towards the manager. I updated the
https://github.com/apache/airflow/pull/5200 PR now after review and once
it
passes CI I think we can merge it.
If anyone wants to have a look as well, happy to hear it.

J.

On Mon, Apr 29, 2019 at 2:14 PM Ash Berlin-Taylor <ash@apache.org<mailto:ash@apache.org>>
wrote:

I think I lean towards the built-in/manager approach as it is less
concurrency code we have to manage/maintain in Airflow, though I'm not
hugely happy about another process :(

-ash

On 29 Apr 2019, at 07:33, Jarek Potiuk <Jarek.Potiuk@polidea.com<mailto:Jarek.Potiuk@polidea.com>>
wrote:

Hello Everyone,

I think we need some more pairs of eyes to take a look at potential
fixes
we have for the pesky LocalExecutorTest that we are all experiencing
with
our Travis builds. Once we solve it I think we should be much closer to
have stable builds - including some other flaky test fixes merged
recently.

It turned out that the problem relates to quite deep internals of how
data
is passed between processes using multiprocessing queues. It's really
deep
in the core processing of Airflow so I think it would be great if also
other experienced Airflowers review and comment it and help to select
the
best solution as we could have missed something.

I was looking at it together with Ash and Bas and I (a bit too fast)
merged
a preliminary version of the fix last week. We reverted it later as it
turned out to have some side effects, so we know we have to be careful
with
this one.

After more detailed analysis and discussions with Omar, we have now two
potential candidates to fix it. Both are green and from local testing -
both are solving the problem in a different way.

 - https://github.com/apache/airflow/pull/5199
 - https://github.com/apache/airflow/pull/5200

I tried to describe the problem, solution candidates with Pros and Cons
in
the JIRA ticket :
https://issues.apache.org/jira/browse/AIRFLOW-4401

I'd love if we can get reviews in the PRs and input to discussion on
which
solution to choose.

J.


--

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
E: jarek.potiuk@polidea.com<mailto:jarek.potiuk@polidea.com>



--

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
E: jarek.potiuk@polidea.com<mailto:jarek.potiuk@polidea.com>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message