ambari-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gonzalo Herreros <gherre...@gmail.com>
Subject Ambari 2.5.1 Agent thread deadlock
Date Wed, 13 Sep 2017 09:42:00 GMT
Hi,

I published this issue on the Hortonworks forum a while ago but didn't get
any answer:
https://community.hortonworks.com/questions/110609/ambari-agent-251-deadlock.html
Hopefully somebody in this list can advise

The issue is that AMBARI-20070 fixes a potential concurrency issue (I never
experienced) but in turn it creates a thread deadlock in the agents.

If I run 2.5.1 agent, in a matter of minutes agents start becoming
unresponsive (yellow icon in Ambari), before a day goes by all agents are
marked as "unknown" and need to be restarted.

A thread dump reveals that all working threads are waiting for a lock
introduced in the fix, which is never released.

I manually commented out the line fix_subprocess_popen() in
/ambari-agent/src/main/python/ambari_agent/main.py
and thanks to that I have been running 2.5.1 on development environments
for months without any issues. I'm surprised nobody has seen this. So far I
have only tested it on VMs, so that might be a factor. Thanks, Gonzalo

Mime
View raw message