amaterasu-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nadav Har Tzvi (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AMATERASU-5) Allocated resources aren't cleaned up properly on crash/unexpected halt
Date Mon, 13 Nov 2017 18:52:01 GMT
Nadav Har Tzvi created AMATERASU-5:
--------------------------------------

             Summary: Allocated resources aren't cleaned up properly on crash/unexpected halt
                 Key: AMATERASU-5
                 URL: https://issues.apache.org/jira/browse/AMATERASU-5
             Project: AMATERASU
          Issue Type: Bug
         Environment: Centos 7 in Parallels
2 CPUs allocated
8 GB memory
            Reporter: Nadav Har Tzvi
            Priority: Critical
         Attachments: Screen Shot 2017-11-13 at 20.44.34.png, Screen Shot 2017-11-13 at 20.45.24.png

Alright, it goes like this:
Given you have a slave with N cpus and M memory.
Given that each job requires 1 cpu and X memory.
When you run a job using ama-start
When you hit ctrl-c in the middle.
Then the next time you start executing Amaterasu, you will have n-1 cpus.
And the next time you start executing Amaterasu, you will have M-X memory.

The missing resources are back only after a reboot of the machine. Pretty darn problematic,
as it will kill slaves in no time.

I attached images displaying some execution trace logs, I am using a VM with 2 CPUs and 8
GB memory. You will see that the number of cpus dropping from 2 to 1 in the first image and
then from 1 to 0 (to not mentioned actually) in the second image. Available memory behaves
in a similar way.

I accidentally discovered it while developing ama-cli where I screwed up execution quite a
bit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message