mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Timothy St. Clair (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MESOS-1037) Exited child process status
Date Tue, 25 Feb 2014 21:17:23 GMT
Timothy St. Clair created MESOS-1037:
----------------------------------------

             Summary: Exited child process status
                 Key: MESOS-1037
                 URL: https://issues.apache.org/jira/browse/MESOS-1037
             Project: Mesos
          Issue Type: Bug
          Components: libprocess
    Affects Versions: 0.17.0
         Environment: Fedora 20 - Linux 3.12.10-300.fc20.x86_64 #1 SMP Thu Feb 6 22:11:48
UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
            Reporter: Timothy St. Clair


During initial packaging I had turned a blind-eye to some failing tests.  Upon further investigation
there appears to be an issue with how processes are reaped on (
Linux 3.12.10-300.fc20.x86_64 #1 SMP Thu Feb 6 22:11:48 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
) 

It appears that there is a status call on the /proc entry of a child process before waitpid
is called, in my env the entry is gone and the tests yield the following:  

[----------] 3 tests from Reap
[ RUN      ] Reap.NonChildProcess
libprocess/tests/reap_tests.cpp:84: Failure
status.get() is NONE
[  FAILED  ] Reap.NonChildProcess (42 ms)
[ RUN      ] Reap.ChildProcess
libprocess/tests/reap_tests.cpp:123: Failure
status.get() is NONE
[  FAILED  ] Reap.ChildProcess (31 ms)
[ RUN      ] Reap.TerminatedChildProcess
libprocess/tests/reap_tests.cpp:150: Failure
process is NONE
Process 18856 reaped unexpectedly
[  FAILED  ] Reap.TerminatedChildProcess (3 ms)
[----------] 3 tests from Reap (76 ms total)


[----------] 4 tests from Subprocess
[ RUN      ] Subprocess.status
libprocess/tests/subprocess_tests.cpp:40: Failure
s.get().status().get() is NONE
[  FAILED  ] Subprocess.status (30 ms)
[ RUN      ] Subprocess.output
libprocess/tests/subprocess_tests.cpp:129: Failure
s.get().status().get() is NONE
[  FAILED  ] Subprocess.output (31 ms)
[ RUN      ] Subprocess.input
libprocess/tests/subprocess_tests.cpp:182: Failure
s.get().status().get() is NONE
[  FAILED  ] Subprocess.input (31 ms)
[ RUN      ] Subprocess.splice
libprocess/tests/subprocess_tests.cpp:221: Failure
s.get().status().get() is NONE
[  FAILED  ] Subprocess.splice (32 ms)
[----------] 4 tests from Subprocess (124 ms total)

trace of failure:
process::ReaperProcess::reap () at reap.cpp:36
os:process () at linux.hpp:54
proc:status () at proc.hpp:174 

Branch for 0.18.0-rc2 build can be found here: https://github.com/timothysc/mesos/tree/0.18.0-post-shuffle







--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message