mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "chenqiuhao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
Date Fri, 15 May 2015 11:26:00 GMT

    [ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545342#comment-14545342
] 

chenqiuhao commented on MESOS-2706:
-----------------------------------

I used strace command and found the slave process will  read all  /proc/$/stats and /proc/$/cmdline
to statics the usage of per docker-task round by round.
For example, I launch a docker-task in an  OS which have launched other 500 processes(count
by ps -ef|grep wc ),the mesos slave process will  keep on reading 500+500 times /proc/$/stats&cmdline
round by round .And when the number of docker-tasks reached 50,    the massive times of reading
 /proc/$/stats&cmdline exhaust whole 1 CPU time.


> When the docker-tasks grow, the time spare between Queuing task and Starting container
grows
> --------------------------------------------------------------------------------------------
>
>                 Key: MESOS-2706
>                 URL: https://issues.apache.org/jira/browse/MESOS-2706
>             Project: Mesos
>          Issue Type: Bug
>          Components: docker
>    Affects Versions: 0.22.0
>         Environment: My Environment info:
> Mesos 0.22.0 & Marathon 0.82-RC1 both running in one host-server.
> Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and 24G mems.
> So Mesos can launch thousands of task in theory.
> And the docker-task is very light-weight to launch a sshd service .
>            Reporter: chenqiuhao
>
> At the beginning, Marathon can launch docker-task very fast,but when the number of tasks
in the only-one mesos-slave host reached 50,It seemed Marathon lauch docker-task slow.
> So I check the mesos-slave log,and I found that the time spare between Queuing task and
Starting container grew .
> For example, 
> launch the 1st docker task, it takes about 0.008s
> [root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing task|Starting container'
> I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b'
for executor dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework '20150202-112355-2684495626-5050-26153-0000
> I0508 15:54:00.196832 225781 docker.cpp:581] Starting container 'd0b0813a-6cb6-4dfd-bbce-f1b338744285'
for task 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b')
of framework '20150202-112355-2684495626-5050-26153-0000'
> launch the 50th docker task, it takes about 4.9s
> I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b'
for executor dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework '20150202-112355-2684495626-5050-26153-0000
> I0508 16:12:15.801503 225778 docker.cpp:581] Starting container '482dd47f-b9ab-4b09-b89e-e361d6f004a4'
for task 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b')
of framework '20150202-112355-2684495626-5050-26153-0000'
> And when i launch the 100th docker task,it takes about 13s!
> And I did the same test in one 24 Cpus and 256G mems server-host, it got the same result.
> Did somebody have the same experience , or Can help to do the same pressure test ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message