mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sam Whitlock (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MESOS-38) Executor resource monitoring and local reporting of usage stats
Date Fri, 14 Oct 2011 21:32:11 GMT

    [ https://issues.apache.org/jira/browse/MESOS-38?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127874#comment-13127874
] 

Sam Whitlock commented on MESOS-38:
-----------------------------------

The way this would probably work is there would be a reporting class(es) associated with the
executor. Each of these would be specific to the isolation module.

lxc would doesn't have to a pluggable reporting module, but it might as well so long it make
it too abstract.

The process-based isolation will need a pluggable reporting module because the syscall interface
differences between OSX and Linux. This would enable the OSX reporting to be implemented at
a later date.
                
> Executor resource monitoring and local reporting of usage stats
> ---------------------------------------------------------------
>
>                 Key: MESOS-38
>                 URL: https://issues.apache.org/jira/browse/MESOS-38
>             Project: Mesos
>          Issue Type: New Feature
>         Environment: Initial executor monitoring for linux only. Dummy monitoring capability
(no-op) for OSX, with functionality to be filled in later.
>            Reporter: Sam Whitlock
>              Labels: monitoring
>
> Implement reporting of resource usage on executors and log them to a local log file (for
now). The eventual usage of this will be to report these statistics to the Mesos master in
order to build either or both a timeline for the webui and/or a top-like command-line interface.
This improvement ticket is just for the local monitoring and log file reporting. A reporting
system (to the master node) will be a later improvement ticket.
> With the current version of Mesos, it is not possible to monitor individual tasks. Therefore
the best this sort of system can do is monitor the usage of an individual executor and aggregate
the resource usage of over the executor's tasks and resource allocations. If frameworks have
a 1-to-1 relationship of a job to an executor, then the aggregate statistics will be more
meaningful.
> Reporting will be available for both lxc isolation and process-based isolation. For lxc
isolation the task is easier because of the isolation facilities of lxc. Process-based isolation
is more difficult as processes can become re-parented from the process tree of the executor
(e.g. double fork). The session ID and the process group ID will likely still be the same
as that of the executor except for the uncommon case of the process resetting both of those.
> Initial reporting will be to a local log file. This will be a 'heartbeat' style akin
to pidstat output (in sysstat library). This may not be incredibly useful, but local monitoring
of resource usage is separate from the reporting and timeline building mentioned above.
> When usage statistics are eventually reported to the Mesos master, it may be possible
to use them to oversubscribe slave nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message