mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "haosdent (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MESOS-4705) Slave failed to sample container with perf event
Date Wed, 06 Apr 2016 05:25:25 GMT

    [ https://issues.apache.org/jira/browse/MESOS-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15227744#comment-15227744
] 

haosdent commented on MESOS-4705:
---------------------------------

[~bmahler] Thanks for your comments. As I checked in https://reviews.apache.org/r/43283 before.
The {{perf stat}} output become from 3 to 4 tokens are after this commit [tools/perf/stat:
Add event unit and scale support | https://github.com/torvalds/linux/commit/410136f5dd96b6013fe6d1011b523b1c247e1ccb]
which appears since 3.14 in mainline. And it become from 4 to 6 tokens are after this commit
[perf stat: Output running time and run/enabled ratio in CSV mode | https://github.com/torvalds/linux/commit/d73515c03c6a2706e088094ff6095a3abefd398b]
which appears since 4.1 in mainline.

However, CentOS 7 backported these patches to its current kernel version(3.10)
{code}
kernel>c7$: grep 'Add event unit and scale support' SPECS/kernel.spec
- [tools] perf/stat: Add event unit and scale support (Jiri Olsa) [1133083]
kernel>c7$: grep 'Output running time and run/enabled ratio in CSV mode' SPECS/kernel.spec
- [perf] stat: Output running time and run/enabled ratio in CSV mode (Jiri Olsa) [1222189]
{code}

This is why I suggested to use [~wangcong]'s perf event api. Because we could not determine
the perf event format according kernel version. But if we want to continue previous way, I
think we need make the rule more loose. For example, only match perf event format by tokens
number instead of check their kernel version.

> Slave failed to sample container with perf event
> ------------------------------------------------
>
>                 Key: MESOS-4705
>                 URL: https://issues.apache.org/jira/browse/MESOS-4705
>             Project: Mesos
>          Issue Type: Bug
>          Components: cgroups, isolation
>    Affects Versions: 0.27.1
>            Reporter: Fan Du
>            Assignee: Fan Du
>
> When sampling container with perf event on Centos7 with kernel 
> 3.10.0-123.el7.x86_64, slave complained with below error spew:
> {code}
> E0218 16:32:00.591181  8376 perf_event.cpp:408] Failed to get perf sample: Failed to
parse perf sample: Failed to parse perf sample line '25871993253,,cycles,mesos/5f23ffca-87ed-4ff6-84f2-6ec3d4098ab8,10059827422,100.00':
Unexpected number of fields
> {code}
> it's caused by the current perf format [assumption | https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob;f=src/linux/perf.cpp;h=1c113a2b3f57877e132bbd65e01fb2f045132128;hb=HEAD#l430]
with kernel version below 3.12 
> On 3.10.0-123.el7.x86_64 kernel, the format is with 6 tokens as below:
> value,unit,event,cgroup,running,ratio
> A local modification fixed this error on my test bed, please review this ticket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message