我們使用 perf top 來顯示 CPU 使用率。結果顯示了兩個函數

samples    pcnt    function
‑‑‑‑‑‑     ‑‑‑‑    ‑‑‑‑‑‑‑‑‑
...        ...     ....
12617.00   6.8%    func_outside
 8691.00   4.7%    func_inside


func_outside() {

我是不是應該在perf top 結果,4.7%其實已經包含在6.8%裡面了。如果不計 func_inside 的成本,func_outside 的成本是 2.1%(6.8‑4.7)?


方法 1:

Short Answer

No each percentage that is reported is for that specific function only. So the func_inside samples are not counted in func_outside


The way perf works is that it periodically collects performance samples. By default perf top simply checks which function is currently running and then adds that to the sample count for this function.

I was pretty sure this is the case, but wanted to verify that this is how perf top displays the results so I wrote a quick test program to test its behavior. This program has two functions of interest outer and inner. The outer function calls inner in a loop, and the amount of work that inner does is controlled by an argument. When compiling be sure to use O0 to avoid inlining. The command line arguments control the ratio of work between the two functions.

Running with parameters ./a.out 1 1 1000000000 gives results:

49.20%  a.out             [.] outer    
23.69%  a.out             [.] main    
21.32%  a.out             [.] inner    

Running with parameters ./a.out 1 10 1000000000 gives results:

66.06%  a.out             [.] inner    
17.77%  a.out             [.] outer    
 9.50%  a.out             [.] main    

Running with parameters ./a.out 1 100 1000000000 gives results:

88.53%  a.out             [.] inner    
 2.85%  a.out             [.] outer    
 1.09%  a.out             [.] main    

If the count for inner was included in outer then the runtime percentage for outer would always be higher than inner. But as these results show that is not the case.

The test program I used is below and was compiled with gcc ‑O0 ‑g ‑‑std=c11 test.c.

#include <stdlib.h>
#include <stdio.h>

long inner(int count) {
  long sum = 0;
  for(int i = 0; i < count; i++) {
    sum += i;
  return sum;


long outer(int count_out, int count_in) {
  long sum = 0;
  for(int i = 0; i < count_out; i++) {
    sum += inner(count_in);
  return sum;

int main(int argc, char **argv)  {
  if(argc < 4) {
    printf("Usage: %s <outer_cnt> <inner_cnt> <loop>\n",argv[0]);

  int outer_cnt = atoi(argv[1]);
  int inner_cnt = atoi(argv[2]);
  int loops     = atoi(argv[3]);

  long res = 0;
  for(int i = 0; i < loops; i++) {
    res += outer(outer_cnt, inner_cnt);

  printf("res is %ld\n", res);
  return 0;

