Use gprof to check your codes for performance issues

August 29th, 2007 mysurface Posted in Developer, gcc, gprof | Hits: 52195 | 11 Comments »

By reading the article Speed your code with the GNU profiler from IBM DevelopWorks, I have gain the knowledge of using gprof to easy my work to identify my module’s performance’s bottleneck. Here, I would like to share my experience on how I discover the clog of my codes.

Let us first look at the simple steps on how GNU profiler works.

In order to make use of gprof, the c/c++ codes must be compiled by gcc with -pg options. Assume the source code to be compiled is gp-test.c.

gcc -pg -g2 -o gp-test{,.c} 

-pg is to enable gprof, -g2 is to enable debugging mode 2, -o is to specified the output of the binaries and I am using the curly brackets to shorten my typing.

Next, run the binaries and gmon.out will be generated.

./gp-test

With gmon.out, now you can extract the profiling info of your codes by running gprof.

gprof gp-test gmon.out > result.txt 

I like to save the results to a text file ‘result.txt’ for further comparison and analysis.

Lets look at a sample c code, and try to catch the choke point.


#include<stdio.h>

int twoD[10000][10000]={0};

int update_d1()
{
    int i,k=0;
    for (i=0;i<10000;i++)
        twoD[i][1]=k++;
}

int update_d2()
{
    int i,k=0;
    for (i=0;i<10000;i++)
        twoD[1][i]=k++;
}

int main(int argc, char * argv[])
{
    int i,j,k=0;
    if (argc!=2)
        return -1;
    if (*(argv[1])=='1')
        update_d1();
    else if (*(argv[1])=='2')
        update_d2();
    else
        printf("\nInvalid value %s\n",argv[1]);

    return 1;
}

Both function update_d1() and update_d2() are accessing the 2D array with same amount of loops. Assume the 3D array twoD[row][column], update_d1() accessing row, where update_d2() accessing column. We discovered that the amount of time used to complete the function are in great differences. Lets compile and profile it with gprof.


gcc -pg -g -o gp-test{,.c}
./gp-test 1
gprof gp-test gmon.out > t1
./gp-test 2
gprof gp-test gmon.out > t2

Observed the extracted results


using update_d1() :
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
100.52      0.06     0.06        1    60.31    60.31  update_d1

using update_d2() :
  %   cumulative   self              self     total
 time   seconds   seconds    calls  Ts/call  Ts/call  name
  0.00      0.00     0.00        1     0.00     0.00  update_d2

update_d1() uses 0.06 seconds, and update_d2() uses less than 0.01 seconds, Why?

Look at the 2D array again, twoD[row][column]. The twoD array is physically map to large one chunk of memory instead of rows and columns. The first block of memory is begins with row 0 column 1, the first column of row 1 is actually located at 10001th block.

Imagine how update_d1() accessing the memory. By accessing each row, it has to leap over 10000 blocks, where update_d2() consequently access 10000 blocks without leaping. Thats the reason of the delays.

[中文翻译]

11 Responses to “Use gprof to check your codes for performance issues”

  1. simple and good example for gprof command

  2. gud work……

  3. great tip. I like the idea that you can call mount your root partition with simply: LABEL=Root too

  4. i like your blog, continue

  5. Nice! Now that I know WHY I keep finding a gmon.out in my home directory and what it IS, I just need to figure out what the heck binary is making it :)
    Thank you for your site; it has helped me quickly and concisely.

  6. best site for learning gprof

  7. good example keep it up

  8. Ghassen Smaoui Says:

    Very helpful. lots of thanks

  9. Ghassen Smaoui Says:

    if someone can tell me how does gprof2dot works

  10. Reza Akbari Says:

    very very gooooooooooooood

  11. It was hard for me to find your website in google search results.
    I found it on 16 place, you have to build a lot of quality backlinks , it
    will help you to increase traffic. I know how to help you, just search in google – k2 seo
    tips

Leave a Reply