Use gprof to check your codes for performance issues

August 29th, 2007 mysurface Posted in Developer, gcc, gprof | Hits: 62562 | 13 Comments »

By reading the article Speed your code with the GNU profiler from IBM DevelopWorks, I have gain the knowledge of using gprof to easy my work to identify my module’s performance’s bottleneck. Here, I would like to share my experience on how I discover the clog of my codes.

Let us first look at the simple steps on how GNU profiler works.

In order to make use of gprof, the c/c++ codes must be compiled by gcc with -pg options. Assume the source code to be compiled is gp-test.c.

gcc -pg -g2 -o gp-test{,.c} 

-pg is to enable gprof, -g2 is to enable debugging mode 2, -o is to specified the output of the binaries and I am using the curly brackets to shorten my typing.

Next, run the binaries and gmon.out will be generated.


With gmon.out, now you can extract the profiling info of your codes by running gprof.

gprof gp-test gmon.out > result.txt 

I like to save the results to a text file ‘result.txt’ for further comparison and analysis.

Lets look at a sample c code, and try to catch the choke point.


int twoD[10000][10000]={0};

int update_d1()
    int i,k=0;
    for (i=0;i<10000;i++)

int update_d2()
    int i,k=0;
    for (i=0;i<10000;i++)

int main(int argc, char * argv[])
    int i,j,k=0;
    if (argc!=2)
        return -1;
    if (*(argv[1])=='1')
    else if (*(argv[1])=='2')
        printf("\nInvalid value %s\n",argv[1]);

    return 1;

Both function update_d1() and update_d2() are accessing the 2D array with same amount of loops. Assume the 3D array twoD[row][column], update_d1() accessing row, where update_d2() accessing column. We discovered that the amount of time used to complete the function are in great differences. Lets compile and profile it with gprof.

gcc -pg -g -o gp-test{,.c}
./gp-test 1
gprof gp-test gmon.out > t1
./gp-test 2
gprof gp-test gmon.out > t2

Observed the extracted results

using update_d1() :
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
100.52      0.06     0.06        1    60.31    60.31  update_d1

using update_d2() :
  %   cumulative   self              self     total
 time   seconds   seconds    calls  Ts/call  Ts/call  name
  0.00      0.00     0.00        1     0.00     0.00  update_d2

update_d1() uses 0.06 seconds, and update_d2() uses less than 0.01 seconds, Why?

Look at the 2D array again, twoD[row][column]. The twoD array is physically map to large one chunk of memory instead of rows and columns. The first block of memory is begins with row 0 column 1, the first column of row 1 is actually located at 10001th block.

Imagine how update_d1() accessing the memory. By accessing each row, it has to leap over 10000 blocks, where update_d2() consequently access 10000 blocks without leaping. Thats the reason of the delays.


13 Responses to “Use gprof to check your codes for performance issues”

  1. simple and good example for gprof command

  2. gud work……

  3. great tip. I like the idea that you can call mount your root partition with simply: LABEL=Root too

  4. i like your blog, continue

  5. Nice! Now that I know WHY I keep finding a gmon.out in my home directory and what it IS, I just need to figure out what the heck binary is making it :)
    Thank you for your site; it has helped me quickly and concisely.

  6. best site for learning gprof

  7. good example keep it up

  8. Ghassen Smaoui Says:

    Very helpful. lots of thanks

  9. Ghassen Smaoui Says:

    if someone can tell me how does gprof2dot works

  10. Reza Akbari Says:

    very very gooooooooooooood

  11. It was hard for me to find your website in google search results.
    I found it on 16 place, you have to build a lot of quality backlinks , it
    will help you to increase traffic. I know how to help you, just search in google – k2 seo

  12. If an individual has non-Hodgkin’s lymphoma for example, the liver and
    spleen grow in size as the cancer progresses.
    (3) Vaccines have alo taken on an undeserved mythical quality.
    If you need additional help, executive or technical presentation, please feel free to call
    us: 1-866-528-0577, We speak your language: English, Spanish, Portuguese, Chinese, Russian, Latvian, French.

    Look intto my homepage ??????????????

  13. Each game is played on a big, illuminated screen with bold
    graphics. When you choose between laptops,
    a big reason why you choose one in preference to another is because you like the look of it.
    For those who are hard-core fans of baccarat, the 12Pearl Club is
    the best ticket to experience the game at its most spectacular.

Leave a Reply