Compiler optimization and its effect on debugger line information

Learn how to decipher the output of debugging line information, and what tools you can use to look deeper.
2 readers like this.
magnifying glass on computer screen, finding a bug in the code

Opensource.com

In my previous article, I described the DWARF information used to map regular and inlined functions between an executable binary and its source code. Functions can be dozens of lines, so you might like to know specifically where the processor is in your source code. The compiler includes information mapping between instructions and specific lines in the source code to provide a precise location. In this article, I describe line mapping information, and some of the issues caused by compiler optimizations.

Start with the same example code from the previous article:

#include <stdlib.h>
#include <stdio.h>

int a;
double b;

int
main(int argc, char* argv[])
{
	a = atoi(argv[1]);
	b = atof(argv[2]);
	a = a + 1;
	b = b / 42.0;
	printf ("a = %d, b = %f\n", a, b);
	return 0;
}

The compiler only includes the line mapping information when the code is compiled with debugging information enabled (the -g option):

$ gcc -O2 -g example.c -o example

Examining line number information

Line information is stored in a machine readable format, but human readable output can be generated with llvm-objdump or odjdump.

$ llvm-objdump --line-numbers example

For the main function, you get output listing the assembly code instruction with the file and line number associated with the instruction:

0000000000401060 <main>:
; main():
; /home/wcohen/present/202207youarehere/example.c:9
  401060: 53                      		 pushq    %rbx
; /usr/include/stdlib.h:364
  401061: 48 8b 7e 08             		 movq    8(%rsi), %rdi
; /home/wcohen/present/202207youarehere/example.c:9
  401065: 48 89 f3                		 movq    %rsi, %rbx
; /usr/include/stdlib.h:364
  401068: ba 0a 00 00 00          		 movl    $10, %edx
  40106d: 31 f6                   		 xorl    %esi, %esi
  40106f: e8 dc ff ff ff          		 callq    0x401050 <strtol@plt>
; /usr/include/bits/stdlib-float.h:27
  401074: 48 8b 7b 10             		 movq    16(%rbx), %rdi
  401078: 31 f6                   		 xorl    %esi, %esi
; /usr/include/stdlib.h:364
  40107a: 89 05 c8 2f 00 00       		 movl    %eax, 12232(%rip)   	# 0x404048 <a>
; /usr/include/bits/stdlib-float.h:27
  401080: e8 ab ff ff ff          		 callq    0x401030 <strtod@plt>
; /home/wcohen/present/202207youarehere/example.c:12
  401085: 8b 05 bd 2f 00 00       		 movl    12221(%rip), %eax   	# 0x404048 <a>
; /home/wcohen/present/202207youarehere/example.c:14
  40108b: bf 10 20 40 00          		 movl    $4202512, %edi      	# imm = 0x402010
; /home/wcohen/present/202207youarehere/example.c:13
  401090: f2 0f 5e 05 88 0f 00 00 		 divsd    3976(%rip), %xmm0   	# 0x402020 <__dso_handle+0x18>
  401098: f2 0f 11 05 a0 2f 00 00 		 movsd    %xmm0, 12192(%rip)  	# 0x404040 <b>
; /home/wcohen/present/202207youarehere/example.c:12
  4010a0: 8d 70 01                		 leal    1(%rax), %esi
; /home/wcohen/present/202207youarehere/example.c:14
  4010a3: b8 01 00 00 00          		 movl    $1, %eax
; /home/wcohen/present/202207youarehere/example.c:12
  4010a8: 89 35 9a 2f 00 00       		 movl    %esi, 12186(%rip)   	# 0x404048 <a>
; /home/wcohen/present/202207youarehere/example.c:14
  4010ae: e8 8d ff ff ff          		 callq    0x401040 <printf@plt>
; /home/wcohen/present/202207youarehere/example.c:16
  4010b3: 31 c0                   		 xorl    %eax, %eax
  4010b5: 5b                      		 popq    %rbx
  4010b6: c3

The first instruction at 0x401060 maps to the original source code file example.c line 9, the opening { for the main function.

The next instruction 0x401061 maps to line 364 of stdlib.h line 364, the inlined atoi function. This is setting up one of the arguments to the later strtol call.

The instruction 0x401065 is also associated with the opening { of the main function.

Instructions 0x401068 and 0x40106d set the remaining arguments for the strtol call that takes place at 0x40106f. In this case, you can see that the compiler has reordered the instructions and causes some bouncing between line 9 of example.c and line 364, or the stdlib.h include file, as you step through the instructions on the debugger.

You can also see some mixing of instructions for lines 12, 13, and 14 from example.c in the output of llvm-objdump above. The compiler has moved the divide instructions (0x40190) for line 13 before some of the instructions for line 12 to hide the latency of the divide. As you step through the instructions in the debugger for this code, you see the debugger jump back and forth between lines rather than doing all the instructions from one line before moving on to the next line. Also notice as you step though that line 13 with the divide operation was not shown, but the divide definitely occurred to produce the output. You can see GDB bouncing between lines when stepping through the  program's main function:

(gdb) run 1 2
Starting program: /home/wcohen/present/202207youarehere/example 1 2
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 1, main (argc=3, argv=0x7fffffffdbe8) at /usr/include/stdlib.h:364
364      return (int) strtol (__nptr, (char **) NULL, 10);
(gdb) print $pc
$10 = (void (*)()) 0x401060 <main>
(gdb) next
10   	 a = atoi(argv[1]);
(gdb) print $pc
$11 = (void (*)()) 0x401061 <main+1>
(gdb) next
11   	 b = atof(argv[2]);
(gdb) print $pc
$12 = (void (*)()) 0x401074 <main+20>
(gdb) next
10   	 a = atoi(argv[1]);
(gdb) print $pc
$13 = (void (*)()) 0x40107a <main+26>
(gdb) next
11   	 b = atof(argv[2]);
(gdb) print $pc
$14 = (void (*)()) 0x401080 <main+32>
(gdb) next
12   	 a = a + 1;
(gdb) print $pc
$15 = (void (*)()) 0x401085 <main+37>
(gdb) next
14   	 printf ("a = %d, b = %f\n", a, b);
(gdb) print $pc
$16 = (void (*)()) 0x4010ae <main+78>
(gdb) next
a = 2, b = 0.047619
15   	 return 0;
(gdb) print $pc
$17 = (void (*)()) 0x4010b3 <main+83>

With this simple example, you can see that the order of instructions does not match the original source code. When the program is running normally, you would never observe those changes. However, they are quite visible when using a debugger to step through the code. The boundaries between lines of code become blurred. This has other implications. When you decide to set a breakpoint to a line following a line with variable update, the compiler scheduler may have moved the variable after the location you expect the variable to be updated, and you don’t get the expected value for the variable at the breakpoint.

Which of the instructions for a line get the breakpoint?

With the previous example.c, the compiler generated multiple instructions to implement individual lines of code. How does the debugger know which of those instructions should be the one that it places the breakpoint on? There’s an additional statement flag in the line information that marks the recommended locations to place the breakpoints. You can see those instructions marked with S in the column below SBPE in eu-readelf --debug-dump=decodedline example:

DWARF section [31] '.debug_line' at offset 0x50fd:

 CU [c] example.c
  line:col SBPE* disc isa op address (Statement Block Prologue Epilogue *End)
  /home/wcohen/present/202207youarehere/example.c (mtime: 0, length: 0)
     9:1   S        0   0  0 0x0000000000401060 <main>
    10:2   S        0   0  0 0x0000000000401060 <main>
  /usr/include/stdlib.h (mtime: 0, length: 0)
   362:1   S        0   0  0 0x0000000000401060 <main>
   364:3   S        0   0  0 0x0000000000401060 <main>
  /home/wcohen/present/202207youarehere/example.c (mtime: 0, length: 0)
     9:1            0   0  0 0x0000000000401060 <main>
  /usr/include/stdlib.h (mtime: 0, length: 0)
   364:16           0   0  0 0x0000000000401061 <main+0x1>
   364:16           0   0  0 0x0000000000401065 <main+0x5>
  /home/wcohen/present/202207youarehere/example.c (mtime: 0, length: 0)
     9:1            0   0  0 0x0000000000401065 <main+0x5>
  /usr/include/stdlib.h (mtime: 0, length: 0)
   364:16           0   0  0 0x0000000000401068 <main+0x8>
   364:16           0   0  0 0x000000000040106f <main+0xf>
   364:16           0   0  0 0x0000000000401074 <main+0x14>
  /usr/include/bits/stdlib-float.h (mtime: 0, length: 0)
    27:10           0   0  0 0x0000000000401074 <main+0x14>
  /usr/include/stdlib.h (mtime: 0, length: 0)
   364:10           0   0  0 0x000000000040107a <main+0x1a>
  /home/wcohen/present/202207youarehere/example.c (mtime: 0, length: 0)
    11:2   S        0   0  0 0x0000000000401080 <main+0x20>
  /usr/include/bits/stdlib-float.h (mtime: 0, length: 0)
    25:1   S        0   0  0 0x0000000000401080 <main+0x20>
    27:3   S        0   0  0 0x0000000000401080 <main+0x20>
    27:10           0   0  0 0x0000000000401080 <main+0x20>
    27:10           0   0  0 0x0000000000401085 <main+0x25>
  /home/wcohen/present/202207youarehere/example.c (mtime: 0, length: 0)
    12:2   S        0   0  0 0x0000000000401085 <main+0x25>
    12:8            0   0  0 0x0000000000401085 <main+0x25>
    14:2            0   0  0 0x000000000040108b <main+0x2b>
    13:8            0   0  0 0x0000000000401090 <main+0x30>
    13:4            0   0  0 0x0000000000401098 <main+0x38>
    12:8            0   0  0 0x00000000004010a0 <main+0x40>
    14:2            0   0  0 0x00000000004010a3 <main+0x43>
    12:4            0   0  0 0x00000000004010a8 <main+0x48>
    13:2   S        0   0  0 0x00000000004010ae <main+0x4e>
    14:2   S        0   0  0 0x00000000004010ae <main+0x4e>
    15:2   S        0   0  0 0x00000000004010b3 <main+0x53>
    16:1            0   0  0 0x00000000004010b3 <main+0x53>
    16:1            0   0  0 0x00000000004010b6 <main+0x56>
    16:1       *    0   0  0 0x00000000004010b6 <main+0x56>

  • Groups of instructions are delimited by the path to the source file for those instructions.
  • The left column contains the line number and column that the instruction maps back to, followed by the flags.
  • The hexadecimal number is the address of the instruction, followed by the offset into the function of the instruction.

If you look carefully at the output, you see that some instructions map back to multiple lines in the code. For example, 0x0000000000401060 maps to both line 9 and 10 of example.c. The same instruction also maps to lines 362 and 364 of /usr/include/stdlib.h. The mappings are not one-to-one. One line of source code may map to multiple instructions, and one instruction may map to multiple lines of code. When the debugger decides to print out a single line mapping for an instruction, it might not be the one that you expect.

Merging and eliminating of lines

As you saw in the output of the detailed line mapping information, mappings are not one-to-one. There are cases where the compiler can eliminate instructions because they have no effect on the final result of the program. The compiler may also merge instructions from separate lines through optimizations, such as common subexpression elimination (CSE), and omit that the instruction could have come from more than one place in the code.

The following example was compiled on an x86_64 Fedora 36 machine, using GCC-12.2.1. Depending on the particular environment, you may not get the same results, because different versions of compilers may optimize the code differently.

Note the if-else statement in the code. Both have statements doing the same expensive divides. The compiler factors out the divide operation.

#include <stdlib.h>
#include <stdio.h>

int
main(int argc, char* argv[])
{
    int a,b,c;
    a = atoi(argv[1]);
    b = atoi(argv[2]);
    if (b) {
   	 c = 100/a;
    } else {
   	 c = 100/a;
    }
    printf ("a = %d, b = %d, c = %d\n", a, b, c);
    return 0;
}

Looking at objdump -dl whichline, you see one divide operation in the binary:

/home/wcohen/present/202207youarehere/whichline.c:13
  401085:    b8 64 00 00 00  		 mov	$0x64,%eax
  40108a:    f7 fb           		 idiv   %ebx

Line 13 is one of the lines with a divide, but you might suspect that there are other line numbers associated with those addresses. Look at the output of eu-readelf --debug-dump=decodedline whichline to see whether there are other line numbers associated with those addresses.

Line 11, where the other divide occurs, is not in this list:

  /usr/include/stdlib.h (mtime: 0, length: 0)
   364:16       0   0  0 0x0000000000401082 <main+0x32>
   364:16       0   0  0 0x0000000000401085 <main+0x35>
  /home/wcohen/present/202207youarehere/whichline.c (mtime: 0, length: 0)
   10:2   S    	0   0  0 0x0000000000401085 <main+0x35>
   13:3   S    	0   0  0 0x0000000000401085 <main+0x35>
   15:2   S    	0   0  0 0x0000000000401085 <main+0x35>
   13:5        	0   0  0 0x0000000000401085 <main+0x35>

If the results are unused, the compiler may completely eliminate generating code for some lines.

Consider the following example, where the else clause computes c = 100 * a, but does not use it:

#include <stdlib.h>
#include <stdio.h>

int
main(int argc, char* argv[])
{
    int a,b,c;
    a = atoi(argv[1]);
    b = atoi(argv[2]);
    if (b) {
   	 c = 100/a;
   	 printf ("a = %d, b = %d, c = %d\n", a, b, c);
    } else {
   	 c = 100 * a;
   	 printf ("a = %d, b = %d\n", a, b);
    }
    return 0;
}

Compile eliminate.c with GCC:

$ gcc -O2 -g eliminate.c -o eliminate

When looking through the output generated by objdump -dl eliminate, there’s no sign of the multiplication for 100 * a (line 14) of eliminate.c. The compiler has determined that the value was not used and eliminated it.

When looking through the output of objdump -dl eliminate, there is no:

/home/wcohen/present/202207youarehere/eliminate.c:14

Maybe it’s hidden as one of the other views of line information. You can use eu-readelf with the --debug-dump option to get a complete view of the line information:

$ eu-readelf --debug-dump=decodedline eliminate > eliminate.lines

It turns out that GCC did record some mapping information. It seems that 0x4010a5 maps to the multiplication statement, in addition to the printf at line 15:

 /home/wcohen/present/202207youarehere/eliminate.c (mtime: 0, length: 0)
…
	18:1        	0   0  0 0x00000000004010a4 <main+0x54>
	14:3   S    	0   0  0 0x00000000004010a5 <main+0x55>
	15:3   S    	0   0  0 0x00000000004010a5 <main+0x55>
	15:3        	0   0  0 0x00000000004010b0 <main+0x60>
	15:3   	*	0   0  0 0x00000000004010b6 <main+0x66>

Optimization affects line information

The line information included into compiled binaries is helpful when pinpointing where in code the processor is. However, optimization can affect the line information, and what you see when debugging the code.

When using a debugger, expect that boundaries between lines of code are fuzzy, and the debugger is likely to bounce between them when stepping through the code. An instruction might map to multiple lines of source code, but the debugger may only reports one. The compiler may entirely eliminate instructions associated with a line of code, and it may or may not include line mapping information. The line information generated by the compiler is helpful, but keep in mind that something might be lost in translation.

What to read next
Will Cohen with sunflowers
William Cohen has been a developer of performance tools at Red Hat for over a decade and has worked on a number of the performance tools in Red Hat Enterprise Linux and Fedora such as OProfile, PAPI, SystemTap, and Dyninst.

Comments are closed.

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.