A programmer's guide to GNU C Compiler

Get a behind-the-scenes look at the steps it takes to produce a binary file so that when something goes wrong, you know how to step through the process to resolve problems.
6 readers like this
6 readers like this
GitHub launches Open Source Friday 

Opensource.com

C is a well-known programming language, popular with experienced and new programmers alike. Source code written in C uses standard English terms, so it's considered human-readable. However, computers only understand binary code. To convert code into machine language, you use a tool called a compiler.

A very common compiler is GCC (GNU C Compiler). The compilation process involves several intermediate steps and adjacent tools.

Install GCC

To confirm whether GCC is already installed on your system, use the gcc command:

$ gcc --version

If necessary, install GCC using your packaging manager. On Fedora-based systems, use dnf:

$ sudo dnf install gcc libgcc

On Debian-based systems, use apt:

$ sudo apt install build-essential

After installation, if you want to check where GCC is installed, then use:

$ whereis gcc

Simple C program using GCC

Here's a simple C program to demonstrate how to compile code using GCC. Open your favorite text editor and paste in this code:

// hellogcc.c
#include <stdio.h>
 
int main() {
        printf("Hello, GCC!\n");
return 0;
}

Save the file as hellogcc.c and then compile it:

$ ls
hellogcc.c

$ gcc hellogcc.c

$ ls -1
a.out
hellogcc.c

As you can see, a.out is the default executable generated as a result of compilation. To see the output of your newly-compiled application, just run it as you would any local binary:

$ ./a.out
Hello, GCC!

Name the output file

The filename a.out isn't very descriptive, so if you want to give a specific name to your executable file, you can use the -o option:

$ gcc -o hellogcc hellogcc.c

$ ls
a.out  hellogcc  hellogcc.c

$ ./hellogcc
Hello, GCC!

This option is useful when developing a large application that needs to compile multiple C source files.

Intermediate steps in GCC compilation

There are actually four steps to compiling, even though GCC performs them automatically in simple use-cases.

  1. Pre-Processing: The GNU C Preprocessor (cpp) parses the headers (#include statements), expands macros (#define statements), and generates an intermediate file such as hellogcc.i with expanded source code.
  2. Compilation: During this stage, the compiler converts pre-processed source code into assembly code for a specific CPU architecture. The resulting assembly file is named with a .s extension, such as hellogcc.s in this example.
  3. Assembly: The assembler (as) converts the assembly code into machine code in an object file, such as hellogcc.o.
  4. Linking: The linker (ld) links the object code with the library code to produce an executable file, such as hellogcc.

When running GCC, use the -v option to see each step in detail.

$ gcc -v -o hellogcc hellogcc.c
Compiler flowchart

(Jayashree Huttanagoudar, CC BY-SA 4.0)

Manually compile code

It can be useful to experience each step of compilation because, under some circumstances, you don't need GCC to go through all the steps.

First, delete the files generated by GCC in the current folder, except the source file.

$ rm a.out hellogcc.o

$ ls
hellogcc.c

Pre-processor

First, start the pre-processor, redirecting its output to hellogcc.i:

$ cpp hellogcc.c > hellogcc.i

$ ls
hellogcc.c  hellogcc.i

Take a look at the output file and notice how the pre-processor has included the headers and expanded the macros.

Compiler

Now you can compile the code into assembly. Use the -S option to set GCC just to produce assembly code.

$ gcc -S hellogcc.i

$ ls
hellogcc.c  hellogcc.i  hellogcc.s

$ cat hellogcc.s

Take a look at the assembly code to see what's been generated.

Assembly

Use the assembly code you've just generated to create an object file:

$ as -o hellogcc.o hellogcc.s

$ ls
hellogcc.c  hellogcc.i  hellogcc.o  hellogcc.s

Linking

To produce an executable file, you must link the object file to the libraries it depends on. This isn't quite as easy as the previous steps, but it's educational:

$ ld -o hellogcc hellogcc.o
ld: warning: cannot find entry symbol _start; defaulting to 0000000000401000
ld: hellogcc.o: in function `main`:
hellogcc.c:(.text+0xa): undefined reference to `puts'

An error referencing an undefined puts occurs after the linker is done looking at the libc.so library. You must find suitable linker options to link the required libraries to resolve this. This is no small feat, and it's dependent on how your system is laid out.

When linking, you must link code to core runtime (CRT) objects, a set of subroutines that help binary executables launch. The linker also needs to know where to find important system libraries, including libc and libgcc, notably within special start and end instructions. These instructions can be delimited by the --start-group and --end-group options or using paths to crtbegin.o and crtend.o.

This example uses paths as they appear on a RHEL 8 install, so you may need to adapt the paths depending on your system.

$ ld -dynamic-linker \
/lib64/ld-linux-x86-64.so.2 \
-o hello \
/usr/lib64/crt1.o /usr/lib64/crti.o \
--start-group \
-L/usr/lib/gcc/x86_64-redhat-linux/8 \
-L/usr/lib64 -L/lib64 hello.o \
-lgcc \
--as-needed -lgcc_s \
--no-as-needed -lc -lgcc \
--end-group
/usr/lib64/crtn.o

The same linker procedure on Slackware uses a different set of paths, but you can see the similarity in the process:

$ ld -static -o hello \
-L/usr/lib64/gcc/x86_64-slackware-linux/11.2.0/ \
/usr/lib64/crt1.o /usr/lib64/crti.o \
hello.o /usr/lib64/crtn.o \
--start-group -lc -lgcc -lgcc_eh \
--end-group

Now run the resulting executable:

$ ./hello
Hello, GCC!

Some helpful utilities

Below are a few utilities that help examine the file type, symbol table, and the libraries linked with the executable.

Use the file utility to determine the type of file:

$ file hellogcc.c
hellogcc.c: C source, ASCII text

$ file hellogcc.o
hellogcc.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped

$ file hellogcc
hellogcc: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=bb76b241d7d00871806e9fa5e814fee276d5bd1a, for GNU/Linux 3.2.0, not stripped

The use the nm utility to list symbol tables for object files:

$ nm hellogcc.o
0000000000000000 T main
                          U puts

Use the ldd utility to list dynamic link libraries:

$ ldd hellogcc
linux-vdso.so.1 (0x00007ffe3bdd7000)
libc.so.6 => /lib64/libc.so.6 (0x00007f223395e000)
/lib64/ld-linux-x86-64.so.2 (0x00007f2233b7e000)

Wrap up

In this article, you learned the various intermediate steps in GCC compilation and the utilities to examine the file type, symbol table, and libraries linked with an executable. The next time you use GCC, you'll understand the steps it takes to produce a binary file for you, and when something goes wrong, you know how to step through the process to resolve problems.

Jayashree Huttanagoudar is a software engineer at RedHat India Pvt ltd. She works with Middleware OpenJDK team. She is always curious to learn new things which adds to her work.
Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.