At some point, your C program is gonna grow. Maintaining all that code in just one file is painful. Even more, you may be using different libraries and typing all those flags every time you want to compile your program also becomes painful. When you get to that point, is time to Make and Split.
Hello World
We will use a simple hello world program to introduce the basics on how to build makefiles
and how to split your program in different files to keep it tidy and easy to maintain. Let’s start with some code:
#include <stdio.h>
int
greetings (char *str)
{
printf ("Hello %s\n", str);
return 0;
}
int
main (int argc, char *argv[])
{
greetings ("World!");
return 0;
}
This case is very simple and you can just compile it typing make
or something like gcc -o helloworld helloword.c
. However, and for educational purposes we are going to write our own Makefile
, to build this simple program.
A Makefile
A Makefile
for our example may look like this:
helloworld: helloworld.c
${CC} ${CFLAGS} -o $@ $<
.PHONY: clean
clean:
rm helloworld
Makefiles
are usually named Makefile
. If you just run make
, the tool will look for that file in the current directory. If you want to give it some fancy name, then you need to use the -f
flag followed by the name you chose make -f fancy_make_file.mk
.
OK, the first thing you need to know about Makefiles
is that they are a list of rules describing how to produce a file from another file or set of files. Rules follow the structure depicted below:
target: dependencies
<TAB>command to run
Let’s take a closest to the first rule in our Makefile
to understand how does those rules work:
-
target
. Is the name of the file you want to generate. In this case it ishelloworld
. This is the name of the executable we want to generate. -
dependencies
. This is the list of files required to generate the file specified by the target. Whenever one of those dependencies change (the files are modified)make
will know that something has changed and the rule will be fired again. If none have changed themake
will just let you know that there is nothing to do. In our example, whenever we changehelloworld.c
and we runmake
, the commands below will be executed. -
command to run
. If thetarget
file does not exist, or any of thedependencies
have been changed, the commands specified in this block will be executed. This is usually a compilation command that generated thetarget
file. Let’s look in detail how we build the compilation command and why we did it like that. All those commands have to be prefixed with a<TAB>
character… a bunch of spaces will not work.
Environmental variables
The compilation command in our helloworld
rule, makes use of some default environment variables and some make
built-in variables.
- The
CC
variable is a standard way to specify the default compiler to use.CC
stands forC compiler
. This variable is usually unset and in that case,make
will use the default c compiler for a UNIX system that is namedcc
. For modern GNU/Linux systems, this is usually a link togcc
maybe through thealternatives
system. - The
CFLAGS
is also a standard variable used to specify the compiler flags we want to use to build our executable
You may be wandering what does standard
means in the bullets above. Well, let’s explain this. make
provides some implicit rules to make our lives easy. This rules are the ones that let us run make helloworld
to just compile our program. Let’s see:
$ rm helloworld; make helloworld
cc -o helloworld helloworld.c
As we can see, make uses as default compiler cc
and it also uses the content of the variable CFLAGS
to set the compiler options… well in this case the variable is empty so we just see an extra space. Let’s put something on that variable:
$ rm helloworld; CFLAGS="-static" make helloworld
cc -static -o helloworld helloworld.c
Voila!, we have built a static binary just setting the CFLAGS
variable. Try to change CC
to gcc
. Or even better, to arm-linux-gnueabi-gcc
. Wow, you are cross-compiling your program now…
When we write our own rules in a makefile
this default behaviour is overwritten, and if we want to keep it (and we should) we have to manually add these variables to the rule. That is why we use:
${CC} ${CFLAGS} -o $@ $<
instead of
gcc -static -o helloworld helloworld.c
The second is a valid rule. You can just add it to the makefile
and it will work. However it does not really take advantage of the make
tool.
Make Build-in Variables
make
is a powerful tool and it defines some internal variables and also pattern specific variable. In our simple example we are using two of those variables:
-
$@
. This variable represents that current target associated to the rule. In our example it is the same that typinghelloworld
, the name of the binary. The advantage of this is that you can easily change the name of your binary in just one place -
$<
. This variable represents the file just after the colon, which is also a convenient way to get rid of some keystrokes.
PHONY commands
In general, make
starts scanning the Makefile
and firing all the rules it found in it whenever the target does not exist or the dependencies have changed since the last build (the modification time of the dependencies is posterior to the modification time of the target).
However, we will want to fire some rules manually. Two classical examples of those rules are clean
and install
. We do not want these rules to be automatically fired and for those cases we use the so-called .PHONY:
targets. To fire those rules we have to explicitly indicate that in the command-line. For instance, to run our clean
rule, that deletes the binary we should run:
make clean
In this example we just delete the executable. In a bigger project you may need to also delete intermediate object files, libraries,…
Well, this is a pretty basic introduction to the make
tool. In most cases it is enough for working with small projects. For bigger projects you better use a build system like GNU autotools
or CMake
.
Splitting
So, to finish with this introduction on how to go from small demo programs to mini projects ;), we need to know the basics on how to split our program in pieces to easily deal with it. Again, we are going to split this minimal hello world program in pieces just for education purposes. In general, files above a few thousands of lines should be splited but, as usual, at the end, this is a bit of a personal taste.
We are going to move our greetings
function into a separate file, create a header file to be able to properly access the function and change our Makefile
to compile all together.
Let’s start moving the function into a new file called greet.c
.
#include <stdio.h>
#include "greet.h"
int
greetings (const char *str)
{
printf ("Hello %s\n", str);
return 0;
}
As you can see, we have to keep the stdio.h
include in this file because the code is using printf
(stdio.h
contains the definition for that function). We are also adding a new header for our component/module. Actually in this case it is not necessary, but in general, if you are chopping off a big program, you surely will need to include your specific header also in the .c
file (the implementation).
In this case we are using quotes to include greet.h
because we do not want this file to be installed in the system. We will just use it during the compilation of our program, and it will not be used anywhere else. Using the quotes instructs the pre-processor to look for the file in the current folder, instead of in the system folder (/usr/include
).
The Interface
Now we have a C file with the code we want to split, but we need an interface so this code can be invoked from other files. This is what the header files are for (among other things). So, let’s write a basic general header file for our greetings
function:
#ifndef 0x00sec_greet_h
#define 0x00sec_greet_h
#ifdef __cplusplus
extern "C" {
#endif
int greetings (const char *str);
#ifdef __cplusplus
}
#endif
#endif
Here we see quite some pre-processor stuff. The first part of the header file is intended to avoid multiple inclusions of the file.
So, the header file instructs the pre-processor to check if the macro 0x00sec_greet_h
has been defined. If it is defined, meaning that this file has already been included, then the whole file is skipped. There is no need to process it again. If it is not defined then, the file is processed, and the first thing it does it to define the 0x00sec_greet_h
macro, so future includes will be discarded.
Towards the middle of the file we see the prototype of our function, the one we are moving into a different file. This is all we need in this simple case. In a real project you may have quite some functions defined here as well as a bunch of data types used by those functions. Whenever you have to do this, you will know what has to be included here.
As you can see, a C prototype is just the function definition without the code.
The main program
The main program will now look like this:
#include "greet.h"
int
main (int argc, char *argv[])
{
greetings ("World!");
return 0;
}
Now, we do not need stdio.h
anymore, as we are not calling printf
from here (in this example). We need our new greet.h
that contains the definition of the greetings
function we are calling from main.
Now you can compile the program with
gcc -o hello hello.c greet.c
Or change your current Makefile
Improving our Makefile
If you had tried to change your makefile, you may have noticed that the $<
will only add the first file in the dependencies list of the rule. However now we have two files. One way to solve this is to use a variable in our Makefile
to include all the source files you want to use. Something like this:
SRC= hello.c greet.c
hello1: $(SRC)
${CC} ${CFLAGS} -o $@ ${SRC}
This works but, unfortunately is not the best way to do it. I will leave the good Makefile
for you to try or maybe for the comments, because there is still one thing we have to discuss and this post is already quite long.
Using C code on C++ application
You may have noted that I have skipped a couple of pre-processor commands in our greet.h
file. I reserved those for the very end. Those lines are intended to merge C and C++ code. This is a bit difficult to explain, but I will try my best. I will get the Makefile
out of the scene now so we focus on what is going on with the pre-processor.
First we will recompile our simple program using g++
instead of gcc
.
$ g++ -o hellocpp hello.c greet.c
$ nm hellocpp | grep greet
000000000040054d T greetings
We see that our function gets into the binary with the proper name greetings
. Now, let’s remove the extra pre-processor lines from greet.h
. It will look like this now:
#ifndef GREET_HEADER
#define GREET_HEADER
int greetings (const char *str);
#endif
And if we try to recompile our program:
$ g++ -o hellocpp hello.c greet.c
/tmp/ccENSxa4.o: In function `main':
hello.c:(.text+0x15): undefined reference to `greetings(char*)'
collect2: error: ld returned 1 exit status
The compiler cannot find the function now. Let’s see what is going on, just compiling the main program but without linking it to avoid the linking error:
$ g++ -c -o hellocpp.o hello.c
$ nm hellocpp.o | grep greet
U _Z9greetingsPKc
Yes, that is a pretty strange name for our greeting function. Those characters around our function name are named signatures, and this is the way, C++ provide parametric polymorphism among other things. In other words, this is why you can define methods in C++ with the same name, but different list of parameters. This is also the basics for the RTTI (Run-Time Type Identification) and other fancy things C++ can do.
We could talk a lot about this, its impact on the ABI issues with C++ from years and more… but I will not bother you with this now.
The missing preprocessor lines
I will reproduce here the relevant part of the header file for the reader’s convenience.
#ifdef __cplusplus
extern "C" {
#endif
int greetings (char *str);
#ifdef __cplusplus
}
#endif
The pre-processor lines above check if the __cplusplus
macro is defined. This macro is defined by the C++ compiler precisely to allow this kind of definitions. This is the way, for a source code to know if it is being compiled by a C or by a C++ compiler. So, if we are using a C++ compiler, we will add the extern "C"
string before the function prototype. This will tell the C++ compiler that the function has been written in C and no signature is needed to access it.
You may be wondering why you should care about this. Well, many important libraries in the system are written in C. When you code in C++ and you need to use that library, you will be facing this problem, if the library headers are not properly defined. So if you write some C code (sometimes this is the only way) and you want your C++ mates to use your code, you should add those lines to your header.
Conclusions
I would say this is the very basics to use make
and split your code in different files. There are a lot of other things to learn from here, but I believe that from this point on you can do it yourself. Just RTFM.
https://www.gnu.org/software/make/manual/html_node/index.html#SEC_Contents
To finish as an example to show you that all this is pretty common, take a look for instance to the beginning of /usr/include/pcap/pcap.h
.
Hack Fun!