Improving Makefile Quality

Paul Sander

paul@wakawaka.com

May 12, 2018

Make is the original build tool in the standard Unix environment to coordinate the compilation of medium- to large-scale projects, and to perform other computations during a software build. It is regarded as a tool that is easy to learn but hard to master. Programs written in its language are notoriously prone to error and are difficult to debug. Typical errors include rebuilding artifacts that are already up to date, or failing to build artifacts that are out of date, or to build artifacts in an unexpected order (thus producing incorrect build results). This paper identifies several sources of these errors, and provides solutions to correct these errors.

Background

The Make programming language has both declarative and procedural aspects. A source file containing the Make programming language, or Makefile, models software artifacts and the relationships between them using a directed acyclic graph, called a dependency graph, which is declarative in nature. The procedural aspect is a collection of routines that consume some set of input artifacts to produce an output artifact.

In a Makefile, rules represent each node of the dependency graph. Each rule identifies one software artifact, or target, which must almost always be a regular file. Each rule has associated with it a recipe that produces the file, and a prerequisite list that identifies files that must exist and be up to date (i.e., newer than their prerequisites, in turn) before the recipe runs. The recipe is usually a script written in the system’s command shell language. The prerequisite list is a flat list of files that are consumed by the recipe to produce the node’s target artifact. The prerequisite list must be complete, meaning that it must exhaustively identify every input consumed by the recipe. In most cases, the prerequisite list also minimal, meaning that it also contains nothing extra.

Most of the errors observed when running Make fall in to one of three general categories:

A target is needlessly rebuilt (which indicates an unneeded entry in the prerequisite list).
An out of date target is not rebuilt (which indicates a missing entry in the prerequisite list).
Targets are built in an unexpected order (which is due to one or more of the above two conditions).

This paper will identify several common coding practices that lead to these conditions, and provide patterns and examples to code correct solutions.

The examples presented in this paper assume that pattern rules are available. Pattern rules are a common feature in newer variations of Make, but they are not part of the POSIX standard, nor the Single Unix Standard. The Make program provided on BSD-based platforms conform more strictly to the standard and do not implement pattern rules. Gnu Make implements pattern rules and will operate on many systems.

Review of Make Features

This section provides a brief review of Make’s features and some widespread conventions. Most readers should already be familiar with this material, but it is included here to provide a baseline upon which the main topic can build. This is not a comprehensive tutorial of all of Make’s features.

Basic Makefile Rules

When the Make tool is invoked by the user, it looks in the present working directory for a file named “Makefile” or “makefile”, and executes a program written in the Make programming language. The name of the file can also be specified on the command line, using the "-f" command line option.

The basic syntax of a Makefile rule is shown in Figure 1. The target identifies the file that is produced by the rule. The recipe is a shell script that produces the target. The prerequisites are the files consumed by the recipe to produce the target, and they must be up to date before the recipe executes. A target is up to date if its inode modification time (or mod time) in the filesystem is newer than those of its prerequisites. Exactly one target can be built by any rule. When the rule executes, the target is brought up to date.

target: prerequisites
recipe

Figure 1: Basic syntax of a Makefile rule.

The recipe is a script, which means that it can have multiple commands, and each command runs in a new command shell. Each line of the recipe must begin with a TAB character. (If this syntax requirement is not met, then a mysterious “Missing separator” message is displayed and Make exits with an error condition.) Commands that are too long to fit in one line of text may be continued by preceding the end of line delimiter (or newline character) with a backslash (“\”). Continuation lines must also begin with a TAB character.

In Figure 2, the file “c” is the result of concatenating files “a” and “b”. Whenever Make is invoked, it builds “c” if it does not exist, or if either file “a” or “b” has a newer mod time than “c”. If “c” exists and is newer than both files “a” and “b” then no action is taken.

c: a b
cat a b > c

Figure 2: Concatenate two files "a" and "b" to produce c.

The relationship between target and prerequisites requires that all of the prerequisites be brought up to date before the target. There is no implication of order among the prerequisites. In the example above, we guarantee that c is newer than "a" and "b" because "c" depends on "a" and "b". We cannot guarantee that "b" is newer than "a" (or vice-versa) because there is no dependence between them, even if "a" and "b" are targets of other rules. (If "a" and "b" are targets of other rules, then their rules may define a dependency that c does not know or care about.)

The example in Figure 2 is not typical of a real world build procedure. Figure 3 illustrates the same principle in the context of compiling a C program that uses two header files.

foo: foo.c foo.h bar.h
cc -o foo foo.c

Figure 3: Compile a C source file using two header files.

The prerequisite list of the example above includes the C source file and the two header files. Although only the foo.c source file is compiled by the recipe, the header files are also declared as prerequisites because they are also read by the compiler by way of #include directives within the source code. This is an important point because any change to the header files implies that the target must be rebuilt, even if the C source file remains unchanged.

If a target has no prerequisites then its recipe executes if the target does not exist. Such rules will execute just once in a clean source tree, and never again until after the target has been removed. Such a rule is not useful if the contents of the target are static (because if this were true then the target might as well be a source file), but it can be used to capture aspects of the environment during the first build of a clean source tree.

If no recipe is specified for a given target then the default recipe is a non-operation. In this situation, the target file is never created. Such targets are useful as high level checkpoints or milestones. The “all” target in Figure 4 is such a target, which ensures that the “foo” binary is up to date. Normally, such rules have many prerequisites.

all: foo

foo: foo.c foo.h bar.h
cc -o foo foo.c

Figure 4: The “all” target has an empty recipe.

Make will build every target specified on the command line. If no target is specified, then the first target specified in the Makefile is built by default. By convention, that target is the “all” target, which depends on all of the desired artifacts of the build.

Rules can be split up throughout the Makefile. That is, any specific target may be declared multiple times in a Makefile. Each occurrence may have zero or more prerequisites listed with it, and at most one occurrence may have a recipe. The example in Figure 5 declares the “all” target at the top of the Makefile, and fills in its dependency graph later.

all:

foo: foo.c foo.h bar.h
cc -o foo foo.c

all: foo

Figure 5: The “all” target is the Makefile’s default target, and its prerequisite list is distributed throughout the Makefile.

Oftentimes the first prerequisite appearing in a target’s prerequisite list is given some prominence over others. For example, a program source file may be seen as prominent over the header files that it uses. Splitting rules can articulate this kind of prominence as shown in Figure 6.

all:

foo: foo.c
cc -o foo foo.c

all: foo

foo: foo.h

foo: bar.h

Figure 6: A target’s first prerequisite has prominence over others. Here, the “foo.c” prerequisite has prominence for the “foo” target.

In this example, the foo target depends on the foo.c, foo.h, and bar.h files, and it derives from these files by invoking the C compiler. The all target depends on the foo file, and it has an empty recipe. Make builds the all target by default if no target is specified in its command line.

Makefile Rules are Transitive

Given a target, Make will apply rules and traverse the dependency graph until it finds the shortest paths to existing prerequisites (or a rule that has no prerequisites). The Makefile in Figure 7 defines two rules that apply in sequence to compile a C source file into an executable file: The first rule compiles the program's source code to an object file, and a second rule links the object file to produce the executable binary.

When the "make" command runs, it discovers that the "all" target depends on the "foo" target. Then it discovers that, to make the "foo" target, it must first make the "foo.o" target. Therefore, to complete its goal, it first compiles the source code into an object file using the first rule, then it links the object file to produce an executable binary.

all: foo

foo.o: foo.c
cc -c -o foo.o foo.c

foo: foo.o
cc -o foo foo.o

foo.o: foo.h
foo.o: bar.h

Figure 7: Separate rules produce an object file from source and header files, and an executable binary from the object file.

Makefile Inclusion

Make has a file inclusion capability, and the naming conventions for included files are to use a “Makefile.” prefix, or the “.d” or “.mk” extensions. The “include” directive is used in a Makefile to read another Makefile and merge the dependency graphs. The included Makefile must exist, otherwise Make will fail.

Figure 8 shows two Makefiles: The first is named “Makefile” and specifies the rules that build an application, and then includes a second Makefile named “foo.d”. The foo.d file defines the C source file's header file prerequisites.

# Contents of Makefile:

all:

foo.o: foo.c
cc -c -o foo.o foo.c

foo: foo.o
cc -o foo foo.o

all: foo

include foo.d

# Contents of foo.d:
foo.o: foo.h
foo.o: bar.h

Figure 8: The “include” directive merges the dependency graphs of two Makefiles.

It is common for included files to be produced by the Makefile itself. The “sinclude” (or sometimes “-include”) directive tells Make to include the file, but to continue if the file does not already exist. Some variants of Make, such as the very popular Gnu Make, build the included files first, then read them. Other variants require that one or more rules build the included Makefiles (taking advantage of the sinclude behavior of not failing due to missing Makefiles), and invoke Make again to take advantage of the derived Makefiles.

Figure 9 uses the C compiler to compute the header file prerequisite list (rather than hard-coding it as above). Gnu Make will build the dependency graph and compile the executable with a single command. Other implementations of Make may require that the “depend” target be built first, then the “all” target be built by a subsequent invocation of Make.

all:

depend:

foo.o: foo.c
cc -c -o foo foo.c

foo: foo.o
cc -o foo foo.o

all: foo

foo.d: foo.c
cc -M foo.c > foo.d

depend: foo.d

sinclude foo.d

Figure 9: A C program’s header file dependencies are computed by the compiler and read by the “sinclude” directive.

Conventional Top-Level or Phony Targets

The “all” and “depend” targets are among several top-level targets in widespread use, by convention. The targets shown in Table 1 fit into this general category. These targets do not correspond to actual artifacts in the filesystem. For this reason, they are also described as phony targets.

Phony Target	Action
all	Depends on all desired end products of the build. This is often the default target in a Makefile.
clean	Depends on nothing but has the side effect of removing all derived files from the source tree, leaving it clean.
depend	Depends on all generated Makefiles.
includes	Depends on all generated header files.
install	Depends on all desired end products of the build, and installs them on the host for general use. Usually requires root access to complete successfully.
libs	Depends on all archive and shared libraries.
Table 1: Conventional phony targets and their actions.

Double-Colon Rules

Make's basic rules are suitable for deriving files that contain a single kind of data and are rebuilt using the same procedure regardless of which prerequisite has changed. However, they are not suitable for files that have multiple sections that derive differently depending on which prerequisites have changed.

For that purpose, Make has a second kind of rule, called the double-colon rule. It syntax is intuitive, as shown in Figure 10.

target:: prerequisites
recipe

Figure 10: Syntax of a double-colon rule.

Double-colon rules differ from basic rules in the following ways: Every double-colon rule has a recipe, and that recipe executes whenever the target is out of date with respect to any of the prerequisites listed in that specific rule. If no recipe is given, a default is assumed. If the prerequisite list is empty, then the recipe always runs (even if the target already exists).

A single target can have multiple double-colon rules, and multiple recipes can run (in arbitrary order) to bring the target up to date. A double-colon rule cannot therefore be split up in the same way that a basic Makefile rule can be, to compute long prerequisite lists that are distributed throughout the Makefile (or in included Makefiles); each double-colon rule must be complete.

Double-colon rules are rarely used to build real targets, but they are often used to build phony targets. For example, the “clean” target has no prerequisites and many actions. Therefore, a double-colon rule is useful to implement it. Figure 11 shows an example in which making the “clean” target deletes all of the generated artifacts.

all:

clean::

depend:
foo.o: foo.c
cc -c -o foo foo.c

foo: foo.o
cc -o foo foo.o

all: foo

clean::
rm -f foo foo.o

foo.d: foo.c
cc -M foo.c > foo.d

clean::

rm -f foo.d

depend: foo.d

sinclude foo.d

Figure 11: Using double-colon rules to delete derived artifacts.

Macros

Make has macros, which are often erroneously called “variables.” Macros are used to identify targets or prerequisite lists, or are used as an abstraction tool to tune recipes. Assignment is straightforward, as shown in Figure 12.

name = value

Figure 12: Syntax of a macro definition.

Here, “name” is the name of the macro, and “value” is the string that is assigned to it. When a macro stores a long value, the assignment can be split into multiple lines using a continuation character: Add a backslash (\) to the end of the line and continue on the next line.

Macro expansion can be done in a few ways: $name, if the name is a single letter (or character, as shown below), or either $(name) or ${name} regardless of the length of the name. Undefined macros expand to the null string.

Make delays expansion of macros for as long as possible. When used in targets and prerequisite lists, they are expanded at the time the rule is read while the dependency graph is computed. When used in recipes, macros are expanded at the time the recipe executes.

Make imports environment variables and sets corresponding macros. Some variations of Make have additional built-in macros. Some variations of Make also have ways to change the way that macros are expanded (to expand early or to perform string substitutions), and Gnu Make has parameterized macros.

Many environment variables can have a detrimental effect on a build. Any that are known to cause failures should cleared. Some variations of Make include an “unset” directive that will delete the setting of a macro that corresponds to an environment variable, and also deletes it from the environment of commands spawned by a recipe. If the “unset” directive is not available then the macro can be set to a null value.

Pattern Rules

Nearly all variations of Make have pattern rules. Those that don’t have pattern rules have suffix rules, which are less flexible. Suffix rules are included in the POSIX standard for Make, whereas pattern rules are not. For simplicity, this paper assumes that pattern rules are available, and that suffix rules are not used.

Pattern rules are an abstraction tool to define how files having a certain naming convention derive other files having a related naming convention, based on wildcards. The wildcard character is the per-cent sign (%). It appears in the target of a rule, and in a prerequisite that is specified on the same line as the target. Additional prerequisites without patterns may also be specified.

Make defines special macros for use within the recipes of pattern rules to use in place of actual file names. These predefined macros are listed in Table 2. Some macros are useful only in the recipes of pattern rules, others can be used in any recipe.

Macro name	Macro value
$?	List of prerequisites that are newer than the target, can be used in any recipe.
$@	The name of the target, can be used in any recipe.
${@D}	Directory containing the target file, with no trailing slash.
${@F}	Basename of the target file.
$<	In a pattern rule, this is the prerequisite that matches the pattern. Otherwise it is the first prerequisite listed for the target.
${<D}	Directory containing the prerequisite file, with no trailing slash.
${<F}	Basename of the prerequisite file.
$*	This is useful only in pattern rules. In a pattern rule, this is the substring that matches the pattern’s wildcard, and nothing else.
$$	The dollar sign, useful when recipes use shell variables.
Table 2: Predefined macros.

Pattern rules are normally factored out into a Makefile that is included. This provides a way to standardize the recipes used to build various types of artifacts. Figure 13 illustrates an example Makefile, named “patterns.mk”, containing pattern rules for compiling C programs and computing their header file dependencies.

# Contents of patterns.mk file

%: %.o
cc -o $@ $<

%.o: %.c
cc -c -o $@ $<

%.d: %.c
cc -M $< > $@

Figure 13: Pattern rules to compile a C source program to an executable binary and to produce a Makefile containing header file prerequisites.

Figure 14 shows an example Makefile that uses the pattern rule definitions in the patterns.mk file by including the file. This Makefile defines rules to compile a C source file and also to compute its header file dependencies (which are also loaded by the sinclude directive). The program is compiled by making the “all” or “foo” targets. Some variations of Make (e.g., Gnu Make) will also compute the header file dependencies, but others may require the “depend” target to be built first.

# Define standard targets

all:

clean::

depend:

# Read the pattern rules
include patterns.mk

# Compile foo.c to executable
foo: foo.o

all: foo

# Compute foo.c header dependencies
foo.d: foo.c
depend: foo.d

sinclude foo.d

# Clean the source tree
clean::
rm -f foo foo.o foo.d

Figure 14: Makefile using pattern rules defined in a second Makefile named “patterns.mk”.

Patterns do not always represent the stem of a filename. They may match directory prefixes, or even intermediate directory names, as well! This can be useful for non-recursive procedures to descend into source trees from a root directory containing the Makefile for an entire project.

Multiple Targets Define Multiple Rules

It is valid syntax in a Makefile to specify more than one target when writing rules. However, Make does not model this in a way that the user expects. Make’s design requires that a rule can produce exactly one target. So when multiple targets are given in the definition of a rule, what really happens internally is that each target defines a node in the dependency graph having an identical prerequisite list and recipe to the other targets specified in the rule. Figure 15 illustrates this behavior of Make by presenting two equivalent Makefiles.

y.tab.c y.tab.h: grammar.y
yacc -d grammar.y

y.tab.c: grammar.y
yacc -d grammar.y

y.tab.h: grammar.y
yacc -d grammar.y

Figure 15: Two equivalent Makefiles producing two targets using identical prerequisite lists and recipes.

This causes the Yacc tool to run twice, even though both invocations produce both the y.tab.c and y.tab.h files. Although this might seem like a minor and tolerable performance problem, particularly with Yacc, it can cause problems for a couple of reasons: Updating the mod time on a target that has already been built can cause unpredictable results, and the second invocation of the tool may actually produce different results than the first and thus reduce consistency of the build.

Suppressing Output

Make normally displays the commands of each recipe as it invokes them. This can be suppressed if the first character of a command is the “@” character. The pattern rule in Figure 16 to generate C header file dependencies has been modified so that it is not displayed.

%.d: %.c
@cc -M $< > $@

Figure 16: Suppressing the display of recipes.

Common Makefile Errors

Having completed a review of the Make programming language, the following sections will discuss some of the common coding errors that cause unexpected results. Many of the examples to follow are incompatible with the POSIX standard for Make because they rely on pattern rules.

Undeclared Prerequisites

Undeclared prerequisites are artifacts that are consumed by a recipe but are not listed in a rule’s prerequisite list. They cause targets to be skipped when they should be rebuilt. Such inputs can be explicit or implicit.

The Makefile in Figure 17 demonstrates the condition with a contrived example. A file named "undeclared_prerequisite" located in the present working directory is consumed by the recipe. But it is incorrectly omitted from the dependency graph. As a result, the xx.out file is not rebuilt when the undeclared_prerequisite file is changed. Furthermore, the xx.out file will fail to build if the undeclared_prerequisite file is missing.

xx.out: xx.in
grep “some-value” xx.in undeclared_prerequisite > $@

Figure 17: Makefile rule demonstrating an undeclared prerequisite file.

On many systems, the operation of the grep program also depends on the GREP_COLORS environment variable, which affects the contents of the target. This environment variable may be set to a value tailored to the user, and its value may vary between different users. Some shells, such as Bash, read a configuration file located in the user's home directory whenever they launch (without being given specific command line parameters to prevent it). Because Make invokes the shell to execute the Makefile's recipes, it introduces this configuration file as an undeclared prerequisite in its default behavior.

Examples of explicit undeclared prerequisites are the names of tools like the compiler and the rest of the standard Unix tool set, or system-supplied libraries like libm.a or libm.so that are supplied to the linker (perhaps with an abbreviated notation like -lm). Examples of implicit undeclared prerequisites are archive or shared libraries such as libc.a or ld.so that are read by linker without explicit direction to the contrary, or hidden files located in the user’s home directory that could affect the output of tools like grep or ls or rpmbuild. Also implicit are groups of related artifacts that are represented by a placeholder or proxy target, which is described a later section of this paper.

Generally speaking, any file located in the source tree that is consumed by a recipe should be included in the rule’s prerequisite list. This is true for scripts that are included with the source code, and tools that derive from the source tree and are used later by the build. Including these files ensures that the tools are up to date before they are used, and also that any change to a tool causes the artifacts built by that tool to be rebuilt. The Makefile itself should be a prerequisite of a target that derives from no other source than the rule that creates it. The Makefile may also be included in the prerequisite lists of specific targets while the Makefile is in development, but would normally be removed as the rules stabilize. An exception is a set of artifacts that are represented by a proxy target.

Artifacts that are supplied by the system (i.e. created while the operating system and auxiliary packages are installed) with read-only permissions may be omitted from the prerequisite list. (But note that C and C++ header files are normally included by the compiler’s dependency graph generator.) Most shops accept that a source tree must be rebuilt from scratch in a clean sandbox if the build environment is upgraded.

Artifacts that are neither located in the source tree nor provided with read-only permission by the system are uncontrolled sources and can cause failures that are difficult to debug. Such files might be configuration files stored in the user's home directory, or tools or data owned by prominent users and referenced via a path within their home directory. Uncontrolled sources like these should be eliminated from the build wherever possible. Otherwise they should deleted or emptied, worked around, or (as a last resort) provided with standardized content so that all users will have similar configurations.

Environment variables such as LD_LIBRARY_PATH affect the way that recipes consume prerequisites, whether they are declared or not. Environment variables in general are difficult because there are so many of them and they can be poorly documented and difficult to discover. The fact that they can be set on the user's command line means that they can affect the build in non-repeatable ways. Environment variables that are known to affect the build can be set to reasonable values (or be unset) in a Makefile. An alternative is to invoke Make through a wrapper script that sets up a standardized environment (and unsets all environment variables other than those specified in the standardized environment).

Side-Effects

Side-effects occur when a rule produces one or more files in addition to the target artifact. The pattern rule in Figure 18 truncates a space-separated table to 5 columns, then turns it into comma-separated values. The side-effect is a temporary file named with a ".csv.t" suffix. (Note that the temporary file is correctly cleaned by the "clean" rule.)

%.csv: %.table
cut ‘-d ’ -f 1-5 < $< > $@.t
tr ‘ ’ ‘,’ < $@.t > $@

clean::
rm -f *.csv *.csv.t

Figure 18: Pattern rule producing a side-effect.

By themselves, side-effects don’t cause much trouble. The problem occurs when the side-effect is used as an input to another rule, whether or not it is included in that rule’s prerequisite list. The pattern rule in Figure 19 illustrates a case where the side-effect from Figure 18 is included in a prerequisite list.

%.5: %.csv.t
( echo “col1 col2 col3 col4 col5”; cat $< ) > $@

clean::
rm -f *.5

Figure 19: Incorrect use of a side-effect as a prerequisite.

Because the side-effect is not a target for any rule, Make’s dependency graph does not guarantee that the side-effect is produced before it is consumed. This can cause incorrect or non-repeatable results or a “Don’t know how to make” error. If the rule is modified to hard-code the side-effect (and removing it from the prerequisite list), the recipe will simply fail or produce non-deterministic results.

To fill in the dependency graph, the rule that produces the side-effect should be split into two rules. One rule changes the side-effect into a target, the other updates the final target’s prerequisite list to depend on the new target. Figure 20 refactors the pattern rule of Figure 18 to turn the side-effect into a target, thus correctly filling in the dependency graph.

%.csv.t: %.table
cut ‘-d ’ -f 1-5 < $< > $@

%.csv: %.csv.t
tr ‘ ’ ‘,’ < $< > $@

Figure 20: Refactor the pattern rule of Figure 18 to correct the dependency graph.

There are some common exceptions in which side-effects are produced by design and must be brought back in to the dependency graph. These are discussed in later sections about tools that produce multiple output files.

Target Files and Idempotence

Some recipes perform in such away that, when they complete, the target file appears to be unchanged except that its inode modification time (or "mod time") has been updated. This can cause a lot of unnecessary work later during the build because everything depending on this target has gone out of date according to the mod times. And yet, the result of rebuilding the dependent targets can also produce identical results.

For example, a code generator may produce a header file, and the interfaces defined in the header file do not change even though the implementation written in the prerequisite file has changed. Another case is when a file is written from scratch by a Makefile recipe, and its only prerequisite is the Makefile (or it has no prerequisites at all).

An implementation detail of Make seems to be that it checks the mod times of prerequisites while it traverses the dependency graph, and it compares the mod times of the target vs. prerequisites as it visits each node. This opens the possibility to preserve timestamps of files whose contents do not appear to be changed when a recipe runs.

Figure 21 shows an example Makefile that defines a version number and writes it into a header file.

VERSION = 1.2.3

version.h: Makefile
echo ‘#define VERSION ”$(VERSION)”’ > $@

Figure 21: Makefile rule writes a version number into a C header file.

The contents of the version.h file change only when the value of the $(VERSION) macro changes, but not when the Makefile changes in any other way. This appears to be the proper implementation because if the Makefile prerequisite is removed, the header file would never be updated except after the source tree has been cleaned. Conversely, the version.h file is updated properly if the value of the VERSION macro is changed, or if the rule that produces the version.h file is changed. However, the recipe executes every time the Makefile changes in any way, thus updating the mod time of the version.h file. Then every file that depends on the version.h file will be rebuilt, even if the Makefile changes do not affect the version number.

Figure 22 illustrates how to maintain the mod time of the target file if it is left unchanged after the main logic of the recipe completes. In this recipe, the new header file is derived in a temporary file. The cmp program compares its input files (if they exist) and exits successfully if both files are bit-for-bit identical. If version.h does not exist or differs from version.h.t, the temporary file replaces the target. Finally, the temporary file is forcefully removed. Now the Makefile can be changed and objects that depend on version.h file won’t recompile unless the version number has changed.

VERSION = 1.2.3
version.h: Makefile
echo ‘#define VERSION “$(VERSION)”’ > $@.t
cmp $@ $@.t || mv -f $@.t $@
rm -f $@.t

Figure 22: Recipe preserves the mod time of the target unless its content changes.

Auto-discovery

Auto-discovery is a computational procedure in which Make explores the source tree to identify work to do. In its normal operation, Make's dependency graph is computed from rules that are coded into the Makefile. But the sinclude directive combined with rules that write Makefiles provides a way to amend Make's dependency graph with rules that are computed while probing the source tree.

A form of auto-discovery was presented in earlier sections involving C header file dependency checking, in which the C compiler produced Makefiles to be consumed by Makefile inclusion. An alternative implementation of this capability, attributed to Tom Tromey, computes C header prerequisite lists as side effects of the compilation rules. A pattern rule for this appears in Figure 23. This method relies on knowing that any change to the dependency graph related to C header files, be it an addition or deletion, also depends on a change to at least one of the remaining C source or header files. This method ensures that, even if the dependency graph is incorrect at the moment that Make is subsequently invoked, all of the necessary files are processed to recompute the corrected graph.

all:

%: %.c
cc -o $@ $
cc -M %< > $*.d

foo: foo.c

sinclude foo.d

all: foo

Figure 23: Tom Tromey's method to produce C header

Another form of auto-discovery is to scan the source tree for C files that contain the definition of a function named "main" and fill in the dependency graph to produce executable binaries. This is illustrated in Figure 24. This Makefile does the following:

Define a conventional phony target named "all" that depends on executable binary files.
Define a pattern rule to compile a C source file to an object file.
Define a pattern rule to link an object file to produce an executable binary.
Include a Makefile named "binaries.d".
Derive the "binaries.d" file upon every invocation of Make:
- Identify all C source files containing a definition of a function named "main".
- Add the executable binaries to the "all" target's prerequisite list.
- Rely on Make's transitivity of pattern rules to complete the dependency graph to connect object and source files to the executable binaries.

all:

%: %.o

cc -o $@ $<

%.o: %.c

cc -o $@ -c $<

FORCE:

binaries.d: Makefile FORCE
find . -type f -name '*.c' -print | \
xargs fgrep -l ' main(' | \
sed -e 's/^../all: /' -e 's/..$$//' > $@.t
cmp $@.t $@ || mv -f $@.t $@
rm -f $@.t

depend: binaries.d

sinclude binaries.d

Figure 24: Auto-discovering executable binary files.

This method is useful in other use cases, such as producing libraries from the entire contents of a directory. However, any derived C source files must be mentioned explicitly in one of the Makefiles, be it hard-coded or computed.

This method is not applicable in situations where the names of the derived files are not predictable. In other words, auto-discovery is not applicable for discovering derived files. Proxy targets, which are discussed in a later section, should be used instead.

This method does not automatically clean up the directory by removing derived files that were created as a result of auto-discovery. Auto-deletion is discussed in the following section.

Auto-deletion

Auto-deletion is the automatic deletion of artifacts during a build, driven by a computation to identify the unwanted artifacts. This is distinct from other removal actions such as the conventional "clean" target that removes artifacts using hard-coded wildcards or other patterns.

Because Make can only model the creation of target artifacts, removals cannot be modeled in the dependency graph. Instead, auto-deletion is done as a side-effect of another rule. For example, obsolete binaries might be removed from a sandbox in the recipe of a phony target, such as the conventional "all" target.

The basic approach to auto-deletion is to take inventory of the relevant artifacts in the sandbox, then compute the set of desirable artifacts (which might be coded into the Makefile, or auto-discovery may be used), and remove the set difference of the two lists. In a recipe, this can be done by writing the two lists into temporary files, then use a tool like the standard "comm" program to compute the difference and remove the files listed in its output. Gnu Make also has functions to compute the set difference between two lists stored in macros.

The Makefile in Figure 25 uses auto-discovery to identify desirable executable binary files, which are stored in a directory named "bin", and saves the list in a Makefile macro named EXECS. It also takes inventory of the "bin" directory at the start of the build, and stores it in the macro named BIN_EXECS. A new "auto-delete" phony target's recipe computes the set difference between $(BIN_EXECS) and $(EXECS) and removes the obsolete artifacts. The conventional "all" target depends on the desirable artifacts, and the new auto-delete target. The conventional "clean" target removes all of the executable binaries.

all:

EXECS =
BIN_EXECS =

bin/%: %.o

mkdir -p $(@D)

cc -o $@ $<

%.o: %.c

cc -c -o $@ $<

binaries.d: Makefile FORCE
{ echo 'EXECS = \\'; \
find . -type f -name '*.c' -print | \
xargs fgrep -l ' main(' | \
sort | \
sed -e 's@^..@ bin/@' -e 's@..$$@ \\@'; \
echo; \
echo 'BIN_EXECS = \\'; \
ls -1 bin/* | \\
sort | \
sed -e 's@$$@ \\'; \
echo;
} > $@.t
cmp $@.t $@ || mv -f $@.t $@
rm -f $@.t

sinclude binaries.d

autodelete:
for f in $(BIN_EXECS); do echo "$$f"; done > $@.1.t
for f in $(EXECS); do echo "$$f"; done > $@.2.t
comm -23 $@.1.t $@.2.t | xargs -t rm -f
rm -f $@.1.t $@.2.t

all: autodelete $(EXECS)

clean::
rm -rf bin/*

Figure 25: Auto-deletion is done in the "all" target's recipe

The auto-delete target deletes any executable binary that is not identified through auto-discovery. That means that, if the source file containing a program's "main" function is removed from the source tree, the derived executable binary will also be removed during the next compilation using the "all" or "autodelete" target.

Working With Tools That Produce Multiple Output Files

Some tools produce multiple files as output, making them difficult to model in Make. Make’s design assumes that any given rule will produce exactly one target. Under these conditions, poorly coded workarounds often involve a broken dependency graph: Undeclared prerequisites cause expected artifacts to be skipped by a build, and idempotence can cause extra work.

One approach to solving these problems is to create a uniquely named temporary directory, process the input there, and copy the output back out to create the derived artifacts. Only one of the output files is treated as a target during processing, and the remaining artifacts are connected to the dependency graph separately.

Tools that produce multiple output files fall into three general categories. In the first category, the tool produces output files with fixed names, as Yacc does. In the second category, the recipe produces output files with predictable names, such as when the compiler produces dependency graphs, or when Bison is used. In the third category, the tool produces files with names that cannot be predicted by Make, as rpmbuild does.

Tools Producing Output with Fixed Names

Tools like Yacc take arbitrary files as input and produce multiple output files having fixed names. Yacc in particular will read an arbitrary file, then produce two files. By convention, its input file is named with a “.y” extension, but to avoid conflict with built-in rules the ".yy" extension will be used here. Yacc creates files named y.tab.c and y.tab.h in the present working directory that contain C source code and interface definitions, respectively.

Such tools are hard to use if they must be applied to multiple input files. Furthermore, some invocations may be idempotent with respect to a subset of the output files. A typical approach to using tools like this is to try to limit their use to just one invocation per directory in the source tree. In Figure 26, y.tab.c is the actual output of the tool. The y.tab.h file is a side-effect that gets reconnected to the dependency graph by making it depend artificially on the y.tab.c file. We also ignore the idempotence issue, so that every change to the grammar.yy file causes all artifacts that depend on either the y.tab.c or y.tab.h files to rebuild, even if their contents do not change.

# Run Yacc to produce y.tab.c and y.tab.h files, leaving
# y.tab.h disconnected from the dependency graph.

y.tab.c: grammar.yy
yacc -d grammar.yy

# Reconnect y.tab.h to the dependency graph.

y.tab.h: y.tab.c
touch y.tab.h

Figure 26: Yacc produces y.tab.c and y.tab.h files, but must be modeled by two rules. The y.tab.h file reconnects to the dependency graph but ignores idempotence.

Figure 27's pattern rules illustrate how using this type of tool can be generalized in a Makefile, using Yacc for a project that requires two or more parsers. The pattern rules create a uniquely named temporary directory for each source file, copy the source file into the temporary directory (using a hard link), run the tool there, and copy its output back out (again using a hard link). The rules also consider the idempotence issue.

Note that the Makefile of Figure 27 does not address a different issue when using Yacc. Yacc's output files define many symbols having standard names having a standard "yy_" prefix. As a consequence, the y.tab.c and y.tab.h files should be edited further, perhaps using a tool like sed, to make these symbols unique if a specific binary executable contains more than one parser.

all: grammar.c grammar.h

# Copy source file to a similarly named temporary directory and
# run Yacc. Leave %.yd/y.tab.h disconnected from the dependency
# graph.

%.yd/y.tab.c: %.y
rm -rf ${@D}
mkdir -p ${@D}
ln -f $< ${@D}/${<F}
cd ${@D} && yacc -d ${<F}

# Copy and rename y.tab.c out of the temporary directory, if changed by Yacc.

%.c: %.yd/y.tab.c
cmp $@ $< || ln -f $< $@

# Copy y.tab.h out of the temporary directory if changed by Yacc, and
# reconnect it to the dependency graph.

%.h: %.yd/y.tab.c
cmp $@ ${<D}/y.tab.h || ln -f ${<D}/y.tab.h $@

# Remove temporary directories created by Yacc pattern rules.

clean::
rm -rf *.yd grammar.c grammar.h

Figure 27: Pattern rules invoke Yacc in a temporary directory, working around its naming convention and considering idempotence.

Tools Producing Output with Predictable Names

Some tools will derive the names of their output files from the names of their input files, and some are flexible enough to allow the user to specify the names of the output files. Pattern rules can take advantage of these tools.

Bison is such a tool. Bison is a follow-on to Yacc, and it also reads one file and produces two files: A compilable source file and a header file. However, unlike Yacc, Bison names it output after its input. In Figure 28, Bison runs in a temporary directory into which all of its output files are copied. Each of the derived %.tab.c and %.tab.h files is checked for idempotence as it is copied back out of the temporary directory.

# Copy %.yy file to "bison" temporary directory.

bison/%.yy: %.yy
mkdir -p bison
ln -f $< $@

# Invoke Bison in temporary directory to produce %.tab.c and
# %.tab.h files.

bison/%.tab.c: bison/%.yy
cd ${<D} && bison -d -o ${@F} ${<F}

# Copy %.tab.c file back out of temporary directory if changed.

%.tab.c: bison/%.tab.c
cmp $< $@ || ln -f $< $@

# Reconnect %.tab.h file to dependency graph and copy it out if
# changed.

%.tab.h: bison/%.tab.c
cmp ${<D}/${@F} $@ || ln -f ${<D}/${@F} $@

# Clean the temporary directory.

clean::
rm -rf bison

Figure 28: Pattern rules run Bison in a temporary directory, reconnect its derived header file to the dependency graph, and check for idempotence.

Tools Producing Output with Unpredictable Names and Proxy Targets

Some tools just do what they will and produce results having names that cannot be represented by pattern rules. Tools like rpmbuild name their output based on the contents of their input files. Tools like createrepo use hash values to name their output, which (in a Makefile) are indistinguishable from random names. The Java compiler produces class files named after the classes defined in the source code, and synthesizes the names of class files containing anonymous classes.

To use such a tool in a Makefile, we create a flat file with a predictable name as a proxy, a proxy target, to represent the output of the tool. Recipes of rules that depend on the proxy target must understand how to consume the actual data that the proxy represents. A useful technique is to probe the sandbox after the tool completes and write the names of the built files into the proxy file. Recipes of dependent rules can read the proxy file for the actual identities of the produced files.

Note that this is different from auto-discovery. Auto-discovery computes portions of Make's dependency graph by examining source files. Although it is possible that artifacts such as C header files can be derived, there are usually references to such derived files made within other source files. With auto-discovery, the dependency graph can be computed before artifacts derive from the discovered artifacts. In contrast, proxy targets identify derived files that are not easily added to the dependency graph while it is being traversed.

With proxy targets, new artifacts are derived first, and then they are discovered and inventoried after the fact. The proxy target represents the derived files in the dependency graph, and the inventory stored within drives the consumption of the artifacts in rules that depend upon the proxy target.

In Figure 29, the present working directory contains two child directories named "src" and "specs". The src directory contains one or more child directories that contain source code and the procedures to build them. The specs directory contains one or more RPM spec files that describe how to create binary RPMs that correspond to the source trees. The name of each source tree matches the basename of the corresponding spec file. Pattern rules create a standard RPM build tree. One rule produces a gzip'ed tar archive of the source tree (using auto-discovery) in the proper place in the RPM build tree. Another pattern rule copies the spec file to its proper place. Another rule builds a binary RPM (which depends on the source code archive and the copied spec file). Because the name of the rpm file derives from the contents of the spec file, it cannot be determined by a pattern rule. Instead, the RPM build area is searched for the new RPM and its identity is written into a proxy file whose name can derive from the name of the spec file. Finally, a top-level target named “rpm” is defined and the customary “clean” rule is amended.

# Define standard top-level phony targets.

rpm:
depends:
clean::

# Identify the RPM.

PKG = foo

# Specify the layout of a standard RPM build tree.

RPMBUILD = rpmbuild/
RPMRPMS = $(RPMBUILD)RPMS
RPMSRC = $(RPMBUILD)SOURCES/
RPMSPECS = $(RPMBUILD)SPECS/

# Phony target used as a prerequisite to force building artifacts.

.FORCE:

# Pattern rule to take inventory of RPM source tree, producing
# a Makefile having a .src.d extension. The Makefile establishes
# the RPM source tar archive as depending on each and every file
# that exists in the src tree. This inventory is sinclude'd at a
# later time.

%.src.d: .FORCE
find src/$* -type f -print | sort | \
sed -e 's/^/$(RPMSRC)$*.tar.gz: /' > $@.t
cmp $@ $@.t || mv -f $@.t $@
rm -f $@.t

# Pattern rule to build an RPM source tar archive. This target is
# rebuilt if the source tree inventory changes (a file was added or
# or removed, causing the %.src.d file to change), or if one of the
# files listed in it has changed (because it's sinclude'd later).

$(RPMSRC)%.tar.gz: %.src.d
mkdir -p $(@D)
tar cf - src/$* | gzip > $@

Figure 29: Using proxy targets in pattern rules when building binary RPMs.

# Pattern rule to copy RPM spec file to RPM build area.

$(RPMSPECS)%.spec: specs/%.spec
mkdir -p $(@D)
cp -f $< $@

# Pattern rule to build an RPM from a spec file. Also requires
# the RPM's source tar archive as prerequisite

%.rpm.proxy: $(RPMSPECS)%.spec
rpmbuild -bb --define=_topdir=$(RPMBUILD) $<
find $(RPMRPMS) -type f -print | sort > $@

# Complete the dependency graph: Read the source tree inventory
# file, and make the RPM depend on its spec file and source tar
# archive file.

sinclude $(PKG).src.d

$(PKG).rpm.proxy: $(RPMSRC)$(PKG).tar.gz $(RPMSPECS)$(PKG).spec

# Implement the standard high-level phony targets.

depends: $(PKG).src.d

rpm: $(PKG).rpm.proxy

clean::
rm -rf $(RPMBUILD) *.rpm.proxy *.src.d

Figure 29 (continued): Using proxy targets in pattern rules when building binary RPMs.

Figure 30 extends the above example by creating a Yum repository in the RPMS tree where binary RPMs were stored. It runs the createrepo command in the rpmbuild/RPMS directory, then takes inventory of the tree and writes the result into a proxy file. The inventory contains only those files created by the createrepo command. Another top-level target named “yum” is created. The "clean" target is amended to remove only the Yum repo's proxy target because the actual repo is covered by the rule that cleans RPMs.

# Define a new phony top-level target for Yum repo.

yum:

# Rule to produce a Yum repo after RPMs have been built. Depends on
# the RPM's proxy target, which contains a list of actual RPMs.
# Create the Yum repository, then take inventory of the result,
# filtering out the pre-existing RPMs.

yum.repo.proxy: $(PKG).rpm.proxy
rm -f $@
cd $(RPMRPMS) && rm -rf repodata && createrepo .
find $(RPMRPMS) -type f -print | sort | comm -23 - $< > $@

# Add the new repo to the top-level target.

yum: yum.repo.proxy

# Clean the Yum repo's proxy target. Note that the actual Yum repo
# is cleaned by the same rule that cleans RPMs.

clean::
rm -f yum.repo.proxy

Figure 30: Using proxy targets as a prerequisite and a target to create a Yum repository.

The yum.repo.proxy target that represents the actual Yum repository depends on the RPM(s) to be included in it. This ensures that the Yum repository is built after all of the RPMs have been built.

Prerequisites Are Not Built in List Order

There is a common misconception that prerequisites are built in the order in which they appear in a prerequisite list. The contrived example of Figure 31 illustrates this.

all:

tool: tool.c tool.h
cc -o $@ $<

frob.dat: frob.in
./tool $< > $@

all: tool frob.dat

clean::
rm -f tool frob.dat

Figure 31: The "all" target incorrectly assumes that its prerequisites are built in the order listed.

In this example, there is a home-built tool that filters a proprietary data file (named "frob.in") to produce a shippable version of that file (named "forb.dat"). The idea is that, when building the “all” target, both the tool and the shippable data file are produced. This behavior is also observed when running the “make all” command.

However, the Makefile of Figure 31 is not correct. After building the “all” target, the user can modify tool.c and run “make all” again. Under this condition, the tool is rebuilt as expected. But the frob.dat file is not rebuilt, even though the change to the tool could change the contents of the frob.dat file. Even worse is that for the tool change to take effect, the user must build the “clean” target, or remove the “frob.dat” file directly, or modify the “frob.in” file, before building the "all" target again.

Furthermore, some variations of Make permit multiple targets to be built concurrently. Running “make all” in this mode may cause the tool and frob.dat file to be made concurrently, and the frob.dat file may either fail to build due to the absence of the tool, or it may build incorrectly because the tool is out of date at the moment that the frob.dat recipe runs.

The reason this Makefile appears to work in a common use case is due to an implementation detail of many variations of Make: When there are fewer execution threads available than there are targets to build, Make does seem to traverse lists from left to right and process what it finds.

The correct Makefile eliminates the undeclared prerequisite of the frob.dat file by adding the tool to its prerequisite list. This will ensure that the tool exists and is up to date before the frob.dat file is generated. It also ensures that the frob.dat file is rebuilt whenever the tool changes. Figure 32 shows the correct Makefile.

all:

tool: tool.c tool.h
cc -o $@ $<

frob.dat: frob.in tool
./tool $< > $@

all: tool frob.dat

clean::
rm -f tool frob.dat

Figure 32: In the corrected Makefile, the frob.dat file depends on the tool that produces it.

Note that in Figure 32, the “all” target’s prerequisite may be minimized to contain only the frob.dat file. This is because the frob.dat file depends on the tool, so both artifacts are guaranteed to exist and be up to date after the “all” target builds. However, there is no harm to include the tool in the “all” target’s prerequisite list. It is, in fact, a best practice to make the “all” target depend on all of the shippable artifacts produced by the Makefile.

Figure 33 illustrates the same concept, using a compiled binary and static library rather than a tool and a data file. Here, a file named libfoo.c is compiled to produce a static library named libfoo.a, and a tool named foo is compiled from its source file foo.c and is linked with the static library.

all:

%.o: %.c
cc -c -o $@ $<

%.a: %.o
rm -f $@
ar rcv $@ $<
ranlib $@

foo.o: libfoo.h

foo: foo.o
cc -o $@ $< -L. -lfoo

all: libfoo.a foo

clean::
rm -f foo libfoo.a foo.o

Figure 33: The "all" target incorrectly assumes that libfoo.a is made before the foo executable binary.

In this case, the first run of the “make all” command will successfully produce both the libfoo.a library and the foo executable. If the libfoo.c file is modified and “make all” is run again, only the static library is built and the executable is skipped. Running “make foo” is also a non-operation. The user is therefore inclined to run “make clean” and then “make all” to get a correct build. The root cause is an undeclared prerequisite: The library is omitted from the executable’s prerequisite list. Figure 34 shows the corrected Makefile.

all:

%.o: %.c
cc -c -o $@ $<

%.a: %.o
rm -f $@
ar rcv $@ $<
ranlib $@

foo.o: libfoo.h

foo: foo.o libfoo.a
cc -o $@ $< -L. -lfoo

all: libfoo.a foo

clean::
rm -f foo libfoo.a foo.o

Figure 34: In the corrected Makefile, the foo binary executable depends on the libfoo.a library.

Standardize Target and Prerequisite Paths

Some variations of Make are a little dumb when building the dependency graph in the sense that each target must be a unique path. Furthermore, each occurrence of any given file appearing as a target or in a prerequisite list must be represented by the identical path in the filesystem. For example, the distinct filesystem paths “foo” and “./foo” identify the same file, but to some variations of Make they represent two different nodes in the dependency graph. This can confuse Make’s dependency analysis and produce incorrect results.

It is therefore necessary to identify every file within the Makefile using a single path. Removing components such as “.” and “..” from the middles of target and prerequisite paths is important, as is resolving symbolic links. For example, the path “../obj/lib/foo/../bar.a” should be minimized to “../obj/lib/bar.a” consistently throughout the Makefile.

By convention, paths to files located within the source tree should be converted to relative paths. References to files outside of the source tree (such as system-supplied header files and libraries) are represented by fullpaths. This is so that the Makefile will continue to work even if the root of the source tree is renamed.

Minimize Prerequisite Lists

The Makefile in Figure 35 creates two libraries and two executables. Object files and archive libraries are built by pattern rules and are discovered through the dependency graph. Abstractions for libraries and executable binaries are defined using macros. Executables are built explicitly due to their unique linker instructions.

all:

libs:

%.a: %.o
rm -f $@
ar rcv $@ $<
ranlib $@

%.o: %.c
cc -c -o $@ $<

LIBS = lib1.a lib2.a
PROGS = prog1 prog2

libs: $(LIBS)

all: $(PROGS) $(LIBS)

prog1.o: lib1.h lib2.h

prog1: prog1.o $(LIBS)
cc -o $@ $< -L. -l1 -l2

prog2.o: lib2.h

prog2: prog2.o $(LIBS)
cc -o $@ $< -L. -l2

clean::
rm -r $(PROGS) $(LIBS) *.o

Figure 35: Prerequisite list of prog2 is not minimal, causing it to be incorrectly built when lib1.c changes.

In Figure 35, the prerequisite list of prog2 is not minimal: The prerequisite list includes the lib1.a library even though it is never used by the recipe. This causes prog2 to be needlessly re-linked whenever the lib1.c file changes.

To optimize the dependency graph to avoid this condition, list the required libraries explicitly in the prerequisite list. Figure 36 corrects the dependency graph accordingly.

all:

libs:

%.a: %.o
rm -f $@
ar rcv $@ $<
ranlib $@

%.o: %.c
cc -c -o $@ $<

LIBS = lib1.a lib2.a
PROGS = prog1 prog2

libs: $(LIBS)

all: $(PROGS) $(LIBS)

prog1.o: lib1.h lib2.h

prog1: prog1.o lib1.a lib2.a
cc -o $@ $< -L. -l1 -l2

prog2.o: lib2.h

prog2: prog2.o lib2.a
cc -o $@ $< -L. -l2

clean::
rm -r $(PROGS) $(LIBS) *.o

Figure 36: Minimizing the prog2's prerequisite list avoids unnecessary rebuilds.

Avoid Using Phony Targets As Prerequisites

Phony targets are targets that, when built, leave no corresponding artifacts in the filesystem. Oftentimes they are top-level targets that a user can build for convenience rather than listing out the names of the artifacts explicitly on Make’s command line. This makes them attractive as abstractions for portions of the dependency graph, to be used as checkpoints in the course of a large build. But because they do not identify actual artifacts in the filesystem, they have no timestamps to compare to their prerequisites. Consequently, phony targets are always out of date and are always rebuilt when they are encountered in the dependency graph. That also means that every rule listing a phony target in its prerequisite list is also always out of date and always gets rebuilt. Figure 37 shows an example.

all:

libs:

%.a: %.o
rm -f $@
ar rcv $@ $<
ranlib $@

%.o: %.c
cc -c -o $@ $<

LIBS = lib1.a lib2.a
PROGS = prog1 prog2

libs: $(LIBS)

all: $(PROGS)

prog1: prog1.o libs
cc -o $@ $< -L. -l1 -l2

prog2: prog2.o libs
cc -o $@ $< -L. -l2

clean::
rm -r $(PROGS) $(LIBS) *.o

Figure 37: Targets prog1 and prog2 depend on the libs phony target.

In this case, both prog1 and prog2 depend on the libs target, which in turn depends on all of the libraries. This will guarantee that the libraries are all up to date before the executables are linked. However, the libs phony target is always out of date so the executables are always relinked whenever they are encountered in the dependency graph, even if they are up to date with respect to the libraries.

To correct this condition, replace the phony targets appearing in prerequisite lists with real artifacts. Figure 38 shows the corrected Makefile.

all:

libs:

%.a: %.o
rm -f $@
ar rcv $@ $<
ranlib $@

%.o: %.c
cc -c -o $@ $<

LIBS = lib1.a lib2.a
PROGS = prog1 prog2

libs: $(LIBS)

all: $(PROGS) $(LIBS)

prog1: prog1.o lib1.a lib2.a
cc -o $@ $< -L. -l1 -l2

prog2: prog2.o lib2.a
cc -o $@ $< -L. -l2

clean::
rm -r $(PROGS) $(LIBS) *.o

Figure 38: Replacing phony targets as prerequisites with actual artifacts as prerequisites.

In general, it is best to limit the use of phony targets to top-level targets that the user can specify on the command line. Phony targets can be used as prerequisites for other phony targets, but they should be avoided as prerequisites for actual artifacts unless the artifact must be rebuilt independently of source code changes. In this case, the derived artifact should be checked for idempotence to limit the impact of the phony target prerequisite.

Avoid Using Directories As Targets and Prerequisites

When using Make to build a project, there is an assumption that targets are only modified by Make. Because Make depends on timestamps of artifacts identified to it as targets and prerequisites, changing the timestamps outside of the context of a running Make confuses it. Touching a target, for example, causes Make to skip building that target until one of its prerequisites is also touched (or the target is removed).

Timestamps of directories change as side effects of manipulating their contents. For example, deleting a file will update a directory’s timestamp, as will creating an empty sub-directory. This happens often enough that directories should not be used as proxy targets for other artifacts.

In the example of Figure 39, a directory named “src” contains original source code. A Makefile target called “include” copies header files from the src directory to a sibling directory named “include”. The directory and target are the same, so that the timestamp of the directory determines whether or not the target is up to date. A C source file is compiled, using header files found in the "include" directory. The executable also depends on the "include" target so that the header files are brought up to date before it compiles.

all:

include:

clean::

include:
mkdir -p $(@D)
cp -f $< $@

bin/%: src/%.c include
mkdir -p $(@D)
cc -Iinclude -o $@ $<

include: src/foo.h

all: bin/foo

clean::
rm -f bin/* include/*

Figure 39: Using a directory as a target causes unexpected results.

The problem with this Makefile is that making the “clean” target also modifies the timestamp of the "include" directory, bringing the "include" target up to date. This causes subsequent compilations to fail due to the absence of header files in the "include" directory, even though the timestamp on the directory is newer than the headers that should be copied into it.

Because the "include" directory represents all of the header files of the project, every compiled binary (in theory) depends on every header file from Make’s point of view. That means that any change to any header causes every binary to recompile. This is probably not optimal because usually any given C source file includes only a subset of available header files.

It is a very rare case when using a directory as a target is the correct approach. Instead, a proper dependency analysis should be done using regular files as targets. The Makefile in Figure 40 models the executable binary correctly by adding header files to its prerequisite list. It also forces the "include" target by making it depend on a phony target.

all:

clean::

.FORCE:

include: .FORCE

include/%: src/%
mkdir -p $(@D)
cp -f $< $@

bin/%: src/%.c include
mkdir -p $(@D)
cc -Iinclude -o $@ $<

include: include/foo.h

bin/foo: src/foo.c include/foo.h

all: bin/foo

clean::
rm -f bin/* include/*

Figure 40: Header files are prerequisites of executable binaries, and the "include" target becomes a pure top-level target.

Delayed Macro Expansion

Makefile macro expansion is delayed until a macro is actually used. When Make initially reads and constructs the dependency graph, macro expansions appearing in targets and prerequisite lists are expanded before the rules begin executing. But macros appearing in recipes aren’t expanded until the recipes execute. This can lead to a condition in which a value appears to be used before it is set in the Makefile. Figure 41 is an example:

all: prog1 prog2

CCOPTIONS=-g

prog1: prog1.c
cc $(CCOPTIONS) -o $@ $<

CCOPTIONS=-O

prog2: prog2.c
cc $(CCOPTIONS) -o $@ $<

Figure 41: prog1 is compiled unexpectedly with optimization without debugging symbols due to the re-definition of the CCOPTIONS macro and deferred macro expansion.

Whenever prog1 compiles, it is compiled optimized without debugging symbols. This happens because, when the rule producing the prog1 binary executes, the last assignment to the CCOPTIONS macro was to select the “-O” compiler option while the dependency graph was constructed.

Delayed expansion can also cause discrepancies between the target and prerequisite list vs. the recipe. Figure 42 shows an example of a program that links incorrectly.

all: prog1 prog2
libs: lib1.a lib2.a

%.o: %.c
cc -c -o $@ $<

%.a: %.o
rm -f $@
ar rcv $@ $<
ranlib $@

%: %.c
cc -o $@ $< $(LIBS)

LIBS=lib1.a lib2.a

prog1: prog1.c lib1.h lib2.h $(LIBS)

LIBS=lib2.a

prog2: prog2.c lib2.h $(LIBS)

clean::
rm -f prog1 prog2 *.a *.o

Figure 42: prog1 links incorrectly due to the re-definition of the LIBS macro and deferred macro expansion, even though its prerequisite list is correct.

In this case, the prerequisite list of prog1 includes both lib1.a and lib2.a because the dependency graph is computed while the $(LIBS) macro lists both of the libraries. While Make continues to build the dependency graph, the LIBS macro is reassigned to be a shorter list of libraries. When the prog1 compilation recipe runs (after the dependency graph is completed), $(LIBS) still contains only the lib2.a library, causing prog1 to link incorrectly. The prog2 program builds correctly because the $(LIBS) macro has been set to the correct value before prog2’s node the in dependency graph is computed, and it remains unchanged until that recipe executes.

Some variations of Make have the ability to define macros with a per-rule scope. But the best approach to correct this condition is to define macros before writing prerequisite lists and recipes. Figure 43 corrects the errors by define multiple macros to define the proper sets of libraries to link with the executables, and it reorders the Makefile so that macros are defined near the top.

all:

libs:

LIB2=lib2.a
LIBS=lib1.a $(LIB2)

libs: $(LIBS)

%.o: %.c
cc -c -g -o $@ $<

%.a: %.o
rm -f $@
ar rcv $@ $<
ranlib $@

prog1: prog1.c lib1.h lib2.h $(LIBS)
cc -O -o $@ $< $(LIBS)

prog2: prog2.c $(LIB2)
cc -O -o $@ $< $(LIB2)

all: prog1 prog2 $(LIBS)

clean::
rm -f prog1 prog2 *.a *.o

Figure 43: Macros are defined near the top of the Makefile to avoid problems with deferred expansion.

Tracing and Logging

Make normally logs to the standard error output stream the recipes that it runs to produce each target. In Figure 44, the pattern rules might be used to compile and link C programs. In real Makefiles, the commands encoded into recipes can be quite long and difficult to read.

BINDIR = ../bin/
INCLUDES = -I/usr/local/include
LIBDIRS = -L/usr/local/lib
LIBS = -lmalloc

$(BINDIR)%: %.o
test -d $(BINDIR) || mkdir -p $(BINDIR)
cc -o $@ $(INCLUDES) $(LIBDIRS) $< $(LIBS)

%.o: %.c
cc -M $(INCLUDES) $< > $*.d
cc -c -o $@ $(INCLUDES) $<

Figure 44: Commands invoked in recipes are logged by default and can be difficult to read.

The shell commands coded into the recipes of the pattern rules can overwhelm interactive users. As a result, the recipes might be modified to simply show the names of the files being compiled and linked, along with the compiler's diagnostic messages, as shown in Figure 45.

BINDIR = ../bin/
INCLUDES = -I/usr/local/include
LIBDIRS = -L/usr/local/lib
LIBS = -lmalloc

$(BINDIR)%: %.o
@echo “Linking $@”
@test -d $(BINDIR) || mkdir -p $(BINDIR)
@cc -o $@ $(INCLUDES) $(LIBDIRS) $< $(LIBS)

%.o: %.c
@echo “Compiling $@”
@cc -M $(INCLUDES) $< > $*.d
@cc -c -o $@ $(INCLUDES) $<

Figure 45: Make's usual display of recipe commands is replaced by progress messages.

On the other hand, software build engineers invoke Make through automation and don’t need short progress messages. Instead, they want to log all of the commands that were actually executed to produce each target. The reason for this is so that they can review the full log after the fact to debug problems that might arise during the build. Sometimes the delay between the build and the analysis can be significant, perhaps long enough for the rules to change in the source code so that they no longer match the build logs. This limits the value of abbreviated logs in this use case.

In Figure 46, the mkdir command would be shown only when it actually runs when the first executable is linked, and not for every executable. These pattern rules perform the same actions but log the actions more appropriately for automated builds.

BINDIR = ../bin/
INCLUDES = -I/usr/local/include
LIBDIRS = -L/usr/local/lib
LIBS = -lmalloc

$(BINDIR)%: %.o
@echo “Linking $@”
@set -x; test -d $(BINDIR) || mkdir -p $(BINDIR)
cc -o $@ $(INCLUDES) $(LIBDIRS) $< $(LIBS)

%.o: %.c
@echo “Compiling $@”
cc -M $(INCLUDES) $< > $*.d
cc -c -o $@ $(INCLUDES) $<

Figure 46: Actual commands invoked by the recipes are logged by build automation.

To solve this conflict of requirements, use macros to set the proper level of output. The $(Q) macro is customarily used to conditionally silence simple commands. Use the $(QS) macro for complex commands that use control structures. Define the macros to assume default values suitable for the developers and other users who invoke Make interactively. Additional macros $(Q_B) and $(QS_B) are the corresponding values that are suitable for build engineers. Figure 47 shows how to use macros to control the verbosity of Make's log output.

BINDIR = ../bin/
INCLUDES = -I/usr/local/include
LIBDIRS = -L/usr/local/lib
LIBS = -lmalloc

Q = @
QS = @
Q_B =
QS_B = @set -x;

$(BINDIR)%: %.o
@echo “Linking $@”
$(QS) test -d $(BINDIR) || mkdir -p $(BINDIR)
$(Q) cc -o $@ $(INCLUDES) $(LIBDIRS) $< $(LIBS)

%.o: %.c
@echo “Compiling $@”
$(Q) cc -M $(INCLUDES) $< > $*.d
$(Q) cc -c -o $@ $(INCLUDES) $<

Figure 47: The $(Q) and $(QS) macros control verbosity of Make's log.

Developers can invoke Make interactively and use the default settings of the Q and QS macros to see the amount of verbosity that they want. Build automation can use Make’s command line to override the definitions of the those macros, as shown in Figure 48. The macros expand as desired because Make defers expansion of these macros until the moment the recipes run. In this use case, the Makefile defines the level of verbosity so that the build automation need not be aware of the actual values of the macros.

make ‘Q=$(Q_B)’ ‘QS=$(QS_B)’

Figure 48: Command line enabling verbose logging.

Build engineers can also embed messages to be seen only in their logs by using the shell’s built-in “:” (colon) command. This command is functionally equivalent to the /bin/true command (i.e., it’s a non-operation exiting with a successful status). Make logs it like any other command. Combining it with the $(Q) macro and supplying a message as its command line argument produces a message that appears only in the build engineer’s logs. Figure 49 is an example that can be used in a Makefile.

all:
$(Q): “The $@ target is completed”

Figure 49: Embedding messages in a verbose log.

Summary

Programs written in the Make programming language (i.e., Makefiles) are notoriously prone to error. Typical negative experiences include rebuilding artifacts that are already up to date, or failing to build artifacts that are out of date, or building artifacts in an unexpected order (thus producing incorrect build results). The root cause of most errors fall into one of a small number of general categories:

Missing edges in the dependency graph cause expected work to be skipped
- Undeclared prerequisites, often referring to side effects of other rules
- Assumption that artifacts are built in the order they are listed in prerequisite lists
Unneeded edges in the dependency graph cause extra work to be done
Rules and artifacts with incorrect timestamps yield incomplete or suboptimal scheduling
- Recipes yielding idempotent artifacts
- Artifacts are identified by multiple paths
- Phony targets appearing in prerequisite lists
- Directories used as prerequisites or targets

Deferred expansion of macros causes incorrect scheduling or recipes

After reviewing Make’s features, these failure modes were demonstrated with examples. Solutions to these problems were presented.

In some projects, the source tree can be examined to identify artifacts to be built. This discovery is done programmatically rather than by listing each artifact in the Makefile explicitly. Automatic removal of artifacts is also possible by comparing the set of desired artifacts to the set of existing artifacts, and removing the difference. Methods to perform auto-discovery and auto-deletion were presented, with examples.

Make's method to model the build assumes that each step of a build produces exactly one file. However, there are many situations in which a recipe can produce multiple files. Output files can have fixed names, predictable names, or unpredictable names. Each of these cases were presented with examples.

Software developers and build engineers have different requirements of the logs produced by Make. A method for selecting the verbosity level of logging and tracing was presented. The method satisfies the needs of developers by default, but can be overridden by build automation to meet the requirements of the build engineer.

The methods presented here have been used in practice. The resulting Makefiles have proven to have increased reliability and correctness of software builds based on Make, and have reduced unnecessary work that had previously been done before employing these methods.