Next: Preprocessor Options, Previous: Optimize Options, Up: Invoking GCC
GCC supports a number of command-line options that control adding run-time instrumentation to the code it normally generates. For example, one purpose of instrumentation is collect profiling statistics for use in finding program hot spots, code coverage analysis, or profile-guided optimizations. Another class of program instrumentation is adding run-time checking to detect programming errors like invalid pointer dereferences or out-of-bounds array accesses, as well as deliberately hostile attacks such as stack smashing or C++ vtable hijacking. There is also a general hook which can be used to implement other forms of tracing or function-level instrumentation for debug or program analysis purposes.
-p
-pg
You can use the function attribute no_instrument_function
to
suppress profiling of individual functions when compiling with these options.
See Common Function Attributes.
-fprofile-arcs
When the compiled program exits it saves this data to a file called auxname.gcda for each source file. The data may be used for profile-directed optimizations (-fbranch-probabilities), or for test coverage analysis (-ftest-coverage). Each object file's auxname is generated from the name of the output file, if explicitly specified and it is not the final executable, otherwise it is the basename of the source file. In both cases any suffix is removed (e.g. foo.gcda for input file dir/foo.c, or dir/foo.gcda for output file specified as -o dir/foo.o).
Note that if a command line directly links source files, the corresponding
.gcda files will be prefixed with the unsuffixed name of the output file.
E.g. gcc a.c b.c -o binary
would generate binary-a.gcda and
binary-b.gcda files.
See Cross-profiling.
--coverage
fork
calls are
detected and correctly handled without double counting.
Moreover, an object file can be recompiled multiple times and the corresponding .gcda file merges as long as the source file and the compiler options are unchanged.
With -fprofile-arcs, for each function of your program GCC
creates a program flow graph, then finds a spanning tree for the graph.
Only arcs that are not on the spanning tree have to be instrumented: the
compiler adds code to count the number of times that these arcs are
executed. When an arc is the only exit or only entrance to a block, the
instrumentation code can be added to the block; otherwise, a new basic
block must be created to hold the instrumentation code.
-ftest-coverage
-fprofile-abs-path
-fprofile-dir=
pathWhen an executable is run in a massive parallel environment, it is recommended to save profile to different folders. That can be done with variables in path that are exported during run-time:
%p
%q{VAR}
-fprofile-generate
-fprofile-generate=
pathThe following options are enabled: -fprofile-arcs, -fprofile-values, -finline-functions, and -fipa-bit-cp.
If path is specified, GCC looks at the path to find the profile feedback data files. See -fprofile-dir.
To optimize the program based on the collected profile information, use
-fprofile-use. See Optimize Options, for more information.
-fprofile-info-section
-fprofile-info-section=
name.gcov_info
. A pointer to the
profile information generated by -fprofile-arcs is placed in the
specified section for each translation unit. This option disables the profile
information registration through a constructor and it disables the profile
information processing through a destructor. This option is not intended to be
used in hosted environments such as GNU/Linux. It targets free-standing
environments (for example embedded systems) with limited resources which do not
support constructors/destructors or the C library file I/O.
The linker could collect the input sections in a continuous memory block and define start and end symbols. A GNU linker script example which defines a linker output section follows:
.gcov_info : { PROVIDE (__gcov_info_start = .); KEEP (*(.gcov_info)) PROVIDE (__gcov_info_end = .); }
The program could dump the profiling information registered in this linker set for example like this:
#include <gcov.h> #include <stdio.h> #include <stdlib.h> extern const struct gcov_info *__gcov_info_start[]; extern const struct gcov_info *__gcov_info_end[]; static void filename (const char *f, void *arg) { puts (f); } static void dump (const void *d, unsigned n, void *arg) { const unsigned char *c = d; for (unsigned i = 0; i < n; ++i) printf ("%02x", c[i]); } static void * allocate (unsigned length, void *arg) { return malloc (length); } static void dump_gcov_info (void) { const struct gcov_info **info = __gcov_info_start; const struct gcov_info **end = __gcov_info_end; /* Obfuscate variable to prevent compiler optimizations. */ __asm__ ("" : "+r" (info)); while (info != end) { void *arg = NULL; __gcov_info_to_gcda (*info, filename, dump, allocate, arg); putchar ('\n'); ++info; } } int main() { dump_gcov_info(); return 0; }
-fprofile-note=
path-fprofile-prefix-path=
path-fprofile-prefix-map=
old=
new-fprofile-update=
methodWarning: When an application does not properly join all threads (or creates an detached thread), a profile file can be still corrupted.
Using `prefer-atomic' would be transformed either to `atomic',
when supported by a target, or to `single' otherwise. The GCC driver
automatically selects `prefer-atomic' when -pthread
is present in the command line.
-fprofile-filter-files=
regexFor example, -fprofile-filter-files=main\.c;module.*\.c will instrument
only main.c and all C files starting with 'module'.
-fprofile-exclude-files=
regexFor example, -fprofile-exclude-files=/usr/.* will prevent instrumentation
of all files that are located in the /usr/ folder.
-fprofile-reproducible=
[multithreaded
|parallel-runs
|serial
]-fprofile-generate
. This makes it possible to rebuild program
with same outcome which is useful, for example, for distribution
packages.
With -fprofile-reproducible=serial the profile gathered by
-fprofile-generate is reproducible provided the trained program
behaves the same at each invocation of the train run, it is not
multi-threaded and profile data streaming is always done in the same
order. Note that profile streaming happens at the end of program run but
also before fork
function is invoked.
Note that it is quite common that execution counts of some part of
programs depends, for example, on length of temporary file names or
memory space randomization (that may affect hash-table collision rate).
Such non-reproducible part of programs may be annotated by
no_instrument_function
function attribute. gcov-dump with
-l can be used to dump gathered data and verify that they are
indeed reproducible.
With -fprofile-reproducible=parallel-runs collected profile
stays reproducible regardless the order of streaming of the data into
gcda files. This setting makes it possible to run multiple instances of
instrumented program in parallel (such as with make -j
). This
reduces quality of gathered data, in particular of indirect call
profiling.
-fsanitize=address
help=1
,
the available options are shown at startup of the instrumented program. See
https://github.com/google/sanitizers/wiki/AddressSanitizerFlags#run-time-flags
for a list of supported options.
The option cannot be combined with -fsanitize=thread or
-fsanitize=hwaddress. Note that the only target
-fsanitize=hwaddress is currently supported on is AArch64.
-fsanitize=kernel-address
-fsanitize=hwaddress
help=1
,
the available options are shown at startup of the instrumented program.
The option cannot be combined with -fsanitize=thread or
-fsanitize=address, and is currently only available on AArch64.
-fsanitize=kernel-hwaddress
Note: This option has different defaults to the -fsanitize=hwaddress.
Instrumenting the stack and alloca calls are not on by default but are still
possible by specifying the command-line options
--param hwasan-instrument-stack=1 and
--param hwasan-instrument-allocas=1 respectively. Using a random frame
tag is not implemented for kernel instrumentation.
-fsanitize=pointer-compare
detect_invalid_pointer_pairs=2
to the environment variable
ASAN_OPTIONS. Using detect_invalid_pointer_pairs=1
detects
invalid operation only when both pointers are non-null.
-fsanitize=pointer-subtract
detect_invalid_pointer_pairs=2
to the environment variable
ASAN_OPTIONS. Using detect_invalid_pointer_pairs=1
detects
invalid operation only when both pointers are non-null.
-fsanitize=shadow-call-stack
Currently it only supports the aarch64 platform. It is specifically designed for linux kernels that enable the CONFIG_SHADOW_CALL_STACK option. For the user space programs, runtime support is not currently provided in libc and libgcc. Users who want to use this feature in user space need to provide their own support for the runtime. It should be noted that this may cause the ABI rules to be broken.
On aarch64, the instrumentation makes use of the platform register x18
.
This generally means that any code that may run on the same thread as code
compiled with ShadowCallStack must be compiled with the flag
-ffixed-x18, otherwise functions compiled without
-ffixed-x18 might clobber x18
and so corrupt the shadow
stack pointer.
Also, because there is no userspace runtime support, code compiled with ShadowCallStack cannot use exception handling. Use -fno-exceptions to turn off exceptions.
See https://clang.llvm.org/docs/ShadowCallStack.html for more
details.
-fsanitize=thread
Note that sanitized atomic builtins cannot throw exceptions when
operating on invalid memory addresses with non-call exceptions
(-fnon-call-exceptions).
-fsanitize=leak
malloc
and other allocator functions. See
https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer for more
details. The run-time behavior can be influenced using the
LSAN_OPTIONS environment variable.
The option cannot be combined with -fsanitize=thread.
-fsanitize=undefined
-fsanitize=shift
-fsanitize=shift-exponent
-fsanitize=shift-base
-fsanitize=integer-divide-by-zero
-fsanitize=unreachable
__builtin_unreachable
call into a diagnostics message call instead. When reaching the
__builtin_unreachable
call, the behavior is undefined.
-fsanitize=vla-bound
-fsanitize=null
-fsanitize=return
-fsanitize=signed-integer-overflow
+
, *
, and both unary and binary -
does not overflow in the signed arithmetics. This also detects
INT_MIN / -1
signed division. Note, integer promotion
rules must be taken into account. That is, the following is not an
overflow:
signed char a = SCHAR_MAX; a++;
-fsanitize=bounds
-fsanitize=bounds-strict
-fsanitize=alignment
-fsanitize=object-size
__builtin_object_size
function. Various out of bounds pointer
accesses are detected.
-fsanitize=float-divide-by-zero
-fsanitize=float-cast-overflow
FE_INVALID
exceptions enabled.
-fsanitize=nonnull-attribute
nonnull
function attribute.
-fsanitize=returns-nonnull-attribute
returns_nonnull
function attribute, to detect returning
of null values from such functions.
-fsanitize=bool
-fsanitize=enum
-fsanitize=vptr
-fsanitize=pointer-overflow
-fsanitize=builtin
__builtin_ctz
or __builtin_clz
invokes undefined behavior and is diagnosed
by this option.
While -ftrapv causes traps for signed overflows to be emitted,
-fsanitize=undefined gives a diagnostic message.
This currently works only for the C family of languages.
-fno-sanitize=all
-fasan-shadow-offset=
number-fsanitize-sections=
s1,
s2,...
-fsanitize-recover
[=
opts]Currently this feature only works for -fsanitize=undefined (and its suboptions except for -fsanitize=unreachable and -fsanitize=return), -fsanitize=float-cast-overflow, -fsanitize=float-divide-by-zero, -fsanitize=bounds-strict, -fsanitize=kernel-address and -fsanitize=address. For these sanitizers error recovery is turned on by default, except -fsanitize=address, for which this feature is experimental. -fsanitize-recover=all and -fno-sanitize-recover=all is also accepted, the former enables recovery for all sanitizers that support it, the latter disables recovery for all sanitizers that support it.
Even if a recovery mode is turned on the compiler side, it needs to be also
enabled on the runtime library side, otherwise the failures are still fatal.
The runtime library defaults to halt_on_error=0
for
ThreadSanitizer and UndefinedBehaviorSanitizer, while default value for
AddressSanitizer is halt_on_error=1
. This can be overridden through
setting the halt_on_error
flag in the corresponding environment variable.
Syntax without an explicit opts parameter is deprecated. It is equivalent to specifying an opts list of:
undefined,float-cast-overflow,float-divide-by-zero,bounds-strict
-fsanitize-address-use-after-scope
-fsanitize-undefined-trap-on-error
__builtin_trap
rather than
a libubsan
library routine. The advantage of this is that the
libubsan
library is not needed and is not linked in, so this
is usable even in freestanding environments.
-fsanitize-coverage=trace-pc
__sanitizer_cov_trace_pc
into every basic block.
-fsanitize-coverage=trace-cmp
__sanitizer_cov_trace_cmp1
,
__sanitizer_cov_trace_cmp2
, __sanitizer_cov_trace_cmp4
or
__sanitizer_cov_trace_cmp8
for integral comparison with both operands
variable or __sanitizer_cov_trace_const_cmp1
,
__sanitizer_cov_trace_const_cmp2
,
__sanitizer_cov_trace_const_cmp4
or
__sanitizer_cov_trace_const_cmp8
for integral comparison with one
operand constant, __sanitizer_cov_trace_cmpf
or
__sanitizer_cov_trace_cmpd
for float or double comparisons and
__sanitizer_cov_trace_switch
for switch statements.
-fcf-protection=
[full
|branch
|return
|none
|check
]The value branch
tells the compiler to implement checking of
validity of control-flow transfer at the point of indirect branch
instructions, i.e. call/jmp instructions. The value return
implements checking of validity at the point of returning from a
function. The value full
is an alias for specifying both
branch
and return
. The value none
turns off
instrumentation.
The value check
is used for the final link with link-time
optimization (LTO). An error is issued if LTO object files are
compiled with different -fcf-protection values. The
value check
is ignored at the compile time.
The macro __CET__
is defined when -fcf-protection is
used. The first bit of __CET__
is set to 1 for the value
branch
and the second bit of __CET__
is set to 1 for
the return
.
You can also use the nocf_check
attribute to identify
which functions and calls should be skipped from instrumentation
(see Function Attributes).
Currently the x86 GNU/Linux target provides an implementation based
on Intel Control-flow Enforcement Technology (CET) which works for
i686 processor or newer.
-fharden-compares
__builtin_trap
if the results do not
match. Use with `-fharden-conditional-branches' to cover all
conditionals.
-fharden-conditional-branches
__builtin_trap
if the result is
unexpected. Use with `-fharden-compares' to cover all
conditionals.
-fstack-protector
alloca
, and
functions with buffers larger than or equal to 8 bytes. The guards are
initialized when a function is entered and then checked when the function
exits. If a guard check fails, an error message is printed and the program
exits. Only variables that are actually allocated on the stack are
considered, optimized away variables or variables allocated in registers
don't count.
-fstack-protector-all
-fstack-protector-strong
-fstack-protector-explicit
stack_protect
attribute.
-fstack-check
Note that this switch does not actually cause checking to be done; the operating system or the language runtime must do that. The switch causes generation of code to ensure that they see the stack being extended.
You can additionally specify a string parameter: `no' means no checking, `generic' means force the use of old-style checking, `specific' means use the best checking method and is equivalent to bare -fstack-check.
Old-style checking is a generic mechanism that requires no specific target support in the compiler but comes with the following drawbacks:
Note that old-style stack checking is also the fallback method for `specific' if no target support has been added in the compiler.
`-fstack-check=' is designed for Ada's needs to detect infinite recursion
and stack overflows. `specific' is an excellent choice when compiling
Ada code. It is not generally sufficient to protect against stack-clash
attacks. To protect against those you want `-fstack-clash-protection'.
-fstack-clash-protection
Most targets do not fully support stack clash protection. However, on
those targets -fstack-clash-protection will protect dynamic stack
allocations. -fstack-clash-protection may also provide limited
protection for static stack allocations if the target supports
-fstack-check=specific.
-fstack-limit-register=
reg-fstack-limit-symbol=
sym-fno-stack-limit
For instance, if the stack starts at absolute address `0x80000000' and grows downwards, you can use the flags -fstack-limit-symbol=__stack_limit and -Wl,--defsym,__stack_limit=0x7ffe0000 to enforce a stack limit of 128KB. Note that this may only work with the GNU linker.
You can locally override stack limit checking by using the
no_stack_limit
function attribute (see Function Attributes).
-fsplit-stack
When code compiled with -fsplit-stack calls code compiled
without -fsplit-stack, there may not be much stack space
available for the latter code to run. If compiling all code,
including library code, with -fsplit-stack is not an option,
then the linker can fix up these calls so that the code compiled
without -fsplit-stack always has a large stack. Support for
this is implemented in the gold linker in GNU binutils release 2.21
and later.
-fvtable-verify=
[std
|preinit
|none
]This option causes run-time data structures to be built at program startup,
which are used for verifying the vtable pointers.
The options `std' and `preinit'
control the timing of when these data structures are built. In both cases the
data structures are built before execution reaches main
. Using
-fvtable-verify=std causes the data structures to be built after
shared libraries have been loaded and initialized.
-fvtable-verify=preinit causes them to be built before shared
libraries have been loaded and initialized.
If this option appears multiple times in the command line with different
values specified, `none' takes highest priority over both `std' and
`preinit'; `preinit' takes priority over `std'.
-fvtv-debug
Note: This feature appends data to the log file. If you want a fresh log
file, be sure to delete any existing one.
-fvtv-counts
Note: This feature appends data to the log files. To get fresh log
files, be sure to delete any existing ones.
-finstrument-functions
__builtin_return_address
does not work beyond the current
function, so the call site information may not be available to the
profiling functions otherwise.)
void __cyg_profile_func_enter (void *this_fn, void *call_site); void __cyg_profile_func_exit (void *this_fn, void *call_site);
The first argument is the address of the start of the current function, which may be looked up exactly in the symbol table.
This instrumentation is also done for functions expanded inline in other
functions. The profiling calls indicate where, conceptually, the
inline function is entered and exited. This means that addressable
versions of such functions must be available. If all your uses of a
function are expanded inline, this may mean an additional expansion of
code size. If you use extern inline
in your C code, an
addressable version of such functions must be provided. (This is
normally the case anyway, but if you get lucky and the optimizer always
expands the functions inline, you might have gotten away without
providing static copies.)
A function may be given the attribute no_instrument_function
, in
which case this instrumentation is not done. This can be used, for
example, for the profiling functions listed above, high-priority
interrupt routines, and any functions from which the profiling functions
cannot safely be called (perhaps signal handlers, if the profiling
routines generate output or allocate memory).
See Common Function Attributes.
-finstrument-functions-exclude-file-list=
file,
file,...
For example:
-finstrument-functions-exclude-file-list=/bits/stl,include/sys
excludes any inline function defined in files whose pathnames contain /bits/stl or include/sys.
If, for some reason, you want to include letter `,' in one of
sym, write `\,'. For example,
-finstrument-functions-exclude-file-list='\,\,tmp'
(note the single quote surrounding the option).
-finstrument-functions-exclude-function-list=
sym,
sym,...
vector<int> blah(const vector<int> &)
, not the
internal mangled name (e.g., _Z4blahRSt6vectorIiSaIiEE
). The
match is done on substrings: if the sym parameter is a substring
of the function name, it is considered to be a match. For C99 and C++
extended identifiers, the function name must be given in UTF-8, not
using universal character names.
-fpatchable-function-entry=
N[,
M]
0
so the
function entry points to the address just at the first NOP.
The NOP instructions reserve extra space which can be used to patch in
any desired instrumentation at run time, provided that the code segment
is writable. The amount of space is controllable indirectly via
the number of NOPs; the NOP instruction used corresponds to the instruction
emitted by the internal GCC back-end interface gen_nop
. This behavior
is target-specific and may also depend on the architecture variant and/or
other compilation options.
For run-time identification, the starting addresses of these areas,
which correspond to their respective function entries minus M,
are additionally collected in the __patchable_function_entries
section of the resulting binary.
Note that the value of __attribute__ ((patchable_function_entry
(N,M)))
takes precedence over command-line option
-fpatchable-function-entry=N,M. This can be used to increase
the area size or to remove it completely on a single function.
If N=0
, no pad location is recorded.
The NOP instructions are inserted at—and maybe before, depending on M—the function entry address, even before the prologue.
The maximum value of N and M is 65535.