You copied the Doc URL to your clipboard.

Overview of Arm® Fortran Compiler (armflang)

This topic introduces Arm Fortran Compiler.

For more information on Arm Fortran Compiler, see the Arm Fortran Compiler Reference Guide or Arm Fortran Compiler product web page.

Invoking Arm Fortran Compiler

To invoke Arm Fortran Compiler for preprocessing, compilation, assembly, and linking, use armflang.

To access compiler details and documentation, use:

GNU and Arm Compiler commands

Arm

Version details

armflang --version

Help and documentation

armflang --help

man armflang

Supported file types

The extensions .f90, .f95, .f03, and .f08 are used for modern, free-form source code that conforms to the Fortran 90, Fortran 95, Fortran 2003, or Fortran 2008 standards.

The extensions .F90, .F95, .F03, and .F08 are used for source code that requires preprocessing, and which is preprocessed automatically.

It is possible to instruct armflang to preprocess source irrespective of file extension by using the -cpp flag, as detailed in the next section.

Typically, .f and .for extensions are used for older, fixed-form code, such as FORTRAN77.

Optimization remarks

Optimization remarks are described in Optimize.

For more information on optimization remarks, see the Fortran and C/C++ compiler reference guides.

Arm hardware flags

GCC and Arm Compiler, have three hardware compiler flags in common: -march, -mtune, and -mcpu:

  • -march=X: Tells the compiler that X is the minimal architecture the binary must run on. The compiler is free to use architecture-specific instructions. This flag behaves differently on Arm and x86. On Arm, -march does not override -mtune, but on x86 -march does override both -mtune and -mcpu.

  • -mtune=X: Tells the compiler to optimize for microarchitecture X, but does not allow the compiler to change the ABI or make assumptions about available instructions. This flag has the more-or-less the same meaning on Arm and x86.

  • -mcpu=X: On Arm, this flag is a combination of -march and -mtune. It simultaneously specifies the target architecture and optimizes for a given microarchitecture. On x86, this flag is a deprecated synonym for -mtune.

GCC and Arm Compiler support passing the special parameter value native to these flags. The native value tells the compiler to automatically detect the architecture or microarchitecture of the machine on which the compiler is executing.

Note

Arm Compiler does not support the use of -march=native. To aid portability, GCC on AArch64 does support the use of -march=native.

These compiler options control binary code generation. Correctly using these options can greatly improve run-time performance. If you are not cross compiling, the simplest and easiest method to get the best performance on Arm, with both GCC and LLVM-based compilers, is to only use -mcpu=native, and actively avoid using -mtune or -march.

Note

Automatic detection of the architecture and processor is independent of the optimization level that is denoted by the -On flag and similar flags, as detailed in the Commonly used flags**and **Optimization compiler options sections in each compiler guide.

Optimized math functions with Arm Performance Libraries

Arm Performance Libraries (ArmPL) provide the following optimized standard core math libraries for high-performance computing applications on Arm processors:

  • BLAS - Basic Linear Algebra Subprograms (including XBLAS which is extended precision BLAS).

  • LAPACK - a comprehensive package of higher-level linear algebra routines.

  • FFT - a set of Fast Fourier Transform routines for real and complex data.

  • Math routines - optimized implementations of common maths intrinsics (on by default in Arm Performance Libraries versions 19.3+).

  • Auto-vectorization of Fortran math intrinsics (disable this with -fno-simdmath).

Arm Compiler for HPC 19.0+ introduces the -armpl compiler flag that simplifies using Arm Performance Libraries. This new flag provides a simple interface for selecting thread-parallelism and architectural tuning. Arm Performance Libraries also provides improved Fortran math intrinsics with auto-vectorization.

The -armpl and -mcpu flags enable the compiler to find appropriate Arm Performance Libraries header files during compilation, and appropriate libraries during linking. Both flags are required to achieve the best results.

Note

If your build process compiles and links as two separate steps, ensure that you add the same -armpl and -mcpu options to both.

For more information on Arm Performance Libraries, see Arm Performance Libraries.

Compiler directives

Directives are used to provide additional information to the compiler, and to control the compilation of specific code blocks, for example, loops. The Arm Fortran Compiler supports the following common directives:

Arm Fortran Compiler directives

Directive

Usage

Description

IVDEP

!DIR$ IVDEP

<do loop>

A generic directive which forces the compiler to ignore any potential memory dependencies of iterative loops, and to vectorize the loop.

OMP SIMD

!$OMP SIMD

<do loop>

An OpenMP directive to indicate that a loop can be transformed into a Single Instruction Multiple Data (SIMD) loop.

Note

  • -fopenmp must be set.

  • There is no support for OMP SIMD clauses.

VECTOR ALWAYS

!DIR$ VECTOR ALWAYS

<do loop>

Forces the compiler to vectorize a loop, and ignores any potential performance implications.

Note

The loop must be vectorizable.

NOVECTOR

!DIR$ NOVECTOR

<do loop>

Disables the vectorization of a loop.

UNROLL

!DIR$ UNROLL

<do loop>

Instructs the compiler optimizer to unroll a DO loop when optimization is enabled with the compiler optimization flags -02 or higher.

Generating Position Independent Code (PIC) with fPIC on AArch64

The generation of Position Independent Code (PIC) is typically required for building shared libraries. Supplying the command line flag -fPIC at compile time instructs armflang to generate Position Independent Code (PIC), and is generally consistent with the behavior of other compilers.

Note

PGI compilers do not differentiate between -fPIC and -fpic which are documented as interchangeable on x86 architectures. For more information on migrating from the PGI pgfortran compiler to Arm Compiler, see armflang for pgfortran users.

However, while the use of -fpic is often interchangeable with -fPIC on x86, it is not the case with GCC on AArch64. -fpic uses an address mode with a smaller number of entries in the Global Offset Table. As a result, -fpic is not considered to be portable between x86_64 and AArch64 architectures.

Allocating stack variables

  • Thread-safe recursion

    The -frecursive flag allocates all local variables on the stack. This allows thread-safe recursion and is applied implicitly for source compiled with the -fopenmp flag.

    Use the -frecursive options when compiling a procedure that:

    • Has no OpenMP elements and is not compiled using the -fopenmp flag.

    • Is called from within an OpenMP parallel region in source and is compiled with the -fopenmp flag.

  • Automatic arrays

    This feature of Fortran 2003 allows allocatable arrays to be allocated, and dynamically resized without the need for calls to ALLOCATE and DEALLOCATE. Automatic arrays are stored on the heap, regardless of the -frecursive flag, unless -fstack-arrays is specified.

Note

Use of the stack for local variables and automatic arrays can have implications for the stack size. To avoid running out of stack, it might be necessary to increase the stack size. For example, to remove the stack-size limit, enter ulimit -s unlimited at the command line.

Line lengths

The Fortran standard for free-form source (from Fortran90 onwards) sets a maximum line length of 132 characters. Statements can be broken over a maximum of 255 lines using the ampersand, &, continuation mark. Many compilers permit the use of lines longer than 132 characters.

armflang limits line lengths to 2100 characters and generates a compile time error if there are source lines, including comments, longer than 2100 characters. To compile with Arm Fortran Compiler, you must ensure that all source lines are within this limit.

Note

Arm Compiler versions that are earlier than 19.3 limited line lengths to 264 characters. Using compiler macros in versions that are earlier than 19.3 can lead to the generation of source lines longer than 264 characters at compile time.

Language extensions

There are several common extensions to the Fortran language which are typically supported by many existing compilers, generally for legacy reasons, including armflang. Often, the required functionality is now part of the language standard, even though it uses a different syntax. The following table shows common language extensions and their standards-compliant alternatives, where available.

Fortran language extensions and their standard-compliant alternatives

Extension

Purpose

Standard-compliant alternative

Notes

IARGC()

Function call which returns the number of command line arguments supplied

COMMAND_ARGUMENT_COUNT()

Introduced with 2003 standard.

GETARG(pos, arg)

Subroutine call which returns the pos-th argument that is passed on the command line when the program was invoked, and returns it as arg.

GET_COMMAND_ARGUMENT(pos,arg, len, status)

Introduced with 2003 standard.

arg, len, and status are OPTIONAL arguments.

GETENV(name, arg)

Subroutine call which returns the environment variable name as arg.

GET_ENVIRONMENT_VARIABLE(name, arg, len, status, trim_name)

Introduced with 2003 standard.

arg, len, and status are OPTIONAL arguments.

GETCWD(dir, status)

Subroutine call which returns the current working directory as dir. status is an OPTIONAL argument which returns 0 on success, and a nonzero error code when not successful.

No equivalent functionality at present.

No equivalent functionality in the 2003 standard.

Some commonly supported language extensions are not supported in armflang:

GCC and Arm Compiler options

Extension

Purpose

armflang equivalent

Notes

ISNAN(x)

Logical function returns .TRUE. if the REAL argument x is Not-a-Number (NaN).

IEEE_IS_NAN(x)

Introduced with 2003 standard. Requires IEEE_ARITHMETIC module.

For more information on supported language extensions, see the Fortran intrinsics chapter in the Arm Fortran Compiler Reference Guide.

Pre-defined macros

armflang has the following compiler and machine-specific predefined processor macros:

Pre-defined macros

Macro

Value

Purpose

__aarch64__

1

Selection of architecture-dependent source at compile time.

__ARM_ARCH

8

Selection of architecture-dependent source at compile time.

__FLANG

1

Selection of compiler-dependent source at compile time.

__clang__

1

Selection of compiler-dependent source at compile time.

__clang_version__

__clang_major_

__clang_minor__

__clang_patchlevel__

"7.1.0"

7

1

0

Underlying Clang version details.

Detailed compiler options

Passing the flag -### to armflang causes it to print the complete options used at each stage of the compilation, without executing them.

Understand the optimization choices the compiler makes

Arm Compiler incorporates two tools to help you better understand the optimization decisions that it makes:

Optimization Remarks

Optimization Remarks can be used to see which code has been inlined or to understand why a loop has not been vectorized.

To enable Optimization Remarks, pass one or more of the following -Rpass flags at compile time:

Options to enable Optimization Remarks

-Rpass flags

Description

-Rpass=<regexp>

To request information about what Arm Compiler has optimized.

-Rpass-analysis=<regexp>

To request information about what Arm Compiler has analyzed.

-Rpass-missed=<regexp>

To request information about what Arm Compiler failed to optimize.

In each case, <regexp> is used to select the type of remarks to provide. For example, loop-vectorize for information on vectorization, and inline for information on in-lining, or .* to report all optimization remarks. Rpass accepts regular expressions, so (loop-vectorize|inline) can be used to capture any remark on vectorization or inlining.

For example, to get actionable information on which loops can and cannot be vectorized at compile time, pass:

-Rpass=loop-vectorize -Rpass-missed=loop-vectorize -Rpass-analysis=loop-vectorize -g

Note

  • Optimization Remarks are only available when you have set an appropriate debug flag, such as -g.

  • Optimization Remarks are piped to stdout at compile time.

For more information, refer to the Optimization Remarks documentation in the Arm Fortran Compiler reference guide.

Arm Optimization Report

Arm Optimization Report is a new, beta-quality feature of Arm Compiler for Linux version 19.3 that builds upon the llvm-opt-report tool available in open-source LLVM. The new Arm Optimization Report feature makes it easier to see what optimization decisions the compiler is making about unrolling, vectorizing, and interleaving, in-line with your source code.

To enable Arm Optimization Report:

  1. At compile time, add the -fsave-optimization-record to the command line.

    A <filename>.opt.yaml report is generated by the compiler, where <filename> is the name of the binary.

  2. Use Arm Optimization Report (arm-opt-report) to inspect the <filename>.opt.yaml report as augmented source code:

    arm-opt-report <filename>.opt.yaml
    

    The annotated source code appears in the terminal.