Control the placement of OpenMP threads

The value of the environment variable OMP_PROC_BIND affects how threads are assigned to cores on your system (also known as thread affinity). If OMP_PROC_BIND=false or is unset, then threads are unpinned; they might be migrated between cores in the system during execution, and thread migration will most likely degrade performance significantly.

Arm recommends setting OMP_PROC_BIND to either "true", "close" or "spread", as required.

If set to "close" then the OpenMP threads are pinned to cores close to the parent thread. OMP_PROC_BIND=close is useful where threads in a team are working on locally shared data. For example, if threads are pinned to neighboring cores there might be a performance benefit from the data being stored in a shared level of cache.

If set to "spread" then the OpenMP threads are pinned to cores that are distant from the parent thread. OMP_PROC_BIND=spread is useful to avoid contention on hardware resources. For example, if threads are working on large amounts of private data then there might be an advantage to using "spread" to reduce contention on a shared level of cache or memory bandwidth.

Setting the value to "true" avoids thread migration, but does not specify a particular affinity policy.

Another option is to set OMP_PROC_BIND to "master". If OMP_PROC_BIND=master, all OpenMP threads in a team are pinned to the same core as the master thread.

Notes:

  • OMP_PROC_BIND can be set to a comma-separated list of the values described above, which sets the affinity policy separately for each level of nested parallelism.
  • The values assigned to OpenMP environment variables are case insensitive.

The descriptions above describe how OpenMP threads are pinned to cores in the system. However, the OpenMP specification uses the term place to denote a hardware resource for which threads can have affinity. The environment variable OMP_PLACES allows you to define what is meant by a "place" in the system.

OMP_PLACES can be set to one of three pre-defined values: "threads", "cores" or "sockets". Setting OMP_PLACES=threads assigns OpenMP threads to hardware threads in the system. On a system where a single core supports multiple hardware threads (for example, Marvell ThunderX2 systems with SMT>1), assigning OpenMP threads to hardware threads allows for the co-location of several threads in a single core.

If the value is set to "cores" then each OpenMP thread is assigned to a different core in the system, which might support more than one hardware thread.

If the value is set to "sockets" then each OpenMP thread is assigned to a single socket in the system, which contains multiple cores. Where "sockets" is set, the OpenMP threads might migrate in the assigned socket.

To more finely control the placement of OpenMP threads in your system, set OMP_PLACES to a list of numbers that indicate the IDs of hardware places in your system (typically hardware threads). There is a considerable amount of flexibility availability using OMP_PLACES, including the ability to exclude places from thread placement. If you are interested in this level of control, refer to the OpenMP specification and experiment on your system.

Previous Next