Download as pdf or txt
Download as pdf or txt
You are on page 1of 669

OpenMP

Application Programming
Interface

Version 5.2 November 2021

Copyright 1997-2021
c OpenMP Architecture Review Board.
Permission to copy without fee all or part of this material is granted, provided the OpenMP
Architecture Review Board copyright notice and the title of this document appear. Notice is
given that copying is by permission of the OpenMP Architecture Review Board.
This page intentionally left blank in published version.
Contents

1 Overview of the OpenMP API 1


1.1 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Threading Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.2 OpenMP Language Terminology . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.3 Loop Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.4 Synchronization Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.5 Tasking Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.6 Data Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2.7 Implementation Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.2.8 Tool Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.3 Execution Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.4 Memory Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.4.1 Structure of the OpenMP Memory Model . . . . . . . . . . . . . . . . . . . 26
1.4.2 Device Data Environments . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.4.3 Memory Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.4.4 The Flush Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.4.5 Flush Synchronization and Happens Before . . . . . . . . . . . . . . . . . . 30
1.4.6 OpenMP Memory Consistency . . . . . . . . . . . . . . . . . . . . . . . . 32
1.5 Tool Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.5.1 OMPT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.5.2 OMPD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.6 OpenMP Compliance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.7 Normative References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.8 Organization of this Document . . . . . . . . . . . . . . . . . . . . . . . . . . 36

i
2 Internal Control Variables 38
2.1 ICV Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.2 ICV Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3 Modifying and Retrieving ICV Values . . . . . . . . . . . . . . . . . . . . . . 42
2.4 How the Per-Data Environment ICVs Work . . . . . . . . . . . . . . . . . . . 45
2.5 ICV Override Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3 Directive and Construct Syntax 48


3.1 Directive Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.1.1 Fixed Source Form Directives . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.1.2 Free Source Form Directives . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.2 Clause Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.2.1 OpenMP Argument Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.2.2 Reserved Locators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.2.3 OpenMP Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.2.4 Array Shaping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.2.5 Array Sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.2.6 iterator Modifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.3 Conditional Compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.3.1 Fixed Source Form Conditional Compilation Sentinels . . . . . . . . . . . . 70
3.3.2 Free Source Form Conditional Compilation Sentinel . . . . . . . . . . . . . 71
3.4 if Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.5 destroy Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4 Base Language Formats and Restrictions 74


4.1 OpenMP Types and Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.2 OpenMP Stylized Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.3 Structured Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.3.1 OpenMP Context-Specific Structured Blocks . . . . . . . . . . . . . . . . . 77
4.4 Loop Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.4.1 Canonical Loop Nest Form . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.4.2 OpenMP Loop-Iteration Spaces and Vectors . . . . . . . . . . . . . . . . . 91
4.4.3 collapse Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.4.4 ordered Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

ii OpenMP API – Version 5.2 November 2021


4.4.5 Consistent Loop Schedules . . . . . . . . . . . . . . . . . . . . . . . . . . 95

5 Data Environment 96
5.1 Data-Sharing Attribute Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.1.1 Variables Referenced in a Construct . . . . . . . . . . . . . . . . . . . . . . 96
5.1.2 Variables Referenced in a Region but not in a Construct . . . . . . . . . . . 100
5.2 threadprivate Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.3 List Item Privatization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.4 Data-Sharing Attribute Clauses . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.4.1 default Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.4.2 shared Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.4.3 private Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.4.4 firstprivate Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.4.5 lastprivate Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.4.6 linear Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.4.7 is_device_ptr Clause . . . . . . . . . . . . . . . . . . . . . . . . . . 120
5.4.8 use_device_ptr Clause . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.4.9 has_device_addr Clause . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.4.10 use_device_addr Clause . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.5 Reduction Clauses and Directives . . . . . . . . . . . . . . . . . . . . . . . . 124
5.5.1 OpenMP Reduction Identifiers . . . . . . . . . . . . . . . . . . . . . . . . 124
5.5.2 OpenMP Reduction Expressions . . . . . . . . . . . . . . . . . . . . . . . 125
5.5.3 Implicitly Declared OpenMP Reduction Identifiers . . . . . . . . . . . . . . 128
5.5.4 initializer Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.5.5 Properties Common to All Reduction Clauses . . . . . . . . . . . . . . . . 131
5.5.6 Reduction Scoping Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.5.7 Reduction Participating Clauses . . . . . . . . . . . . . . . . . . . . . . . . 134
5.5.8 reduction Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.5.9 task_reduction Clause . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.5.10 in_reduction Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.5.11 declare reduction Directive . . . . . . . . . . . . . . . . . . . . . . 139
5.6 scan Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.6.1 inclusive Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
5.6.2 exclusive Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

Contents iii
5.7 Data Copying Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
5.7.1 copyin Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
5.7.2 copyprivate Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
5.8 Data-Mapping Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.8.1 Implicit Data-Mapping Attribute Rules . . . . . . . . . . . . . . . . . . . . 148
5.8.2 Mapper Identifiers and mapper Modifiers . . . . . . . . . . . . . . . . . . 149
5.8.3 map Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
5.8.4 enter Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
5.8.5 link Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
5.8.6 Pointer Initialization for Device Data Environments . . . . . . . . . . . . . 160
5.8.7 defaultmap Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
5.8.8 declare mapper Directive . . . . . . . . . . . . . . . . . . . . . . . . . 162
5.9 Data-Motion Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
5.9.1 to Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
5.9.2 from Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
5.10 uniform Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
5.11 aligned Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

6 Memory Management 171


6.1 Memory Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
6.2 Memory Allocators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
6.3 align Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
6.4 allocator Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
6.5 allocate Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
6.6 allocate Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
6.7 allocators Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
6.8 uses_allocators Clause . . . . . . . . . . . . . . . . . . . . . . . . . . 181

7 Variant Directives 183


7.1 OpenMP Contexts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
7.2 Context Selectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
7.3 Matching and Scoring Context Selectors . . . . . . . . . . . . . . . . . . . . . 188
7.4 Metadirectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
7.4.1 when Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

iv OpenMP API – Version 5.2 November 2021


7.4.2 otherwise Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
7.4.3 metadirective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
7.4.4 begin metadirective . . . . . . . . . . . . . . . . . . . . . . . . . . 192
7.5 Declare Variant Directives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
7.5.1 match Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
7.5.2 adjust_args Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
7.5.3 append_args Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
7.5.4 declare variant Directive . . . . . . . . . . . . . . . . . . . . . . . . 197
7.5.5 begin declare variant Directive . . . . . . . . . . . . . . . . . . . . 198
7.6 dispatch Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
7.6.1 novariants Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
7.6.2 nocontext Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
7.7 declare simd Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
7.7.1 branch Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
7.8 Declare Target Directives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
7.8.1 declare target Directive . . . . . . . . . . . . . . . . . . . . . . . . . 206
7.8.2 begin declare target Directive . . . . . . . . . . . . . . . . . . . . . 207
7.8.3 indirect Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

8 Informational and Utility Directives 210


8.1 at Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
8.2 requires Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
8.2.1 requirement Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
8.3 Assumption Directives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
8.3.1 assumption Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
8.3.2 assumes Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
8.3.3 assume Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
8.3.4 begin assumes Directive . . . . . . . . . . . . . . . . . . . . . . . . . . 215
8.4 nothing Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
8.5 error Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
8.5.1 severity Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
8.5.2 message Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

Contents v
9 Loop Transformation Constructs 219
9.1 tile Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
9.1.1 sizes Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
9.2 unroll Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
9.2.1 full Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
9.2.2 partial Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

10 Parallelism Generation and Control 223


10.1 parallel Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
10.1.1 Determining the Number of Threads for a parallel Region . . . . . . . . 226
10.1.2 num_threads Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
10.1.3 Controlling OpenMP Thread Affinity . . . . . . . . . . . . . . . . . . . . . 228
10.1.4 proc_bind Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
10.2 teams Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
10.2.1 num_teams Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
10.3 order Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
10.4 simd Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
10.4.1 nontemporal Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
10.4.2 safelen Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
10.4.3 simdlen Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
10.5 masked Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
10.5.1 filter Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

11 Work-Distribution Constructs 240


11.1 single Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
11.2 scope Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
11.3 sections Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
11.3.1 section Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
11.4 workshare Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
11.5 Worksharing-Loop Constructs . . . . . . . . . . . . . . . . . . . . . . . . . . 247
11.5.1 for Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
11.5.2 do Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
11.5.3 schedule Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252

vi OpenMP API – Version 5.2 November 2021


11.6 distribute Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
11.6.1 dist_schedule Clause . . . . . . . . . . . . . . . . . . . . . . . . . . 256
11.7 loop Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
11.7.1 bind Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258

12 Tasking Constructs 260


12.1 untied Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
12.2 mergeable Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
12.3 final Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
12.4 priority Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
12.5 task Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
12.5.1 affinity Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
12.5.2 detach Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
12.6 taskloop Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
12.6.1 grainsize Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
12.6.2 num_tasks Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
12.7 taskyield Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
12.8 Initial Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
12.9 Task Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272

13 Device Directives and Clauses 275


13.1 device_type Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
13.2 device Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
13.3 thread_limit Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
13.4 Device Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
13.5 target data Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
13.6 target enter data Construct . . . . . . . . . . . . . . . . . . . . . . . . 280
13.7 target exit data Construct . . . . . . . . . . . . . . . . . . . . . . . . . 282
13.8 target Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
13.9 target update Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . 289

14 Interoperability 291
14.1 interop Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
14.1.1 OpenMP Foreign Runtime Identifiers . . . . . . . . . . . . . . . . . . . . . 293
14.1.2 init Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293

Contents vii
14.1.3 use Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
14.2 Interoperability Requirement Set . . . . . . . . . . . . . . . . . . . . . . . . . 294

15 Synchronization Constructs and Clauses 296


15.1 Synchronization Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
15.1.1 Synchronization Hint Type . . . . . . . . . . . . . . . . . . . . . . . . . . 296
15.1.2 hint Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
15.2 critical Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
15.3 Barriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
15.3.1 barrier Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
15.3.2 Implicit Barriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
15.3.3 Implementation-Specific Barriers . . . . . . . . . . . . . . . . . . . . . . . 304
15.4 taskgroup Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
15.5 taskwait Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
15.6 nowait Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
15.7 nogroup Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
15.8 OpenMP Memory Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
15.8.1 memory-order Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
15.8.2 atomic Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
15.8.3 extended-atomic Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
15.8.4 atomic Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
15.8.5 flush Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
15.8.6 Implicit Flushes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
15.9 OpenMP Dependences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
15.9.1 task-dependence-type Modifier . . . . . . . . . . . . . . . . . . . . . . . . 321
15.9.2 Depend Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
15.9.3 update Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
15.9.4 depobj Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
15.9.5 depend Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
15.9.6 doacross Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
15.10 ordered Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
15.10.1 Stand-alone ordered Construct . . . . . . . . . . . . . . . . . . . . . . . 329
15.10.2 Block-associated ordered Construct . . . . . . . . . . . . . . . . . . . . 330
15.10.3 parallelization-level Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . 331

viii OpenMP API – Version 5.2 November 2021


16 Cancellation Constructs 332
16.1 cancel Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
16.2 cancellation point Construct . . . . . . . . . . . . . . . . . . . . . . . 336

17 Composition of Constructs 338


17.1 Nesting of Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
17.2 Clauses on Combined and Composite Constructs . . . . . . . . . . . . . . . . 339
17.3 Combined and Composite Directive Names . . . . . . . . . . . . . . . . . . . 342
17.4 Combined Construct Semantics . . . . . . . . . . . . . . . . . . . . . . . . . 343
17.5 Composite Construct Semantics . . . . . . . . . . . . . . . . . . . . . . . . . 343

18 Runtime Library Routines 345


18.1 Runtime Library Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
18.2 Thread Team Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
18.2.1 omp_set_num_threads . . . . . . . . . . . . . . . . . . . . . . . . . . 348
18.2.2 omp_get_num_threads . . . . . . . . . . . . . . . . . . . . . . . . . . 349
18.2.3 omp_get_max_threads . . . . . . . . . . . . . . . . . . . . . . . . . . 350
18.2.4 omp_get_thread_num . . . . . . . . . . . . . . . . . . . . . . . . . . 350
18.2.5 omp_in_parallel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
18.2.6 omp_set_dynamic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
18.2.7 omp_get_dynamic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
18.2.8 omp_get_cancellation . . . . . . . . . . . . . . . . . . . . . . . . . 353
18.2.9 omp_set_nested (Deprecated) . . . . . . . . . . . . . . . . . . . . . . 353
18.2.10 omp_get_nested (Deprecated) . . . . . . . . . . . . . . . . . . . . . . 354
18.2.11 omp_set_schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
18.2.12 omp_get_schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
18.2.13 omp_get_thread_limit . . . . . . . . . . . . . . . . . . . . . . . . . 357
18.2.14 omp_get_supported_active_levels . . . . . . . . . . . . . . . . 358
18.2.15 omp_set_max_active_levels . . . . . . . . . . . . . . . . . . . . . 358
18.2.16 omp_get_max_active_levels . . . . . . . . . . . . . . . . . . . . . 359
18.2.17 omp_get_level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
18.2.18 omp_get_ancestor_thread_num . . . . . . . . . . . . . . . . . . . 360
18.2.19 omp_get_team_size . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
18.2.20 omp_get_active_level . . . . . . . . . . . . . . . . . . . . . . . . . 362

Contents ix
18.3 Thread Affinity Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
18.3.1 omp_get_proc_bind . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
18.3.2 omp_get_num_places . . . . . . . . . . . . . . . . . . . . . . . . . . 364
18.3.3 omp_get_place_num_procs . . . . . . . . . . . . . . . . . . . . . . 365
18.3.4 omp_get_place_proc_ids . . . . . . . . . . . . . . . . . . . . . . . 365
18.3.5 omp_get_place_num . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
18.3.6 omp_get_partition_num_places . . . . . . . . . . . . . . . . . . 367
18.3.7 omp_get_partition_place_nums . . . . . . . . . . . . . . . . . . 368
18.3.8 omp_set_affinity_format . . . . . . . . . . . . . . . . . . . . . . 368
18.3.9 omp_get_affinity_format . . . . . . . . . . . . . . . . . . . . . . 369
18.3.10 omp_display_affinity . . . . . . . . . . . . . . . . . . . . . . . . . 370
18.3.11 omp_capture_affinity . . . . . . . . . . . . . . . . . . . . . . . . . 371
18.4 Teams Region Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
18.4.1 omp_get_num_teams . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
18.4.2 omp_get_team_num . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
18.4.3 omp_set_num_teams . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
18.4.4 omp_get_max_teams . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
18.4.5 omp_set_teams_thread_limit . . . . . . . . . . . . . . . . . . . . 375
18.4.6 omp_get_teams_thread_limit . . . . . . . . . . . . . . . . . . . . 376
18.5 Tasking Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
18.5.1 omp_get_max_task_priority . . . . . . . . . . . . . . . . . . . . . 377
18.5.2 omp_in_explicit_task . . . . . . . . . . . . . . . . . . . . . . . . . 377
18.5.3 omp_in_final . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
18.6 Resource Relinquishing Routines . . . . . . . . . . . . . . . . . . . . . . . . 378
18.6.1 omp_pause_resource . . . . . . . . . . . . . . . . . . . . . . . . . . 378
18.6.2 omp_pause_resource_all . . . . . . . . . . . . . . . . . . . . . . . 380
18.7 Device Information Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
18.7.1 omp_get_num_procs . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
18.7.2 omp_set_default_device . . . . . . . . . . . . . . . . . . . . . . . 382
18.7.3 omp_get_default_device . . . . . . . . . . . . . . . . . . . . . . . 382
18.7.4 omp_get_num_devices . . . . . . . . . . . . . . . . . . . . . . . . . . 383
18.7.5 omp_get_device_num . . . . . . . . . . . . . . . . . . . . . . . . . . 384
18.7.6 omp_is_initial_device . . . . . . . . . . . . . . . . . . . . . . . . 384

x OpenMP API – Version 5.2 November 2021


18.7.7 omp_get_initial_device . . . . . . . . . . . . . . . . . . . . . . . 385
18.8 Device Memory Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
18.8.1 omp_target_alloc . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
18.8.2 omp_target_free . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
18.8.3 omp_target_is_present . . . . . . . . . . . . . . . . . . . . . . . . 389
18.8.4 omp_target_is_accessible . . . . . . . . . . . . . . . . . . . . . . 390
18.8.5 omp_target_memcpy . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
18.8.6 omp_target_memcpy_rect . . . . . . . . . . . . . . . . . . . . . . . 392
18.8.7 omp_target_memcpy_async . . . . . . . . . . . . . . . . . . . . . . 394
18.8.8 omp_target_memcpy_rect_async . . . . . . . . . . . . . . . . . . 396
18.8.9 omp_target_associate_ptr . . . . . . . . . . . . . . . . . . . . . . 399
18.8.10 omp_target_disassociate_ptr . . . . . . . . . . . . . . . . . . . 401
18.8.11 omp_get_mapped_ptr . . . . . . . . . . . . . . . . . . . . . . . . . . 402
18.9 Lock Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
18.9.1 omp_init_lock and omp_init_nest_lock . . . . . . . . . . . . . 405
18.9.2 omp_init_lock_with_hint and
omp_init_nest_lock_with_hint . . . . . . . . . . . . . . . . . . 406
18.9.3 omp_destroy_lock and omp_destroy_nest_lock . . . . . . . . . 408
18.9.4 omp_set_lock and omp_set_nest_lock . . . . . . . . . . . . . . . 409
18.9.5 omp_unset_lock and omp_unset_nest_lock . . . . . . . . . . . . 410
18.9.6 omp_test_lock and omp_test_nest_lock . . . . . . . . . . . . . 412
18.10 Timing Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
18.10.1 omp_get_wtime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
18.10.2 omp_get_wtick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
18.11 Event Routine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
18.11.1 omp_fulfill_event . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
18.12 Interoperability Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416
18.12.1 omp_get_num_interop_properties . . . . . . . . . . . . . . . . . 417
18.12.2 omp_get_interop_int . . . . . . . . . . . . . . . . . . . . . . . . . . 417
18.12.3 omp_get_interop_ptr . . . . . . . . . . . . . . . . . . . . . . . . . . 418
18.12.4 omp_get_interop_str . . . . . . . . . . . . . . . . . . . . . . . . . . 419
18.12.5 omp_get_interop_name . . . . . . . . . . . . . . . . . . . . . . . . . 420
18.12.6 omp_get_interop_type_desc . . . . . . . . . . . . . . . . . . . . . 421

Contents xi
18.12.7 omp_get_interop_rc_desc . . . . . . . . . . . . . . . . . . . . . . 421
18.13 Memory Management Routines . . . . . . . . . . . . . . . . . . . . . . . . . 422
18.13.1 Memory Management Types . . . . . . . . . . . . . . . . . . . . . . . . . 422
18.13.2 omp_init_allocator . . . . . . . . . . . . . . . . . . . . . . . . . . 425
18.13.3 omp_destroy_allocator . . . . . . . . . . . . . . . . . . . . . . . . 426
18.13.4 omp_set_default_allocator . . . . . . . . . . . . . . . . . . . . . 427
18.13.5 omp_get_default_allocator . . . . . . . . . . . . . . . . . . . . . 428
18.13.6 omp_alloc and omp_aligned_alloc . . . . . . . . . . . . . . . . . 428
18.13.7 omp_free . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
18.13.8 omp_calloc and omp_aligned_calloc . . . . . . . . . . . . . . . . 431
18.13.9 omp_realloc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
18.14 Tool Control Routine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435
18.15 Environment Display Routine . . . . . . . . . . . . . . . . . . . . . . . . . . 438

19 OMPT Interface 440


19.1 OMPT Interfaces Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
19.2 Activating a First-Party Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
19.2.1 ompt_start_tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
19.2.2 Determining Whether a First-Party Tool Should be Initialized . . . . . . . . 442
19.2.3 Initializing a First-Party Tool . . . . . . . . . . . . . . . . . . . . . . . . . 443
19.2.4 Monitoring Activity on the Host with OMPT . . . . . . . . . . . . . . . . . 446
19.2.5 Tracing Activity on Target Devices with OMPT . . . . . . . . . . . . . . . 447
19.3 Finalizing a First-Party Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
19.4 OMPT Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
19.4.1 Tool Initialization and Finalization . . . . . . . . . . . . . . . . . . . . . . 451
19.4.2 Callbacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452
19.4.3 Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
19.4.4 Miscellaneous Type Definitions . . . . . . . . . . . . . . . . . . . . . . . . 456
19.5 OMPT Tool Callback Signatures and Trace Records . . . . . . . . . . . . . . 473
19.5.1 Initialization and Finalization Callback Signature . . . . . . . . . . . . . . . 473
19.5.2 Event Callback Signatures and Trace Records . . . . . . . . . . . . . . . . . 474
19.6 OMPT Runtime Entry Points for Tools . . . . . . . . . . . . . . . . . . . . . . 510
19.6.1 Entry Points in the OMPT Callback Interface . . . . . . . . . . . . . . . . . 511
19.6.2 Entry Points in the OMPT Device Tracing Interface . . . . . . . . . . . . . 526

xii OpenMP API – Version 5.2 November 2021


19.6.3 Lookup Entry Points: ompt_function_lookup_t . . . . . . . . . . . 537

20 OMPD Interface 539


20.1 OMPD Interfaces Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 540
20.2 Activating a Third-Party Tool . . . . . . . . . . . . . . . . . . . . . . . . . . 540
20.2.1 Enabling Runtime Support for OMPD . . . . . . . . . . . . . . . . . . . . . 540
20.2.2 ompd_dll_locations . . . . . . . . . . . . . . . . . . . . . . . . . . 540
20.2.3 ompd_dll_locations_valid . . . . . . . . . . . . . . . . . . . . . . 541
20.3 OMPD Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542
20.3.1 Size Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542
20.3.2 Wait ID Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542
20.3.3 Basic Value Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542
20.3.4 Address Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
20.3.5 Frame Information Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
20.3.6 System Device Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . 544
20.3.7 Native Thread Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544
20.3.8 OMPD Handle Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545
20.3.9 OMPD Scope Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545
20.3.10 ICV ID Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 546
20.3.11 Tool Context Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 546
20.3.12 Return Code Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547
20.3.13 Primitive Type Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548
20.4 OMPD Third-Party Tool Callback Interface . . . . . . . . . . . . . . . . . . . 549
20.4.1 Memory Management of OMPD Library . . . . . . . . . . . . . . . . . . . 549
20.4.2 Context Management and Navigation . . . . . . . . . . . . . . . . . . . . . 551
20.4.3 Accessing Memory in the OpenMP Program or Runtime . . . . . . . . . . . 554
20.4.4 Data Format Conversion: ompd_callback_device_host_fn_t . . . 558
20.4.5 ompd_callback_print_string_fn_t . . . . . . . . . . . . . . . . 559
20.4.6 The Callback Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 560
20.5 OMPD Tool Interface Routines . . . . . . . . . . . . . . . . . . . . . . . . . 561
20.5.1 Per OMPD Library Initialization and Finalization . . . . . . . . . . . . . . 562
20.5.2 Per OpenMP Process Initialization and Finalization . . . . . . . . . . . . . 565
20.5.3 Thread and Signal Safety . . . . . . . . . . . . . . . . . . . . . . . . . . . 569
20.5.4 Address Space Information . . . . . . . . . . . . . . . . . . . . . . . . . . 569

Contents xiii
20.5.5 Thread Handles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570
20.5.6 Parallel Region Handles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576
20.5.7 Task Handles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 580
20.5.8 Querying Thread States . . . . . . . . . . . . . . . . . . . . . . . . . . . . 586
20.5.9 Display Control Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 588
20.5.10 Accessing Scope-Specific Information . . . . . . . . . . . . . . . . . . . . 590
20.6 Runtime Entry Points for OMPD . . . . . . . . . . . . . . . . . . . . . . . . . 594
20.6.1 Beginning Parallel Regions . . . . . . . . . . . . . . . . . . . . . . . . . . 594
20.6.2 Ending Parallel Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595
20.6.3 Beginning Task Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595
20.6.4 Ending Task Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596
20.6.5 Beginning OpenMP Threads . . . . . . . . . . . . . . . . . . . . . . . . . . 596
20.6.6 Ending OpenMP Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . 597
20.6.7 Initializing OpenMP Devices . . . . . . . . . . . . . . . . . . . . . . . . . 597
20.6.8 Finalizing OpenMP Devices . . . . . . . . . . . . . . . . . . . . . . . . . . 598

21 Environment Variables 599


21.1 Parallel Region Environment Variables . . . . . . . . . . . . . . . . . . . . . 600
21.1.1 OMP_DYNAMIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 600
21.1.2 OMP_NUM_THREADS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 600
21.1.3 OMP_THREAD_LIMIT . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601
21.1.4 OMP_MAX_ACTIVE_LEVELS . . . . . . . . . . . . . . . . . . . . . . . . 601
21.1.5 OMP_NESTED (Deprecated) . . . . . . . . . . . . . . . . . . . . . . . . . . 602
21.1.6 OMP_PLACES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602
21.1.7 OMP_PROC_BIND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604
21.2 Program Execution Environment Variables . . . . . . . . . . . . . . . . . . . 605
21.2.1 OMP_SCHEDULE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
21.2.2 OMP_STACKSIZE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606
21.2.3 OMP_WAIT_POLICY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607
21.2.4 OMP_DISPLAY_AFFINITY . . . . . . . . . . . . . . . . . . . . . . . . . 607
21.2.5 OMP_AFFINITY_FORMAT . . . . . . . . . . . . . . . . . . . . . . . . . . 608
21.2.6 OMP_CANCELLATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 610
21.2.7 OMP_DEFAULT_DEVICE . . . . . . . . . . . . . . . . . . . . . . . . . . 610
21.2.8 OMP_TARGET_OFFLOAD . . . . . . . . . . . . . . . . . . . . . . . . . . 610

xiv OpenMP API – Version 5.2 November 2021


21.2.9 OMP_MAX_TASK_PRIORITY . . . . . . . . . . . . . . . . . . . . . . . . 611
21.3 OMPT Environment Variables . . . . . . . . . . . . . . . . . . . . . . . . . . 611
21.3.1 OMP_TOOL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 611
21.3.2 OMP_TOOL_LIBRARIES . . . . . . . . . . . . . . . . . . . . . . . . . . 612
21.3.3 OMP_TOOL_VERBOSE_INIT . . . . . . . . . . . . . . . . . . . . . . . . 612
21.4 OMPD Environment Variables . . . . . . . . . . . . . . . . . . . . . . . . . . 613
21.4.1 OMP_DEBUG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613
21.5 Memory Allocation Environment Variables . . . . . . . . . . . . . . . . . . . 614
21.5.1 OMP_ALLOCATOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614
21.6 Teams Environment Variables . . . . . . . . . . . . . . . . . . . . . . . . . . 614
21.6.1 OMP_NUM_TEAMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615
21.6.2 OMP_TEAMS_THREAD_LIMIT . . . . . . . . . . . . . . . . . . . . . . . 615
21.7 OMP_DISPLAY_ENV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615

A OpenMP Implementation-Defined Behaviors 616

B Features History 625


B.1 Deprecated Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625
B.2 Version 5.1 to 5.2 Differences . . . . . . . . . . . . . . . . . . . . . . . . . . 626
B.3 Version 5.0 to 5.1 Differences . . . . . . . . . . . . . . . . . . . . . . . . . . 628
B.4 Version 4.5 to 5.0 Differences . . . . . . . . . . . . . . . . . . . . . . . . . . 631
B.5 Version 4.0 to 4.5 Differences . . . . . . . . . . . . . . . . . . . . . . . . . . 635
B.6 Version 3.1 to 4.0 Differences . . . . . . . . . . . . . . . . . . . . . . . . . . 636
B.7 Version 3.0 to 3.1 Differences . . . . . . . . . . . . . . . . . . . . . . . . . . 637
B.8 Version 2.5 to 3.0 Differences . . . . . . . . . . . . . . . . . . . . . . . . . . 638

Index 641

Contents xv
List of Figures
19.1 First-Party Tool Activation Flow Chart . . . . . . . . . . . . . . . . . . . . . . . . 442

xvi
List of Tables
2.1 ICV Scopes and Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.2 ICV Initial Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3 Ways to Modify and to Retrieve ICV Values . . . . . . . . . . . . . . . . . . . . . 42
2.4 ICV Override Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3.1 Syntactic Properties for Clauses, Arguments and Modifiers . . . . . . . . . . . . . 58

5.1 Implicitly Declared C/C++ Reduction Identifiers . . . . . . . . . . . . . . . . . . . 129


5.2 Implicitly Declared Fortran Reduction Identifiers . . . . . . . . . . . . . . . . . . 129
5.3 Map-Type Decay of Map Type Combinations . . . . . . . . . . . . . . . . . . . . 163

6.1 Predefined Memory Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171


6.2 Allocator Traits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
6.3 Predefined Allocators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

11.1 ompt_callback_work Callback Work Types for Worksharing-Loop . . . . . . 249

12.1 ompt_callback_task_create Callback Flags Evaluation . . . . . . . . . . 263

18.1 Required Values of the omp_interop_property_t enum Type . . . . . . . . 416


18.2 Required Values for the omp_interop_rc_t enum Type . . . . . . . . . . . . 417
18.3 Standard Tool Control Commands . . . . . . . . . . . . . . . . . . . . . . . . . . 436

19.1 OMPT Callback Interface Runtime Entry Point Names and Their Type Signatures . 445
19.2 Callbacks for which ompt_set_callback Must Return ompt_set_always 447
19.3 Callbacks for which ompt_set_callback May Return Any Non-Error Code . . 448
19.4 OMPT Tracing Interface Runtime Entry Point Names and Their Type Signatures . . 449

20.1 Mapping of Scope Type and OMPD Handles . . . . . . . . . . . . . . . . . . . . 546

21.1 Predefined Abstract Names for OMP_PLACES . . . . . . . . . . . . . . . . . . . . 603


21.2 Available Field Types for Formatting OpenMP Thread Affinity Information . . . . 609

xvii
This page intentionally left blank in published version.
1 1 Overview of the OpenMP API
2 The collection of compiler directives, library routines, and environment variables that this
3 document describes collectively define the specification of the OpenMP Application Program
4 Interface (OpenMP API) for parallelism in C, C++ and Fortran programs.
5 This specification provides a model for parallel programming that is portable across architectures
6 from different vendors. Compilers from numerous vendors support the OpenMP API. More
7 information about the OpenMP API can be found at the following web site
8 http://www.openmp.org
9 The directives, library routines, environment variables, and tool support that this document defines
10 allow users to create, to manage, to debug and to analyze parallel programs while permitting
11 portability. The directives extend the C, C++ and Fortran base languages with single program
12 multiple data (SPMD) constructs, tasking constructs, device constructs, work-distribution
13 constructs, and synchronization constructs, and they provide support for sharing, mapping and
14 privatizing data. The functionality to control the runtime environment is provided by library
15 routines and environment variables. Compilers that support the OpenMP API often include
16 command line options to enable or to disable interpretation of some or all OpenMP directives.

17 1.1 Scope
18 The OpenMP API covers only user-directed parallelization, wherein the programmer explicitly
19 specifies the actions to be taken by the compiler and runtime system in order to execute the program
20 in parallel. OpenMP-compliant implementations are not required to check for data dependences,
21 data conflicts, race conditions, or deadlocks. Compliant implementations also are not required to
22 check for any code sequences that cause a program to be classified as non-conforming. Application
23 developers are responsible for correctly using the OpenMP API to produce a conforming program.
24 The OpenMP API does not cover compiler-generated automatic parallelization.

1
1 1.2 Glossary
2 1.2.1 Threading Concepts

3 thread An execution entity with a stack and associated threadprivate memory.


4 OpenMP thread A thread that is managed by the OpenMP implementation.
5 thread number A number that the OpenMP implementation assigns to an OpenMP thread. For
6 threads within the same team, zero identifies the primary thread and consecutive
7 numbers identify the other threads of this team.
8 idle thread An OpenMP thread that is not currently part of any parallel region.
9 thread-safe routine A routine that performs the intended function even when executed concurrently (by
10 more than one thread).
11 processor Implementation-defined hardware unit on which one or more OpenMP threads can
12 execute.
13 device An implementation-defined logical execution engine.
14 COMMENT: A device could have one or more processors.
15 host device The device on which the OpenMP program begins execution.
16 target device A device with respect to which the current device performs an operation, as specified
17 by a device construct or an OpenMP device memory routine.
18 parent device For a given target region, the device on which the corresponding target
19 construct was encountered.

20 1.2.2 OpenMP Language Terminology


21 base language A programming language that serves as the foundation of the OpenMP specification.
22 COMMENT: See Section 1.7 for a listing of current base languages for
23 the OpenMP API.
24 base program A program written in a base language.
25 preprocessed code For C/C++, a sequence of preprocessing tokens that result from the first six phases of
26 translation, as defined by the base language.
27 program order An ordering of operations performed by the same thread as determined by the
28 execution sequence of operations specified by the base language.

2 OpenMP API – Version 5.2 November 2021


1 COMMENT: For versions of C and C++ that include base language
2 support for threading, program order corresponds to the sequenced before
3 relation between operations performed by the same thread.
4 structured block For C/C++, an executable statement, possibly compound, with a single entry at the
5 top and a single exit at the bottom, or an OpenMP construct.
6 For Fortran, a strictly structured block or a loosely structured block.
7 structured block For C/C++, a sequence of zero or more executable statements (including OpenMP
8 sequence constructs) that together have a single entry at the top and a single exit at the bottom.
9 For Fortran, a block of zero or more executable constructs (including OpenMP
10 constructs) with a single entry at the top and a single exit at the bottom.
11 strictly structured A single Fortran BLOCK construct, with a single entry at the top and a single exit at
12 block the bottom.
13 loosely structured A block of zero or more executable constructs (including OpenMP constructs),
14 block where the first executable construct (if any) is not a Fortran BLOCK construct, with a
15 single entry at the top and a single exit at the bottom.
16 compilation unit For C/C++, a translation unit.
17 For Fortran, a program unit.
18 enclosing context For C/C++, the innermost scope enclosing an OpenMP directive.
19 For Fortran, the innermost scoping unit enclosing an OpenMP directive.
20 directive A base language mechanism to specify OpenMP program behavior.
21 COMMENT: See Section 3.1 for a description of OpenMP directive
22 syntax in each base language.
23 white space A non-empty sequence of space and/or horizontal tab characters.
24 OpenMP program A program that consists of a base program that is annotated with OpenMP directives
25 or that calls OpenMP API runtime library routines.
26 conforming program An OpenMP program that follows all rules and restrictions of the OpenMP
27 specification.
28 implementation code Implicit code that is introduced by the OpenMP implementation.
29 metadirective A directive that conditionally resolves to another directive.
30 declarative directive An OpenMP directive that may only be placed in a declarative context and results in
31 one or more declarations only; it is not associated with the immediate execution of
32 any user code or implementation code. For C++, if a declarative directive applies to a
33 function declaration or definition and it is specified with one or more C++ attribute
34 specifiers, the specified attributes must be applied to the function as permitted by the

CHAPTER 1. OVERVIEW OF THE OPENMP API 3


1 base language. For Fortran, a declarative directive must appear after any USE,
2 IMPORT, and IMPLICIT statements in a declarative context.
3 executable directive An OpenMP directive that appears in an executable context and results in
4 implementation code and/or prescribes the manner in which associated user code
5 must execute.
6 informational directive An OpenMP directive that is neither declarative nor executable, but otherwise
7 conveys user code properties to the compiler.
8 utility directive An OpenMP directive that facilitates interactions with the compiler and/or supports
9 code readability; it may be either informational or executable.
10 stand-alone directive An OpenMP construct in which no user code is associated, but may produce
11 implementation code.
12 construct An OpenMP executable directive and its paired end directive (if any) and the
13 associated structured block (if any) not including the code in any called routines.
14 That is, the lexical extent of an executable directive.
15 subsidiary directive An OpenMP directive that is not an executable directive and that appears only as part
16 of an OpenMP construct.
17 combined construct A construct that is a shortcut for specifying one construct immediately nested inside
18 another construct. A combined construct is semantically identical to that of
19 explicitly specifying the first construct containing one instance of the second
20 construct and no other statements.
21 composite construct A construct that is composed of two constructs but does not have identical semantics
22 to specifying one of the constructs immediately nested inside the other. A composite
23 construct either adds semantics not included in the constructs from which it is
24 composed or provides an effective nesting of the one construct inside the other that
25 would otherwise be non-conforming.
26 constituent construct For a given combined or composite construct, a construct from which it, or any one
27 of its constituent constructs, is composed.
28 COMMENT: The constituent constructs of a
29 target teams distribute parallel for simd construct are the
30 following constructs: target,
31 teams distribute parallel for simd, teams,
32 distribute parallel for simd, distribute,
33 parallel for simd, parallel, for simd, for, and simd.
34 leaf construct For a given combined or composite construct, a constituent construct that is not itself
35 a combined or composite construct.
36 COMMENT: The leaf constructs of a
37 target teams distribute parallel for simd construct are the

4 OpenMP API – Version 5.2 November 2021


1 following constructs: target, teams, distribute, parallel,
2 for, and simd.
3 combined target A combined construct that is composed of a target construct along with another
4 construct construct.
5 region All code encountered during a specific instance of the execution of a given construct,
6 structured block sequence or OpenMP library routine. A region includes any code in
7 called routines as well as any implementation code. The generation of a task at the
8 point where a task generating construct is encountered is a part of the region of the
9 encountering thread. However, an explicit task region that corresponds to a task
10 generating construct is not part of the region of the encountering thread unless it is
11 an included task region. The point where a target or teams directive is
12 encountered is a part of the region of the encountering thread, but the region that
13 corresponds to the target or teams directive is not.
14 COMMENTS:
15 A region may also be thought of as the dynamic or runtime extent of a
16 construct or of an OpenMP library routine.
17 During the execution of an OpenMP program, a construct may give rise
18 to many regions.
19 active parallel region A parallel region that is executed by a team consisting of more than one thread.

20 inactive parallel region A parallel region that is executed by a team of only one thread.
21 active target region A target region that is executed on a device other than the device that encountered
22 the target construct.

23 inactive target region A target region that is executed on the same device that encountered the target
24 construct.
25 sequential part All code encountered during the execution of an initial task region that is not part of
26 a parallel region corresponding to a parallel construct or a task region
27 corresponding to a task construct.
28 COMMENTS:
29 A sequential part is enclosed by an implicit parallel region.
30 Executable statements in called routines may be in both a sequential part
31 and any number of explicit parallel regions at different points in the
32 program execution.
33 primary thread An OpenMP thread that has thread number 0. A primary thread may be an initial
34 thread or the thread that encounters a parallel construct, creates a team,
35 generates a set of implicit tasks, and then executes one of those tasks as thread
36 number 0.

CHAPTER 1. OVERVIEW OF THE OPENMP API 5


1 worker thread An OpenMP thread that is not the primary thread of a team and that executes one of
2 the implicit tasks of a parallel region.
3 parent thread The thread that encountered the parallel construct and generated a parallel
4 region is the parent thread of each of the threads in the team of that parallel
5 region. The primary thread of a parallel region is the same thread as its parent
6 thread with respect to any resources associated with an OpenMP thread.
7 child thread When a thread encounters a parallel construct, each of the threads in the
8 generated parallel region’s team are child threads of the encountering thread.
9 The target or teams region’s initial thread is not a child thread of the thread that
10 encountered the target or teams construct.
11 ancestor thread For a given thread, its parent thread or one of its parent thread’s ancestor threads.
12 descendent thread For a given thread, one of its child threads or one of its child threads’ descendent
13 threads.
14 team A set of one or more threads participating in the execution of a parallel region.
15 COMMENTS:
16 For an active parallel region, the team comprises the primary thread and
17 at least one additional thread.
18 For an inactive parallel region, the team comprises only the primary
19 thread.
20 league The set of teams created by a teams construct.
21 contention group An initial thread and its descendent threads.
22 implicit parallel region An inactive parallel region that is not generated from a parallel construct.
23 Implicit parallel regions surround the whole OpenMP program, all target regions,
24 and all teams regions.
25 initial thread The thread that executes an implicit parallel region.
26 initial team The team that comprises an initial thread executing an implicit parallel region.
27 nested construct A construct (lexically) enclosed by another construct.
28 closely nested construct A construct nested inside another construct with no other construct nested between
29 them.
30 explicit region A region that corresponds to either a construct of the same name or a library routine
31 call that explicitly appears in the program.
32 nested region A region (dynamically) enclosed by another region. That is, a region generated from
33 the execution of another region or one of its nested regions.

6 OpenMP API – Version 5.2 November 2021


1 COMMENT: Some nestings are conforming and some are not. See
2 Section 17.1 for the restrictions on nesting.
3 closely nested region A region nested inside another region with no parallel region nested between
4 them.
5 strictly nested region A region nested inside another region with no other explicit region nested between
6 them.
7 all threads All OpenMP threads participating in the OpenMP program.
8 current team All threads in the team executing the innermost enclosing parallel region.
9 encountering thread For a given region, the thread that encounters the corresponding construct.
10 all tasks All tasks participating in the OpenMP program.
11 current team tasks All tasks encountered by the corresponding team. The implicit tasks constituting the
12 parallel region and any descendent tasks encountered during the execution of
13 these implicit tasks are included in this set of tasks.
14 generating task For a given region, the task for which execution by a thread generated the region.
15 binding thread set The set of threads that are affected by, or provide the context for, the execution of a
16 region.
17 The binding thread set for a given region can be all threads on a specified set of
18 devices, all threads in a contention group, all primary threads executing an enclosing
19 teams region, the current team, or the encountering thread.
20 COMMENT: The binding thread set for a particular region is described in
21 its corresponding subsection of this specification.
22 binding task set The set of tasks that are affected by, or provide the context for, the execution of a
23 region.
24 The binding task set for a given region can be all tasks, the current team tasks, all
25 tasks of the current team that are generated in the region, the binding implicit task, or
26 the generating task.
27 COMMENT: The binding task set for a particular region (if applicable) is
28 described in its corresponding subsection of this specification.
29 binding region The enclosing region that determines the execution context and limits the scope of
30 the effects of the bound region is called the binding region.
31 Binding region is not defined for regions for which the binding thread set is all
32 threads or the encountering thread, nor is it defined for regions for which the binding
33 task set is all tasks.
34 orphaned construct A construct that gives rise to a region for which the binding thread set is the current
35 team, but is not nested within another construct that gives rise to the binding region.

CHAPTER 1. OVERVIEW OF THE OPENMP API 7


1 work-distribution A construct that is cooperatively executed by threads in the binding thread set of the
2 construct corresponding region.
3 worksharing construct A work-distribution construct that is executed by the thread team of the innermost
4 enclosing parallel region and includes, by default, an implicit barrier.
5 device construct An OpenMP construct that accepts the device clause.
6 cancellable construct An OpenMP construct that can be cancelled.
7 device routine A function (for C/C++ and Fortran) or subroutine (for Fortran) that can be executed
8 on a target device, as part of a target region.
9 target variant A version of a device routine that can only be executed as part of a target region.
10 foreign runtime A runtime environment that exists outside the OpenMP runtime with which the
11 environment OpenMP implementation may interoperate.
12 foreign execution A context that is instantiated from a foreign runtime environment in order to facilitate
13 context execution on a given device.
14 foreign task A unit of work executed in a foreign execution context.
15 indirect device An indirect call to the device version of a procedure on a device other than the host
16 invocation device, through a function pointer (C/C++), a pointer to a member function (C++) or
17 a procedure pointer (Fortran) that refers to the host version of the procedure.
18 place An unordered set of processors on a device.
19 place list The ordered list that describes all OpenMP places available to the execution
20 environment.
21 place partition An ordered list that corresponds to a contiguous interval in the OpenMP place list. It
22 describes the places currently available to the execution environment for a given
23 parallel region.
24 place number A number that uniquely identifies a place in the place list, with zero identifying the
25 first place in the place list, and each consecutive whole number identifying the next
26 place in the place list.
27 thread affinity A binding of threads to places within the current place partition.
28 SIMD instruction A single machine instruction that can operate on multiple data elements.
29 SIMD lane A software or hardware mechanism capable of processing one data element from a
30 SIMD instruction.
31 SIMD chunk A set of iterations executed concurrently, each by a SIMD lane, by a single thread by
32 means of SIMD instructions.
33 memory A storage resource to store and to retrieve variables accessible by OpenMP threads.

8 OpenMP API – Version 5.2 November 2021


1 memory space A representation of storage resources from which memory can be allocated or
2 deallocated. More than one memory space may exist.
3 memory allocator An OpenMP object that fulfills requests to allocate and to deallocate memory for
4 program variables from the storage resources of its associated memory space.
5 handle An opaque reference that uniquely identifies an abstraction.

6 1.2.3 Loop Terminology


7 canonical loop nest A loop nest that complies with the rules and restrictions defined in Section 4.4.1.
8 loop-associated An OpenMP executable directive for which the associated user code must be a
9 directive canonical loop nest.
10 associated loop A loop from a canonical loop nest that is controlled by a given loop-associated
11 directive.
12 loop nest depth For a canonical loop nest, the maximal number of loops, including the outermost
13 loop, that can be associated with a loop-associated directive.
14 logical iteration space For a loop-associated directive, the sequence 0,. . . ,N − 1 where N is the number of
15 iterations of the loops associated with the directive. The logical numbering denotes
16 the sequence in which the iterations would be executed if the set of associated loops
17 were executed sequentially.
18 logical iteration An iteration from the associated loops of a loop-associated directive, designated by a
19 logical number from the logical iteration space of the associated loops.
20 logical iteration vector For a loop-associated directive with n associated nested loops, the set of n-tuples
21 space (i1 , . . . , in ). For the k th associated loop, from outermost to innermost, ik is its
22 logical iteration number as if it was the only associated loop.
23 logical iteration vector An iteration from the associated nested loops of a loop-associated directive, where n
24 is the number of associated loops, designated by an n-tuple from the logical iteration
25 vector space of the associated loops.
26 lexicographic order The total order of two logical iteration vectors ωa = (i1 , . . . , in ) and
27 ωb = (j1 , . . . , jn ), denoted by ωa ≤lex ωb , where either ωa = ωb or
28 ∃m ∈ {1, . . . , n} such that im < jm and ik = jk for all k ∈ {1, . . . , m − 1}.
29 product order The partial order of two logical iteration vectors ωa = (i1 , . . . , in ) and
30 ωb = (j1 , . . . , jn ), denoted by ωa ≤product ωb , where ik ≤ jk for all k ∈ {1, . . . , n}.
31 loop transformation A construct that is replaced by the loops that result from applying the transformation
32 construct as defined by its directive to its associated loops.

CHAPTER 1. OVERVIEW OF THE OPENMP API 9


1 generated loop A loop that is generated by a loop transformation construct and is one of the
2 resulting loops that replace the construct.
3 SIMD loop A loop that includes at least one SIMD chunk.
4 non-rectangular loop For a loop nest, a loop for which a loop bound references the iteration variable of a
5 surrounding loop in the loop nest.
6 perfectly nested loop A loop that has no intervening code between it and the body of its surrounding loop.
7 The outermost loop of a loop nest is always perfectly nested.
8 doacross loop nest A loop nest, consisting of loops that may be associated with the same
9 loop-associated directive, that has cross-iteration dependences. An iteration is
10 dependent on one or more lexicographically earlier iterations.
11 COMMENT: The ordered clause parameter on a worksharing-loop
12 directive identifies the loops associated with the doacross loop nest.

13 1.2.4 Synchronization Terminology


14 barrier A point in the execution of a program encountered by a team of threads, beyond
15 which no thread in the team may execute until all threads in the team have reached
16 the barrier and all explicit tasks generated by the team have executed to completion.
17 If cancellation has been requested, threads may proceed to the end of the canceled
18 region even if some threads in the team have not reached the barrier.
19 cancellation An action that cancels (that is, aborts) an OpenMP region and causes executing
20 implicit or explicit tasks to proceed to the end of the canceled region.
21 cancellation point A point at which implicit and explicit tasks check if cancellation has been requested.
22 If cancellation has been observed, they perform the cancellation.
23 flush An operation that a thread performs to enforce consistency between its view and
24 other threads’ view of memory.
25 device-set The set of devices for which a flush operation may enforce memory consistency.
26 flush property A property that determines the manner in which a flush operation enforces memory
27 consistency. The defined flush properties are:
28 • strong: flushes a set of variables from the current thread’s temporary view of the
29 memory to the memory;
30 • release: orders memory operations that precede the flush before memory
31 operations performed by a different thread with which it synchronizes;

10 OpenMP API – Version 5.2 November 2021


1 • acquire: orders memory operations that follow the flush after memory operations
2 performed by a different thread that synchronizes with it.
3 COMMENT: Any flush operation has one or more flush properties.
4 strong flush A flush operation that has the strong flush property.
5 release flush A flush operation that has the release flush property.
6 acquire flush A flush operation that has the acquire flush property.
7 atomic operation An operation that is specified by an atomic construct or is implicitly performed by
8 the OpenMP implementation and that atomically accesses and/or modifies a specific
9 storage location.
10 atomic read An atomic operation that is specified by an atomic construct on which the read
11 clause is present.
12 atomic write An atomic operation that is specified by an atomic construct on which the write
13 clause is present.
14 atomic update An atomic operation that is specified by an atomic construct on which the
15 update clause is present.
16 atomic captured An atomic update operation that is specified by an atomic construct on which the
17 update capture clause is present.
18 atomic conditional An atomic update operation that is specified by an atomic construct on which the
19 update compare clause is present.
20 read-modify-write An atomic operation that reads and writes to a given storage location.
21 COMMENT: Any atomic update is a read-modify-write operation.
22 sequentially consistent An atomic construct for which the seq_cst clause is specified.
atomic construct
23 non-sequentially An atomic construct for which the seq_cst clause is not specified
consistent atomic
construct
24 sequentially consistent An atomic operation that is specified by a sequentially consistent atomic construct.
atomic operation

CHAPTER 1. OVERVIEW OF THE OPENMP API 11


1 1.2.5 Tasking Terminology
2 task A specific instance of executable code and its data environment that the OpenMP
3 implementation can schedule for execution by threads.
4 task region A region consisting of all code encountered during the execution of a task.
5 implicit task A task generated by an implicit parallel region or generated when a parallel
6 construct is encountered during execution.
7 binding implicit task The implicit task of the current thread team assigned to the encountering thread.
8 explicit task A task that is not an implicit task.
9 initial task An implicit task associated with an implicit parallel region.
10 current task For a given thread, the task corresponding to the task region in which it is executing.
11 encountering task For a given region, the current task of the encountering thread.
12 child task A task is a child task of its generating task region. A child task region is not part of
13 its generating task region.
14 sibling tasks Tasks that are child tasks of the same task region.
15 descendent task A task that is the child task of a task region or of one of its descendent task regions.
16 task completion A condition that is satisfied when a thread reaches the end of the executable code that
17 is associated with the task and any allow-completion event that is created for the task
18 has been fulfilled.
19 COMMENT: Completion of the initial task that is generated when the
20 program begins occurs at program exit.
21 task scheduling point A point during the execution of the current task region at which it can be suspended
22 to be resumed later; or the point of task completion, after which the executing thread
23 may switch to a different task region.
24 task switching The act of a thread switching from the execution of one task to another task.
25 tied task A task that, when its task region is suspended, can be resumed only by the same
26 thread that was executing it before suspension. That is, the task is tied to that thread.
27 untied task A task that, when its task region is suspended, can be resumed by any thread in the
28 team. That is, the task is not tied to any thread.
29 undeferred task A task for which execution is not deferred with respect to its generating task region.
30 That is, its generating task region is suspended until execution of the structured block
31 associated with the undeferred task is completed.
32 included task A task for which execution is sequentially included in the generating task region.
33 That is, an included task is undeferred and executed by the encountering thread.

12 OpenMP API – Version 5.2 November 2021


1 merged task A task for which the data environment, inclusive of ICVs, is the same as that of its
2 generating task region.
3 mergeable task A task that may be a merged task if it is an undeferred task or an included task.
4 final task A task that forces all of its child tasks to become final and included tasks.
5 detachable task An explicit task that only completes after an associated event variable that represents
6 an allow-completion event is fulfilled and execution of the associated structured
7 block has completed.
8 task dependence An ordering relation between two sibling tasks: the dependent task and a previously
9 generated predecessor task. The task dependence is fulfilled when the predecessor
10 task has completed.
11 dependent task A task that because of a task dependence cannot be executed until its predecessor
12 tasks have completed.
13 mutually exclusive Tasks that may be executed in any order, but not at the same time.
tasks
14 predecessor task A task that must complete before its dependent tasks can be executed.
15 task synchronization A taskwait, taskgroup, or a barrier construct.
construct
16 task generating A construct that generates one or more explicit tasks that are child tasks of the
17 construct encountering task.

18 target task A mergeable and untied task that is generated by a device construct or a call to a
19 device memory routine and that coordinates activity between the current device and
20 the target device.
21 taskgroup set A set of tasks that are logically grouped by a taskgroup region, such that a task is
22 a member of the taskgroup set if and only if its task region is nested in the
23 taskgroup region and it binds to the same parallel region as the taskgroup
24 region.

CHAPTER 1. OVERVIEW OF THE OPENMP API 13


1 1.2.6 Data Terminology

2 variable A named data storage block, for which the value can be defined and redefined during
3 the execution of a program.
4 COMMENT: An array element or structure element is a variable that is
5 part of another variable.
6 scalar variable For C/C++, a scalar variable, as defined by the base language.
7 For Fortran, a scalar variable with intrinsic type, as defined by the base language,
8 excluding character type.
9 aggregate variable A variable, such as an array or structure, composed of other variables. For Fortran, a
10 variable of character type is considered an aggregate variable.
11 array section A designated subset of the elements of an array that is specified using a subscript
12 notation that can select more than one element.
13 array item An array, an array section, or an array element.
14 shape-operator For C/C++, an array shaping operator that reinterprets a pointer expression as an
15 array with one or more specified dimensions.
16 implicit array For C/C++, the set of array elements of non-array type T that may be accessed by
17 applying a sequence of [] operators to a given pointer that is either a pointer to type T
18 or a pointer to a multidimensional array of elements of type T .
19 For Fortran, the set of array elements for a given array pointer.
20 COMMENT: For C/C++, the implicit array for pointer p with type T
21 (*)[10] consists of all accessible elements p[i][j], for all i and j=0,1,...,9.
22 base pointer For C/C++, an lvalue pointer expression that is used by a given lvalue expression or
23 array section to refer indirectly to its storage, where the lvalue expression or array
24 section is part of the implicit array for that lvalue pointer expression.
25 For Fortran, a data pointer that appears last in the designator for a given variable or
26 array section, where the variable or array section is part of the pointer target for that
27 data pointer.
28 COMMENT: For the array section
29 (*p0).x0[k1].p1->p2[k2].x1[k3].x2[4][0:n], where identifiers pi have a
30 pointer type declaration and identifiers xi have an array type declaration,
31 the base pointer is: (*p0).x0[k1].p1->p2.
32 named pointer For C/C++, the base pointer of a given lvalue expression or array section, or the base
33 pointer of one of its named pointers.

14 OpenMP API – Version 5.2 November 2021


1 For Fortran, the base pointer of a given variable or array section, or the base pointer
2 of one of its named pointers.
3 COMMENT: For the array section
4 (*p0).x0[k1].p1->p2[k2].x1[k3].x2[4][0:n], where identifiers pi have a
5 pointer type declaration and identifiers xi have an array type declaration,
6 the named pointers are: p0, (*p0).x0[k1].p1, and (*p0).x0[k1].p1->p2.
7 containing array For C/C++, a non-subscripted array (a containing array) to which a series of zero or
8 more array subscript operators and/or . (dot) operators are applied to yield a given
9 lvalue expression or array section for which storage is contained by the array.
10 For Fortran, an array (a containing array) without the POINTER attribute and
11 without a subscript list to which a series of zero or more array subscript operators
12 and/or component selectors are applied to yield a given variable or array section for
13 which storage is contained by the array.
14 COMMENT: An array is a containing array of itself. For the array section
15 (*p0).x0[k1].p1->p2[k2].x1[k3].x2[4][0:n], where identifiers pi have a
16 pointer type declaration and identifiers xi have an array type declaration,
17 the containing arrays are: (*p0).x0[k1].p1->p2[k2].x1 and
18 (*p0).x0[k1].p1->p2[k2].x1[k3].x2.
19 containing structure For C/C++, a structure to which a series of zero or more . (dot) operators and/or
20 array subscript operators are applied to yield a given lvalue expression or array
21 section for which storage is contained by the structure.
22 For Fortran, a structure to which a series of zero or more component selectors and/or
23 array subscript selectors are applied to yield a given variable or array section for
24 which storage is contained by the structure.
25 COMMENT: A structure is a containing structure of itself. For C/C++, a
26 structure pointer p to which the -> operator applies is equivalent to the
27 application of a . (dot) operator to (*p) for the purposes of determining
28 containing structures.
29 For the array section (*p0).x0[k1].p1->p2[k2].x1[k3].x2[4][0:n], where
30 identifiers pi have a pointer type declaration and identifiers xi have an
31 array type declaration, the containing structures are: *(*p0).x0[k1].p1,
32 (*(*p0).x0[k1].p1).p2[k2] and (*(*p0).x0[k1].p1).p2[k2].x1[k3]
33 base array For C/C++, a containing array of a given lvalue expression or array section that does
34 not appear in the expression of any of its other containing arrays.
35 For Fortran, a containing array of a given variable or array section that does not
36 appear in the designator of any of its other containing arrays.
37 COMMENT: For the array section
38 (*p0).x0[k1].p1->p2[k2].x1[k3].x2[4][0:n], where identifiers pi have a

CHAPTER 1. OVERVIEW OF THE OPENMP API 15


1 pointer type declaration and identifiers xi have an array type declaration,
2 the base array is: (*p0).x0[k1].p1->p2[k2].x1[k3].x2.
3 named array For C/C++, a containing array of a given lvalue expression or array section, or a
4 containing array of one of its named pointers.
5 For Fortran, a containing array of a given variable or array section, or a containing
6 array of one of its named pointers.
7 COMMENT: For the array section
8 (*p0).x0[k1].p1->p2[k2].x1[k3].x2[4][0:n], where identifiers pi have a
9 pointer type declaration and identifiers xi have an array type declaration,
10 the named arrays are: (*p0).x0, (*p0).x0[k1].p1->p2[k2].x1, and
11 (*p0).x0[k1].p1->p2[k2].x1[k3].x2.
12 base expression The base array of a given array section or array element, if it exists; otherwise, the
13 base pointer of the array section or array element.
14 COMMENT: For the array section
15 (*p0).x0[k1].p1->p2[k2].x1[k3].x2[4][0:n], where identifiers pi have a
16 pointer type declaration and identifiers xi have an array type declaration,
17 the base expression is: (*p0).x0[k1].p1->p2[k2].x1[k3].x2.
18 More examples for C/C++:
19 • The base expression for x[i] and for x[i:n] is x, if x is an array or pointer.
20 • The base expression for x[5][i] and for x[5][i:n] is x, if x is a pointer to
21 an array or x is 2-dimensional array.
22 • The base expression for y[5][i] and for y[5][i:n] is y[5], if y is an array
23 of pointers or y is a pointer to a pointer.
24 Examples for Fortran:
25 • The base expression for x(i) and for x(i:j) is x.
26 base variable For a given data entity that is a variable or array section, a variable denoted by a base
27 language identifier that is either the data entity or is a containing array or containing
28 structure of the data entity.
29 COMMENT:
30 Examples for C/C++:
31 • The data entities x, x[i], x[:n], x[i].y[j] and x[i].y[:n], where x and y
32 have array type declarations, all have the base variable x.
33 • The lvalue expressions and array sections p[i], p[:n], p[i].y[j] and
34 p[i].y[:n], where p has a pointer type and p[i].y has an array type, has a
35 base pointer p but does not have a base variable.

16 OpenMP API – Version 5.2 November 2021


1 Examples for Fortran:
2 • The data objects x, x(i), x(:n), x(i)%y(j) and x(i)%y(:n), where x and y
3 have array type declarations, all have the base variable x.
4 • The data objects p(i), p(:n), p(i)%y(j) and p(i)%y(:n), where p has a
5 pointer type and p(i)%y has an array type, has a base pointer p but does
6 not have a base variable.
7 • For the associated pointer p, p is both its base variable and base pointer.
8 attached pointer A pointer variable in a device data environment to which the effect of a map clause
9 assigns the address of an object, minus some offset, that is created in the device data
10 environment. The pointer is an attached pointer for the remainder of its lifetime in
11 the device data environment.

12 simply contiguous An array section that statically can be determined to have contiguous storage or that,
13 array section in Fortran, has the CONTIGUOUS attribute.

14 structure A structure is a variable that contains one or more variables.


15 For C/C++: Implemented using struct types.
16 For C++: Implemented using class types.
17 For Fortran: Implemented using derived types.
18 string literal For C/C++, a string literal.
19 For Fortran, a character literal constant.
20 private variable With respect to a given set of task regions or SIMD lanes that bind to the same
21 parallel region, a variable for which the name provides access to a different
22 block of storage for each task region or SIMD lane.
23 A variable that is part of another variable (as an array element or a structure element)
24 cannot be made private independently of other components. If a variable is
25 privatized, its components are also private.
26 shared variable With respect to a given set of task regions that bind to the same parallel region, a
27 variable for which the name provides access to the same block of storage for each
28 task region.
29 A variable that is part of another variable (as an array element or a structure element)
30 cannot be shared independently of the other components, except for static data
31 members of C++ classes.

CHAPTER 1. OVERVIEW OF THE OPENMP API 17


1 threadprivate variable A variable that is replicated, one instance per thread, by the OpenMP
2 implementation. Its name then provides access to a different block of storage for each
3 thread.
4 A variable that is part of another variable (as an array element or a structure element)
5 cannot be made threadprivate independently of the other components, except for
6 static data members of C++ classes. If a variable is made threadprivate, its
7 components are also threadprivate.
8 threadprivate memory The set of threadprivate variables associated with each thread.
9 data environment The variables associated with the execution of a given region.
10 device data The initial data environment associated with a device.
environment
11 device address An address of an object that may be referenced on a target device.
12 device pointer An implementation-defined handle that refers to a device address.
13 mapped variable An original variable in a data environment with a corresponding variable in a device
14 data environment.
15 COMMENT: The original and corresponding variables may share storage.
16 mapping operation An operation that establishes or removes a correspondence between a variable in one
17 data environment and another variable in a device data environment.
18 mapper An operation that defines how variables of given type are to be mapped or updated
19 with respect to a device data environment.
20 user-defined mapper A mapper that is defined by a declare mapper directive.
21 map-type decay The process that determines the final map types of the map operations that result
22 from mapping a variable with a user-defined mapper.
23 mappable type A type that is valid for a mapped variable. If a type is composed from other types
24 (such as the type of an array element or a structure element) and any of the other
25 types are not mappable then the type is not mappable.
26 COMMENT: Pointer types are mappable but the memory block to which
27 the pointer refers is not mapped.
28 For C, the type must be a complete type.
29 For C++, the type must be a complete type.
30 In addition, for class types:
31 • All member functions accessed in any target region must appear in a declare
32 target directive.

18 OpenMP API – Version 5.2 November 2021


1 For Fortran, no restrictions on the type except that for derived types:
2 • All type-bound procedures accessed in any target region must appear in a
3 declare target directive.
4 defined For variables, the property of having a valid value.
5 For C, for the contents of variables, the property of having a valid value.
6 For C++, for the contents of variables of POD (plain old data) type, the property of
7 having a valid value.
8 For variables of non-POD class type, the property of having been constructed but not
9 subsequently destructed.
10 For Fortran, for the contents of variables, the property of having a valid value. For
11 the allocation or association status of variables, the property of having a valid status.
12 COMMENT: Programs that rely upon variables that are not defined are
13 non-conforming programs.
14 class type For C++, variables declared with one of the class, struct, or union keywords.
15 static storage duration For C/C++, the lifetime of an object with static storage duration, as defined by the
16 base language.
17 For Fortran, the lifetime of a variable with a SAVE attribute, implicit or explicit, a
18 common block object or a variable declared in a module.
19 NULL A null pointer. For C, the value NULL. For C++, the value NULL or the value
20 nullptr. For Fortran, the value C_NULL_PTR.
21 non-null value A value that is not NULL.
22 non-null pointer A pointer that is not NULL.

23 1.2.7 Implementation Terminology


24 supported active levels An implementation-defined maximum number of active parallel regions that may
25 of parallelism enclose any region of code in the program.
26 OpenMP API support Support of at least one active level of parallelism.

27 nested parallelism Support of more than one active level of parallelism.


support
28 internal control A conceptual variable that specifies runtime behavior of a set of threads or tasks in
29 variable an OpenMP program.

CHAPTER 1. OVERVIEW OF THE OPENMP API 19


1 COMMENT: The acronym ICV is used interchangeably with the term
2 internal control variable throughout this specification.
3 OpenMP Additional A document that exists outside of the OpenMP specification and defines additional
4 Definitions document values that may be used in a conforming program. The OpenMP Additional
5 Definitions document is available at http://www.openmp.org/.
6 compliant An implementation of the OpenMP specification that compiles and executes any
7 implementation conforming program as defined by the specification.
8 COMMENT: A compliant implementation may exhibit unspecified
9 behavior when compiling or executing a non-conforming program.
10 unspecified behavior A behavior or result that is not specified by the OpenMP specification or not known
11 prior to the compilation or execution of an OpenMP program.
12 Such unspecified behavior may result from:
13 • Issues documented by the OpenMP specification as having unspecified behavior.
14 • A non-conforming program.
15 • A conforming program exhibiting an implementation-defined behavior.
16 implementation defined Behavior that must be documented by the implementation, and is allowed to vary
17 among different compliant implementations. An implementation is allowed to define
18 this behavior as unspecified.
19 COMMENT: All features that have implementation-defined behavior are
20 documented in Appendix A.
21 deprecated For a construct, clause, or other feature, the property that it is normative in the
22 current specification but is considered obsolescent and will be removed in the future.
23 Deprecated features may not be fully specified. In general, a deprecated feature was
24 fully specified in the version of the specification immediately prior to the one in
25 which it is deprecated. In most cases, a new feature replaces the deprecated feature.
26 Unless otherwise specified, whether any modifications provided by the replacement
27 feature apply to the deprecated feature is implementation defined.

20 OpenMP API – Version 5.2 November 2021


1 1.2.8 Tool Terminology

2 tool Code that can observe and/or modify the execution of an application.
3 first-party tool A tool that executes in the address space of the program that it is monitoring.
4 third-party tool A tool that executes as a separate process from the process that it is monitoring and
5 potentially controlling.
6 activated tool A first-party tool that successfully completed its initialization.
7 event A point of interest in the execution of a thread.
8 native thread A thread defined by an underlying thread implementation.
9 tool callback A function that a tool provides to an OpenMP implementation to invoke when an
10 associated event occurs.
11 registering a callback Providing a tool callback to an OpenMP implementation.
12 dispatching a callback Processing a callback when an associated event occurs in a manner consistent with
13 at an event the return code provided when a first-party tool registered the callback.
14 thread state An enumeration type that describes the current OpenMP activity of a thread. A
15 thread can be in only one state at any time.
16 wait identifier A unique opaque handle associated with each data object (for example, a lock) that
17 the OpenMP runtime uses to enforce mutual exclusion and potentially to cause a
18 thread to wait actively or passively.
19 frame A storage area on a thread’s stack associated with a procedure invocation. A frame
20 includes space for one or more saved registers and often also includes space for saved
21 arguments, local variables, and padding for alignment.
22 canonical frame An address associated with a procedure frame on a call stack that was the value of the
23 address stack pointer immediately prior to calling the procedure for which the frame
24 represents the invocation.
25 runtime entry point A function interface provided by an OpenMP runtime for use by a tool. A runtime
26 entry point is typically not associated with a global function symbol.
27 trace record A data structure in which to store information associated with an occurrence of an
28 event.
29 native trace record A trace record for an OpenMP device that is in a device-specific format.
30 signal A software interrupt delivered to a thread.
31 signal handler A function called asynchronously when a signal is delivered to a thread.

CHAPTER 1. OVERVIEW OF THE OPENMP API 21


1 async signal safe The guarantee that interruption by signal delivery will not interfere with a set of
2 operations. An async signal safe runtime entry point is safe to call from a signal
3 handler.
4 code block A contiguous region of memory that contains code of an OpenMP program to be
5 executed on a device.
6 OMPT An interface that helps a first-party tool monitor the execution of an OpenMP
7 program.
8 OMPT interface state A state that indicates the permitted interactions between a first-party tool and the
9 OpenMP implementation.
10 OMPT active An OMPT interface state in which the OpenMP implementation is prepared to accept
11 runtime calls from a first-party tool and will dispatch any registered callbacks and in
12 which a first-party tool can invoke runtime entry points if not otherwise restricted.
13 OMPT pending An OMPT interface state in which the OpenMP implementation can only call
14 functions to initialize a first-party tool and in which a first-party tool cannot invoke
15 runtime entry points.
16 OMPT inactive An OMPT interface state in which the OpenMP implementation will not make any
17 callbacks and in which a first-party tool cannot invoke runtime entry points.
18 OMPD An interface that helps a third-party tool inspect the OpenMP state of a program that
19 has begun execution.
20 OMPD library A dynamically loadable library that implements the OMPD interface.
21 image file An executable or shared library.
22 address space A collection of logical, virtual, or physical memory address ranges that contain code,
23 stack, and/or data. Address ranges within an address space need not be contiguous.
24 An address space consists of one or more segments.
25 segment A portion of an address space associated with a set of address ranges.
26 OpenMP architecture The architecture on which an OpenMP region executes.
27 tool architecture The architecture on which an OMPD tool executes.
28 OpenMP process A collection of one or more threads and address spaces. A process may contain
29 threads and address spaces for multiple OpenMP architectures. At least one thread
30 in an OpenMP process is an OpenMP thread. A process may be live or a core file.
31 address space handle A handle that refers to an address space within an OpenMP process.
32 thread handle A handle that refers to an OpenMP thread.
33 parallel handle A handle that refers to an OpenMP parallel region.

22 OpenMP API – Version 5.2 November 2021


1 task handle A handle that refers to an OpenMP task region.
2 descendent handle An output handle that is returned from the OMPD library in a function that accepts
3 an input handle: the output handle is a descendent of the input handle.
4 ancestor handle An input handle that is passed to the OMPD library in a function that returns an
5 output handle: the input handle is an ancestor of the output handle. For a given
6 handle, the ancestors of the handle are also the ancestors of the handle’s descendent.
7 COMMENT: A tool cannot use a handle in an OMPD call if any ancestor
8 of the handle has been released, except for OMPD calls that release it.
9 tool context An opaque reference provided by a tool to an OMPD library. A tool context uniquely
10 identifies an abstraction.
11 address space context A tool context that refers to an address space within a process.
12 thread context A tool context that refers to a native thread.
13 native thread identifier An identifier for a native thread defined by a thread implementation.

14 1.3 Execution Model


15 The OpenMP API uses the fork-join model of parallel execution. Multiple threads of execution
16 perform tasks defined implicitly or explicitly by OpenMP directives. The OpenMP API is intended
17 to support programs that will execute correctly both as parallel programs (multiple threads of
18 execution and a full OpenMP support library) and as sequential programs (directives ignored and a
19 simple OpenMP stubs library). However, a conforming OpenMP program may execute correctly as
20 a parallel program but not as a sequential program, or may produce different results when executed
21 as a parallel program compared to when it is executed as a sequential program. Further, using
22 different numbers of threads may result in different numeric results because of changes in the
23 association of numeric operations. For example, a serial addition reduction may have a different
24 pattern of addition associations than a parallel reduction. These different associations may change
25 the results of floating-point addition.
26 An OpenMP program begins as a single thread of execution, called an initial thread. An initial
27 thread executes sequentially, as if the code encountered is part of an implicit task region, called an
28 initial task region, that is generated by the implicit parallel region surrounding the whole program.
29 The thread that executes the implicit parallel region that surrounds the whole program executes on
30 the host device. An implementation may support other devices besides the host device. If
31 supported, these devices are available to the host device for offloading code and data. Each device
32 has its own threads that are distinct from threads that execute on another device. Threads cannot
33 migrate from one device to another device. Each device is identified by a device number. The
34 device number for the host device is the value of the total number of non-host devices, while each

CHAPTER 1. OVERVIEW OF THE OPENMP API 23


1 non-host device has a unique device number that is greater than or equal to zero and less than the
2 device number for the host device. Additionally, the constant omp_initial_device can be
3 used as an alias for the host device and the constant omp_invalid_device can be used to
4 specify an invalid device number. A conforming device number is either a non-negative integer that
5 is less than or equal to omp_get_num_devices() or equal to omp_initial_device or
6 omp_invalid_device.
7 When a target construct is encountered, a new target task is generated. The target task region
8 encloses the target region. The target task is complete after the execution of the target region
9 is complete.
10 When a target task executes, the enclosed target region is executed by an initial thread. The
11 initial thread executes sequentially, as if the target region is part of an initial task region that is
12 generated by an implicit parallel region. The initial thread may execute on the requested target
13 device, if it is available and supported. If the target device does not exist or the implementation
14 does not support it, all target regions associated with that device execute on the host device.
15 The implementation must ensure that the target region executes as if it were executed in the data
16 environment of the target device unless an if clause is present and the if clause expression
17 evaluates to false.
18 The teams construct creates a league of teams, where each team is an initial team that comprises
19 an initial thread that executes the teams region. Each initial thread executes sequentially, as if the
20 code encountered is part of an initial task region that is generated by an implicit parallel region
21 associated with each team. Whether the initial threads concurrently execute the teams region is
22 unspecified, and a program that relies on their concurrent execution for the purposes of
23 synchronization may deadlock.
24 If a construct creates a data environment, the data environment is created at the time the construct is
25 encountered. The description of a construct defines whether it creates a data environment.
26 When any thread encounters a parallel construct, the thread creates a team of itself and zero or
27 more additional threads and becomes the primary thread of the new team. A set of implicit tasks,
28 one per thread, is generated. The code for each task is defined by the code inside the parallel
29 construct. Each task is assigned to a different thread in the team and becomes tied; that is, it is
30 always executed by the thread to which it is initially assigned. The task region of the task being
31 executed by the encountering thread is suspended, and each member of the new team executes its
32 implicit task. An implicit barrier occurs at the end of the parallel region. Only the primary
33 thread resumes execution beyond the end of the parallel construct, resuming the task region
34 that was suspended upon encountering the parallel construct. Any number of parallel
35 constructs can be specified in a single program.
36 parallel regions may be arbitrarily nested inside each other. If nested parallelism is disabled, or
37 is not supported by the OpenMP implementation, then the new team that is created by a thread that
38 encounters a parallel construct inside a parallel region will consist only of the
39 encountering thread. However, if nested parallelism is supported and enabled, then the new team

24 OpenMP API – Version 5.2 November 2021


1 can consist of more than one thread. A parallel construct may include a proc_bind clause to
2 specify the places to use for the threads in the team within the parallel region.
3 When any team encounters a worksharing construct, the work inside the construct is divided among
4 the members of the team, and executed cooperatively instead of being executed by every thread. An
5 implicit barrier occurs at the end of any region that corresponds to a worksharing construct for
6 which the nowait clause is not specified. Redundant execution of code by every thread in the
7 team resumes after the end of the worksharing construct.
8 When any thread encounters a task generating construct, one or more explicit tasks are generated.
9 Execution of explicitly generated tasks is assigned to one of the threads in the current team, subject
10 to the thread’s availability to execute work. Thus, execution of the new task could be immediate, or
11 deferred until later according to task scheduling constraints and thread availability. Threads are
12 allowed to suspend the current task region at a task scheduling point in order to execute a different
13 task. If the suspended task region is for a tied task, the initially assigned thread later resumes
14 execution of the suspended task region. If the suspended task region is for an untied task, then any
15 thread may resume its execution. Completion of all explicit tasks bound to a given parallel region is
16 guaranteed before the primary thread leaves the implicit barrier at the end of the region.
17 Completion of a subset of all explicit tasks bound to a given parallel region may be specified
18 through the use of task synchronization constructs. Completion of all explicit tasks bound to the
19 implicit parallel region is guaranteed by the time the program exits.
20 When any thread encounters a simd construct, the iterations of the loop associated with the
21 construct may be executed concurrently using the SIMD lanes that are available to the thread.
22 When a loop construct is encountered, the iterations of the loop associated with the construct are
23 executed in the context of its encountering threads, as determined according to its binding region. If
24 the loop region binds to a teams region, the region is encountered by the set of primary threads
25 that execute the teams region. If the loop region binds to a parallel region, the region is
26 encountered by the team of threads that execute the parallel region. Otherwise, the region is
27 encountered by a single thread.
28 If the loop region binds to a teams region, the encountering threads may continue execution
29 after the loop region without waiting for all iterations to complete; the iterations are guaranteed to
30 complete before the end of the teams region. Otherwise, all iterations must complete before the
31 encountering threads continue execution after the loop region. All threads that encounter the
32 loop construct may participate in the execution of the iterations. Only one of these threads may
33 execute any given iteration.
34 The cancel construct can alter the previously described flow of execution in an OpenMP region.
35 The effect of the cancel construct depends on its construct-type-clause. If a task encounters a
36 cancel construct with a taskgroup construct-type-clause, then the task activates cancellation
37 and continues execution at the end of its task region, which implies completion of that task. Any
38 other task in that taskgroup that has begun executing completes execution unless it encounters a
39 cancellation point construct, in which case it continues execution at the end of its task
40 region, which implies its completion. Other tasks in that taskgroup region that have not begun

CHAPTER 1. OVERVIEW OF THE OPENMP API 25


1 execution are aborted, which implies their completion.
2 For all other construct-type-clause values, if a thread encounters a cancel construct, it activates
3 cancellation of the innermost enclosing region of the type specified and the thread continues
4 execution at the end of that region. Threads check if cancellation has been activated for their region
5 at cancellation points and, if so, also resume execution at the end of the canceled region.
6 If cancellation has been activated, regardless of construct-type-clause, threads that are waiting
7 inside a barrier other than an implicit barrier at the end of the canceled region exit the barrier and
8 resume execution at the end of the canceled region. This action can occur before the other threads
9 reach that barrier.
10 When compile-time error termination is performed, the effect is as if an error directive for which
11 sev-level is fatal and action-time is compilation is encountered. When runtime error
12 termination is performed, the effect is as if an error directive for which sev-level is fatal and
13 action-time is execution is encountered.
14 Synchronization constructs and library routines are available in the OpenMP API to coordinate
15 tasks and data access in parallel regions. In addition, library routines and environment
16 variables are available to control or to query the runtime environment of OpenMP programs.
17 The OpenMP specification makes no guarantee that input or output to the same file is synchronous
18 when executed in parallel. In this case, the programmer is responsible for synchronizing input and
19 output processing with the assistance of OpenMP synchronization constructs or library routines.
20 For the case where each thread accesses a different file, the programmer does not need to
21 synchronize access.
22 All concurrency semantics defined by the base language with respect to threads of execution apply
23 to OpenMP threads, unless specified otherwise.

24 1.4 Memory Model


25 1.4.1 Structure of the OpenMP Memory Model
26 The OpenMP API provides a relaxed-consistency, shared-memory model. All OpenMP threads
27 have access to a place to store and to retrieve variables, called the memory. A given storage location
28 in the memory may be associated with one or more devices, such that only threads on associated
29 devices have access to it. In addition, each thread is allowed to have its own temporary view of the
30 memory. The temporary view of memory for each thread is not a required part of the OpenMP
31 memory model, but can represent any kind of intervening structure, such as machine registers,
32 cache, or other local storage, between the thread and the memory. The temporary view of memory
33 allows the thread to cache variables and thereby to avoid going to memory for every reference to a
34 variable. Each thread also has access to another type of memory that must not be accessed by other
35 threads, called threadprivate memory.

26 OpenMP API – Version 5.2 November 2021


1 A directive that accepts data-sharing attribute clauses determines two kinds of access to variables
2 used in the directive’s associated structured block: shared and private. Each variable referenced in
3 the structured block has an original variable, which is the variable by the same name that exists in
4 the program immediately outside the construct. Each reference to a shared variable in the structured
5 block becomes a reference to the original variable. For each private variable referenced in the
6 structured block, a new version of the original variable (of the same type and size) is created in
7 memory for each task or SIMD lane that contains code associated with the directive. Creation of
8 the new version does not alter the value of the original variable. However, the impact of attempts to
9 access the original variable from within the region corresponding to the directive is unspecified; see
10 Section 5.4.3 for additional details. References to a private variable in the structured block refer to
11 the private version of the original variable for the current task or SIMD lane. The relationship
12 between the value of the original variable and the initial or final value of the private version
13 depends on the exact clause that specifies it. Details of this issue, as well as other issues with
14 privatization, are provided in Chapter 5.
15 The minimum size at which a memory update may also read and write back adjacent variables that
16 are part of another variable (as array elements or structure elements) is implementation defined but
17 is no larger than the base language requires.
18 A single access to a variable may be implemented with multiple load or store instructions and, thus,
19 is not guaranteed to be atomic with respect to other accesses to the same variable. Accesses to
20 variables smaller than the implementation-defined minimum size or to C or C++ bit-fields may be
21 implemented by reading, modifying, and rewriting a larger unit of memory, and may thus interfere
22 with updates of variables or fields in the same unit of memory.
23 Two memory operations are considered unordered if the order in which they must complete, as seen
24 by their affected threads, is not specified by the memory consistency guarantees listed in
25 Section 1.4.6. If multiple threads write to the same memory unit (defined consistently with the
26 above access considerations) then a data race occurs if the writes are unordered. Similarly, if at
27 least one thread reads from a memory unit and at least one thread writes to that same memory unit
28 then a data race occurs if the read and write are unordered. If a data race occurs then the result of
29 the program is unspecified.
30 A private variable in a task region that subsequently generates an inner nested parallel region is
31 permitted to be made shared for implicit tasks in the inner parallel region. A private variable in
32 a task region can also be shared by an explicit task region generated during its execution. However,
33 the programmer must use synchronization that ensures that the lifetime of the variable does not end
34 before completion of the explicit task region sharing it. Any other access by one task to the private
35 variables of another task results in unspecified behavior.
36 A storage location in memory that is associated with a given device has a device address that may
37 be dereferenced by a thread executing on that device, but it may not be generally accessible from
38 other devices. A different device may obtain a device pointer that refers to this device address. The
39 manner in which a program can obtain the referenced device address from a device pointer, outside
40 of mechanisms specified by OpenMP, is implementation defined.

CHAPTER 1. OVERVIEW OF THE OPENMP API 27


1 1.4.2 Device Data Environments
2 When an OpenMP program begins, an implicit target data region for each device surrounds
3 the whole program. Each device has a device data environment that is defined by its implicit
4 target data region. Any declare target directives and directives that accept data-mapping
5 attribute clauses determine how an original variable in a data environment is mapped to a
6 corresponding variable in a device data environment.
7 When an original variable is mapped to a device data environment and a corresponding variable is
8 not present in the device data environment, a new corresponding variable (of the same type and size
9 as the original variable) is created in the device data environment. Conversely, the original variable
10 becomes the new variable’s corresponding variable in the device data environment of the device
11 that performs a mapping operation.
12 The corresponding variable in the device data environment may share storage with the original
13 variable. Writes to the corresponding variable may alter the value of the original variable. The
14 impact of this possibility on memory consistency is discussed in Section 1.4.6. When a task
15 executes in the context of a device data environment, references to the original variable refer to the
16 corresponding variable in the device data environment. If an original variable is not currently
17 mapped and a corresponding variable does not exist in the device data environment then accesses to
18 the original variable result in unspecified behavior unless the unified_shared_memory
19 clause is specified on a requires directive for the compilation unit.
20 The relationship between the value of the original variable and the initial or final value of the
21 corresponding variable depends on the map-type. Details of this issue, as well as other issues with
22 mapping a variable, are provided in Section 5.8.3.
23 The original variable in a data environment and a corresponding variable in a device data
24 environment may share storage. Without intervening synchronization data races can occur.
25 If a variable has a corresponding variable with which it does not share storage, a write to a storage
26 location designated by the variable causes the value at the corresponding storage location to
27 become undefined.

28 1.4.3 Memory Management


29 The host device, and other devices that an implementation may support, have attached storage
30 resources where program variables are stored. These resources can have different traits. A memory
31 space in an OpenMP program represents a set of these storage resources. Memory spaces are
32 defined according to a set of traits, and a single resource may be exposed as multiple memory
33 spaces with different traits or may be part of multiple memory spaces. In any device, at least one
34 memory space is guaranteed to exist.
35 An OpenMP program can use a memory allocator to allocate memory in which to store variables.
36 This memory will be allocated from the storage resources of the memory space associated with the
37 memory allocator. Memory allocators are also used to deallocate previously allocated memory.

28 OpenMP API – Version 5.2 November 2021


1 When an OpenMP memory allocator is not used to allocate memory, OpenMP does not prescribe
2 the storage resource for the allocation; the memory for the variables may be allocated in any storage
3 resource.

4 1.4.4 The Flush Operation


5 The memory model has relaxed-consistency because a thread’s temporary view of memory is not
6 required to be consistent with memory at all times. A value written to a variable can remain in the
7 thread’s temporary view until it is forced to memory at a later time. Likewise, a read from a
8 variable may retrieve the value from the thread’s temporary view, unless it is forced to read from
9 memory. OpenMP flush operations are used to enforce consistency between a thread’s temporary
10 view of memory and memory, or between multiple threads’ views of memory.
11 A flush operation has an associated device-set that constrains the threads with which it enforces
12 memory consistency. Consistency is only guaranteed to be enforced between the view of memory
13 of its thread and the view of memory of other threads executing on devices in its device-set. Unless
14 otherwise stated, the device-set of a flush operation only includes the current device.
15 If a flush operation is a strong flush, it enforces consistency between a thread’s temporary view and
16 memory. A strong flush operation is applied to a set of variables called the flush-set. A strong flush
17 restricts how an implementation may reorder memory operations. Implementations must not
18 reorder the code for a memory operation for a given variable, or the code for a flush operation for
19 the variable, with respect to a strong flush operation that refers to the same variable.
20 If a thread has performed a write to its temporary view of a shared variable since its last strong flush
21 of that variable then, when it executes another strong flush of the variable, the strong flush does not
22 complete until the value of the variable has been written to the variable in memory. If a thread
23 performs multiple writes to the same variable between two strong flushes of that variable, the strong
24 flush ensures that the value of the last write is written to the variable in memory. A strong flush of a
25 variable executed by a thread also causes its temporary view of the variable to be discarded, so that
26 if its next memory operation for that variable is a read, then the thread will read from memory and
27 capture the value in its temporary view. When a thread executes a strong flush, no later memory
28 operation by that thread for a variable involved in that strong flush is allowed to start until the strong
29 flush completes. The completion of a strong flush executed by a thread is defined as the point at
30 which all writes to the flush-set performed by the thread before the strong flush are visible in
31 memory to all other threads, and at which that thread’s temporary view of the flush-set is discarded.
32 A strong flush operation provides a guarantee of consistency between a thread’s temporary view
33 and memory. Therefore, a strong flush can be used to guarantee that a value written to a variable by
34 one thread may be read by a second thread. To accomplish this, the programmer must ensure that
35 the second thread has not written to the variable since its last strong flush of the variable, and that
36 the following sequence of events are completed in this specific order:
37 1. The value is written to the variable by the first thread;
38 2. The variable is flushed, with a strong flush, by the first thread;

CHAPTER 1. OVERVIEW OF THE OPENMP API 29


1 3. The variable is flushed, with a strong flush, by the second thread; and
2 4. The value is read from the variable by the second thread.
3 If a flush operation is a release flush or acquire flush, it can enforce consistency between the views
4 of memory of two synchronizing threads. A release flush guarantees that any prior operation that
5 writes or reads a shared variable will appear to be completed before any operation that writes or
6 reads the same shared variable and follows an acquire flush with which the release flush
7 synchronizes (see Section 1.4.5 for more details on flush synchronization). A release flush will
8 propagate the values of all shared variables in its temporary view to memory prior to the thread
9 performing any subsequent atomic operation that may establish a synchronization. An acquire flush
10 will discard any value of a shared variable in its temporary view to which the thread has not written
11 since last performing a release flush, and it will load any value of a shared variable propagated by a
12 release flush that synchronizes with it into its temporary view so that it may be subsequently read.
13 Therefore, release and acquire flushes may also be used to guarantee that a value written to a
14 variable by one thread may be read by a second thread. To accomplish this, the programmer must
15 ensure that the second thread has not written to the variable since its last acquire flush, and that the
16 following sequence of events happen in this specific order:
17 1. The value is written to the variable by the first thread;
18 2. The first thread performs a release flush;
19 3. The second thread performs an acquire flush; and
20 4. The value is read from the variable by the second thread.
21

22 Note – OpenMP synchronization operations, described in Chapter 15 and in Section 18.9, are
23 recommended for enforcing this order. Synchronization through variables is possible but is not
24 recommended because the proper timing of flushes is difficult.
25

26 The flush properties that define whether a flush operation is a strong flush, a release flush, or an
27 acquire flush are not mutually disjoint. A flush operation may be a strong flush and a release flush;
28 it may be a strong flush and an acquire flush; it may be a release flush and an acquire flush; or it
29 may be all three.

30 1.4.5 Flush Synchronization and Happens Before


31 OpenMP supports thread synchronization with the use of release flushes and acquire flushes. For
32 any such synchronization, a release flush is the source of the synchronization and an acquire flush is
33 the sink of the synchronization, such that the release flush synchronizes with the acquire flush.
34 A release flush has one or more associated release sequences that define the set of modifications
35 that may be used to establish a synchronization. A release sequence starts with an atomic operation
36 that follows the release flush and modifies a shared variable and additionally includes any

30 OpenMP API – Version 5.2 November 2021


1 read-modify-write atomic operations that read a value taken from some modification in the release
2 sequence. The following rules determine the atomic operation that starts an associated release
3 sequence.
4 • If a release flush is performed on entry to an atomic operation, that atomic operation starts its
5 release sequence.
6 • If a release flush is performed in an implicit flush region, an atomic operation that is provided
7 by the implementation and that modifies an internal synchronization variable starts its release
8 sequence.
9 • If a release flush is performed by an explicit flush region, any atomic operation that modifies a
10 shared variable and follows the flush region in its thread’s program order starts an associated
11 release sequence.
12 An acquire flush is associated with one or more prior atomic operations that read a shared variable
13 and that may be used to establish a synchronization. The following rules determine the associated
14 atomic operation that may establish a synchronization.
15 • If an acquire flush is performed on exit from an atomic operation, that atomic operation is its
16 associated atomic operation.
17 • If an acquire flush is performed in an implicit flush region, an atomic operation that is
18 provided by the implementation and that reads an internal synchronization variable is its
19 associated atomic operation.
20 • If an acquire flush is performed by an explicit flush region, any atomic operation that reads a
21 shared variable and precedes the flush region in its thread’s program order is an associated
22 atomic operation.
23 A release flush synchronizes with an acquire flush if the following conditions are satisfied:
24 • An atomic operation associated with the acquire flush reads a value written by a modification
25 from a release sequence associated with the release flush; and
26 • The device on which each flush is performed is in both of their respective device-sets.
27 An operation X simply happens before an operation Y if any of the following conditions are
28 satisfied:
29 1. X and Y are performed by the same thread, and X precedes Y in the thread’s program order;
30 2. X synchronizes with Y according to the flush synchronization conditions explained above or
31 according to the base language’s definition of synchronizes with, if such a definition exists; or
32 3. Another operation, Z, exists such that X simply happens before Z and Z simply happens before Y.
33 An operation X happens before an operation Y if any of the following conditions are satisfied:
34 1. X happens before Y according to the base language’s definition of happens before, if such a
35 definition exists; or

CHAPTER 1. OVERVIEW OF THE OPENMP API 31


1 2. X simply happens before Y.
2 A variable with an initial value is treated as if the value is stored to the variable by an operation that
3 happens before all operations that access or modify the variable in the program.

4 1.4.6 OpenMP Memory Consistency


5 The following rules guarantee an observable completion order for a given pair of memory
6 operations in race-free programs, as seen by all affected threads. If both memory operations are
7 strong flushes, the affected threads are all threads on devices in both of their respective device-sets.
8 If exactly one of the memory operations is a strong flush, the affected threads are all threads on
9 devices in its device-set. Otherwise, the affected threads are all threads.
10 • If two operations performed by different threads are sequentially consistent atomic operations or
11 they are strong flushes that flush the same variable, then they must be completed as if in some
12 sequential order, seen by all affected threads.
13 • If two operations performed by the same thread are sequentially consistent atomic operations or
14 they access, modify, or, with a strong flush, flush the same variable, then they must be completed
15 as if in that thread’s program order, as seen by all affected threads.
16 • If two operations are performed by different threads and one happens before the other, then they
17 must be completed as if in that happens before order, as seen by all affected threads, if:
18 – both operations access or modify the same variable;
19 – both operations are strong flushes that flush the same variable; or
20 – both operations are sequentially consistent atomic operations.
21 • Any two atomic memory operations from different atomic regions must be completed as if in
22 the same order as the strong flushes implied in their respective regions, as seen by all affected
23 threads.
24 The flush operation can be specified using the flush directive, and is also implied at various
25 locations in an OpenMP program: see Section 15.8.5 for details.
26

27 Note – Since flush operations by themselves cannot prevent data races, explicit flush operations
28 are only useful in combination with non-sequentially consistent atomic directives.
29

30 OpenMP programs that:


31 • Do not use non-sequentially consistent atomic directives;
32 • Do not rely on the accuracy of a false result from omp_test_lock and
33 omp_test_nest_lock; and

32 OpenMP API – Version 5.2 November 2021


1 • Correctly avoid data races as required in Section 1.4.1,
2 behave as though operations on shared variables were simply interleaved in an order consistent with
3 the order in which they are performed by each thread. The relaxed consistency model is invisible
4 for such programs, and any explicit flush operations in such programs are redundant.

5 1.5 Tool Interfaces


6 The OpenMP API includes two tool interfaces, OMPT and OMPD, to enable development of
7 high-quality, portable, tools that support monitoring, performance, or correctness analysis and
8 debugging of OpenMP programs developed using any implementation of the OpenMP API.
9 An implementation of the OpenMP API may differ from the abstract execution model described by
10 its specification. The ability of tools that use the OMPT or OMPD interfaces to observe such
11 differences does not constrain implementations of the OpenMP API in any way.

12 1.5.1 OMPT
13 The OMPT interface, which is intended for first-party tools, provides the following:
14 • A mechanism to initialize a first-party tool;
15 • Routines that enable a tool to determine the capabilities of an OpenMP implementation;
16 • Routines that enable a tool to examine OpenMP state information associated with a thread;
17 • Mechanisms that enable a tool to map implementation-level calling contexts back to their
18 source-level representations;
19 • A callback interface that enables a tool to receive notification of OpenMP events;
20 • A tracing interface that enables a tool to trace activity on OpenMP target devices; and
21 • A runtime library routine that an application can use to control a tool.
22 OpenMP implementations may differ with respect to the thread states that they support, the mutual
23 exclusion implementations that they employ, and the OpenMP events for which tool callbacks are
24 invoked. For some OpenMP events, OpenMP implementations must guarantee that a registered
25 callback will be invoked for each occurrence of the event. For other OpenMP events, OpenMP
26 implementations are permitted to invoke a registered callback for some or no occurrences of the
27 event; for such OpenMP events, however, OpenMP implementations are encouraged to invoke tool
28 callbacks on as many occurrences of the event as is practical. Section 19.2.4 specifies the subset of
29 OMPT callbacks that an OpenMP implementation must support for a minimal implementation of
30 the OMPT interface.
31 With the exception of the omp_control_tool runtime library routine for tool control, all other
32 routines in the OMPT interface are intended for use only by tools and are not visible to

CHAPTER 1. OVERVIEW OF THE OPENMP API 33


1 applications. For that reason, a Fortran binding is provided only for omp_control_tool; all
2 other OMPT functionality is described with C syntax only.

3 1.5.2 OMPD
4 The OMPD interface is intended for third-party tools, which run as separate processes. An
5 OpenMP implementation must provide an OMPD library that can be dynamically loaded and used
6 by a third-party tool. A third-party tool, such as a debugger, uses the OMPD library to access
7 OpenMP state of a program that has begun execution. OMPD defines the following:
8 • An interface that an OMPD library exports, which a tool can use to access OpenMP state of a
9 program that has begun execution;
10 • A callback interface that a tool provides to the OMPD library so that the library can use it to
11 access the OpenMP state of a program that has begun execution; and
12 • A small number of symbols that must be defined by an OpenMP implementation to help the tool
13 find the correct OMPD library to use for that OpenMP implementation and to facilitate
14 notification of events.
15 Chapter 20 describes OMPD in detail.

16 1.6 OpenMP Compliance


17 The OpenMP API defines constructs that operate in the context of the base language that is
18 supported by an implementation. If the implementation of the base language does not support a
19 language construct that appears in this document, a compliant OpenMP implementation is not
20 required to support it, with the exception that for Fortran, the implementation must allow case
21 insensitivity for directive and API routines names, and must allow identifiers of more than six
22 characters. An implementation of the OpenMP API is compliant if and only if it compiles and
23 executes all other conforming programs, and supports the tool interfaces, according to the syntax
24 and semantics laid out in Chapters 1 through 20. Appendices A and B as well as sections designated
25 as Notes (see Section 1.8) are for information purposes only and are not part of the specification.
26 All library, intrinsic and built-in routines provided by the base language must be thread-safe in a
27 compliant implementation. In addition, the implementation of the base language must also be
28 thread-safe. For example, ALLOCATE and DEALLOCATE statements must be thread-safe in
29 Fortran. Unsynchronized concurrent use of such routines by different threads must produce correct
30 results (although not necessarily the same as serial execution results, as in the case of random
31 number generation routines).
32 Starting with Fortran 90, variables with explicit initialization have the SAVE attribute implicitly.
33 This is not the case in Fortran 77. However, a compliant OpenMP Fortran implementation must
34 give such a variable the SAVE attribute, regardless of the underlying base language version.

34 OpenMP API – Version 5.2 November 2021


1 Appendix A lists certain aspects of the OpenMP API that are implementation defined. A compliant
2 implementation must define and document its behavior for each of the items in Appendix A.

3 1.7 Normative References


4 • ISO/IEC 9899:1990, Information Technology - Programming Languages - C.
5 This OpenMP API specification refers to ISO/IEC 9899:1990 as C90.
6 • ISO/IEC 9899:1999, Information Technology - Programming Languages - C.
7 This OpenMP API specification refers to ISO/IEC 9899:1999 as C99.
8 • ISO/IEC 9899:2011, Information Technology - Programming Languages - C.
9 This OpenMP API specification refers to ISO/IEC 9899:2011 as C11.
10 • ISO/IEC 9899:2018, Information Technology - Programming Languages - C.
11 This OpenMP API specification refers to ISO/IEC 9899:2018 as C18.
12 • ISO/IEC 14882:1998, Information Technology - Programming Languages - C++.
13 This OpenMP API specification refers to ISO/IEC 14882:1998 as C++98.
14 • ISO/IEC 14882:2011, Information Technology - Programming Languages - C++.
15 This OpenMP API specification refers to ISO/IEC 14882:2011 as C++11.
16 • ISO/IEC 14882:2014, Information Technology - Programming Languages - C++.
17 This OpenMP API specification refers to ISO/IEC 14882:2014 as C++14.
18 • ISO/IEC 14882:2017, Information Technology - Programming Languages - C++.
19 This OpenMP API specification refers to ISO/IEC 14882:2017 as C++17.
20 • ISO/IEC 14882:2020, Information Technology - Programming Languages - C++.
21 This OpenMP API specification refers to ISO/IEC 14882:2020 as C++20.
22 • ISO/IEC 1539:1980, Information Technology - Programming Languages - Fortran.
23 This OpenMP API specification refers to ISO/IEC 1539:1980 as Fortran 77.
24 • ISO/IEC 1539:1991, Information Technology - Programming Languages - Fortran.
25 This OpenMP API specification refers to ISO/IEC 1539:1991 as Fortran 90.
26 • ISO/IEC 1539-1:1997, Information Technology - Programming Languages - Fortran.
27 This OpenMP API specification refers to ISO/IEC 1539-1:1997 as Fortran 95.

CHAPTER 1. OVERVIEW OF THE OPENMP API 35


1 • ISO/IEC 1539-1:2004, Information Technology - Programming Languages - Fortran.
2 This OpenMP API specification refers to ISO/IEC 1539-1:2004 as Fortran 2003.
3 • ISO/IEC 1539-1:2010, Information Technology - Programming Languages - Fortran.
4 This OpenMP API specification refers to ISO/IEC 1539-1:2010 as Fortran 2008.
5 • ISO/IEC 1539-1:2018, Information Technology - Programming Languages - Fortran.
6 This OpenMP API specification refers to ISO/IEC 1539-1:2018 as Fortran 2018. While future
7 versions of the OpenMP specification are expected to address the following features, currently
8 their use may result in unspecified behavior.
9 – Declared type of a polymorphic allocatable component in structure constructor
10 – SELECT RANK construct
11 – Assumed-rank dummy argument
12 – Assumed-type dummy argument
13 – Interoperable procedure enhancements
14 Where this OpenMP API specification refers to C, C++ or Fortran, reference is made to the base
15 language supported by the implementation.

16 1.8 Organization of this Document


17 The remainder of this document is structured as normative chapters that define the directives,
18 including their syntax and semantics, the runtime routines and the tool interfaces that comprise the
19 OpenMP API. The document also includes appendices that facilitate maintaining a compliant
20 implementation of the API.
21 Some sections of this document only apply to programs written in a certain base language. Text that
22 applies only to programs for which the base language is C or C++ is shown as follows:
C / C++
23 C/C++ specific text...
C / C++
24 Text that applies only to programs for which the base language is C only is shown as follows:
C
25 C specific text...
C
26 Text that applies only to programs for which the base language is C++ only is shown as follows:

36 OpenMP API – Version 5.2 November 2021


C++
1 C++ specific text...
C++
2 Text that applies only to programs for which the base language is Fortran is shown as follows:
Fortran
3 Fortran specific text...
Fortran
4 Where an entire page consists of base language specific text, a marker is shown at the top of the
5 page. For Fortran-specific text, the marker is:

Fortran (cont.)

6 For C/C++-specific text, the marker is:

C/C++ (cont.)

7 Some text is for information only, and is not part of the normative specification. Such text is
8 designated as a note or comment, like this:
9
10 Note – Non-normative text...
11
12 COMMENT: Non-normative text...

CHAPTER 1. OVERVIEW OF THE OPENMP API 37


1 2 Internal Control Variables
2 An OpenMP implementation must act as if internal control variables (ICVs) control the behavior of
3 an OpenMP program. These ICVs store information such as the number of threads to use for future
4 parallel regions. One copy exists of each ICV per instance of its scope. Possible ICV scopes
5 are: global; device; implicit task; and data environment. If an ICV has global scope then one copy
6 exists for the whole program. The ICVs are given values at various times (described below) during
7 the execution of the program. They are initialized by the implementation itself and may be given
8 values through OpenMP environment variables and through calls to OpenMP API routines. The
9 program can retrieve the values of these ICVs only through OpenMP API routines.
10 For purposes of exposition, this document refers to the ICVs by certain names, but an
11 implementation is not required to use these names or to offer any way to access the variables other
12 than through the ways shown in Section 2.2.

13 2.1 ICV Descriptions


14 Table 2.1 shows the scope and description of each ICV.

TABLE 2.1: ICV Scopes and Descriptions

ICV Scope Description

active-levels-var data environment Number of nested active parallel regions


such that all parallel regions are enclosed by
the outermost initial task region on the device
affinity-format-var device Controls the thread affinity format when display-
ing thread affinity
bind-var data environment Controls the binding of OpenMP threads to
places; when binding is requested, indicates that
the execution environment is advised not to move
threads between places; can also provide default
thread affinity policies
cancel-var global Controls the desired behavior of the cancel
construct and cancellation points

38
ICV Scope Description
debug-var global Controls whether an OpenMP implementation
will collect information that an OMPD library
can access to satisfy requests from a tool
def-allocator-var implicit task Controls the memory allocator used by memory
allocation routines, directives and clauses that do
not specify one explicitly
default-device-var data environment Controls the default target device
display-affinity-var global Controls the display of thread affinity
dyn-var data environment Enables dynamic adjustment of the number of
threads used for encountered parallel regions
explicit-task-var data environment Whether a given task is an explicit task
final-task-var data environment Whether a given task is a final task
levels-var data environment Number of nested parallel regions such that
all parallel regions are enclosed by the outer-
most initial task region on the device
max-active-levels-var data environment Controls the maximum number of nested ac-
tive parallel regions when the innermost
parallel region is generated by a given task
max-task-priority-var global Controls the maximum value that can be speci-
fied in the priority clause
nteams-var device Controls the number of teams requested for en-
countered teams regions
nthreads-var data environment Controls the number of threads requested for
encountered parallel regions
num-procs-var device The number of processors available on the device
place-partition-var implicit task Controls the place partition available for encoun-
tered parallel regions
run-sched-var data environment Controls the schedule used for worksharing-loop
regions that specify the runtime schedule kind
stacksize-var device Controls the stack size for threads that the
OpenMP implementation creates
target-offload-var global Controls the offloading behavior
team-size-var data environment Size of the current team
teams-thread-limit-var device Controls the maximum number of threads in each
contention group that a teams construct creates
thread-limit-var data environment Controls the maximum number of threads that
participate in the contention group

CHAPTER 2. INTERNAL CONTROL VARIABLES 39


ICV Scope Description
thread-num-var data environment Thread number of an implicit task within its
binding team
tool-libraries-var global List of absolute paths to tool libraries
tool-var global Indicates that a tool will be registered
tool-verbose-init-var global Controls whether an OpenMP implementation
will verbosely log the registration of a tool
wait-policy-var device Controls the desired behavior of waiting threads

1 2.2 ICV Initialization


2 Table 2.2 shows the ICVs, associated environment variables, and initial values.

TABLE 2.2: ICV Initial Values

ICV Environment Variable Initial Value


active-levels-var (none) Zero
affinity-format-var OMP_AFFINITY_FORMAT Implementation defined
bind-var OMP_PROC_BIND Implementation defined
cancel-var OMP_CANCELLATION False
debug-var OMP_DEBUG disabled
def-allocator-var OMP_ALLOCATOR Implementation defined
default-device-var OMP_DEFAULT_DEVICE See below
display-affinity-var OMP_DISPLAY_AFFINITY False
dyn-var OMP_DYNAMIC Implementation defined
explicit-task-var (none) False
final-task-var (none) False
levels-var (none) Zero
max-active-levels-var OMP_MAX_ACTIVE_LEVELS, Implementation defined
OMP_NESTED, OMP_NUM_THREADS,
OMP_PROC_BIND
max-task-priority-var OMP_MAX_TASK_PRIORITY Zero
nteams-var OMP_NUM_TEAMS Zero
nthreads-var OMP_NUM_THREADS Implementation defined
num-procs-var (none) Implementation defined
place-partition-var OMP_PLACES Implementation defined
run-sched-var OMP_SCHEDULE Implementation defined
stacksize-var OMP_STACKSIZE Implementation defined

40 OpenMP API – Version 5.2 November 2021


ICV Environment Variable Initial Value
target-offload-var OMP_TARGET_OFFLOAD default
team-size-var (none) One
teams-thread-limit-var OMP_TEAMS_THREAD_LIMIT Zero
thread-limit-var OMP_THREAD_LIMIT Implementation defined
thread-num-var (none) Zero
tool-libraries-var OMP_TOOL_LIBRARIES empty string
tool-var OMP_TOOL enabled
tool-verbose-init-var OMP_TOOL_VERBOSE_INIT disabled
wait-policy-var OMP_WAIT_POLICY Implementation defined

1 If an ICV has an associated environment variable and that ICV does not have global scope then the
2 ICV has a set of associated device-specific environment variables that extend the associated
3 environment variable with the following syntax:
4 <ENVIRONMENT VARIABLE>_DEV[_<device>]
5 where <ENVIRONMENT VARIABLE> is the associated environment variable and <device> is the
6 device number as specified in the device clause (see Section 13.2).

7 Semantics
8 • The initial value of dyn-var is implementation defined if the implementation supports dynamic
9 adjustment of the number of threads; otherwise, the initial value is false.
10 • If target-offload-var is mandatory and the number of non-host devices is zero then the
11 default-device-var is initialized to omp_invalid_device. Otherwise, the initial value is an
12 implementation-defined non-negative integer that is less than or, if target-offload-var is not
13 mandatory, equal to omp_get_initial_device().
14 • The value of the nthreads-var ICV is a list.
15 • The value of the bind-var ICV is a list.
16 The host and non-host device ICVs are initialized before any OpenMP API construct or OpenMP
17 API routine executes. After the initial values are assigned, the values of any OpenMP environment
18 variables that were set by the user are read and the associated ICVs are modified accordingly. If no
19 <device> number is specified on the device-specific environment variable then the value is applied
20 to all non-host devices.

21 Cross References
22 • OMP_AFFINITY_FORMAT, see Section 21.2.5
23 • OMP_ALLOCATOR, see Section 21.5.1
24 • OMP_CANCELLATION, see Section 21.2.6

CHAPTER 2. INTERNAL CONTROL VARIABLES 41


1 • OMP_DEBUG, see Section 21.4.1
2 • OMP_DEFAULT_DEVICE, see Section 21.2.7
3 • OMP_DISPLAY_AFFINITY, see Section 21.2.4
4 • OMP_DYNAMIC, see Section 21.1.1
5 • OMP_MAX_ACTIVE_LEVELS, see Section 21.1.4
6 • OMP_MAX_TASK_PRIORITY, see Section 21.2.9
7 • OMP_NESTED (Deprecated), see Section 21.1.5
8 • OMP_NUM_TEAMS, see Section 21.6.1
9 • OMP_NUM_THREADS, see Section 21.1.2
10 • OMP_PLACES, see Section 21.1.6
11 • OMP_PROC_BIND, see Section 21.1.7
12 • OMP_SCHEDULE, see Section 21.2.1
13 • OMP_STACKSIZE, see Section 21.2.2
14 • OMP_TARGET_OFFLOAD, see Section 21.2.8
15 • OMP_TEAMS_THREAD_LIMIT, see Section 21.6.2
16 • OMP_THREAD_LIMIT, see Section 21.1.3
17 • OMP_TOOL, see Section 21.3.1
18 • OMP_TOOL_LIBRARIES, see Section 21.3.2
19 • OMP_WAIT_POLICY, see Section 21.2.3

20 2.3 Modifying and Retrieving ICV Values


21 Table 2.3 shows methods for modifying and retrieving the ICV values. If (none) is listed for an
22 ICV, the OpenMP API does not support its modification or retrieval. Calls to OpenMP API routines
23 retrieve or modify data environment scoped ICVs in the data environment of their binding tasks.

TABLE 2.3: Ways to Modify and to Retrieve ICV Values

ICV Ways to Modify Value Ways to Retrieve Value


active-levels-var (none) omp_get_active_level
affinity-format-var omp_set_affinity_format omp_get_affinity_format
bind-var (none) omp_get_proc_bind
cancel-var (none) omp_get_cancellation

42 OpenMP API – Version 5.2 November 2021


ICV Ways to Modify Value Ways to Retrieve Value
debug-var (none) (none)
def-allocator-var omp_set_default_allocator omp_get_default_allocator
default-device-var omp_set_default_device omp_get_default_device
display-affinity-var (none) (none)
dyn-var omp_set_dynamic omp_get_dynamic
explicit-task-var (none) omp_in_explicit_task
final-task-var (none) omp_in_final
levels-var (none) omp_get_level
max-active-levels-var omp_set_max_active_levels, omp_get_max_active_levels
omp_set_nested
max-task-priority-var (none) omp_get_max_task_priority
nteams-var omp_set_num_teams omp_get_max_teams
nthreads-var omp_set_num_threads omp_get_max_threads
num-procs-var (none) omp_get_num_procs
place-partition-var (none) omp_get_partition_num_places,
omp_get_partition_place_nums,
omp_get_place_num_procs,
omp_get_place_proc_ids
run-sched-var omp_set_schedule omp_get_schedule
stacksize-var (none) (none)
target-offload-var (none) (none)
team-size-var (none) omp_get_num_threads
teams-thread-limit-var omp_set_teams_thread_limit omp_get_teams_thread_limit
thread-limit-var thread_limit omp_get_thread_limit
thread-num-var (none) omp_get_thread_num
tool-libraries-var (none) (none)
tool-var (none) (none)
tool-verbose-init-var (none) (none)
wait-policy-var (none) (none)

1 Semantics
2 • The value of the bind-var ICV is a list. The runtime call omp_get_proc_bind retrieves the
3 value of the first element of this list.
4 • The value of the nthreads-var ICV is a list. The runtime call omp_set_num_threads sets
5 the value of the first element of this list, and omp_get_max_threads retrieves the value of
6 the first element of this list.
7 • Detailed values in the place-partition-var ICV are retrieved using the listed runtime calls.

CHAPTER 2. INTERNAL CONTROL VARIABLES 43


1 • The thread_limit clause sets the thread-limit-var ICV for the region of the construct on
2 which it appears.

3 Cross References
4 • omp_get_active_level, see Section 18.2.20
5 • omp_get_affinity_format, see Section 18.3.9
6 • omp_get_cancellation, see Section 18.2.8
7 • omp_get_default_allocator, see Section 18.13.5
8 • omp_get_default_device, see Section 18.7.3
9 • omp_get_dynamic, see Section 18.2.7
10 • omp_get_level, see Section 18.2.17
11 • omp_get_max_active_levels, see Section 18.2.16
12 • omp_get_max_task_priority, see Section 18.5.1
13 • omp_get_max_teams, see Section 18.4.4
14 • omp_get_max_threads, see Section 18.2.3
15 • omp_get_num_procs, see Section 18.7.1
16 • omp_get_num_threads, see Section 18.2.2
17 • omp_get_partition_num_places, see Section 18.3.6
18 • omp_get_partition_place_nums, see Section 18.3.7
19 • omp_get_place_num_procs, see Section 18.3.3
20 • omp_get_place_proc_ids, see Section 18.3.4
21 • omp_get_proc_bind, see Section 18.3.1
22 • omp_get_schedule, see Section 18.2.12
23 • omp_get_supported_active_levels, see Section 18.2.14
24 • omp_get_teams_thread_limit, see Section 18.4.6
25 • omp_get_thread_limit, see Section 18.2.13
26 • omp_get_thread_num, see Section 18.2.4
27 • omp_in_final, see Section 18.5.3
28 • omp_set_affinity_format, see Section 18.3.8
29 • omp_set_default_allocator, see Section 18.13.4
30 • omp_set_default_device, see Section 18.7.2

44 OpenMP API – Version 5.2 November 2021


1 • omp_set_dynamic, see Section 18.2.6
2 • omp_set_max_active_levels, see Section 18.2.15
3 • omp_set_nested (Deprecated), see Section 18.2.9
4 • omp_set_num_teams, see Section 18.4.3
5 • omp_set_num_threads, see Section 18.2.1
6 • omp_set_schedule, see Section 18.2.11
7 • omp_set_teams_thread_limit, see Section 18.4.5
8 • thread_limit clause, see Section 13.3

9 2.4 How the Per-Data Environment ICVs Work


10 When a task construct, a parallel construct or a teams construct is encountered, each
11 generated task inherits the values of the data environment scoped ICVs from each generating task’s
12 ICV values.
13 When a parallel construct is encountered, the value of each ICV with implicit task scope is
14 inherited from the implicit binding task of the generating task unless otherwise specified.
15 When a task construct is encountered, the generated task inherits the value of nthreads-var from
16 the generating task’s nthreads-var value. When a parallel construct is encountered, and the
17 generating task’s nthreads-var list contains a single element, the generated implicit tasks inherit
18 that list as the value of nthreads-var. When a parallel construct is encountered, and the
19 generating task’s nthreads-var list contains multiple elements, the generated implicit tasks inherit
20 the value of nthreads-var as the list obtained by deletion of the first element from the generating
21 task’s nthreads-var value. The bind-var ICV is handled in the same way as the nthreads-var ICV.
22 When a target task executes an active target region, the generated initial task uses the values of
23 the data environment scoped ICVs from the device data environment ICV values of the device that
24 will execute the region.
25 When a target task executes an inactive target region, the generated initial task uses the values
26 of the data environment scoped ICVs from the data environment of the task that encountered the
27 target construct.
28 If a target construct with a thread_limit clause is encountered, the thread-limit-var ICV
29 from the data environment of the generated initial task is instead set to an implementation defined
30 value between one and the value specified in the clause.
31 If a target construct with no thread_limit clause is encountered, the thread-limit-var ICV
32 from the data environment of the generated initial task is set to an implementation defined value
33 that is greater than zero.

CHAPTER 2. INTERNAL CONTROL VARIABLES 45


1 If a teams construct with a thread_limit clause is encountered, the thread-limit-var ICV
2 from the data environment of the initial task for each team is instead set to an implementation
3 defined value between one and the value specified in the clause.
4 If a teams construct with no thread_limit clause is encountered, the thread-limit-var ICV
5 from the data environment of the initial task of each team is set to an implementation defined value
6 that is greater than zero and does not exceed teams-thread-limit-var, if teams-thread-limit-var is
7 greater than zero.
8 When encountering a worksharing-loop region for which the runtime schedule kind is specified,
9 all implicit task regions that constitute the binding parallel region must have the same value for
10 run-sched-var in their data environments. Otherwise, the behavior is unspecified.

11 2.5 ICV Override Relationships


12 Table 2.4 shows the override relationships among construct clauses and ICVs. The table only lists
13 ICVs that can be overridden by a clause.

TABLE 2.4: ICV Override Relationships

ICV construct clause, if used


bind-var proc_bind
def-allocator-var allocate, allocator
nteams-var num_teams
nthreads-var num_threads
run-sched-var schedule
teams-thread-limit-var thread_limit

14 Semantics
15 • The num_threads clause overrides the value of the first element of the nthreads-var ICV.
16 • If a schedule clause specifies a modifier then that modifier overrides any modifier that is
17 specified in the run-sched-var ICV.
18 • If bind-var is not set to false then the proc_bind clause overrides the value of the first element
19 of the bind-var ICV; otherwise, the proc_bind clause has no effect.

20 Cross References
21 • allocate clause, see Section 6.6
22 • allocator clause, see Section 6.4
23 • num_teams clause, see Section 10.2.1

46 OpenMP API – Version 5.2 November 2021


1 • num_threads clause, see Section 10.1.2
2 • proc_bind clause, see Section 10.1.4
3 • schedule clause, see Section 11.5.3
4 • thread_limit clause, see Section 13.3

CHAPTER 2. INTERNAL CONTROL VARIABLES 47


1 3 Directive and Construct Syntax
2 This chapter describes the syntax of OpenMP directives, clauses and any related base language
3 code. OpenMP directives are specified with various base-language mechanisms that allow
4 compilers to ignore OpenMP directives and conditionally compiled code if support of the OpenMP
5 API is not provided or enabled. A compliant implementation must provide an option or interface
6 that ensures that underlying support of all OpenMP directives and OpenMP conditional
7 compilation mechanisms is enabled. In the remainder of this document, the phrase OpenMP
8 compilation is used to mean a compilation with these OpenMP features enabled.

9 Restrictions
10 The following restrictions apply to OpenMP directives:
11 • Unless otherwise specified, a program must not depend on any ordering of the evaluations of the
12 expressions that appear in the clauses specified on a directive.
13 • Unless otherwise specified, a program must not depend on any side effects of the evaluations of
14 the expressions that appear in the clauses specified on a directive.
15 Restrictions on explicit OpenMP regions (that arise from executable directives) are as follows:
C++
16 • A throw executed inside a region that arises from a thread-limiting directive must cause
17 execution to resume within the same region, and the same thread that threw the exception must
18 catch it. If the directive is also exception-aborting then whether the exception is caught or the
19 throw results in runtime error termination is implementation defined.
C++
Fortran
20 • A directive may not appear in a pure procedure unless it is pure.
21 • A directive may not appear in a WHERE, FORALL or DO CONCURRENT construct.
22 • If more than one image is executing the program, any image control statement, ERROR STOP
23 statement, FAIL IMAGE statement, collective subroutine call or access to a coindexed object that
24 appears in an explicit OpenMP region will result in unspecified behavior.
Fortran

48 OpenMP API – Version 5.2 November 2021


1 3.1 Directive Format
2 This section defines several categories of directives and constructs. OpenMP directives are
3 specified with a directive-specification. A directive-specification consists of the directive-specifier
4 and any clauses that may optionally be associated with the OpenMP directive:
5 directive-specifier [[,] clause[ [,] clause] ... ]

6 The directive-specifier is:


7 directive-name

8 or for argument-modified directives:


9 directive-name[(directive-arguments)]

C / C++
10 White space in a directive-name is not optional.
C / C++
11 Some OpenMP directives specify a paired end directive, where the directive-name of the paired
12 end directives is:
13 • If directive-name starts with begin, the end-directive-name replaces begin with end
14 • otherwise it is end directive-name unless otherwise specified.
15 The directive-specification of a paired end directive may include one or more optional end-clause:
16 directive-specifier [[,] end-clause[ [,] end-clause]...]

17 where end-clause has the end-clause property, which explicitly allows it on a paired end directive.
C / C++
18 An OpenMP directive may be specified as a pragma directive:
19 #pragma omp directive-specification new-line

20 or a pragma operator:
21 _Pragma("omp directive-specification")

22 The use of omp as the first preprocessing token of a pragma directive is reserved for OpenMP
23 directives that are defined in this specification. The use of ompx as the first preprocessing token of
24 a pragma directive is reserved for implementation-defined extensions to the OpenMP directives.

CHAPTER 3. DIRECTIVE AND CONSTRUCT SYNTAX 49


1
2 Note – In this directive, directive-name is depobj, directive-arguments is o. directive-specifier is
3 depobj(o) and directive-specification is depobj(o) depend(inout: d).
4 #pragma omp depobj(o) depend(inout: d)

6 White space can be used before and after the #. Preprocessing tokens in directive-specification of
7 #pragma and _Pragma pragmas are subject to macro expansion.
C / C++
C++
8 In C++11 and higher, an OpenMP directive may be specified as a C++ attribute specifier:
9 [[ omp :: directive-attr ]]

10 or
11 [[ using omp : directive-attr ]]

12 where directive-attr is
13 directive( directive-specification )

14 or
15 sequence( [omp::]directive-attr [[, [omp::]directive-attr] ... ] )

16 Multiple attributes on the same statement are allowed. Attribute directives that apply to the same
17 statement are unordered unless the sequence attribute is specified, in which case the right-to-left
18 ordering applies. The omp:: namespace qualifier within a sequence attribute is optional. The
19 application of multiple attributes in a sequence attribute is ordered as if each directive had been
20 specified as a pragma directive on subsequent lines.
21

22 Note – This example shows the expected transformation:


23 [[ omp::sequence(directive(parallel), directive(for)) ]]
24 for(...) {}
25 // becomes
26 #pragma omp parallel
27 #pragma omp for
28 for(...) {}

29

30 The use of omp as the attribute namespace of an attribute specifier, or as the optional namespace
31 qualifier within a sequence attribute, is reserved for OpenMP directives that are defined in this
32 specification. The use of ompx as the attribute namespace of an attribute specifier, or as the

50 OpenMP API – Version 5.2 November 2021


1 optional namespace qualifier within a sequence attribute, is reserved for implementation-defined
2 extensions to the OpenMP directives.
3 The pragma and attribute forms are interchangeable for any OpenMP directive. Some OpenMP
4 directives may be composed of consecutive attribute specifiers if specified in their syntax. Any two
5 consecutive attribute specifiers may be reordered or expressed as a single attribute specifier, as
6 permitted by the base language, without changing the behavior of the OpenMP directive.
C++
C / C++
7 Directives are case-sensitive. Each expression used in the OpenMP syntax inside of a clause must
8 be a valid assignment-expression of the base language unless otherwise specified.
C / C++
C++
9 Directives may not appear in constexpr functions or in constant expressions.
C++
Fortran
10 An OpenMP directive for Fortran is specified with a stylized comment as follows:
11 sentinel directive-specification

12 All OpenMP compiler directives must begin with a directive sentinel. The format of a sentinel
13 differs between fixed form and free form source files, as described in Section 3.1.1 and
14 Section 3.1.2. In order to simplify the presentation, free form is used for the syntax of OpenMP
15 directives for Fortran throughout this document, except as noted.
16 Directives are case insensitive. Directives cannot be embedded within continued statements, and
17 statements cannot be embedded within directives. Each expression used in the OpenMP syntax
18 inside of a clause must be a valid expression of the base language unless otherwise specified.
Fortran
19 A directive may be categorized as one of the following:
20 • meta
21 • declarative
22 • executable
23 • informational
24 • utility
25 • subsidiary

CHAPTER 3. DIRECTIVE AND CONSTRUCT SYNTAX 51


1 Base language code can be associated with directives. The directive’s association can be
2 categorized as:
3 • none
4 • block-associated
5 • loop-associated
6 • declaration-associated
7 • delimited
8 • separating
9 A directive and its associated base language code constitute a syntactic formation that follows the
10 syntax given below. The end-directive in a specified formation refers to the paired end directive for
11 the directive. An OpenMP construct is a formation for which the directive is executable.
12 Directives with an association of none are not associated with any base language code. The
13 resulting formation therefore has the following syntax:
14 directive

15 Formations that result from a block-associated directive have the following syntax:
C / C++
16 directive
17 structured-block
C / C++
Fortran
18 directive
19 structured-block
20 [end-directive]

21 If structured-block is a loosely structured block, end-directive is required. If structured-block is a


22 strictly structured block, end-directive is optional. An end-directive that immediately follows a
23 directive and its associated strictly structured block is always paired with that directive.
Fortran
24 Loop-associated directives are block-associated directives for which the associated structured-block
25 is loop-nest, a canonical loop nest.
Fortran
26 For a loop-associated directive, the paired end directive is optional.
Fortran

52 OpenMP API – Version 5.2 November 2021


C / C++
1 Formations that result from a declaration-associated directive have the following syntax:
2 declaration-associated-specification

3 where declaration-associated-specification is either:


4 directive
5 function-definition-or-declaration

6 or:
7 directive
8 declaration-associated-specification

9 In all cases the directive is associated with the function-definition-or-declaration.


C / C++
Fortran
10 The formation that results from a declaration-associated directive in Fortran has the same syntax as
11 the formation for a directive with an association of none.
12 If a directive appears in the specification part of a module then the behavior is as if that directive
13 appears after any references to that module.
Fortran
14 The formation that results from a delimited directive has the following syntax:
15 directive
16 base-language-code
17 end-directive

18 Separating directives may be used to separate a structured-block into multiple


19 structured-block-sequences.
20 Separating directives and the containing structured block have the following syntax:
21 structured-block-sequence
22 directive
23 structured-block-sequence
24 [directive
25 structured-block-sequence ...]

26 wrapped in a single compound statement for C/C++ or optionally wrapped in a single BLOCK
27 construct for Fortran.

CHAPTER 3. DIRECTIVE AND CONSTRUCT SYNTAX 53


1 Restrictions
2 Restrictions to directive format are as follows:
3 • Orphaned separating directives are prohibited. That is, the separating directives must appear
4 within the structured block associated with the same construct with which it is associated and
5 must not be encountered elsewhere in the region of that associated construct.
6 • A stand-alone directive may be placed only at a point where a base language executable
7 statement is allowed.
Fortran
8 • OpenMP directives, except simd and declarative directives, may not appear in pure procedures.
9 • OpenMP directives may not appear in the WHERE, FORALL or DO CONCURRENT constructs.
Fortran
C++
10 • A directive that uses the attribute syntax cannot be applied to the same statement or associated
11 declaration as a directive that uses the pragma syntax.
12 • For any directive that has a paired end directive, both directives must use either the attribute
13 syntax or the pragma syntax.
14 • Neither a stand-alone directive nor a declarative directive may be used in place of a substatement
15 in a selection statement or iteration statement, or in place of the statement that follows a label.
C++
C
16 • Neither a stand-alone directive nor a declarative directive may be used in place of a substatement
17 in a selection statement, in place of the loop body in an iteration statement, or in place of the
18 statement that follows a label.
C
Fortran

19 3.1.1 Fixed Source Form Directives


20 The following sentinels are recognized in fixed form source files:
21 !$omp | c$omp | *$omp | !$omx | c$omx | *$omx

22 The sentinels that end with omp are reserved for OpenMP directives that are defined in this
23 specification. The sentinels that end with omx are reserved for implementation-defined extensions
24 to the OpenMP directives.

54 OpenMP API – Version 5.2 November 2021


1 Sentinels must start in column 1 and appear as a single word with no intervening characters.
2 Fortran fixed form line length, white space, continuation, and column rules apply to the directive
3 line. Initial directive lines must have a space or a zero in column 6, and continuation directive lines
4 must have a character other than a space or a zero in column 6.
5 Comments may appear on the same line as a directive. The exclamation point initiates a comment
6 when it appears after column 6. The comment extends to the end of the source line and is ignored.
7 If the first non-blank character after the directive sentinel of an initial or continuation directive line
8 is an exclamation point, the line is ignored.
9

10 Note – In the following example, the three formats for specifying the directive are equivalent (the
11 first line represents the position of the first 9 columns):
12 c23456789
13 !$omp parallel do shared(a,b,c)
14
15 c$omp parallel do
16 c$omp+shared(a,b,c)
17
18 c$omp paralleldoshared(a,b,c)

19
Fortran
Fortran

20 3.1.2 Free Source Form Directives


21 The following sentinels are recognized in free form source files:
22 !$omp | !$ompx

23 The !$omp sentinel is reserved for OpenMP directives that are defined in this specification. The
24 !$ompx sentinel is reserved for implementation-defined extensions to the OpenMP directives.
25 The sentinel can appear in any column as long as it is preceded only by white space. It must appear
26 as a single word with no intervening white space. Fortran free form line length and white space
27 rules apply to the directive line. Initial directive lines must have a space after the sentinel. The
28 initial line of a directive must not be a continuation line for a base language statement. Fortran free
29 form continuation rules apply. Thus, continued directive lines must have an ampersand (&) as the
30 last non-blank character on the line, prior to any comment placed inside the directive; continuation
31 directive lines can have an ampersand after the directive sentinel with optional white space before
32 and after the ampersand.
33 Comments may appear on the same line as a directive. The exclamation point (!) initiates a
34 comment. The comment extends to the end of the source line and is ignored. If the first non-blank
35 character after the directive sentinel is an exclamation point, the line is ignored.

CHAPTER 3. DIRECTIVE AND CONSTRUCT SYNTAX 55


1 One or more blanks or horizontal tabs are optional to separate adjacent keywords in
2 directive-names unless otherwise specified.
3
4 Note – In the following example the three formats for specifying the directive are equivalent (the
5 first line represents the position of the first 9 columns):
6 !23456789
7 !$omp parallel do &
8 !$omp shared(a,b,c)
9
10 !$omp parallel &
11 !$omp&do shared(a,b,c)
12
13 !$omp paralleldo shared(a,b,c)

14
Fortran

15 3.2 Clause Format


16 This section defines the format and categories of OpenMP clauses. OpenMP clauses are specified
17 as part of a directive-specification. Clauses are optional and, thus, may be omitted from a
18 directive-specification unless otherwise specified. The order in which clauses appear on directives
19 is not significant unless otherwise specified. A clause-specification specifies each OpenMP clause
20 in a directive-specification where clause-specification for inarguable clauses is simply:
21 clause-name

22 Inarguable clauses often form natural groupings that have similar semantic effect and so are
23 frequently specified as a clause grouping. For argument-modified clauses, clause-specification is:
24 clause-name[(clause-argument-specification [; clause-argument-specification [;...]])]
C / C++
25 White space in a clause-name is prohibited. White space within a clause-argument-specification
26 and between another clause-argument-specification is optional.
C / C++
27 An implementation may allow clauses with clause names that start with the ompx_ prefix for use
28 on any OpenMP directive, and the format and semantics of any such clause is implementation
29 defined. All other clause names are reserved.
30 For argument-modified clauses, the first clause-argument-specification is required unless otherwise
31 explicitly stated while additional ones are only permitted on clauses that explicitly allow them.
32 When the first one is omitted, the syntax is identical to an inarguable clause. Clause arguments may
33 be unmodified or modified. For an unmodified argument, clause-argument-specification is:

56 OpenMP API – Version 5.2 November 2021


1 clause-argument-list

2 Unless otherwise specified, modified arguments are pre-modified, for which the format is:
3 [modifier-specification [[, modifier-specification] ,... ] :]clause-argument-list

4 A few modified arguments are explicitly specified as post-modified, for which the format is:
5 clause-argument-list[: modifier-specification [[, modifier-specification] ,... ]]

6 For many OpenMP clauses, clause-argument-list is an OpenMP argument list, which is a


7 comma-separated list of a specific kind of list items (see Section 3.2.1), in which case the format of
8 clause-argument-list is:
9 argument-name

10 For all other OpenMP clauses, clause-argument-list is a comma-separated list of arguments so the
11 format is:
12 argument-name [, argument-name [,... ]]

13 In most of these cases, the list only has a single item so the format of clause-argument-list is again:
14 argument-name

15 In all cases, white space in clause-argument-list is optional.


16 Clause argument modifiers may be simple or complex. Almost all clause arguments are simple, for
17 which the format of modifier-specification is:
18 modifier-name

19 The format of a complex modifier is:


20 modifier-name(modifier-parameter-specification)

21 where modifier-parameter-specification is a comma-separated list of arguments as defined above for


22 clause-argument-list. The position of each modifier-argument-name in the list is significant.
23 Each argument-name and modifier-name is an OpenMP term that may be used in the definitions of
24 the clause and any directives on which the clause may appear. Syntactically, each of these terms is
25 one of the following:
26 • keyword: An OpenMP keyword
27 • OpenMP identifier: An OpenMP identifier
28 • OpenMP argument list: An OpenMP argument list
29 • expression: An expression of some OpenMP type
30 • OpenMP stylized expression: An OpenMP stylized expression

CHAPTER 3. DIRECTIVE AND CONSTRUCT SYNTAX 57


1 A particular lexical instantiation of an argument specifies a parameter of the clause, while a lexical
2 instantiation of a modifier and its parameters affects how or when the argument is applied.
3 The order of arguments must match the order in the clause-specification. The order of modifiers in
4 a clause-argument-specification is not significant unless otherwise specified.
5 General syntactic properties govern the use of clauses, clause and directive arguments, and
6 modifiers in an OpenMP directive. These properties are summarized in Table 3.1, along with the
7 respective default properties for clauses, arguments and modifiers.

TABLE 3.1: Syntactic Properties for Clauses, Arguments and Modifiers

Property Property Description Inverse Clause Argument Modifier


Property defaults defaults defaults
required must be present optional optional required optional
unique may appear at most once repeatable repeatable unique unique
exclusive must appear alone compatible compatible compatible compatible
ultimate must lexically appear last free free free free
(or first for a modifier in
a post-modified clause)
8 A clause, argument or modifier with a given property implies that it does not have the
9 corresponding inverse property, and vice versa. The ultimate property implies the unique property.
10 If all arguments and modifiers of an argument-modified clause or directive are optional and omitted
11 then the parentheses of the syntax for the clause or directive is also omitted.
12 Arguments and modifiers that are expressions may additionally have any of the following value
13 properties: constant, positive, non-negative, and region-invariant.
14

15 Note – In this example, clause-specification is depend(inout: d), clause-name is depend


16 and clause-argument-specification is inout: d. The depend clause has an argument for which
17 argument-name is locator-list, which syntactically is the OpenMP locator list d in the example.
18 Similarly, the depend clause accepts a simple clause modifier with the name
19 task-dependence-type. Syntactically, task-dependence-type is the keyword inout in the example.
20 #pragma omp depobj(o) depend(inout: d)

21

22 The clauses that a directive accepts may form sets. These sets may imply restrictions on their use
23 on that directive or may otherwise capture properties for the clauses on the directive. While specific
24 properties may be defined for a clause set on a particular directive, the following clause-set
25 properties have general meanings and implications as indicated by the restrictions below: required,
26 unique, and exclusive.

58 OpenMP API – Version 5.2 November 2021


1 All clauses that are specified as a clause grouping form a clause set for which properties are
2 specified with the specification of the grouping. Some directives accept a clause grouping for which
3 each member is a directive-name of a directive that has a specific property. These groupings are
4 required, unique and exclusive unless otherwise specified.

5 Restrictions
6 Restrictions to clauses and clause sets are as follows:
7 • A required clause for a directive must appear on the directive.
8 • A unique clause for a directive may appear at most once on the directive.
9 • An exclusive clause for a directive must not appear if a clause with a different clause-name also
10 appears on the directive.
11 • An ultimate clause for a directive must be the lexically last clause to appear on the directive.
12 • If a clause set has the required property, at least one clause in the set must be present on the
13 directive for which the clause set is specified.
14 • If a clause is a member of a set that has the unique property for a directive then the clause has the
15 unique property for that directive regardless of whether it has the unique property when it is not
16 part of such a set.
17 • If one clause of a clause set with the exclusive property appears on a directive, no other clauses
18 with a different clause-name in that set may appear on the directive.
19 • A required argument must appear in the clause-specification.
20 • A unique argument may appear at most once in a clause-argument-specification.
21 • An exclusive argument must not appear if an argument with a different argument-name appears
22 in the clause-argument-specification.
23 • A required modifier must appear in the clause-argument-specification.
24 • A unique modifier may appear at most once in a clause-argument-specification.
25 • An exclusive modifier must not appear if a modifier with a different modifier-name also appears
26 in the clause-argument-specification.
27 • If a clause is pre-modified, an ultimate modifier must be the last modifier in a
28 clause-argument-specification in which any modifier appears.
29 • If a clause is post-modified, an ultimate modifier must be the first modifier in a
30 clause-argument-specification in which any modifier appears.
31 • A modifier that is an expression must neither lexically match the name of a simple modifier
32 defined for the clause that is an OpenMP keyword nor modifier-name parenthesized-tokens,
33 where modifier-name is the modifier-name of a complex modifier defined for the clause and
34 parenthesized-tokens is a token sequence that starts with ( and ends with ).

CHAPTER 3. DIRECTIVE AND CONSTRUCT SYNTAX 59


1 • A constant argument or parameter must be a compile-time constant.
2 • A positive argument or parameter must be greater than zero; a non-negative argument or
3 parameter must be greater than or equal to zero.
4 • A region-invariant argument or parameter must have the same value throughout any given
5 execution of the construct or, for declarative directives, execution of the function or subroutine
6 with which the declaration is associated.

7 Cross References
8 • Directive Format, see Section 3.1
9 • OpenMP Argument Lists, see Section 3.2.1
10 • OpenMP Stylized Expressions, see Section 4.2
11 • OpenMP Types and Identifiers, see Section 4.1

12 3.2.1 OpenMP Argument Lists


13 The OpenMP API defines several kinds of lists, each of which can be used as syntactic instances of
14 clause arguments. A list of any OpenMP type consists of a comma-separated collection of
15 expressions of that OpenMP type. A variable list consists of a comma-separated collection of one
16 or more variable list items. An extended list consists of a comma-separated collection of one or
17 more extended list items. A locator list consists of a comma-separated collection of one or more
18 locator list items. A parameter list consists of a comma-separated collection of one or more
19 parameter list items. A type-name list consists of a comma-separated collection of one or more
20 type-name list items. A directive-name list consists of a comma-separated collection of one or more
21 directive-name list items, each of which is the directive-name of some OpenMP directive. A foreign
22 runtime preference list consists of a comma-separated collection of one or more foreign-runtime list
23 items each of which is an OpenMP foreign-runtime identifier; the order of list items on a foreign
24 runtime preference list is significant. An OpenMP operation list consists of a comma-separated
25 collection of one or more OpenMP operation list items, each of which is an OpenMP operation
26 defined in Section 3.2.3; the order of the list items in an OpenMP operation list is significant.
C / C++
27 A variable list item is a variable or an array section. An extended list item is a variable list item or a
28 function name. A locator list item is any lvalue expression including variables, array sections, and
29 reserved locators. A parameter list item is the name of a function parameter. A type-name list item
30 is a type name.
C / C++

60 OpenMP API – Version 5.2 November 2021


Fortran
1 A variable list item is one of the following:
2 • a variable that is not coindexed and that is not a substring;
3 • an array section that is not coindexed and that does not contain an element that is a substring;
4 • a named constant;
5 • an associate name that may appear in a variable definition context; or
6 • a common block name (enclosed in slashes).
7 An extended list item is a variable list item or a procedure name. A locator list item is a variable list
8 item, or a reserved locator. A parameter list item is a dummy argument of a subroutine or function.
9 A type-name list item is a type specifier that must not be CLASS(*) or an abstract type.
10 A named constant as a list item can appear only in clauses where it is explicitly allowed.
11 When a named common block appears in an OpenMP argument list, it has the same meaning and
12 restrictions as if every explicit member of the common block appeared in the list. An explicit
13 member of a common block is a variable that is named in a COMMON statement that specifies the
14 common block name and is declared in the same scoping unit in which the clause appears. Named
15 common blocks do not include the blank common block.
16 Although variables in common blocks can be accessed by use association or host association,
17 common block names cannot. As a result, a common block name specified in a clause must be
18 declared to be a common block in the same scoping unit in which the clause appears.
19 If a list item that appears in a directive or clause is an optional dummy argument that is not present,
20 the directive or clause for that list item is ignored.
21 If the variable referenced inside a construct is an optional dummy argument that is not present, any
22 explicitly determined, implicitly determined, or predetermined data-sharing and data-mapping
23 attribute rules for that variable are ignored. Otherwise, if the variable is an optional dummy
24 argument that is present, it is present inside the construct.
Fortran
25 Restrictions
26 The restrictions to OpenMP lists are as follows:
27 • Unless otherwise specified, OpenMP list items must be directive-wide unique, i.e., a list item can
28 only appear once in one OpenMP list of all arguments, clauses, and modifiers of the directive.
29 • All list items must be visible, according to the scoping rules of the base language.
C
30 • Unless otherwise specified, a variable that is part of another variable (as an array element or a
31 structure element) cannot be a variable list item, an extended list item or a locator list item.
C

CHAPTER 3. DIRECTIVE AND CONSTRUCT SYNTAX 61


C++
1 • Unless otherwise specified, a variable that is part of another variable (as an array element or a
2 structure element) cannot be a variable list item, an extended list item or locator list item except
3 if the list appears on a clause that is associated with a construct within a class non-static member
4 function and the variable is an accessible data member of the object for which the non-static
5 member function is invoked.
C++
Fortran
6 • Unless otherwise specified, a variable that is part of another variable (as an array element or a
7 structure element) cannot be a variable list item, an extended list item or locator list item.
Fortran

8 3.2.2 Reserved Locators


9 On some directives, some clauses accept the use of reserved locators as special identifiers that
10 represent system storage not necessarily bound to any base language storage item. Reserved
11 locators may only appear in clauses and directives where they are explicitly allowed and may not
12 otherwise be referenced in the program. The list of reserved locators is:
13 omp_all_memory

14 The reserved locator omp_all_memory is a reserved identifier that denotes a list item treated as
15 having storage that corresponds to the storage of all other objects in memory.

16 3.2.3 OpenMP Operations


17 On some directives, some clauses accept the use of OpenMP operations. An OpenMP operation
18 named <generic_name> is a special expression that may be specified in an OpenMP operation list
19 and that is used to construct an object of the <generic_name> OpenMP type (see Section 4.1). In
20 general, the format of an OpenMP operation is the following:
21 <generic_name>(operation-parameter-specification)

62 OpenMP API – Version 5.2 November 2021


C / C++

1 3.2.4 Array Shaping


2 If an expression has a type of pointer to T, then a shape-operator can be used to specify the extent of
3 that pointer. In other words, the shape-operator is used to reinterpret, as an n-dimensional array, the
4 region of memory to which that expression points.
5 Formally, the syntax of the shape-operator is as follows:
6 shaped-expression := ([s1 ][s2 ]...[sn ])cast-expression

7 The result of applying the shape-operator to an expression is an lvalue expression with an


8 n-dimensional array type with dimensions s1 × s2 . . . × sn and element type T.
9 The precedence of the shape-operator is the same as a type cast.
10 Each si is an integral type expression that must evaluate to a positive integer.

11 Restrictions
12 Restrictions to the shape-operator are as follows:
13 • The type T must be a complete type.
14 • The shape-operator can appear only in clauses for which it is explicitly allowed.
15 • The result of a shape-operator must be a named array of a list item.
16 • The type of the expression upon which a shape-operator is applied must be a pointer type.
C++
17 • If the type T is a reference to a type T’, then the type will be considered to be T’ for all purposes
18 of the designated array.
C++
C / C++

CHAPTER 3. DIRECTIVE AND CONSTRUCT SYNTAX 63


1 3.2.5 Array Sections
2 An array section designates a subset of the elements in an array.
C / C++
3 To specify an array section in an OpenMP directive, array subscript expressions are extended with
4 one of the following syntaxes:
5 [ lower-bound : length : stride]
6 [ lower-bound : length : ]
7 [ lower-bound : length ]
8 [ lower-bound : : stride]
9 [ lower-bound : : ]
10 [ lower-bound : ]
11 [ : length : stride]
12 [ : length : ]
13 [ : length ]
14 [ : : stride]
15 [::]
16 [:]

17 The array section must be a subset of the original array.


18 Array sections are allowed on multidimensional arrays. Base language array subscript expressions
19 can be used to specify length-one dimensions of multidimensional array sections.
20 Each of the lower-bound, length, and stride expressions if specified must be an integral type
21 expression of the base language. When evaluated they represent a set of integer values as follows:
22 { lower-bound, lower-bound + stride, lower-bound + 2 * stride,... , lower-bound + ((length - 1) *
23 stride) }
24 The length must evaluate to a non-negative integer.
25 The stride must evaluate to a positive integer.
26 When the size of the array dimension is not known, the length must be specified explicitly.
27 When the stride is absent it defaults to 1.
28 When the length is absent it defaults to d(size − lower-bound)/stridee
e, where size is the size of the
29 array dimension.
30 When the lower-bound is absent it defaults to 0.

64 OpenMP API – Version 5.2 November 2021


C/C++ (cont.)

1 The precedence of a subscript operator that uses the array section syntax is the same as the
2 precedence of a subscript operator that does not use the array section syntax.
3
4 Note – The following are examples of array sections:
5 a[0:6]
6 a[0:6:1]
7 a[1:10]
8 a[1:]
9 a[:10:2]
10 b[10][:][:]
11 b[10][:][:0]
12 c[42][0:6][:]
13 c[42][0:6:2][:]
14 c[1:10][42][0:6]
15 S.c[:100]
16 p->y[:10]
17 this->a[:N]
18 (p+10)[:N]

19 Assume a is declared to be a 1-dimensional array with dimension size 11. The first two examples
20 are equivalent, and the third and fourth examples are equivalent. The fifth example specifies a stride
21 of 2 and therefore is not contiguous.
22 Assume b is declared to be a pointer to a 2-dimensional array with dimension sizes 10 and 10. The
23 sixth example refers to all elements of the 2-dimensional array given by b[10]. The seventh
24 example is a zero-length array section.
25 Assume c is declared to be a 3-dimensional array with dimension sizes 50, 50, and 50. The eighth
26 example is contiguous, while the ninth and tenth examples are not contiguous.
27 The final four examples show array sections that are formed from more general base expressions.
28 The following are examples that are non-conforming array sections:
29 s[:10].x
30 p[:10]->y
31 *(xp[:10])
32 For all three examples, a base language operator is applied in an undefined manner to an array

CHAPTER 3. DIRECTIVE AND CONSTRUCT SYNTAX 65


1 section. The only operator that may be applied to an array section is a subscript operator for which
2 the array section appears as the postfix expression.
3
4
C / C++
Fortran
5 Fortran has built-in support for array sections although some restrictions apply to their use in
6 OpenMP directives, as enumerated in the following section.
Fortran
7 Restrictions
8 Restrictions to array sections are as follows:
9 • An array section can appear only in clauses for which it is explicitly allowed.
10 • A stride expression may not be specified unless otherwise stated.
C / C++
11 • An element of an array section with a non-zero size must have a complete type.
12 • The base expression of an array section must have an array or pointer type.
13 • If a consecutive sequence of array subscript expressions appears in an array section, and the first
14 subscript expression in the sequence uses the extended array section syntax defined in this
15 section, then only the last subscript expression in the sequence may select array elements that
16 have a pointer type.
C / C++
C++
17 • If the type of the base expression of an array section is a reference to a type T, then the type will
18 be considered to be T for all purposes of the array section.
19 • An array section cannot be used in an overloaded [] operator.
C++
Fortran
20 • If a stride expression is specified, it must be positive.
21 • The upper bound for the last dimension of an assumed-size dummy array must be specified.
22 • If a list item is an array section with vector subscripts, the first array element must be the lowest
23 in the array element order of the array section.
24 • If a list item is an array section, the last part-ref of the list item must have a section subscript list.
Fortran

66 OpenMP API – Version 5.2 November 2021


1 3.2.6 iterator Modifier
2 Modifiers
Name Modifies Type Properties
iterator locator-list Complex, name: iterator unique
Arguments:
3
iterator-specifier OpenMP
expression (repeatable)

4 Clauses
5 affinity, depend, from, map, to
6 An iterator modifier is a unique, complex modifier that defines a set of iterators, each of which is an
7 iterator-identifier and an associated set of values. An iterator-identifier expands to those values in
8 the clause argument for which it is specified. Each member of the modifier-parameter-specification
9 list of an iterator modifier is an iterator-specifier with this format:
C / C++
10 [ iterator-type ] iterator-identifier = range-specification
C / C++
Fortran
11 [ iterator-type :: ] iterator-identifier = range-specification
Fortran
12 where:
13 • iterator-identifier is a base-language identifier.
14 • iterator-type is a type that is permitted in a type-name list.
15 • range-specification is of the form begin:end[:step], where begin and end are expressions for
16 which their types can be converted to iterator-type and step is an integral expression.
C / C++
17 In an iterator-specifier, if the iterator-type is not specified then that iterator is of int type.
C / C++
Fortran
18 In an iterator-specifier, if the iterator-type is not specified then that iterator has default integer type.
Fortran
19 In a range-specification, if the step is not specified its value is implicitly defined to be 1.
20 An iterator only exists in the context of the clause argument that it modifies. An iterator also hides
21 all accessible symbols with the same name in the context of that clause argument.
22 The use of a variable in an expression that appears in the range-specification causes an implicit
23 reference to the variable in all enclosing constructs.

CHAPTER 3. DIRECTIVE AND CONSTRUCT SYNTAX 67


C / C++
1 The values of the iterator are the set of values i0 , . . . , iN −1 where:
2 • i0 = (iterator-type) begin;
3 • ij = (iterator-type) (ij−1 + step), where j ≥ 1; and
4 • if step > 0,
5 – i0 < (iterator-type) end;
6 – iN −1 < (iterator-type) end; and
7 – (iterator-type) (iN −1 + step) ≥ (iterator-type) end;
8 • if step < 0,
9 – i0 > (iterator-type) end;
10 – iN −1 > (iterator-type) end; and
11 – (iterator-type) (iN −1 + step) ≤ (iterator-type) end.
C / C++
Fortran
12 The values of the iterator are the set of values i1 , . . . , iN where:
13 • i1 = begin;
14 • ij = ij−1 + step, where j ≥ 2; and
15 • if step > 0,
16 – i1 ≤ end;
17 – iN ≤ end; and
18 – iN + step > end;
19 • if step < 0,
20 – i1 ≥ end;
21 – iN ≥ end; and
22 – iN + step < end.
Fortran
23 The set of values will be empty if no possible value complies with the conditions above.
24 If an iterator-identifier appears in a list-item expression of the modified argument, the effect is as if
25 the list item is instantiated within the clause for each member of the iterator value set, substituting
26 each occurrence of iterator-identifier in the list-item expression with the iterator value. If the
27 iterator value set is empty then the effect is as if the list item was not specified.

68 OpenMP API – Version 5.2 November 2021


1 Restrictions
2 Restrictions to iterator modifiers are as follows:
3 • The iterator-type must not declare a new type.
4 • For each value i in an iterator value set, the mathematical result of i + step must be
5 representable in iterator-type.
C / C++
6 • The iterator-type must be an integral or pointer type.
7 • The iterator-type must not be const qualified.
C / C++
Fortran
8 • The iterator-type must be an integer type.
Fortran
9 • If the step expression of a range-specification equals zero, the behavior is unspecified.
10 • Each iterator-identifier can only be defined once in the modifier-parameter-specification.
11 • Iterators cannot appear in the range-specification.

12 Cross References
13 • affinity clause, see Section 12.5.1
14 • depend clause, see Section 15.9.5
15 • from clause, see Section 5.9.2
16 • map clause, see Section 5.8.3
17 • to clause, see Section 5.9.1

18 3.3 Conditional Compilation


19 In implementations that support a preprocessor, the _OPENMP macro name is defined to have the
20 decimal value yyyymm where yyyy and mm are the year and month designations of the version of
21 the OpenMP API that the implementation supports.
22 If a #define or a #undef preprocessing directive in user code defines or undefines the
23 _OPENMP macro name, the behavior is unspecified.
Fortran
24 The OpenMP API requires Fortran lines to be compiled conditionally, as described in the following
25 sections.
Fortran

CHAPTER 3. DIRECTIVE AND CONSTRUCT SYNTAX 69


Fortran

1 3.3.1 Fixed Source Form Conditional Compilation Sentinels


2 The following conditional compilation sentinels are recognized in fixed form source files:
3 !$ | *$ | c$

4 To enable conditional compilation, a line with a conditional compilation sentinel must satisfy the
5 following criteria:
6 • The sentinel must start in column 1 and appear as a single word with no intervening white space;
7 • After the sentinel is replaced with two spaces, initial lines must have a space or zero in column 6
8 and only white space and numbers in columns 1 through 5; and
9 • After the sentinel is replaced with two spaces, continuation lines must have a character other than
10 a space or zero in column 6 and only white space in columns 1 through 5.
11 If these criteria are met, the sentinel is replaced by two spaces. If these criteria are not met, the line
12 is left unchanged.
13

14 Note – In the following example, the two forms for specifying conditional compilation in fixed
15 source form are equivalent (the first line represents the position of the first 9 columns):
16 c23456789
17 !$ 10 iam = omp_get_thread_num() +
18 !$ & index
19
20 #ifdef _OPENMP
21 10 iam = omp_get_thread_num() +
22 & index
23 #endif

24
25
Fortran

70 OpenMP API – Version 5.2 November 2021


Fortran

1 3.3.2 Free Source Form Conditional Compilation Sentinel


2 The following conditional compilation sentinel is recognized in free form source files:
3 !$

4 To enable conditional compilation, a line with a conditional compilation sentinel must satisfy the
5 following criteria:
6 • The sentinel can appear in any column but must be preceded only by white space;
7 • The sentinel must appear as a single word with no intervening white space;
8 • Initial lines must have a blank character after the sentinel; and
9 • Continued lines must have an ampersand as the last non-blank character on the line, prior to any
10 comment appearing on the conditionally compiled line.
11 Continuation lines can have an ampersand after the sentinel, with optional white space before and
12 after the ampersand. If these criteria are met, the sentinel is replaced by two spaces. If these criteria
13 are not met, the line is left unchanged.
14

15 Note – In the following example, the two forms for specifying conditional compilation in free
16 source form are equivalent (the first line represents the position of the first 9 columns):
17 c23456789
18 !$ iam = omp_get_thread_num() + &
19 !$& index
20
21 #ifdef _OPENMP
22 iam = omp_get_thread_num() + &
23 index
24 #endif

25
26
Fortran

CHAPTER 3. DIRECTIVE AND CONSTRUCT SYNTAX 71


1 3.4 if Clause
2 Name: if Properties: default

3 Arguments
Name Type Properties
4
if-expression expression of logical type default

5 Modifiers
Name Modifies Type Properties
6 directive-name- if-expression Keyword: unique
modifier directive-name

7 Directives
8 cancel, parallel, simd, target, target data, target enter data, target
9 exit data, target update, task, taskloop
10 Semantics
11 If no directive-name-modifier is specified then the effect is as if a directive-name-modifier was
12 specified with the directive-name of the directive on which the clause appears.
13 The effect of the if clause depends on the construct to which it is applied. If the construct is not a
14 combined or composite construct then the effect is described in the section that describes that
15 construct. For combined or composite constructs, the if clause only applies to the semantics of the
16 construct named in the directive-name-modifier. For a combined or composite construct, if no
17 directive-name-modifier is specified then the if clause applies to all constituent constructs to
18 which an if clause can apply.
19 Restrictions
20 Restrictions to the if clause are as follows:
21 • At most one if clause can be specified that applies to the semantics of any construct or
22 constituent construct of a directive-specification.
23 • The directive-name-modifier must specify the directive-name of the construct or of a constituent
24 construct of the directive-specification on which the if clause appears.
25 Cross References
26 • cancel directive, see Section 16.1
27 • parallel directive, see Section 10.1
28 • simd directive, see Section 10.4
29 • target data directive, see Section 13.5
30 • target directive, see Section 13.8
31 • target enter data directive, see Section 13.6

72 OpenMP API – Version 5.2 November 2021


1 • target exit data directive, see Section 13.7
2 • target update directive, see Section 13.9
3 • task directive, see Section 12.5
4 • taskloop directive, see Section 12.6

5 3.5 destroy Clause


6 Name: destroy Properties: default

7 Arguments
Name Type Properties
8
destroy-var variable of OpenMP variable type default

9 Directives
10 depobj, interop
11 Additional information
12 When the destroy clause appears on the depobj construct, the destroy-var argument may be
13 omitted. This syntax has been deprecated.
14 Semantics
15 If the destroy clause appears on a depobj construct and destroy-var is not specified, the effect
16 is as if destroy-var refers to the same OpenMP depend object as the depobj argument of the
17 construct. The syntax of the destroy clause on the depobj construct that does not specify
18 destroy-var has been deprecated. When the destroy clause appears on a depobj construct, the
19 state of destroy-var is set to uninitialized.
20 When the destroy clause appears on an interop construct, the interop-type is inferred based
21 on the interop-type used to initialize destroy-var, and destroy-var is set to the value of
22 omp_interop_none after resources associated with destroy-var are released. The object
23 referred to by destroy-var is unusable after destruction and the effect of using values associated
24 with it is unspecified until it is initialized again by another interop construct.
25 Restrictions
26 • destroy-var must be non-const.
27 • If the destroy clause appears on a depobj construct, destroy-var must refer to the same
28 depend object as the depobj argument of the construct.
29 • If the destroy clause appears on an interop construct destroy-var must refer to a variable of
30 OpenMP interop type.
31 Cross References
32 • depobj directive, see Section 15.9.4
33 • interop directive, see Section 14.1

CHAPTER 3. DIRECTIVE AND CONSTRUCT SYNTAX 73


1 4 Base Language Formats and
2 Restrictions
3 This section defines concepts and restrictions on base language code used in OpenMP. The concepts
4 help support base language neutrality for OpenMP directives and their associated semantics.

5 Restrictions
6 The following restrictions apply generally for base language code in an OpenMP program:
7 • Programs must not declare names that begin with the omp_ or ompx_ prefix, as these are
8 reserved for the OpenMP implementation.
C++
9 • Programs must not declare a namespace with the omp or ompx names, as these are reserved for
10 the OpenMP implementation.
C++

11 4.1 OpenMP Types and Identifiers


12 An OpenMP identifier is a special identifier for use within OpenMP directives and clauses for some
13 specific purpose. For example, OpenMP reduction identifiers specify the combiner operation to use
14 in a reduction, OpenMP mapper identifiers specify the name of a user-defined mapper, and
15 OpenMP foreign runtime identifiers specify the name of a foreign runtime.
16 Generic OpenMP types specify the type of expression or variable that is used in OpenMP contexts
17 regardless of the base language. These types support the definition of many important OpenMP
18 concepts independently of the base language in which they are used.
19 The assignable OpenMP type instance is defined to facilitate base language neutrality. An
20 assignable OpenMP type instance can be used as an argument of an OpenMP construct in order for
21 the implementation to modify the value of that instance.
C / C++
22 An assignable OpenMP type instance is an lvalue expression of that OpenMP type.
C / C++
Fortran
23 An assignable OpenMP type instance is a variable of that OpenMP type.
Fortran

74 OpenMP API – Version 5.2 November 2021


1 The OpenMP logical type supports logical variables and expressions in any base language.
C / C++
2 Any OpenMP logical expression is a scalar expression. This document uses true as a generic term
3 for a non-zero integer value and false as a generic term for an integer value of zero.
C / C++
Fortran
4 Any OpenMP logical expression is a scalar logical expression. This document uses true as a generic
5 term for a logical value of .TRUE. and false as a generic term for a logical value of .FALSE..
Fortran
6 The OpenMP integer type supports integer variables and expressions in any base language.
C / C++
7 Any OpenMP integer expression is an integer expression.
C / C++
Fortran
8 Any OpenMP integer expression is a scalar integer expression.
Fortran
9 The OpenMP string type supports character string variables and expressions in any base language.
C / C++
10 Any OpenMP string expression is an expression of type qualified or unqualified const char *
11 or char * pointing to a null-terminated character string.
C / C++
Fortran
12 Any OpenMP string expression is a character string of default kind.
Fortran
13 OpenMP function identifiers support procedure names in any base language. Regardless of the base
14 language, any OpenMP function identifier is the name of a procedure as a base language identifier.
15 Each OpenMP type other than those specifically defined in this section has a generic name,
16 <generic_name>, by which it is referred throughout this document and that is used to construct the
17 base language construct that corresponds to that OpenMP type.
C / C++
18 A variable of <generic_name> OpenMP type is a variable of type omp_<generic_name>_t.
C / C++
Fortran
19 A variable of <generic_name> OpenMP type is a scalar integer variable of kind
20 omp_<generic_name>_kind.
Fortran

CHAPTER 4. BASE LANGUAGE FORMATS AND RESTRICTIONS 75


1 Cross References
2 • OpenMP Foreign Runtime Identifiers, see Section 14.1.1
3 • OpenMP Reduction Identifiers, see Section 5.5.1
4 • mapper modifier, see Section 5.8.2

5 4.2 OpenMP Stylized Expressions


6 An OpenMP stylized expression is a base language expression that is subject to restrictions that
7 enable its use within an OpenMP implementation. These expressions often make use of special
8 variable identifiers that the implementation binds to well-defined internal state.

9 Cross References
10 • OpenMP Combiner Expressions, see Section 5.5.2.1
11 • OpenMP Initializer Expressions, see Section 5.5.2.2

12 4.3 Structured Blocks


13 This section specifies the concept of a structured block. A structured block:
14 • may contain infinite loops where the point of exit is never reached;
15 • may halt due to an IEEE exception;
C / C++
16 • may contain calls to exit(), _Exit(), quick_exit(), abort() or functions with a
17 _Noreturn specifier (in C) or a noreturn attribute (in C/C++);
18 • may be an expression statement, iteration statement, selection statement, or try block, provided
19 that the corresponding compound statement obtained by enclosing it in { and } would be a
20 structured block; and
C / C++
Fortran
21 • may contain STOP or ERROR STOP statements.
Fortran
C / C++
22 A structured block sequence that consists of no statements or more than one statement may appear
23 only for executable directives that explicitly allow it. The corresponding compound statement
24 obtained by enclosing the sequence in { and } must be a structured block and the structured block
25 sequence then should be considered to be a structured block with all of its restrictions.
C / C++

76 OpenMP API – Version 5.2 November 2021


1 Restrictions
2 Restrictions to structured blocks are as follows:
3 • Entry to a structured block must not be the result of a branch.
4 • The point of exit cannot be a branch out of the structured block.
C / C++
5 • The point of entry to a structured block must not be a call to setjmp.
6 • longjmp must not violate the entry/exit criteria of structured blocks.
C / C++
C++
7 • throw, co_await, co_yield and co_return must not violate the entry/exit criteria of
8 structured blocks.
C++
Fortran
9 • If a BLOCK construct appears in a structured block, that BLOCK construct must not contain any
10 ASYNCHRONOUS or VOLATILE statements, nor any specification statements that include the
11 ASYNCHRONOUS or VOLATILE attributes.
Fortran

12 4.3.1 OpenMP Context-Specific Structured Blocks


13 An OpenMP context-specific structured block consists of statements that conform to specific
14 restrictions so that OpenMP can treat them as a structured block or a structured block sequence.
15 The restrictions depend on the context in which the context-specific structured block can be used.

16 4.3.1.1 OpenMP Allocator Structured Blocks


Fortran
17 An OpenMP allocator structured block consists of allocate-stmt, where allocate-stmt is a Fortran
18 ALLOCATE statement. Allocator structured blocks are considered strictly structured blocks for the
19 purpose of the allocators construct.
Fortran
20 Cross References
21 • allocators directive, see Section 6.7

CHAPTER 4. BASE LANGUAGE FORMATS AND RESTRICTIONS 77


1 4.3.1.2 OpenMP Function Dispatch Structured Blocks
2 An OpenMP function dispatch structured block is a context-specific structured block that identifies
3 the location of a function dispatch.
C / C++
4 A function dispatch structured block is an expression statement with one of the following forms:
5 lvalue-expression = target-call ( [expression-list] );

6 or
7 target-call ( [expression-list] );
C / C++
Fortran
8 A function dispatch structured block is an expression statement with one of the following forms:
9 expression = target-call ( [arguments] )

10 or
11 CALL target-call [ ( [arguments] )]

12 For purposes of the dispatch construct, the expression statement is considered a strictly
13 structured block.
Fortran
14 Restrictions
15 Restrictions to the function dispatch structured blocks are as follows:
C++
16 • The target-call expression can only be a direct call.
C++
Fortran
17 • target-call must be a procedure name.
18 • target-call must not be a procedure pointer.
Fortran
19 Cross References
20 • dispatch directive, see Section 7.6

78 OpenMP API – Version 5.2 November 2021


1 4.3.1.3 OpenMP Atomic Structured Blocks
2 An OpenMP atomic structured block is a context-specific structured block that can appear in an
3 atomic construct. The form of an atomic structured block depends on the atomic semantics that
4 the directive enforces.
5 In the following definitions:
C / C++
6 • x, r (result), and v (as applicable) are lvalue expressions with scalar type.
7 • e (expected) is an expression with scalar type,
8 • d (desired) is an expression with scalar type.
9 • e and v may refer to, or access, the same storage location.
10 • expr is an expression with scalar type.
11 • The order operation, ordop, is one of <, or >.
12 • binop is one of +, *, -, /, &, ^, |, <<, or >>.
13 • == comparisons are performed by comparing the value representation of operand values for
14 equality after the usual arithmetic conversions; if the object representation does not have any
15 padding bits, the comparison is performed as if with memcmp.
16 • For forms that allow multiple occurrences of x, the number of times that x is evaluated is
17 unspecified but will be at least one.
18 • For forms that allow multiple occurrences of expr, the number of times that expr is evaluated is
19 unspecified but will be at least one.
20 • The number of times that r is evaluated is unspecified but will be at least one.
21 • Whether d is evaluated if x == e evaluates to false is unspecified.
C / C++
Fortran
22 • x, v, d and e (as applicable) are scalar variables of intrinsic type.
23 • expr is a scalar expression.
24 • expr-list is a comma-separated, non-empty list of scalar expressions.
25 • intrinsic-procedure-name is one of MAX, MIN, IAND, IOR, or IEOR.
26 • operator is one of +, *, -, /, .AND., .OR., .EQV., or .NEQV..
27 • equalop is ==, .EQ., or .EQV..
28 • == or .EQ. comparisons are performed by comparing the physical representation of operand
29 values for equality after the usual conversions as described in the base language, while ignoring
30 padding bits, if any.

CHAPTER 4. BASE LANGUAGE FORMATS AND RESTRICTIONS 79


1 • .EQV. comparisons are performed as described in the base language.
2 • For forms that allow multiple occurrences of x, the number of times that x is evaluated is
3 unspecified but will be at least one.
4 • For forms that allow multiple occurrences of expr, the number of times that expr is evaluated is
5 unspecified but will be at least one.
6 • The number of times that r is evaluated is unspecified but will be at least one.
7 • Whether d is evaluated if x equalop e evaluates to false is unspecified.
Fortran
8 A read-atomic structured block can be specified for atomic directives that enforce atomic read
9 semantics but not capture semantics.
C / C++
10 A read-atomic structured block is read-expr-stmt, a read expression statement that has the following
11 form:
12 v = x;
C / C++
Fortran
13 A read-atomic structured block is read-statement, a read statement that has the following form:
14 v = x
Fortran
15 A write-atomic structured block can be specified for atomic directives that enforce atomic write
16 semantics but not capture semantics.
C / C++
17 A write-atomic structured block is write-expr-stmt, a write expression statement that has the
18 following form:
19 x = expr;
C / C++
Fortran
20 A write-atomic structured block is write-statement, a write statement that has the following form:
21 x = expr
Fortran
22 An update-atomic structured block can be specified for atomic directives that enforce atomic
23 update semantics but not capture semantics.

80 OpenMP API – Version 5.2 November 2021


C / C++
1 An update-atomic structured block is update-expr-stmt, an update expression statement that has one
2 of the following forms:
3 x++;
4 x--;
5 ++x;
6 --x;
7 x binop= expr;
8 x = x binop expr;
9 x = expr binop x;
C / C++
Fortran
10 An update-atomic structured block is update-statement, an update statement that has one of the
11 following forms:
12 x = x operator expr
13 x = expr operator x
14 x = intrinsic-procedure-name (x, expr-list)
15 x = intrinsic-procedure-name (expr-list, x)
Fortran
16 A conditional-update-atomic structured block can be specified for atomic directives that enforce
17 atomic conditional update semantics but not capture semantics.
C / C++
18 A conditional-update-atomic structured block is either cond-expr-stmt, a conditional expression
19 statement that has one of the following forms:
20 x = expr ordop x ? expr : x;
21 x = x ordop expr ? expr : x;
22 x = x == e ? d : x;

23 or cond-update-stmt, a conditional update statement that has one of the following forms:
24 if(expr ordop x) { x = expr; }
25 if(x ordop expr) { x = expr; }
26 if(x == e) { x = d; }
C / C++

CHAPTER 4. BASE LANGUAGE FORMATS AND RESTRICTIONS 81


Fortran
1 A conditional-update-atomic structured block is conditional-update-statement, a conditional update
2 statement that has one of the following forms:
3 if (x equalop e) then
4 x = d
5 end if

6 or
7 if (x equalop e) x = d

8 read-atomic, write-atomic, update-atomic, and conditional-update-atomic structured blocks are


9 considered strictly structured blocks for the purpose of the atomic construct.
Fortran
10 A capture-atomic structured block can be specified for atomic directives that enforce capture
11 semantics. They are further categorized as write-capture-atomic, update-capture-atomic, and
12 conditional-update-capture-atomic structured blocks, which can be specified for atomic
13 directives that enforce write, update or conditional update atomic semantics in addition to capture
14 semantics.
C / C++
15 A capture-atomic structured block is capture-stmt, a capture statement that has one of the following
16 forms:
17 v = expr-stmt
18 { v = x; expr-stmt }
19 { expr-stmt v = x; }

20 If expr-stmt is write-expr-stmt or expr-stmt is update-expr-stmt as specified above then it is an


21 update-capture-atomic structured block. If expr-stmt is cond-expr-stmt as specified above then it is
22 a conditional-update-capture-atomic structured block. In addition, a
23 conditional-update-capture-atomic structured block can have one of the following forms:
24 { v = x; cond-update-stmt }
25 { cond-update-stmt v = x; }
26 if(x == e) { x = d; } else { v = x; }
27 { r = x == e; if(r) { x = d; } }
28 { r = x == e; if(r) { x = d; } else { v = x; } }
C / C++

82 OpenMP API – Version 5.2 November 2021


Fortran
1 A capture-atomic structured block has one of the following forms:
2 statement
3 capture-statement

4 or
5 capture-statement
6 statement

7 where capture-statement has the following form:


8 v = x

9 If statement is write-statement as specified above then it is a write-capture-atomic structured block.


10 If statement is update-statement as specified above then it is an update-capture-atomic structured
11 block. If statement is conditional-update-statement as specified above then it is a
12 conditional-update-capture-atomic structured block. In addition, for a
13 conditional-update-capture-atomic structured block, statement can have the following form:
14 x = expr

15 In addition, a conditional-update-capture-atomic structured block can have the following form:


16 if (x equalop e) then
17 x = d
18 else
19 v = x
20 end if

21 All capture-atomic structured blocks are considered loosely structured blocks for the purpose of the
22 atomic construct.
Fortran
23 Restrictions
24 Restrictions to OpenMP atomic structured blocks are as follows:
C / C++
25 • In forms where e is assigned it must be an lvalue.
26 • r must be of integral type.
27 • During the execution of an atomic region, multiple syntactic occurrences of x must designate
28 the same storage location.
29 • During the execution of an atomic region, multiple syntactic occurrences of r must designate
30 the same storage location.

CHAPTER 4. BASE LANGUAGE FORMATS AND RESTRICTIONS 83


1 • During the execution of an atomic region, multiple syntactic occurrences of expr must evaluate
2 to the same value.
3 • None of v, x, r, d and expr (as applicable) may access the storage location designated by any
4 other symbol in the list.
5 • In forms that capture the original value of x in v, v and e may not refer to, or access, the same
6 storage location.
7 • binop, binop=, ordop, ==, ++, and -- are not overloaded operators.
8 • The expression x binop expr must be numerically equivalent to x binop (expr). This requirement
9 is satisfied if the operators in expr have precedence greater than binop, or by using parentheses
10 around expr or subexpressions of expr.
11 • The expression expr binop x must be numerically equivalent to (expr) binop x. This requirement
12 is satisfied if the operators in expr have precedence equal to or greater than binop, or by using
13 parentheses around expr or subexpressions of expr.
14 • The expression x ordop expr must be numerically equivalent to x ordop (expr). This requirement
15 is satisfied if the operators in expr have precedence greater than ordop, or by using parentheses
16 around expr or subexpressions of expr.
17 • The expression expr ordop x must be numerically equivalent to (expr) ordop x. This requirement
18 is satisfied if the operators in expr have precedence equal to or greater than ordop, or by using
19 parentheses around expr or subexpressions of expr.
20 • The expression x == e must be numerically equivalent to x == (e). This requirement is satisfied
21 if the operators in e have precedence equal to or greater than ==, or by using parentheses around
22 e or subexpressions of e.
C / C++
Fortran
23 • x must not have the ALLOCATABLE attribute.
24 • During the execution of an atomic region, multiple syntactic occurrences of x must designate
25 the same storage location.
26 • During the execution of an atomic region, multiple syntactic occurrences of r must designate
27 the same storage location.
28 • During the execution of an atomic region, multiple syntactic occurrences of expr must evaluate
29 to the same value.
30 • None of v, expr, and expr-list (as applicable) may access the same storage location as x.
31 • None of x, expr, and expr-list (as applicable) may access the same storage location as v.
32 • In forms that capture the original value of x in v, v may not access the same storage location as e.

84 OpenMP API – Version 5.2 November 2021


1 • If intrinsic-procedure-name refers to IAND, IOR, or IEOR, exactly one expression must appear
2 in expr-list.
3 • The expression x operator expr must be, depending on its type, either mathematically or logically
4 equivalent to x operator (expr). This requirement is satisfied if the operators in expr have
5 precedence greater than operator, or by using parentheses around expr or subexpressions of expr.
6 • The expression expr operator x must be, depending on its type, either mathematically or
7 logically equivalent to (expr) operator x. This requirement is satisfied if the operators in expr
8 have precedence equal to or greater than operator, or by using parentheses around expr or
9 subexpressions of expr.
10 • The expression x equalop e must be, depending on its type, either mathematically or logically
11 equivalent to x equalop (e). This requirement is satisfied if the operators in e have precedence
12 equal to or greater than equalop, or by using parentheses around e or subexpressions of e.
13 • intrinsic-procedure-name must refer to the intrinsic procedure name and not to other program
14 entities.
15 • operator must refer to the intrinsic operator and not to a user-defined operator.
16 • All assignments must be intrinsic assignments.
Fortran
17 Cross References
18 • atomic directive, see Section 15.8.4

19 4.4 Loop Concepts


20 OpenMP semantics frequently involve loops that occur in the base language code. As detailed in
21 this section, OpenMP defines several concepts that facilitate the specification of those semantics
22 and their associated syntax.

23 4.4.1 Canonical Loop Nest Form


24 A loop nest has canonical loop nest form if it conforms to loop-nest in the following grammar:

25 Symbol Meaning

26 loop-nest One of the following:


C / C++
27 for (init-expr; test-expr; incr-expr)
28 loop-body
C / C++

CHAPTER 4. BASE LANGUAGE FORMATS AND RESTRICTIONS 85


1 or
C++
2 for (range-decl: range-expr)
3 loop-body

4 A range-based for loop is equivalent to a regular for loop using iterators, as


5 defined in the base language. A range-based for loop has no iteration variable.
C++
6 or
Fortran
7 DO [ label ] var = lb , ub [ , incr ]
8 [intervening-code]
9 loop-body
10 [intervening-code]
11 [ label ] END DO

12 If the loop-nest is a nonblock-do-construct, it is treated as a block-do-construct for


13 each DO construct.
14 The value of incr is the increment of the loop. If not specified, its value is assumed to
15 be 1.
Fortran
16 or
17 loop-transformation-construct

18 or
19 generated-canonical-loop

20 loop-body One of the following:


21 loop-nest

22 or
C / C++
23 {
24 [intervening-code]
25 loop-body
26 [intervening-code]
27 }
C / C++
28 or

86 OpenMP API – Version 5.2 November 2021


Fortran
1 BLOCK
2 [intervening-code]
3 loop-body
4 [intervening-code]
5 END BLOCK
Fortran
6 or if none of the previous productions match
7 final-loop-body

8 loop-transformation- A loop transformation construct.


construct

9 generated-canonical- A generated loop from a loop transformation construct that has canonical loop nest
10 loop form and for which the loop body matches loop-body.

11 intervening-code A non-empty structured block sequence that does not contain OpenMP directives or
12 calls to the OpenMP runtime API in its corresponding region, referred to as
13 intervening code. If intervening code is present, then a loop at the same depth within
14 the loop nest is not a perfectly nested loop.
C / C++
15 It must not contain iteration statements, continue statements or break statements
16 that apply to the enclosing loop.
C / C++
Fortran
17 It must not contain loops, array expressions, CYCLE statements or EXIT statements.
Fortran

18 final-loop-body A structured block that terminates the scope of loops in the loop nest. If the loop nest
19 is associated with a loop-associated directive, loops in this structured block cannot be
20 associated with that directive.
C / C++

21 init-expr One of the following:


22 var = lb
23 integer-type var = lb

CHAPTER 4. BASE LANGUAGE FORMATS AND RESTRICTIONS 87


C
1 pointer-type var = lb
C
C++
2 random-access-iterator-type var = lb
C++

3 test-expr One of the following:


4 var relational-op ub
5 ub relational-op var

6 relational-op One of the following:


7 <
8 <=
9 >
10 >=
11 !=

12 incr-expr One of the following:


13 ++var
14 var++
15 - - var
16 var - -
17 var += incr
18 var - = incr
19 var = var + incr
20 var = incr + var
21 var = var - incr
22 The value of incr, respectively 1 and -1 for the increment and decrement operators, is
23 the increment of the loop.
C / C++

24 var One of the following:


C / C++
25 A variable of a signed or unsigned integer type.
26 C / C++

88 OpenMP API – Version 5.2 November 2021


C
1 A variable of a pointer type.
2 C
C++
3 A variable of a random access iterator type.
4 C++
Fortran
5 A scalar variable of integer type.
Fortran
6 var is the iteration variable of the loop. It must not be modified during the execution
7 of intervening-code or loop-body in the loop.

8 lb, ub One of the following:


9 Expressions of a type compatible with the type of var that are loop invariant with
10 respect to the outermost loop.
11 or
12 One of the following:
13 var-outer
14 var-outer + a2
15 a2 + var-outer
16 var-outer - a2
17 where var-outer is of a type compatible with the type of var.
18 or
19 If var is of an integer type, one of the following:
20 a2 - var-outer
21 a1 * var-outer
22 a1 * var-outer + a2
23 a2 + a1 * var-outer
24 a1 * var-outer - a2
25 a2 - a1 * var-outer
26 var-outer * a1
27 var-outer * a1 + a2
28 a2 + var-outer * a1
29 var-outer * a1 - a2
30 a2 - var-outer * a1

CHAPTER 4. BASE LANGUAGE FORMATS AND RESTRICTIONS 89


1 where var-outer is of an integer type.
2 lb and ub are loop bounds. A loop for which lb or ub refers to var-outer is a
3 non-rectangular loop. If var is of an integer type, var-outer must be of an integer
4 type with the same signedness and bit precision as the type of var.
5 The coefficient in a loop bound is 0 if the bound does not refer to var-outer. If a loop
6 bound matches a form in which a1 appears, the coefficient is -a1 if the product of
7 var-outer and a1 is subtracted from a2, and otherwise the coefficient is a1. For other
8 matched forms where a1 does not appear, the coefficient is −1 if var-outer is
9 subtracted from a2, and otherwise the coefficient is 1.

10 a1, a2, incr Integer expressions that are loop invariant with respect to the outermost loop of the
11 loop nest.
12 If the loop is associated with a loop-associated directive, the expressions are
13 evaluated before the construct formed from that directive.

14 var-outer The loop iteration variable of a surrounding loop in the loop nest.
C++

15 range-decl A declaration of a variable as defined by the base language for range-based for
16 loops.

17 range-expr An expression that is valid as defined by the base language for range-based for
18 loops. It must be invariant with respect to the outermost loop of the loop nest and the
19 iterator derived from it must be a random access iterator.
C++
20 Restrictions
21 Restrictions to canonical loop nests are as follows:
C / C++
22 • If test-expr is of the form var relational-op b and relational-op is < or <= then incr-expr must
23 cause var to increase on each iteration of the loop. If test-expr is of the form var relational-op b
24 and relational-op is > or >= then incr-expr must cause var to decrease on each iteration of the
25 loop. Increase and decrease are using the order induced by relational-op.
26 • If test-expr is of the form ub relational-op var and relational-op is < or <= then incr-expr must
27 cause var to decrease on each iteration of the loop. If test-expr is of the form ub relational-op
28 var and relational-op is > or >= then incr-expr must cause var to increase on each iteration of the
29 loop. Increase and decrease are using the order induced by relational-op.

90 OpenMP API – Version 5.2 November 2021


1 • If relational-op is != then incr-expr must cause var to always increase by 1 or always decrease
2 by 1 and the increment must be a constant expression.
3 • final-loop-body must not contain any break statement that would cause the termination of the
4 innermost loop.
C / C++
Fortran
5 • final-loop-body must not contain any EXIT statement that would cause the termination of the
6 innermost loop.
Fortran
7 • A loop-nest must also be a structured block.
8 • For a non-rectangular loop, if var-outer is referenced in lb and ub then they must both refer to the
9 same iteration variable.
10 • For a non-rectangular loop, let alb and aub be the respective coefficients in lb and ub, incrinner
11 the increment of the non-rectangular loop and incrouter the increment of the loop referenced by
12 var-outer. incrinner (aub − alb ) must be a multiple of incrouter .
13 • The loop iteration variable may not appear in a threadprivate directive.

14 Cross References
15 • Loop Transformation Constructs, see Chapter 9
16 • threadprivate directive, see Section 5.2

17 4.4.2 OpenMP Loop-Iteration Spaces and Vectors


18 A loop-associated directive controls some number of the outermost loops of an associated loop
19 nest, called the associated loops, in accordance with its specified clauses. These associated loops
20 and their loop iteration variables form an OpenMP loop-iteration space. OpenMP loop-iteration
21 vectors allow other directives to refer to points in that loop-iteration space.
22 A loop transformation construct that appears inside a loop nest is replaced according to its
23 semantics before any loop can be associated with a loop-associated directive that is applied to the
24 loop nest. The depth of the loop nest is determined according to the loops in the loop nest, after any
25 such replacements have taken place. A loop counts towards the depth of the loop nest if it is a base
26 language loop statement or generated loop and it matches loop-nest while applying the production
27 rules for canonical loop nest form to the loop nest.
28 The canonical loop nest form allows the iteration count of all associated loops to be computed
29 before executing the outermost loop.
30 For any associated loop, the iteration count is computed as follows:

CHAPTER 4. BASE LANGUAGE FORMATS AND RESTRICTIONS 91


C / C++
1 • If var has a signed integer type and the var operand of test-expr after usual arithmetic
2 conversions has an unsigned integer type then the loop iteration count is computed from lb,
3 test-expr and incr using an unsigned integer type corresponding to the type of var.
4 • Otherwise, if var has an integer type then the loop iteration count is computed from lb, test-expr
5 and incr using the type of var.
C / C++
C
6 • If var has a pointer type then the loop iteration count is computed from lb, test-expr and incr
7 using the type ptrdiff_t.
C
C++
8 • If var has a random access iterator type then the loop iteration count is computed from lb,
9 test-expr and incr using the type
10 std::iterator_traits<random-access-iterator-type>::difference_type.
11 • For range-based for loops, the loop iteration count is computed from range-expr using the type
12 std::iterator_traits<random-access-iterator-type>::difference_type where
13 random-access-iterator-type is the iterator type derived from range-expr.
C++
Fortran
14 • The loop iteration count is computed from lb, ub and incr using the type of var.
Fortran
15 The behavior is unspecified if any intermediate result required to compute the iteration count
16 cannot be represented in the type determined above.
17 No synchronization is implied during the evaluation of the lb, ub, incr or range-expr expressions.
18 Whether, in what order, or how many times any side effects within the lb, ub, incr, or range-expr
19 expressions occur is unspecified.
20 Let the number of loops associated with a construct be n. The OpenMP loop-iteration space is the
21 n-dimensional space defined by the values of var i , 1 ≤ i ≤ n, the iteration variables of the associated
22 loops, with i = 1 referring to the outermost loop of the loop nest. An OpenMP loop-iteration vector,
23 which may be used as an argument of OpenMP directives and clauses, then has the form:
24 var 1 [± offset 1 ], var 2 [± offset 2 ], . . . , var n [± offset n ]
25 where offset i is a compile-time constant non-negative OpenMP integer expression that facilitates
26 identification of relative points in the loop-iteration space.

92 OpenMP API – Version 5.2 November 2021


1 The iterations of some number of associated loops can be collapsed into one larger iteration space
2 that is called the logical iteration space. The particular integer type used to compute the iteration
3 count for the collapsed loop is implementation defined, but its bit precision must be at least that of
4 the widest type that the implementation would use for the iteration count of each loop if it was the
5 only associated loop. OpenMP defines a special loop-iteration vector, omp_cur_iteration, for
6 which offset i = 0 ∀ i. This loop-iteration vector enables identification of relative points in the
7 logical iteration space as:
8 omp_cur_iteration [± logical_offset]
9 where logical_offset is a compile-time constant non-negative OpenMP integer expression.
10 For directives that result in the execution of a collapsed logical iteration space, the number of times
11 that any intervening code between any two loops of the same logical iteration space will be
12 executed is unspecified but will be the same for all intervening code at the same depth, at least once
13 per iteration of the loop that encloses the intervening code and at most once per logical iteration. If
14 the iteration count of any loop is zero and that loop does not enclose the intervening code, the
15 behavior is unspecified.

16 4.4.3 collapse Clause


17 Name: collapse Properties: unique

18 Arguments
Name Type Properties
19
n expression of integer type default

20 Directives
21 distribute, do, for, loop, simd, taskloop

22 Semantics
23 The collapse clause associates one or more loops with the directive on which it appears for the
24 purpose of identifying the portion of the depth of the canonical loop nest to which to apply the
25 semantics of the directive. The argument n specifies the number of loops of the associated loop nest
26 to which to apply those semantics. On all directives on which the collapse clause may appear,
27 the effect is as if a value of one was specified for n if the collapse clause is not specified.

28 Restrictions
29 • n must not evaluate to a value greater than the depth of the associated loop nest.

CHAPTER 4. BASE LANGUAGE FORMATS AND RESTRICTIONS 93


1 Cross References
2 • distribute directive, see Section 11.6
3 • do directive, see Section 11.5.2
4 • for directive, see Section 11.5.1
5 • loop directive, see Section 11.7
6 • ordered clause, see Section 4.4.4
7 • simd directive, see Section 10.4
8 • taskloop directive, see Section 12.6

9 4.4.4 ordered Clause


10 Name: ordered Properties: unique
11 Arguments
Name Type Properties
12 n expression of integer type optional, constant, posi-
tive
13 Directives
14 do, for, simd
15 Semantics
16 The ordered clause associates one or more loops with the directive on which it appears for the
17 purpose of identifying cross-iteration dependences. The argument n specifies the number of loops
18 of the associated loop to use for that purpose. If n is not specified then the behavior is as if n is
19 specified with the same value as is specified for the collapse clause on the construct.
20 Restrictions
21 • None of the associated loops may be non-rectangular loops.
22 • The ordered clause must not appear on a worksharing-loop directive if the associated loops
23 include the generated loops of a tile directive.
24 • n must not evaluate to a value greater than the depth of the associated loop nest.
25 • If n is explicitly specified, the associated loops must be perfectly nested.
26 • If n is explicitly specified and the collapse clause is also specified for the ordered clause on
27 the same construct, n must be greater than or equal to the n specified for the collapse clause.
28 • If n is explicitly specified, a linear clause must not be specified on the same directive.
C++
29 • If n is explicitly specified, none of the associated loops may be a range-based for loop.
C++

94 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • collapse clause, see Section 4.4.3
3 • do directive, see Section 11.5.2
4 • for directive, see Section 11.5.1
5 • linear clause, see Section 5.4.6
6 • simd directive, see Section 10.4
7 • tile directive, see Section 9.1

8 4.4.5 Consistent Loop Schedules


9 For constructs formed from loop-associated directives that have consistent schedules, the
10 implementation will guarantee that memory effects of a logical iteration in the first loop nest
11 happen before the execution of the same logical iteration in the second loop nest.
12 Two constructs formed from loop-associated directives have consistent schedules if all of the
13 following conditions hold:
14 • The constructs have the same directive-name;
15 • The regions that correspond to the two constructs have the same binding region;
16 • The constructs have the same reproducible schedule;
17 • The associated loop nests have identical logical iteration vector spaces; and
18 • The associated loop nests are either both rectangular or both non-rectangular.

CHAPTER 4. BASE LANGUAGE FORMATS AND RESTRICTIONS 95


1 5 Data Environment
2 This chapter presents directives and clauses for controlling data environments. These clauses and
3 directives include the data-environment attribute clauses, which explicitly determine the attributes
4 of list items specified in a list parameter. The data-environment attribute clauses form a general
5 clause set for which certain restrictions apply to their use on directives that accept any members of
6 the set. In addition, these clauses are divided into two subsets that also form general clause sets:
7 data-sharing attribute clauses and data-mapping attribute clauses. Data-sharing attribute clauses
8 control the data-sharing attributes of variables in a construct, indicating whether a variable is
9 shared or private in the outermost scope of the construct. Data-mapping attribute clauses control
10 the data-mapping attributes of variables in a data environment, indicating whether a variable is
11 mapped from the data environment to another device data environment. Additional restrictions
12 apply to the use of these sets on directives that accept any members of them.

13 5.1 Data-Sharing Attribute Rules


14 This section describes how the data-sharing attributes of variables referenced in data environments
15 are determined. The following two cases are described separately:
16 • Section 5.1.1 describes the data-sharing attribute rules for variables referenced in a construct.
17 • Section 5.1.2 describes the data-sharing attribute rules for variables referenced in a region, but
18 outside any construct.

19 5.1.1 Variables Referenced in a Construct


20 The data-sharing attributes of variables that are referenced in a construct can be predetermined,
21 explicitly determined, or implicitly determined, according to the rules outlined in this section.
22 Specifying a variable in a copyprivate clause or a data-sharing attribute clause other than the
23 private clause on an enclosed construct causes an implicit reference to the variable in the
24 enclosing construct. Specifying a variable in a map clause of an enclosed construct may cause an
25 implicit reference to the variable in the enclosing construct. Such implicit references are also
26 subject to the data-sharing attribute rules outlined in this section.
Fortran
27 A type parameter inquiry or complex part designator that is referenced in a construct is treated as if
28 its designator is referenced.
Fortran

96 OpenMP API – Version 5.2 November 2021


1 Certain variables and objects have predetermined data-sharing attributes for the construct in which
2 they are referenced. The first matching rule from the following list of predetermined data-sharing
3 attribute rules applies for variables and objects that are referenced in a construct.
Fortran
4 • Variables declared within a BLOCK construct inside a construct that do not have the SAVE
5 attribute are private.
Fortran
6 • Variables and common blocks (in Fortran) that appear as arguments in threadprivate
7 directives or variables with the _Thread_local (in C) or thread_local (in C++)
8 storage-class specifier are threadprivate.
C
9 • Variables with automatic storage duration that are declared in a scope inside the construct are
10 private.
C
C++
11 • Variables of non-reference type with automatic storage duration that are declared in a scope
12 inside the construct are private.
C++
C / C++
13 • Objects with dynamic storage duration are shared.
C / C++
14 • The loop iteration variable in the associated loop of a simd construct with just one associated
15 loop is linear with a linear-step that is the increment of the associated loop.
16 • The loop iteration variables in the associated loops of a simd construct with multiple associated
17 loops are lastprivate.
18 • The loop iteration variable in any associated loop of a loop construct is lastprivate.
19 • The loop iteration variable in any associated loop of a loop-associated construct is otherwise
20 private.
C++
21 • The implicitly declared variables of a range-based for loop are private.
C++
Fortran
22 • Loop iteration variables inside parallel, teams, or task generating constructs are private in
23 the innermost such construct that encloses the loop.
24 • Implied-do, FORALL and DO CONCURRENT indices are private.
Fortran

CHAPTER 5. DATA ENVIRONMENT 97


C / C++
1 • Variables with static storage duration that are declared in a scope inside the construct are shared.
2 • If a list item in a map clause on the target construct has a base pointer, and the base pointer is
3 a scalar variable that does not appear in a map clause on the construct, the base pointer is
4 firstprivate.
5 • If a list item in a reduction or in_reduction clause on the construct has a base pointer
6 then the base pointer is private.
7 • Static data members are shared.
8 • The __func__ variable and similar function-local predefined variables are shared.
C / C++
Fortran
9 • Cray pointees have the same data-sharing attribute as the storage with which their Cray pointers
10 are associated. Cray pointer support has been deprecated.
11 • Assumed-size arrays and named constants are shared.
12 • An associate name that may appear in a variable definition context is shared if its association
13 occurs outside of the construct and otherwise it has the same data-sharing attribute as the
14 selector with which it is associated.
Fortran
15 Variables with predetermined data-sharing attributes may not be listed in data-sharing attribute
16 clauses, except for the cases listed below. For these exceptions only, listing a predetermined
17 variable in a data-sharing attribute clause is allowed and overrides the variable’s predetermined
18 data-sharing attributes.
19 • The loop iteration variable in any associated loop of a loop-associated construct may be listed in
20 a private or lastprivate clause.
21 • If a simd construct has just one associated loop then its loop iteration variable may be listed in a
22 linear clause with a linear-step that is the increment of the associated loop.
C / C++
23 • Variables with const-qualified type with no mutable members may be listed in a
24 firstprivate clause, even if they are static data members.
25 • The __func__ variable and similar function-local predefined variables may be listed in a
26 shared or firstprivate clause.
C / C++

98 OpenMP API – Version 5.2 November 2021


Fortran
1 • Loop iteration variables of loops that are not associated with any OpenMP directive may be
2 listed in data-sharing attribute clauses on the surrounding teams, parallel or task generating
3 construct, and on enclosed constructs, subject to other restrictions.
4 • Assumed-size arrays may be listed in a shared clause.
5 • Named constants may be listed in a shared or firstprivate clause.
Fortran
6 Additional restrictions on the variables that may appear in individual clauses are described with
7 each clause in Section 5.4.
8 Variables with explicitly determined data-sharing attributes are those that are referenced in a given
9 construct and are listed in a data-sharing attribute clause on the construct.
10 Variables with implicitly determined data-sharing attributes are those that are referenced in a given
11 construct and do not have predetermined or explicitly determined data-sharing attributes in that
12 construct.
13 Rules for variables with implicitly determined data-sharing attributes are as follows:
14 • In a parallel, teams, or task generating construct, the data-sharing attributes of these
15 variables are determined by the default clause, if present (see Section 5.4.1).
16 • In a parallel construct, if no default clause is present, these variables are shared.
17 • For constructs other than task generating constructs, if no default clause is present, these
18 variables reference the variables with the same names that exist in the enclosing context.
19 • In a target construct, variables that are not mapped after applying data-mapping attribute
20 rules (see Section 5.8) are firstprivate.
C++
21 • In an orphaned task generating construct, if no default clause is present, formal arguments
22 passed by reference are firstprivate.
C++
Fortran
23 • In an orphaned task generating construct, if no default clause is present, dummy arguments
24 are firstprivate.
Fortran

CHAPTER 5. DATA ENVIRONMENT 99


1 • In a task generating construct, if no default clause is present, a variable for which the
2 data-sharing attribute is not determined by the rules above and that in the enclosing context is
3 determined to be shared by all implicit tasks bound to the current team is shared.
4 • In a task generating construct, if no default clause is present, a variable for which the
5 data-sharing attribute is not determined by the rules above is firstprivate.
6 A program is non-conforming if a variable in a task generating construct is implicitly determined to
7 be firstprivate according to the above rules but is not permitted to appear in a firstprivate
8 clause according to the restrictions specified in Section 5.4.4.

9 5.1.2 Variables Referenced in a Region but not in a


10 Construct
11 The data-sharing attributes of variables that are referenced in a region, but not in the corresponding
12 construct, are determined as follows:
C / C++
13 • Variables with static storage duration that are declared in called routines in the region are shared.
14 • File-scope or namespace-scope variables referenced in called routines in the region are shared
15 unless they appear as arguments in a threadprivate directive.
16 • Objects with dynamic storage duration are shared.
17 • Static data members are shared unless they appear as arguments in a threadprivate
18 directive.
19 • In C++, formal arguments of called routines in the region that are passed by reference have the
20 same data-sharing attributes as the associated actual arguments.
21 • Other variables declared in called routines in the region are private.
C / C++
Fortran
22 • Local variables declared in called routines in the region and that have the SAVE attribute, or that
23 are data initialized, are shared unless they appear as arguments in a threadprivate directive.
24 • Variables belonging to common blocks, or accessed by host or use association, and referenced in
25 called routines in the region are shared unless they appear as arguments in a threadprivate
26 directive.
27 • Dummy arguments of called routines in the region that have the VALUE attribute are private.
28 • A dummy argument of a called routine in the region that does not have the VALUE attribute is
29 private if the associated actual argument is not shared.

100 OpenMP API – Version 5.2 November 2021


1 • A dummy argument of a called routine in the region that does not have the VALUE attribute is
2 shared if the actual argument is shared and it is a scalar variable, structure, an array that is not a
3 pointer or assumed-shape array, or a simply contiguous array section. Otherwise, the
4 data-sharing attribute of the dummy argument is implementation defined if the associated actual
5 argument is shared.
6 • Cray pointees have the same data-sharing attribute as the storage with which their Cray pointers
7 are associated. Cray pointer support has been deprecated.
8 • Implied-do indices, DO CONCURRENT indices, FORALL indices, and other local variables
9 declared in called routines in the region are private.
Fortran

10 5.2 threadprivate Directive


Name: threadprivate Association: none
11
Category: declarative Properties: default

12 Arguments
13 threadprivate(list)
Name Type Properties
14
list list of variable list item type default

15 Semantics
16 The threadprivate directive specifies that variables are replicated, with each thread having its
17 own copy. Unless otherwise specified, each copy of a threadprivate variable is initialized once, in
18 the manner specified by the program, but at an unspecified point in the program prior to the first
19 reference to that copy. The storage of all copies of a threadprivate variable is freed according to
20 how static variables are handled in the base language, but at an unspecified point in the program.
C++
21 Each copy of a block-scope threadprivate variable that has a dynamic initializer is initialized the
22 first time its thread encounters its definition; if its thread does not encounter its definition, its
23 initialization is unspecified.
C++
24 The content of a threadprivate variable can change across a task scheduling point if the executing
25 thread switches to another task that modifies the variable. For more details on task scheduling, see
26 Section 1.3 and Chapter 12.
27 In parallel regions, references by the primary thread are to the copy of the variable in the thread
28 that encountered the parallel region.
29 During a sequential part, references are to the initial thread’s copy of the variable. The values of
30 data in the initial thread’s copy of a threadprivate variable are guaranteed to persist between any

CHAPTER 5. DATA ENVIRONMENT 101


1 two consecutive references to the variable in the program, provided that no teams construct that is
2 not nested inside of a target construct is encountered between the references and that the initial
3 thread is not executing code inside of a teams region. For initial threads that are executing code
4 inside of a teams region, the values of data in the copies of a threadprivate variable of those initial
5 threads are guaranteed to persist between any two consecutive references to the variable inside that
6 teams region.
7 The values of data in the threadprivate variables of threads that are not initial threads are
8 guaranteed to persist between two consecutive active parallel regions only if all of the
9 following conditions hold:
10 • Neither parallel region is nested inside another explicit parallel region;
11 • The sizes of the thread teams used to execute both parallel regions are the same;
12 • The thread affinity policies used to execute both parallel regions are the same;
13 • The value of the dyn-var internal control variable in the enclosing task region is false at entry to
14 both parallel regions;
15 • No teams construct that is not nested inside of a target construct is encountered between the
16 parallel regions;
17 • No construct with an order clause that specifies concurrent is encountered between the
18 parallel regions; and
19 • Neither the omp_pause_resource nor omp_pause_resource_all routine is called.
20 If these conditions all hold, and if a threadprivate variable is referenced in both regions, then threads
21 with the same thread number in their respective regions reference the same copy of that variable.
C / C++
22 If the above conditions hold, the storage duration, lifetime, and value of a thread’s copy of a
23 threadprivate variable that does not appear in any copyin clause on the corresponding construct
24 of the second region spans the two consecutive active parallel regions. Otherwise, the storage
25 duration, lifetime, and value of a thread’s copy of the variable in the second region is unspecified.
C / C++
Fortran
26 If the above conditions hold, the definition, association, or allocation status of a thread’s copy of a
27 threadprivate variable or a variable in a threadprivate common block that is not affected by any
28 copyin clause that appears on the corresponding construct of the second region (a variable is
29 affected by a copyin clause if the variable appears in the copyin clause or it is in a common
30 block that appears in the copyin clause) spans the two consecutive active parallel regions.
31 Otherwise, the definition and association status of a thread’s copy of the variable in the second
32 region are undefined, and the allocation status of an allocatable variable are implementation defined.
33 If a threadprivate variable or a variable in a threadprivate common block is not affected by any
34 copyin clause that appears on the corresponding construct of the first parallel region in

102 OpenMP API – Version 5.2 November 2021


1 which it is referenced, the thread’s copy of the variable inherits the declared type parameter and the
2 default parameter values from the original variable. The variable or any subobject of the variable is
3 initially defined or undefined according to the following rules:
4 • If it has the ALLOCATABLE attribute, each copy created has an initial allocation status of
5 unallocated;
6 • If it has the POINTER attribute, each copy has the same association status as the initial
7 association status.
8 • If it does not have either the POINTER or the ALLOCATABLE attribute:
9 – If it is initially defined, either through explicit initialization or default initialization, each copy
10 created is so defined;
11 – Otherwise, each copy created is undefined.
Fortran
C++
12 The order in which any constructors for different threadprivate variables of class type are called is
13 unspecified. The order in which any destructors for different threadprivate variables of class type
14 are called is unspecified.
C++
15 Restrictions
16 Restrictions to the threadprivate directive are as follows:
17 • A thread must not reference another thread’s copy of a threadprivate variable.
18 • A threadprivate variable must not appear as the base variable of a list item in any clause except
19 for the copyin and copyprivate clauses.
20 • A program in which an untied task accesses threadprivate storage is non-conforming.
C / C++
21 • Each list item must be a file-scope, namespace-scope, or static block-scope variable.
22 • No list item may have an incomplete type.
23 • The address of a threadprivate variable must not be an address constant.
24 • If the value of a variable referenced in an explicit initializer of a threadprivate variable is
25 modified prior to the first reference to any instance of the threadprivate variable, the behavior is
26 unspecified.
27 • A variable that is part of another variable (as an array element or a structure element) cannot
28 appear in a threadprivate directive unless it is a static data member of a C++ class.
29 • A threadprivate directive for file-scope variables must appear outside any definition or
30 declaration, and must lexically precede all references to any of the variables in its list.

CHAPTER 5. DATA ENVIRONMENT 103


1 • A threadprivate directive for namespace-scope variables must appear outside any
2 definition or declaration other than the namespace definition itself and must lexically precede all
3 references to any of the variables in its list.
4 • Each variable in the list of a threadprivate directive at file, namespace, or class scope must
5 refer to a variable declaration at file, namespace, or class scope that lexically precedes the
6 directive.
7 • A threadprivate directive for static block-scope variables must appear in the scope of the
8 variable and not in a nested scope. The directive must lexically precede all references to any of
9 the variables in its list.
10 • Each variable in the list of a threadprivate directive in block scope must refer to a variable
11 declaration in the same scope that lexically precedes the directive. The variable must have static
12 storage duration.
13 • If a variable is specified in a threadprivate directive in one translation unit, it must be
14 specified in a threadprivate directive in every translation unit in which it is declared.
C / C++
C++
15 • A threadprivate directive for static class member variables must appear in the class
16 definition, in the same scope in which the member variables are declared, and must lexically
17 precede all references to any of the variables in its list.
18 • A threadprivate variable must not have an incomplete type or a reference type.
19 • A threadprivate variable with class type must have:
20 – An accessible, unambiguous default constructor in the case of default initialization without a
21 given initializer;
22 – An accessible, unambiguous constructor that accepts the given argument in the case of direct
23 initialization; and
24 – An accessible, unambiguous copy constructor in the case of copy initialization with an explicit
25 initializer.
C++
Fortran
26 • Each list item must be a named variable or a named common block; a named common block
27 must appear between slashes.
28 • The list argument must not include any corrays associate names.
29 • The threadprivate directive must appear in the declaration section of a scoping unit in
30 which the common block or variable is declared.

104 OpenMP API – Version 5.2 November 2021


1 • If a threadprivate directive that specifies a common block name appears in one program
2 unit, then such a directive must also appear in every other program unit that contains a COMMON
3 statement that specifies the same name. It must appear after the last such COMMON statement in
4 the program unit.
5 • If a threadprivate variable or a threadprivate common block is declared with the BIND attribute,
6 the corresponding C entities must also be specified in a threadprivate directive in the C
7 program.
8 • A variable may only appear as an argument in a threadprivate directive in the scope in
9 which it is declared. It must not be an element of a common block or appear in an
10 EQUIVALENCE statement.
11 • A variable that appears as an argument in a threadprivate directive must be declared in the
12 scope of a module or have the SAVE attribute, either explicitly or implicitly.
13 • The effect of an access to a threadprivate variable in a DO CONCURRENT construct is unspecified.
Fortran
14 Cross References
15 • Determining the Number of Threads for a parallel Region, see Section 10.1.1
16 • copyin clause, see Section 5.7.1
17 • dyn-var ICV, see Table 2.1
18 • order clause, see Section 10.3

19 5.3 List Item Privatization


20 Some data-sharing attribute clauses, including reduction clauses, specify that list items that appear
21 in their list argument may be privatized for the construct on which they appear. Each task that
22 references a privatized list item in any statement in the construct receives at least one new list item
23 if the construct has one or more associated loops, and otherwise each such task receives one new
24 list item. Each SIMD lane used in a simd construct that references a privatized list item in any
25 statement in the construct receives at least one new list item. Language-specific attributes for new
26 list items are derived from the corresponding original list item. Inside the construct, all references to
27 the original list item are replaced by references to a new list item received by the task or SIMD lane.
28 If the construct has one or more associated loops then, within the same logical iteration of the
29 loops, the same new list item replaces all references to the original list item. For any two logical
30 iterations, if the references to the original list item are replaced by the same list item then the logical
31 iterations must execute in some sequential order.
32 In the rest of the region, whether references are to a new list item or the original list item is
33 unspecified. Therefore, if an attempt is made to reference the original item, its value after the

CHAPTER 5. DATA ENVIRONMENT 105


1 region is also unspecified. If a task or a SIMD lane does not reference a privatized list item,
2 whether the task or SIMD lane receives a new list item is unspecified.
3 The value and/or allocation status of the original list item will change only:
4 • If accessed and modified via a pointer;
5 • If possibly accessed in the region but outside of the construct;
6 • As a side effect of directives or clauses; or
Fortran
7 • If accessed and modified via construct association.
Fortran
C++
8 If the construct is contained in a member function, whether accesses anywhere in the region
9 through the implicit this pointer refer to the new list item or the original list item is unspecified.
C++
C / C++
10 A new list item of the same type, with automatic storage duration, is allocated for the construct.
11 The storage and thus lifetime of these list items last until the block in which they are created exits.
12 The size and alignment of the new list item are determined by the type of the variable. This
13 allocation occurs once for each task generated by the construct and once for each SIMD lane used
14 by the construct.
15 The new list item is initialized, or has an undefined initial value, as if it had been locally declared
16 without an initializer.
C / C++
C++
17 If the type of a list item is a reference to a type T then the type will be considered to be T for all
18 purposes of the clause.
19 The order in which any default constructors for different private variables of class type are called is
20 unspecified. The order in which any destructors for different private variables of class type are
21 called is unspecified.
C++
Fortran
22 If any statement of the construct references a list item, a new list item of the same type and type
23 parameters is allocated. This allocation occurs once for each task generated by the construct and
24 once for each SIMD lane used by the construct. If the type of the list item has default initialization,
25 the new list item has default initialization. Otherwise, the initial value of the new list item is
26 undefined. The initial status of a private pointer is undefined.

106 OpenMP API – Version 5.2 November 2021


1 For a list item or the subobject of a list item with the ALLOCATABLE attribute:
2 • If the allocation status is unallocated, the new list item or the subobject of the new list item will
3 have an initial allocation status of unallocated;
4 • If the allocation status is allocated, the new list item or the subobject of the new list item will
5 have an initial allocation status of allocated; and
6 • If the new list item or the subobject of the new list item is an array, its bounds will be the same as
7 those of the original list item or the subobject of the original list item.
8 A privatized list item may be storage-associated with other variables when the data-sharing
9 attribute clause is encountered. Storage association may exist because of base language constructs
10 such as EQUIVALENCE or COMMON. If A is a variable that is privatized by a construct and B is a
11 variable that is storage-associated with A then:
12 • The contents, allocation, and association status of B are undefined on entry to the region;
13 • Any definition of A, or of its allocation or association status, causes the contents, allocation, and
14 association status of B to become undefined; and
15 • Any definition of B, or of its allocation or association status, causes the contents, allocation, and
16 association status of A to become undefined.
17 A privatized list item may be a selector of an ASSOCIATE or SELECT TYPE construct. If the
18 construct association is established prior to a parallel region, the association between the
19 associate name and the original list item will be retained in the region.
20 Finalization of a list item of a finalizable type or subobjects of a list item of a finalizable type
21 occurs at the end of the region. The order in which any final subroutines for different variables of a
22 finalizable type are called is unspecified.
Fortran
23 If a list item appears in both firstprivate and lastprivate clauses, the update required
24 for the lastprivate clause occurs after all initializations for the firstprivate clause.

25 Restrictions
26 The following restrictions apply to any list item that is privatized unless otherwise stated for a given
27 data-sharing attribute clause:
C++
28 • A variable of class type (or array thereof) that is privatized requires an accessible, unambiguous
29 default constructor for the class type.
C++

CHAPTER 5. DATA ENVIRONMENT 107


C / C++
1 • A variable that is privatized must not have a const-qualified type unless it is of class type with
2 a mutable member. This restriction does not apply to the firstprivate clause.
3 • A variable that is privatized must not have an incomplete type or be a reference to an incomplete
4 type.
C / C++
Fortran
5 • Variables that appear in namelist statements, in variable format expressions, and in expressions
6 for statement function definitions, must not be privatized.
7 • Pointers with the INTENT(IN) attribute must not be privatized. This restriction does not apply
8 to the firstprivate clause.
9 • A private variable must not be coindexed or appear as an actual argument to a procedure where
10 the corresponding dummy argument is a coarray.
11 • Assumed-size arrays must not be privatized in a target, teams, or distribute construct.
Fortran

12 5.4 Data-Sharing Attribute Clauses


13 Several constructs accept clauses that allow a user to control the data-sharing attributes of variables
14 referenced in the construct. Not all of the clauses listed in this section are valid on all directives.
15 The set of clauses that is valid on a particular directive is described with the directive. The
16 reduction data-sharing attribute clauses are explained in Section 5.5.
17 A list item may be specified in both firstprivate and lastprivate clauses.
C++
18 If a variable referenced in a data-sharing attribute clause has a type derived from a template and the
19 program does not otherwise reference that variable, any behavior related to that variable is
20 unspecified.
C++
Fortran
21 If individual members of a common block appear in a data-sharing attribute clause other than the
22 shared clause, the variables no longer have a Fortran storage association with the common block.
Fortran

108 OpenMP API – Version 5.2 November 2021


1 5.4.1 default Clause
2 Name: default Properties: unique

3 Arguments
Name Type Properties
4 data-sharing-attribute Keyword: firstprivate, none, default
private, shared

5 Directives
6 parallel, task, taskloop, teams

7 Semantics
8 The default clause determines the implicit data-sharing attribute of certain variables that are
9 referenced in the construct, in accordance with the rules given in Section 5.1.1.
10 If data-sharing-attribute is not none, the data-sharing attribute of all variables referenced in the
11 construct that have implicitly determined data-sharing attributes will be data-sharing-attribute. If
12 data-sharing-attribute is none, the data-sharing attribute is not implicitly determined.

13 Restrictions
14 Restrictions to the default clause are as follows:
15 • If data-sharing-attribute is none, each variable that is referenced in the construct and does not
16 have a predetermined data-sharing attribute must have its data-sharing attribute explicitly
17 determined by being listed in a data-sharing attribute clause.
C / C++
18 • If data-sharing-attribute is firstprivate or private, each variable with static storage
19 duration that is declared in a namespace or global scope, is referenced in the construct, and does
20 not have a predetermined data-sharing attribute must have its data-sharing attribute explicitly
21 determined by being listed in a data-sharing attribute clause.
C / C++
22 Cross References
23 • parallel directive, see Section 10.1
24 • task directive, see Section 12.5
25 • taskloop directive, see Section 12.6
26 • teams directive, see Section 10.2

CHAPTER 5. DATA ENVIRONMENT 109


1 5.4.2 shared Clause
Name: shared Properties: data-environment attribute, data-
2
sharing attribute

3 Arguments
Name Type Properties
4
list list of variable list item type default

5 Directives
6 parallel, task, taskloop, teams

7 Semantics
8 The shared clause declares one or more list items to be shared by tasks generated by the construct
9 on which it appears. All references to a list item within a task refer to the storage area of the
10 original variable at the point the directive was encountered.
11 The programmer must ensure, by adding proper synchronization, that storage shared by an explicit
12 task region does not reach the end of its lifetime before the explicit task region completes its
13 execution.
Fortran
14 The association status of a shared pointer becomes undefined upon entry to and exit from the
15 construct if it is associated with a target or a subobject of a target that appears as a privatized list
16 item in a data-sharing attribute clause on the construct. A reference to the shared storage that is
17 associated with the dummy argument by any other task must be synchronized with the reference to
18 the procedure to avoid possible data races.
Fortran
19 Cross References
20 • parallel directive, see Section 10.1
21 • task directive, see Section 12.5
22 • taskloop directive, see Section 12.6
23 • teams directive, see Section 10.2

110 OpenMP API – Version 5.2 November 2021


1 5.4.3 private Clause
Name: private Properties: data-environment attribute, data-
2
sharing attribute, privatization

3 Arguments
Name Type Properties
4
list list of variable list item type default

5 Directives
6 distribute, do, for, loop, parallel, scope, sections, simd, single, target,
7 task, taskloop, teams

8 Semantics
9 The private clause specifies that its list items are to be privatized according to Section 5.3. Each
10 task or SIMD lane that references a list item in the construct receives only one new list item, unless
11 the construct has one or more associated loops and an order clause that specifies concurrent
12 is also present.

13 Restrictions
14 Restrictions to the private clause are as specified in Section 5.3.

15 Cross References
16 • List Item Privatization, see Section 5.3
17 • distribute directive, see Section 11.6
18 • do directive, see Section 11.5.2
19 • for directive, see Section 11.5.1
20 • loop directive, see Section 11.7
21 • parallel directive, see Section 10.1
22 • scope directive, see Section 11.2
23 • sections directive, see Section 11.3
24 • simd directive, see Section 10.4
25 • single directive, see Section 11.1
26 • target directive, see Section 13.8
27 • task directive, see Section 12.5
28 • taskloop directive, see Section 12.6
29 • teams directive, see Section 10.2

CHAPTER 5. DATA ENVIRONMENT 111


1 5.4.4 firstprivate Clause
Name: firstprivate Properties: data-environment attribute, data-
2
sharing attribute, privatization

3 Arguments
Name Type Properties
4
list list of variable list item type default

5 Directives
6 distribute, do, for, parallel, scope, sections, single, target, task,
7 taskloop, teams

8 Semantics
9 The firstprivate clause provides a superset of the functionality provided by the private
10 clause. A list item that appears in a firstprivate clause is subject to the private clause
11 semantics described in Section 5.4.3, except as noted. In addition, the new list item is initialized
12 from the original list item that exists before the construct. The initialization of the new list item is
13 done once for each task that references the list item in any statement in the construct. The
14 initialization is done prior to the execution of the construct.
15 For a firstprivate clause on a construct that is not a work-distribution construct, the initial
16 value of the new list item is the value of the original list item that exists immediately prior to the
17 construct in the task region where the construct is encountered unless otherwise specified. For a
18 firstprivate clause on a work-distribution construct, the initial value of the new list item for
19 each implicit task of the threads that execute the construct is the value of the original list item that
20 exists in the implicit task immediately prior to the point in time that the construct is encountered
21 unless otherwise specified.
22 To avoid data races, concurrent updates of the original list item must be synchronized with the read
23 of the original list item that occurs as a result of the firstprivate clause.
C / C++
24 For variables of non-array type, the initialization occurs by copy assignment. For an array of
25 elements of non-array type, each element is initialized as if by assignment from an element of the
26 original array to the corresponding element of the new array.
C / C++
C++
27 For each variable of class type:
28 • If the firstprivate clause is not on a target construct then a copy constructor is invoked
29 to perform the initialization; and
30 • If the firstprivate clause is on a target construct then how many copy constructors, if
31 any, are invoked is unspecified.

112 OpenMP API – Version 5.2 November 2021


1 If copy constructors are called, the order in which copy constructors for different variables of class
2 type are called is unspecified.
C++
Fortran
3 If the original list item does not have the POINTER attribute, initialization of the new list items
4 occurs as if by intrinsic assignment unless the original list item has a compatible type-bound
5 defined assignment, in which case initialization of the new list items occurs as if by the defined
6 assignment. If the original list item that does not have the POINTER attribute has the allocation
7 status of unallocated, the new list items will have the same status.
8 If the original list item has the POINTER attribute, the new list items receive the same association
9 status as the original list item, as if by pointer assignment.
10 The list items that appear in a firstprivate clause may include named constants.
Fortran
11 Restrictions
12 Restrictions to the firstprivate clause are as follows:
13 • A list item that is private within a parallel region must not appear in a firstprivate
14 clause on a worksharing construct if any of the worksharing regions that arise from the
15 worksharing construct ever bind to any of the parallel regions that arise from the
16 parallel construct.
17 • A list item that is private within a teams region must not appear in a firstprivate clause
18 on a distribute construct if any of the distribute regions that arise from the
19 distribute construct ever bind to any of the teams regions that arise from the teams
20 construct.
21 • A list item that appears in a reduction clause of a parallel construct must not appear in a
22 firstprivate clause on a worksharing, task, or taskloop construct if any of the
23 worksharing or task regions that arise from the worksharing, task, or taskloop construct
24 ever bind to any of the parallel regions that arise from the parallel construct.
25 • A list item that appears in a reduction clause of a teams construct must not appear in a
26 firstprivate clause on a distribute construct if any of the distribute regions that
27 arise from the distribute construct ever bind to any of the teams regions that arise from the
28 teams construct.
29 • A list item that appears in a reduction clause of a worksharing construct must not appear in a
30 firstprivate clause in a task construct encountered during execution of any of the
31 worksharing regions that arise from the worksharing construct.

CHAPTER 5. DATA ENVIRONMENT 113


C++
1 • A variable of class type (or array thereof) that appears in a firstprivate clause requires an
2 accessible, unambiguous copy constructor for the class type.
3 • If the original list item in a firstprivate clause on a work-distribution construct has a
4 reference type then it must bind to the same object for all threads in the binding thread set of the
5 work-distribution region.
C++
Fortran
6 • If the list item is a polymorphic variable with the ALLOCATABLE attribute, the behavior is
7 unspecified.
Fortran
8 Cross References
9 • distribute directive, see Section 11.6
10 • do directive, see Section 11.5.2
11 • for directive, see Section 11.5.1
12 • parallel directive, see Section 10.1
13 • private clause, see Section 5.4.3
14 • scope directive, see Section 11.2
15 • sections directive, see Section 11.3
16 • single directive, see Section 11.1
17 • target directive, see Section 13.8
18 • task directive, see Section 12.5
19 • taskloop directive, see Section 12.6
20 • teams directive, see Section 10.2

114 OpenMP API – Version 5.2 November 2021


1 5.4.5 lastprivate Clause
Name: lastprivate Properties: data-environment attribute, data-
2
sharing attribute, privatization

3 Arguments
Name Type Properties
4
list list of variable list item type default

5 Modifiers
Name Modifies Type Properties
6 lastprivate- list Keyword: conditional default
modifier

7 Directives
8 distribute, do, for, loop, sections, simd, taskloop

9 Semantics
10 The lastprivate clause provides a superset of the functionality provided by the private
11 clause. A list item that appears in a lastprivate clause is subject to the private clause
12 semantics described in Section 5.4.3. In addition, when a lastprivate clause without the
13 conditional modifier appears on a directive and the list item is not an iteration variable of any
14 associated loop, the value of each new list item from the sequentially last iteration of the associated
15 loops, or the lexically last structured block sequence associated with a sections construct, is
16 assigned to the original list item. When the conditional modifier appears on the clause or the
17 list item is an iteration variable of one of the associated loops, if sequential execution of the loop
18 nest would assign a value to the list item then the original list item is assigned the value that the list
19 item would have after sequential execution of the loop nest.
C++
20 For class types, the copy assignment operator is invoked. The order in which copy assignment
21 operators for different variables of the same class type are invoked is unspecified.
C++
C / C++
22 For an array of elements of non-array type, each element is assigned to the corresponding element
23 of the original array.
C / C++
Fortran
24 If the original list item does not have the POINTER attribute, its update occurs as if by intrinsic
25 assignment unless it has a type bound procedure as a defined assignment.
26 If the original list item has the POINTER attribute, its update occurs as if by pointer assignment.
Fortran

CHAPTER 5. DATA ENVIRONMENT 115


1 When the conditional modifier does not appear on the lastprivate clause, any list item
2 that is not an iteration variable of the associated loops and that is not assigned a value by the
3 sequentially last iteration of the loops, or by the lexically last structured block sequence associated
4 with a sections construct, has an unspecified value after the construct. When the
5 conditional modifier does not appear on the lastprivate clause, a list item that is the
6 iteration variable of an associated loop and that would not be assigned a value during sequential
7 execution of the loop nest has an unspecified value after the construct. Unassigned subcomponents
8 also have unspecified values after the construct.
9 If the lastprivate clause is used on a construct to which neither the nowait nor the
10 nogroup clauses are applied, the original list item becomes defined at the end of the construct. To
11 avoid data races, concurrent reads or updates of the original list item must be synchronized with the
12 update of the original list item that occurs as a result of the lastprivate clause.
13 Otherwise, if the lastprivate clause is used on a construct to which the nowait or the
14 nogroup clauses are applied, accesses to the original list item may create a data race. To avoid
15 this data race, if an assignment to the original list item occurs then synchronization must be inserted
16 to ensure that the assignment completes and the original list item is flushed to memory.
17 If a list item that appears in a lastprivate clause with the conditional modifier is
18 modified in the region by an assignment outside the construct or not to the list item then the value
19 assigned to the original list item is unspecified.

20 Restrictions
21 Restrictions to the lastprivate clause are as follows:
22 • A list item must not appear in a lastprivate clause on a work-distribution construct if the
23 corresponding region binds to the region of a parallelism-generating construct in which the list
24 item is private.
25 • A list item that appears in a lastprivate clause with the conditional modifier must be a
26 scalar variable.
C++
27 • A variable of class type (or array thereof) that appears in a lastprivate clause requires an
28 accessible, unambiguous default constructor for the class type, unless the list item is also
29 specified in a firstprivate clause.
30 • A variable of class type (or array thereof) that appears in a lastprivate clause requires an
31 accessible, unambiguous copy assignment operator for the class type.
32 • If an original list item in a lastprivate clause on a work-distribution construct has a
33 reference type then it must bind to the same object for all threads in the binding thread set of the
34 work-distribution region.
C++

116 OpenMP API – Version 5.2 November 2021


Fortran
1 • A variable that appears in a lastprivate clause must be definable.
2 • If the original list item has the ALLOCATABLE attribute, the corresponding list item of which the
3 value is assigned to the original item must have an allocation status of allocated upon exit from
4 the sequentially last iteration or lexically last structured block sequence associated with a
5 sections construct.
6 • If the list item is a polymorphic variable with the ALLOCATABLE attribute, the behavior is
7 unspecified.
Fortran
8 Cross References
9 • distribute directive, see Section 11.6
10 • do directive, see Section 11.5.2
11 • for directive, see Section 11.5.1
12 • loop directive, see Section 11.7
13 • private clause, see Section 5.4.3
14 • sections directive, see Section 11.3
15 • simd directive, see Section 10.4
16 • taskloop directive, see Section 12.6

17 5.4.6 linear Clause


Name: linear Properties: data-environment attribute, data-
18
sharing attribute, privatization, post-modified

19 Arguments
Name Type Properties
20
list list of variable list item type default

21 Modifiers
Name Modifies Type Properties
step-simple- list OpenMP integer expression exclusive, re-
modifier gion-invariant,
unique
step-complex- list Complex, name: step Ar- unique
22
modifier guments:
linear-step expression of in-
teger type (region-invariant)

linear-modifier list Keyword: ref, uval, val unique

CHAPTER 5. DATA ENVIRONMENT 117


1 Directives
2 declare simd, do, for, simd

3 Additional information
4 list and linear-modifier may instead be specified as linear-modifier(list) for linear clauses that
5 appear on a declare simd directive. This syntax has been deprecated.
6 Semantics
7 The linear clause provides a superset of the functionality provided by the private clause. A
8 list item that appears in a linear clause is subject to the private clause semantics described in
9 Section 5.4.3, except as noted. If the step-simple-modifier is specified, the behavior is as if the
10 step-complex-modifier is instead specified with step-simple-modifier as its linear-step argument. If
11 linear-step is not specified, it is assumed to be 1.
12 When a linear clause is specified on a construct, the value of the new list item on each logical
13 iteration of the associated loops corresponds to the value of the original list item before entering the
14 construct plus the logical number of the iteration times linear-step. The value corresponding to the
15 sequentially last logical iteration of the associated loops is assigned to the original list item.
16 When a linear clause is specified on a declare simd directive, the list items refer to
17 parameters of the procedure to which the directive applies. For a given call to the procedure, the
18 clause determines whether the SIMD version generated by the directive may be called. If the clause
19 does not specify the ref linear-modifier, the SIMD version requires that the value of the
20 corresponding argument at the callsite is equal to the value of the argument from the first lane plus
21 the logical number of the lane times the linear-step. If the clause specifies the ref linear-modifier,
22 the SIMD version requires that the storage locations of the corresponding arguments at the callsite
23 from each SIMD lane correspond to locations within a hypothetical array of elements of the same
24 type, indexed by the logical number of the lane times the linear-step.

25 Restrictions
26 Restrictions to the linear clause are as follows:
27 • Only a loop iteration variable of a loop that is associated with the construct may appear as a list
28 item in a linear clause if a reduction clause with the inscan modifier also appears on
29 the construct.
30 • A linear-modifier may be specified as ref or uval only on a declare simd directive.
31 • For a linear clause that appears on a loop-associated construct, the difference between the
32 value of a list item at the end of a logical iteration and its value at the beginning of the logical
33 iteration must be equal to linear-step.
34 • If linear-modifier is uval for a list item in a linear clause that is specified on a
35 declare simd directive and the list item is modified during a call to the SIMD version of the
36 procedure, the program must not depend on the value of the list item upon return from the
37 procedure.

118 OpenMP API – Version 5.2 November 2021


1 • If linear-modifier is uval for a list item in a linear clause that is specified on a
2 declare simd directive, the program must not depend on the storage of the argument in the
3 procedure being the same as the storage of the corresponding argument at the callsite.
C
4 • All list items must be of integral or pointer type.
5 • If specified, linear-modifier must be val.
C
C++
6 • If linear-modifier is not ref, all list items must be of integral or pointer type, or must be a
7 reference to an integral or pointer type.
8 • If linear-modifier is ref or uval, all list items must be of a reference type.
9 • If a list item in a linear clause on a worksharing construct has a reference type then it must
10 bind to the same object for all threads of the team.
11 • If a list item in a linear clause that is specified on a declare simd directive is of a reference
12 type and linear-modifier is not ref, the difference between the value of the argument on exit
13 from the function and its value on entry to the function must be the same for all SIMD lanes.
C++
Fortran
14 • If linear-modifier is not ref, all list items must be of type integer.
15 • If linear-modifier is ref or uval, all list items must be dummy arguments without the VALUE
16 attribute.
17 • List items must not be Cray pointers or variables that have the POINTER attribute. Cray pointer
18 support has been deprecated.
19 • If linear-modifier is not ref and a list item has the ALLOCATABLE attribute, the allocation
20 status of the list item in the sequentially last iteration must be allocated upon exit from that
21 iteration.
22 • If linear-modifier is ref, list items must be polymorphic variables, assumed-shape arrays, or
23 variables with the ALLOCATABLE attribute.
24 • If a list item in a linear clause that is specified on a declare simd directive is a dummy
25 argument without the VALUE attribute and linear-modifier is not ref, the difference between the
26 value of the argument on exit from the procedure and its value on entry to the procedure must be
27 the same for all SIMD lanes.
28 • A common block name must not appear in a linear clause.
Fortran

CHAPTER 5. DATA ENVIRONMENT 119


1 Cross References
2 • declare simd directive, see Section 7.7
3 • do directive, see Section 11.5.2
4 • for directive, see Section 11.5.1
5 • private clause, see Section 5.4.3
6 • simd directive, see Section 10.4
7 • taskloop directive, see Section 12.6

8 5.4.7 is_device_ptr Clause


Name: is_device_ptr Properties: data-environment attribute, data-
9
sharing attribute

10 Arguments
Name Type Properties
11
list list of variable list item type default

12 Directives
13 dispatch, target

14 Semantics
15 The is_device_ptr clause indicates that its list items are device pointers. Support for device
16 pointers created outside of OpenMP, specifically outside of any OpenMP mechanism that returns a
17 device pointer, is implementation defined.
18 If the is_device_ptr clause is specified on a target construct, each list item privatized
19 inside the construct and the new list item is initialized to the device address to which the original
20 list item refers.
Fortran
21 If the is_device_ptr clause is specified on a target construct, if any list item is not of type
22 C_PTR, the behavior is as if the list item appeared in a has_device_addr clause. Support for
23 such list items in an is_device_ptr clause is deprecated.
Fortran
24 Restrictions
25 Restrictions to the is_device_ptr clause are as follows:
26 • Each list item must be a valid device pointer for the device data environment.
C
27 • Each list item must have a type of pointer or array.
C

120 OpenMP API – Version 5.2 November 2021


C++
1 • Each list item must have a type of pointer, array, reference to pointer or reference to array.
C++
Fortran
2 • Each list item must be of type C_PTR unless the clause appears on a target directive; the use
3 of list items on the target directive that are not of type C_PTR has been deprecated.
Fortran
4 Cross References
5 • dispatch directive, see Section 7.6
6 • has_device_addr clause, see Section 5.4.9
7 • target directive, see Section 13.8

8 5.4.8 use_device_ptr Clause


Name: use_device_ptr Properties: data-environment attribute, data-
9
sharing attribute

10 Arguments
Name Type Properties
11
list list of variable list item type default

12 Directives
13 target data

14 Semantics
C / C++
15 If a list item that appears in a use_device_ptr clause is a pointer to an object that is mapped to
16 the device data environment, references to the list item in the structured block that is associated
17 with the construct on which the clause appears are converted into references to a device pointer that
18 is local to the structured block and that refers to the device address of the corresponding object. If
19 the list item does not point to a mapped object, it must contain a valid device address for the target
20 device, and the list item references are instead converted to references to a local device pointer that
21 refers to this device address.
C / C++

CHAPTER 5. DATA ENVIRONMENT 121


Fortran
1 If a list item that appears in a use_device_ptr clause is of type C_PTR and points to a data
2 entity that is mapped to the device data environment, references to the list item in the structured
3 block that is associated with the construct on which the clause appears are converted into references
4 to a device pointer that is local to the structured block and that refers to the device address of the
5 corresponding entity. If a list item of type C_PTR does not point to a mapped object, it must
6 contain a valid device address for the target device, and the list item references are instead
7 converted to references to a local device pointer that refers to this device address. If a list item in a
8 use_device_ptr clause is not of type C_PTR, the behavior is as if the list item appeared in a
9 use_device_addr clause. Support for such list items in a use_device_ptr clause is
10 deprecated.
Fortran
11 Restrictions
12 Restrictions to the use_device_ptr clause are as follows:
13 • Each list item must not be a structure element.
C / C++
14 • Each list item must be a pointer for which the value is the address of an object that has
15 corresponding storage in the device data environment or is accessible on the target device.
C / C++
Fortran
16 • The value of a list item that is of type C_PTR must be the address of a data entity that has
17 corresponding storage in the device data environment or is accessible on the target device.
Fortran
18 Cross References
19 • target data directive, see Section 13.5

20 5.4.9 has_device_addr Clause


Name: has_device_addr Properties: data-environment attribute, data-
21
sharing attribute

22 Arguments
Name Type Properties
23
list list of variable list item type default

24 Directives
25 target

122 OpenMP API – Version 5.2 November 2021


1 Semantics
2 The has_device_addr clause indicates that its list items already have device addresses and
3 therefore they may be directly accessed from a target device. If the device address of a list item is
4 not for the device on which the region that is associated with the construct on which the clause
5 appears executes, accessing the list item inside the region results in unspecified behavior. The list
6 items may include array sections.

7 Restrictions
8 Restrictions to the has_device_addr clause are as follows:
9 • Each list item must have a valid device address for the device data environment.

10 Cross References
11 • target directive, see Section 13.8

12 5.4.10 use_device_addr Clause


Name: use_device_addr Properties: data-environment attribute, data-
13
sharing attribute

14 Arguments
Name Type Properties
15
list list of variable list item type default

16 Directives
17 target data

18 Semantics
19 If a list item has corresponding storage in the device data environment, references to the list item in
20 the structured block that is associated with the construct on which the use_device_addr clause
21 appears are converted into references to the corresponding list item. If the list item is not a mapped
22 list item, it is assumed to be accessible on the target device. Inside the structured block, the list item
23 has a device address and its storage may not be accessible from the host device. The list items that
24 appear in a use_device_addr clause may include array sections.
C / C++
25 If a list item in a use_device_addr clause is an array section that has a base pointer, the effect
26 of the clause is to convert the base pointer to a pointer that is local to the structured block and that
27 contains the device address. This conversion may be elided if the list item was not already mapped.
C / C++

CHAPTER 5. DATA ENVIRONMENT 123


1 Restrictions
2 Restrictions to the use_device_addr clause are as follows:
3 • Each list item must have a corresponding list item in the device data environment or be
4 accessible on the target device.
5 • Each list item must not be a structure element.
C / C++
6 • If a list item is an array section, the base expression must be a base language identifier.
C / C++
Fortran
7 • If a list item is an array section, the designator of the base expression must be a name without any
8 selectors.
Fortran
9 Cross References
10 • target data directive, see Section 13.5

11 5.5 Reduction Clauses and Directives


12 The reduction clauses are data-sharing attribute clauses that can be used to perform some forms of
13 recurrence calculations in parallel. Reduction clauses include reduction scoping clauses and
14 reduction participating clauses. Reduction scoping clauses define the region in which a reduction is
15 computed. Reduction participating clauses define the participants in the reduction.

16 5.5.1 OpenMP Reduction Identifiers


17 The syntax of an OpenMP reduction identifier is defined as follows:
C
18 A reduction identifier is either an identifier or one of the following operators: +, - (deprecated), *,
19 &, |, ^, && and ||.
C
C++
20 A reduction identifier is either an id-expression or one of the following operators: +,
21 - (deprecated), *, &, |, ^, && and ||.
C++
Fortran
22 A reduction identifier is either a base language identifier, or a user-defined operator, or one of the
23 following operators: +, - (deprecated), *, .and., .or., .eqv., .neqv., or one of the
24 following intrinsic procedure names: max, min, iand, ior, ieor.
Fortran

124 OpenMP API – Version 5.2 November 2021


1 5.5.2 OpenMP Reduction Expressions
2 A reduction expression is an OpenMP stylized expression that is relevant to reduction clauses. It is
3 either a combiner expression or an initializer expression.

4 Restrictions
5 Restrictions to reduction expressions are as follows:
6 • If execution of a reduction expression results in the execution of an OpenMP construct or an
7 OpenMP API call, the behavior is unspecified.
C / C++
8 • If a reduction expression corresponds to a reduction identifier that is used in a target region, a
9 declare target directive must be specified for any function that can be accessed through the
10 expression.
C / C++
Fortran
11 • Any subroutine or function used in a reduction expression must be an intrinsic function, or must
12 have an accessible interface.
13 • Any user-defined operator, defined assignment or extended operator used in a reduction
14 expression must have an accessible interface.
15 • If any subroutine, function, user-defined operator, defined assignment or extended operator is
16 used in a reduction expression, it must be accessible to the subprogram in which the
17 corresponding reduction clause is specified.
18 • Any subroutine used in a reduction expression must not have any alternate returns appear in the
19 argument list.
20 • If the list item in the corresponding reduction clause is an array or array section, any
21 procedure used in a reduction expression must either be elemental or have dummy arguments that
22 are scalar.
23 • Any procedure called in the region of a reduction expression must be pure and may not reference
24 any host-associated variables.
25 • If a reduction expression corresponds to a reduction identifier that is used in a target region, a
26 declare target directive must be specified for any function or subroutine that can be
27 accessed through the expression.
Fortran

CHAPTER 5. DATA ENVIRONMENT 125


1 5.5.2.1 OpenMP Combiner Expressions
2 A combiner expression specifies how a reduction combines partial results into a single value.
Fortran
3 A combiner expression is an assignment statement or a subroutine name followed by an argument
4 list.
Fortran
5 In the definition of a combiner expression, omp_in and omp_out correspond to two special
6 variable identifiers that refer to storage of the type of the reduction list item to which the reduction
7 applies. If the list item is an array or array section, the identifiers to which omp_in and omp_out
8 correspond each refer to an array element. Each of the two special variable identifiers denotes one
9 of the values to be combined before executing the combiner expression. The special omp_out
10 identifier refers to the storage that holds the resulting combined value after executing the combiner
11 expression. The number of times that the combiner expression is executed and the order of these
12 executions for any reduction clause are unspecified.
Fortran
13 If the combiner expression is a subroutine name with an argument list, the combiner expression is
14 evaluated by calling the subroutine with the specified argument list. If the combiner expression is an
15 assignment statement, the combiner expression is evaluated by executing the assignment statement.
16 If a generic name is used in a combiner expression and the list item in the corresponding reduction
17 clause is an array or array section, it is resolved to the specific procedure that is elemental or only
18 has scalar dummy arguments.
Fortran
19 Restrictions
20 Restrictions to combiner expressions are as follows:
21 • The only variables allowed in a combiner expression are omp_in and omp_out.
Fortran
22 • Any selectors in the designator of omp_in and omp_out must be component selectors.
Fortran

23 5.5.2.2 OpenMP Initializer Expressions


24 An initializer expression determines the initializer for the private copies of reduction list items. If
25 the initialization of the copies is not determined a priori, the syntax of an initializer expression is as
26 follows:
C
27 omp_priv = initializer
C

126 OpenMP API – Version 5.2 November 2021


1 or
C++
2 omp_priv initializer
C++
3 or
C / C++
4 function-name(argument-list)
C / C++
5 or
Fortran
6 omp_priv = expression

7 or
8 subroutine-name(argument-list)
Fortran
9 In the definition of an initializer expression, the omp_priv special variable identifier refers to the
10 storage to be initialized. The special variable identifier omp_orig can be used in an initializer
11 expression to refer to the storage of the original variable to be reduced. The number of times that an
12 initializer expression is evaluated and the order of these evaluations are unspecified.
C / C++
13 If an initializer expression is a function name with an argument list, it is evaluated by calling the
14 function with the specified argument list. Otherwise, an initializer expression specifies how
15 omp_priv is declared and initialized.
C / C++
Fortran
16 If an initializer expression is a subroutine name with an argument list, the initializer-expr is
17 evaluated by calling the subroutine with the specified argument list. If an initializer expression is an
18 assignment statement, the initializer expression is evaluated by executing the assignment statement.
Fortran
C
19 The a priori initialization of private copies that are created for reductions follows the rules for
20 initialization of objects with static storage duration.
C

CHAPTER 5. DATA ENVIRONMENT 127


C++
1 The a priori initialization of private copies that are created for reductions follows the rules for
2 default-initialization.
C++
Fortran
3 The rules for a priori initialization of private copies that are created for reductions are as follows:
4 • For complex, real, or integer types, the value 0 will be used.
5 • For logical types, the value .false. will be used.
6 • For derived types for which default initialization is specified, default initialization will be used.
7 • Otherwise, the behavior is unspecified.
Fortran
8 Restrictions
9 Restrictions to initializer expressions are as follows:
10 • The only variables allowed in an initializer expression are omp_priv and omp_orig.
11 • If an initializer expression modifies the variable omp_orig, the behavior is unspecified.
C
12 • If an initializer expression is a function name with an argument list, one of the arguments must
13 be the address of omp_priv.
C
C++
14 • If an initializer expression is a function name with an argument list, one of the arguments must
15 be omp_priv or the address of omp_priv.
C++
Fortran
16 • If an initializer expression is a subroutine name with an argument list, one of the arguments must
17 be omp_priv.
Fortran

18 5.5.3 Implicitly Declared OpenMP Reduction Identifiers


C / C++
19 Table 5.1 lists each reduction identifier that is implicitly declared at every scope for arithmetic types
20 and its semantic initializer value. The actual initializer value is that value as expressed in the data
21 type of the reduction list item.

128 OpenMP API – Version 5.2 November 2021


TABLE 5.1: Implicitly Declared C/C++ Reduction Identifiers

Identifier Initializer Combiner

+ omp_priv = 0 omp_out += omp_in


- (depre- omp_priv = 0 omp_out += omp_in
cated)
* omp_priv = 1 omp_out *= omp_in
& omp_priv = ~ 0 omp_out &= omp_in
| omp_priv = 0 omp_out |= omp_in
^ omp_priv = 0 omp_out ^= omp_in
&& omp_priv = 1 omp_out = omp_in && omp_out
|| omp_priv = 0 omp_out = omp_in || omp_out
max omp_priv = Minimal omp_out = omp_in > omp_out ?
representable number in the omp_in : omp_out
reduction list item type
min omp_priv = Maximal omp_out = omp_in < omp_out ?
representable number in the omp_in : omp_out
reduction list item type

C / C++
Fortran
1 Table 5.2 lists each reduction identifier that is implicitly declared for numeric and logical types and
2 its semantic initializer value. The actual initializer value is that value as expressed in the data type
3 of the reduction list item.

TABLE 5.2: Implicitly Declared Fortran Reduction Identifiers

Identifier Initializer Combiner

+ omp_priv = 0 omp_out = omp_in + omp_out


- (depre- omp_priv = 0 omp_out = omp_in + omp_out
cated)
* omp_priv = 1 omp_out = omp_in * omp_out
table continued on next page

CHAPTER 5. DATA ENVIRONMENT 129


table continued from previous page

Identifier Initializer Combiner

.and. omp_priv = .true. omp_out = omp_in .and. omp_out


.or. omp_priv = .false. omp_out = omp_in .or. omp_out
.eqv. omp_priv = .true. omp_out = omp_in .eqv. omp_out
.neqv. omp_priv = .false. omp_out = omp_in .neqv. omp_out
max omp_priv = Minimal omp_out = max(omp_in, omp_out)
representable number in the
reduction list item type
min omp_priv = Maximal omp_out = min(omp_in, omp_out)
representable number in the
reduction list item type
iand omp_priv = All bits on omp_out = iand(omp_in, omp_out)
ior omp_priv = 0 omp_out = ior(omp_in, omp_out)
ieor omp_priv = 0 omp_out = ieor(omp_in, omp_out)

Fortran

1 5.5.4 initializer Clause


2 Name: initializer Properties: unique

3 Arguments
Name Type Properties
4
initializer-expr expression of initializer type default

5 Directives
6 declare reduction

7 Semantics
8 The initializer clause can be used to specify initializer-expr as the initializer expression for a
9 user-defined reduction.

10 Cross References
11 • declare reduction directive, see Section 5.5.11

130 OpenMP API – Version 5.2 November 2021


1 5.5.5 Properties Common to All Reduction Clauses
2 The clause-specification of a reduction clause has a clause-argument-specification that specifies an
3 OpenMP variable list argument and has a required reduction-identifier modifier that specifies the
4 reduction identifier to use for the reduction. The reduction identifier must match a previously
5 declared reduction identifier of the same name and type for each of the list items. This match is
6 done by means of a name lookup in the base language.
7 The list items that appear in a reduction clause may include array sections.
C++
8 If the type is a derived class then any reduction identifier that matches its base classes is also a
9 match if no specific match for the type has been specified.
10 If the reduction identifier is not an id-expression then it is implicitly converted to one by prepending
11 the keyword operator (for example, + becomes operator+).
12 If the reduction identifier is qualified then a qualified name lookup is used to find the declaration.
13 If the reduction identifier is unqualified then an argument-dependent name lookup must be
14 performed using the type of each list item.
C++
15 If a list item is an array or array section, it will be treated as if a reduction clause would be applied
16 to each separate element of the array section.
17 If a list item is an array section, the elements of any copy of the array section will be stored
18 contiguously.
Fortran
19 If the original list item has the POINTER attribute, any copies of the list item are associated with
20 private targets.
Fortran
21 Any copies of a list item associated with the reduction are initialized with the initializer value of the
22 reduction identifier. Any copies are combined using the combiner associated with the reduction
23 identifier.

24 Execution Model Events


25 The reduction-begin event occurs before a task begins to perform loads and stores that belong to the
26 implementation of a reduction and the reduction-end event occurs after the task has completed
27 loads and stores associated with the reduction. If a task participates in multiple reductions, each
28 reduction may be bracketed by its own pair of reduction-begin/reduction-end events or multiple
29 reductions may be bracketed by a single pair of events. The interval defined by a pair of
30 reduction-begin/reduction-end events may not contain a task scheduling point.

CHAPTER 5. DATA ENVIRONMENT 131


1 Tool Callbacks
2 A thread dispatches a registered ompt_callback_reduction with
3 ompt_sync_region_reduction in its kind argument and ompt_scope_begin as its
4 endpoint argument for each occurrence of a reduction-begin event in that thread. Similarly, a thread
5 dispatches a registered ompt_callback_reduction with
6 ompt_sync_region_reduction in its kind argument and ompt_scope_end as its
7 endpoint argument for each occurrence of a reduction-end event in that thread. These callbacks
8 occur in the context of the task that performs the reduction and has the type signature
9 ompt_callback_sync_region_t.

10 Restrictions
11 Restrictions common to reduction clauses are as follows:
12 • Any array element must be specified at most once in all list items on a directive.
13 • For a reduction identifier declared in a declare reduction directive, the directive must
14 appear before its use in a reduction clause.
15 • If a list item is an array section, it must specify contiguous storage, it cannot be a zero-length
16 array section and its base expression must be a base language identifier.
17 • If a list item is an array section or an array element, accesses to the elements of the array outside
18 the specified array section or array element result in unspecified behavior.
C / C++
19 • The type of a list item that appears in a reduction clause must be valid for the reduction identifier.
20 For a max or min reduction in C, the type of the list item must be an allowed arithmetic data
21 type: char, int, float, double, or _Bool, possibly modified with long, short,
22 signed, or unsigned. For a max or min reduction in C++, the type of the list item must be
23 an allowed arithmetic data type: char, wchar_t, int, float, double, or bool, possibly
24 modified with long, short, signed, or unsigned.
25 • A list item that appears in a reduction clause must not be const-qualified.
26 • The reduction identifier for any list item must be unambiguous and accessible.
C / C++
Fortran
27 • The type, type parameters and rank of a list item that appears in a reduction clause must be valid
28 for the combiner expression and the initializer expression.
29 • A list item that appears in a reduction clause must be definable.
30 • A procedure pointer must not appear in a reduction clause.
31 • A pointer with the INTENT(IN) attribute must not appear in a reduction clause.

132 OpenMP API – Version 5.2 November 2021


1 • An original list item with the POINTER attribute or any pointer component of an original list
2 item that is referenced in a combiner expression must be associated at entry to the construct that
3 contains the reduction clause. Additionally, the list item or the pointer component of the list item
4 must not be deallocated, allocated, or pointer assigned within the region.
5 • An original list item with the ALLOCATABLE attribute or any allocatable component of an
6 original list item that corresponds to a special variable identifier in the combiner expression or
7 the initializer expression must be in the allocated state at entry to the construct that contains the
8 reduction clause. Additionally, the list item or the allocatable component of the list item must be
9 neither deallocated nor allocated, explicitly or implicitly, within the region.
10 • If the reduction identifier is defined in a declare reduction directive, the declare
11 reduction directive must be in the same subprogram, or accessible by host or use association.
12 • If the reduction identifier is a user-defined operator, the same explicit interface for that operator
13 must be accessible at the location of the declare reduction directive that defines the
14 reduction identifier.
15 • If the reduction identifier is defined in a declare reduction directive, any procedure
16 referenced in the initializer clause or the combiner expression must be an intrinsic
17 function, or must have an explicit interface where the same explicit interface is accessible as at
18 the declare reduction directive.
Fortran
19 Cross References
20 • ompt_callback_sync_region_t, see Section 19.5.2.13
21 • ompt_scope_endpoint_t, see Section 19.4.4.11
22 • ompt_sync_region_t, see Section 19.4.4.14

23 5.5.6 Reduction Scoping Clauses


24 Reduction scoping clauses define the region in which a reduction is computed by tasks or SIMD
25 lanes. All properties common to all reduction clauses, which are defined in Section 5.5.5, apply to
26 reduction scoping clauses.
27 The number of copies created for each list item and the time at which those copies are initialized
28 are determined by the particular reduction scoping clause that appears on the construct. The time at
29 which the original list item contains the result of the reduction is determined by the particular
30 reduction scoping clause. To avoid data races, concurrent reads or updates of the original list item
31 must be synchronized with that update of the original list item, which may occur after the construct
32 on which the reduction scoping clause appears, for example, due to the use of the nowait clause.
33 The location in the OpenMP program at which values are combined and the order in which values
34 are combined are unspecified. Thus, when comparing sequential and parallel executions, or when
35 comparing one parallel execution to another (even if the number of threads used is the same),

CHAPTER 5. DATA ENVIRONMENT 133


1 bitwise-identical results are not guaranteed. Similarly, side effects (such as floating-point
2 exceptions) may not be identical and may not occur at the same location in the OpenMP program.

3 5.5.7 Reduction Participating Clauses


4 A reduction participating clause specifies a task or a SIMD lane as a participant in a reduction
5 defined by a reduction scoping clause. All properties common to all reduction clauses, which are
6 defined in Section 5.5.5, apply to reduction participating clauses.
7 Accesses to the original list item may be replaced by accesses to copies of the original list item
8 created by a region that corresponds to a construct with a reduction scoping clause.
9 In any case, the final value of the reduction must be determined as if all tasks or SIMD lanes that
10 participate in the reduction are executed sequentially in some arbitrary order.

11 5.5.8 reduction Clause


Name: reduction Properties: data-environment attribute, data-
12 sharing attribute, privatization, reduction
scoping, reduction participating

13 Arguments
Name Type Properties
14
list list of variable list item type default

15 Modifiers
Name Modifies Type Properties
reduction- list An OpenMP reduction iden- required, ultimate
16 identifier tifier
reduction-modifier list Keyword: default, default
inscan, task

17 Directives
18 do, for, loop, parallel, scope, sections, simd, taskloop, teams

19 Semantics
20 The reduction clause is a reduction scoping clause and a reduction participating clause, as
21 described in Section 5.5.6 and Section 5.5.7. For each list item, a private copy is created for each
22 implicit task or SIMD lane and is initialized with the initializer value of the reduction-identifier.
23 After the end of the region, the original list item is updated with the values of the private copies
24 using the combiner associated with the reduction-identifier.
25 If reduction-modifier is not present or the default reduction-modifier is present, the behavior is
26 as follows. For parallel and worksharing constructs, one or more private copies of each list

134 OpenMP API – Version 5.2 November 2021


1 item are created for each implicit task, as if the private clause had been used. For the simd
2 construct, one or more private copies of each list item are created for each SIMD lane, as if the
3 private clause had been used. For the taskloop construct, private copies are created
4 according to the rules of the reduction scoping clauses. For the teams construct, one or more
5 private copies of each list item are created for the initial task of each team in the league, as if the
6 private clause had been used. For the loop construct, private copies are created and used in the
7 construct according to the description and restrictions in Section 5.3. At the end of a region that
8 corresponds to a construct for which the reduction clause was specified, the original list item is
9 updated by combining its original value with the final value of each of the private copies, using the
10 combiner of the specified reduction-identifier.
11 If the inscan reduction-modifier is present, a scan computation is performed over updates to the
12 list item performed in each logical iteration of the loop associated with the worksharing-loop,
13 worksharing-loop SIMD, or simd construct (see Section 5.6). The list items are privatized in the
14 construct according to the description and restrictions in Section 5.3. At the end of the region, each
15 original list item is assigned the value described in Section 5.6.
16 If the task reduction-modifier is present for a parallel or worksharing construct, then each list
17 item is privatized according to the description and restrictions in Section 5.3, and an unspecified
18 number of additional private copies may be created to support task reductions. Any copies
19 associated with the reduction are initialized before they are accessed by the tasks that participate in
20 the reduction, which include all implicit tasks in the corresponding region and all participating
21 explicit tasks that specify an in_reduction clause (see Section 5.5.10). After the end of the
22 region, the original list item contains the result of the reduction.

23 Restrictions
24 Restrictions to the reduction clause are as follows:
25 • All restrictions common to all reduction clauses, as listed in Section 5.5.5, apply to this clause.
26 • A list item that appears in a reduction clause on a worksharing construct must be shared in
27 the parallel region to which a corresponding worksharing region binds.
28 • If an array section or array element appears as a list item in a reduction clause on a
29 worksharing construct, all threads of the team must specify the same storage location.
30 • Each list item specified with the inscan reduction-modifier must appear as a list item in an
31 inclusive or exclusive clause on a scan directive enclosed by the construct.
32 • If the inscan reduction-modifier is specified, a reduction clause without the inscan
33 reduction-modifier must not appear on the same construct.
34 • A reduction clause with the task reduction-modifier may only appear on a parallel
35 construct, a worksharing construct or a combined or composite construct for which any of the
36 aforementioned constructs is a constituent construct and neither simd nor loop are constituent
37 constructs.

CHAPTER 5. DATA ENVIRONMENT 135


1 • A reduction clause with the inscan reduction-modifier may only appear on a
2 worksharing-loop construct, a simd construct or a combined or composite construct for which
3 any of the aforementioned constructs is a constituent construct and distribute is not a
4 constituent construct.
5 • The inscan reduction-modifier must not be specified on a construct for which the ordered or
6 schedule clause is specified.
7 • A list item that appears in a reduction clause of the innermost enclosing worksharing or
8 parallel construct must not be accessed in an explicit task generated by a construct for which
9 an in_reduction clause over the same list item does not appear.
10 • The task reduction-modifier must not appear in a reduction clause if the nowait clause is
11 specified on the same construct.
C / C++
12 • If a list item in a reduction clause on a worksharing construct has a reference type then it
13 must bind to the same object for all threads of the team.
14 • If a list item in a reduction clause on a worksharing construct is an array section or an array
15 element then the base pointer must point to the same variable for all threads of the team.
16 • A variable of class type (or array thereof) that appears in a reduction clause with the
17 inscan reduction-modifier requires an accessible, unambiguous default constructor for the
18 class type; the number of calls to it while performing the scan computation is unspecified.
19 • A variable of class type (or array thereof) that appears in a reduction clause with the
20 inscan reduction-modifier requires an accessible, unambiguous copy assignment operator for
21 the class type; the number of calls to it while performing the scan computation is unspecified.
C / C++
22 Cross References
23 • List Item Privatization, see Section 5.3
24 • do directive, see Section 11.5.2
25 • for directive, see Section 11.5.1
26 • loop directive, see Section 11.7
27 • ordered clause, see Section 4.4.4
28 • parallel directive, see Section 10.1
29 • private clause, see Section 5.4.3
30 • scan directive, see Section 5.6
31 • schedule clause, see Section 11.5.3
32 • scope directive, see Section 11.2

136 OpenMP API – Version 5.2 November 2021


1 • sections directive, see Section 11.3
2 • simd directive, see Section 10.4
3 • taskloop directive, see Section 12.6
4 • teams directive, see Section 10.2

5 5.5.9 task_reduction Clause


Name: task_reduction Properties: data-environment attribute, data-
6 sharing attribute, privatization, reduction
scoping

7 Arguments
Name Type Properties
8
list list of variable list item type default

9 Modifiers
Name Modifies Type Properties
10 reduction- list An OpenMP reduction iden- required, ultimate
identifier tifier

11 Directives
12 taskgroup

13 Semantics
14 The task_reduction clause is a reduction scoping clause, as described in Section 5.5.6, that
15 specifies a reduction among tasks. For each list item, the number of copies is unspecified. Any
16 copies associated with the reduction are initialized before they are accessed by the tasks that
17 participate in the reduction. After the end of the region, the original list item contains the result of
18 the reduction.

19 Restrictions
20 Restrictions to the task_reduction clause are as follows:
21 • All restrictions common to all reduction clauses, as listed in Section 5.5.5, apply to this clause.

22 Cross References
23 • taskgroup directive, see Section 15.4

CHAPTER 5. DATA ENVIRONMENT 137


1 5.5.10 in_reduction Clause
Name: in_reduction Properties: data-environment attribute, data-
2 sharing attribute, privatization, reduction par-
ticipating

3 Arguments
Name Type Properties
4
list list of variable list item type default

5 Modifiers
Name Modifies Type Properties
6 reduction- list An OpenMP reduction iden- required, ultimate
identifier tifier

7 Directives
8 target, task, taskloop

9 Semantics
10 The in_reduction clause is a reduction participating clause, as described in Section 5.5.7, that
11 specifies that a task participates in a reduction. For a given list item, the in_reduction clause
12 defines a task to be a participant in a task reduction that is defined by an enclosing region for a
13 matching list item that appears in a task_reduction clause or a reduction clause with
14 task as the reduction-modifier, where either:
15 1. The matching list item has the same storage location as the list item in the in_reduction
16 clause; or
17 2. A private copy, derived from the matching list item, that is used to perform the task reduction
18 has the same storage location as the list item in the in_reduction clause.
19 For the task construct, the generated task becomes the participating task. For each list item, a
20 private copy may be created as if the private clause had been used.
21 For the target construct, the target task becomes the participating task. For each list item, a
22 private copy may be created in the data environment of the target task as if the private clause
23 had been used. This private copy will be implicitly mapped into the device data environment of the
24 target device, if the target device is not the parent device.
25 At the end of the task region, if a private copy was created its value is combined with a copy created
26 by a reduction scoping clause or with the original list item.

27 Restrictions
28 Restrictions to the in_reduction clause are as follows:
29 • All restrictions common to all reduction clauses, as listed in Section 5.5.5, apply to this clause.

138 OpenMP API – Version 5.2 November 2021


1 • A list item that appears in a task_reduction clause or a reduction clause with task as
2 the reduction-modifier that is specified on a construct that corresponds to a region in which the
3 region of the participating task is closely nested must match each list item. The construct that
4 corresponds to the innermost enclosing region that meets this condition must specify the same
5 reduction-identifier for the matching list item as the in_reduction clause.
6 Cross References
7 • target directive, see Section 13.8
8 • task directive, see Section 12.5
9 • taskloop directive, see Section 12.6

10 5.5.11 declare reduction Directive


Name: declare reduction Association: none
11
Category: declarative Properties: pure

12 Arguments
13 declare reduction(reduction-specifier)
Name Type Properties
14
reduction-specifier OpenMP reduction specifier default

15 Clauses
16 initializer
17 Semantics
18 The declare reduction directive declares a reduction-identifier that can be used in a
19 reduction clause as a user-defined reduction. The directive argument reduction-specifier uses the
20 following syntax:
21 reduction-identifier : typename-list : combiner

22 where reduction-identifier is a reduction identifier, typename-list is a type-name list, and combiner


23 is an OpenMP combiner expression.
24 The reduction-identifier and the type identify the declare reduction directive. The
25 reduction-identifier can later be used in a reduction clause that uses variables of the types specified
26 in the declare reduction directive. If the directive specifies several types then the behavior is
27 as if a declare reduction directive was specified for each type. The visibility and
28 accessibility of a user-defined reduction are the same as those of a variable declared at the same
29 location in the program.
C++
30 The declare reduction directive can also appear at the locations in a program where a static
31 data member could be declared. In this case, the visibility and accessibility of the declaration are
32 the same as those of a static data member declared at the same location in the program.
C++

CHAPTER 5. DATA ENVIRONMENT 139


1 The enclosing context of the combiner and of the initializer-expr that is specified by the
2 initializer clause is that of the declare reduction directive. The combiner and the
3 initializer-expr must be correct in the base language as if they were the body of a function defined
4 at the same location in the program.
Fortran
5 If a type with deferred or assumed length type parameter is specified in a declare reduction
6 directive, the reduction-identifier of that directive can be used in a reduction clause with any
7 variable of the same type and the same kind parameter, regardless of the length type parameters
8 with which the variable is declared.
9 If the reduction-identifier is the same as the name of a user-defined operator or an extended
10 operator, or the same as a generic name that is one of the allowed intrinsic procedures, and if the
11 operator or procedure name appears in an accessibility statement in the same module, the
12 accessibility of the corresponding declare reduction directive is determined by the
13 accessibility attribute of the statement.
14 If the reduction-identifier is the same as a generic name that is one of the allowed intrinsic
15 procedures and is accessible, and if it has the same name as a derived type in the same module, the
16 accessibility of the corresponding declare reduction directive is determined by the
17 accessibility of the generic name according to the base language.
Fortran
18 Restrictions
19 Restrictions to the declare reduction directive are as follows:
20 • A reduction-identifier may not be re-declared in the current scope for the same type or for a type
21 that is compatible according to the base language rules.
22 • The typename-list must not declare new types.
C / C++
23 • A type name in a declare reduction directive cannot be a function type, an array type, a
24 reference type, or a type qualified with const, volatile or restrict.
C / C++
Fortran
25 • If the length type parameter is specified for a type, it must be a constant, a colon (:) or an
26 asterisk (*).
27 • If a type with deferred or assumed length parameter is specified in a declare reduction
28 directive, no other declare reduction directive with the same type, the same kind
29 parameters and the same reduction-identifier is allowed in the same scope.
Fortran

140 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • OpenMP Combiner Expressions, see Section 5.5.2.1
3 • OpenMP Initializer Expressions, see Section 5.5.2.2
4 • OpenMP Reduction Identifiers, see Section 5.5.1
5 • initializer clause, see Section 5.5.4

6 5.6 scan Directive


Name: scan Association: separating
7
Category: subsidiary Properties: default

8 Separated directives
9 do, for, simd

10 Clauses
11 exclusive, inclusive

12 Clause set
13 Properties: unique, required, exclusive Members: exclusive, inclusive

14 Semantics
15 The scan directive separates the final-loop-body of an enclosing simd construct or
16 worksharing-loop construct (or a composite construct that combines them) into a structured block
17 sequence that serves as an input phase and a structured block sequence that serves as a scan phase.
18 The input phase contains all computations that update the list item in the iteration, and the scan
19 phase ensures that any statement that reads the list item uses the result of the scan computation for
20 that iteration. Thus, it specifies that a scan computation updates each list item on each logical
21 iteration of the enclosing loop nest that is associated with the separated directive.
22 If the inclusive clause is specified, the input phase includes the preceding structured block
23 sequence and the scan phase includes the following structured block sequence and, thus, the
24 directive specifies that an inclusive scan computation is performed for each list item of list. If the
25 exclusive clause is specified, the input phase excludes the preceding structured block sequence
26 and instead includes the following structured block sequence, while the scan phase includes the
27 preceding structured block sequence and, thus, the directive specifies that an exclusive scan
28 computation is performed for each list item of list.
29 The result of a scan computation for a given iteration is calculated according to the last generalized
30 prefix sum (PRESUMlast ) applied over the sequence of values given by the original value of the list
31 item prior to the loop and all preceding updates to the list item in the logical iteration space of the
32 loop. The operation PRESUMlast (op, a1 , . . . , aN ) is defined for a given binary operator op and a
33 sequence of N values a1 , . . . , aN as follows:

CHAPTER 5. DATA ENVIRONMENT 141


1 • if N = 1, a1
2 • if N > 1, op( PRESUMlast (op, a1 , . . . , aj ), PRESUMlast (op, ak , . . . , aN ) ), 1 ≤ j + 1 = k ≤ N.
3 At the beginning of the input phase of each iteration, the list item is initialized with the value of the
4 initializer expression of the reduction-identifier specified by the reduction clause on the
5 separated construct. The update value of a list item is, for a given iteration, the value of the list item
6 on completion of its input phase.
7 Let orig-val be the value of the original list item on entry to the separated construct. Let combiner
8 be the combiner expression for the reduction-identifier specified by the reduction clause on the
9 construct. Let ui be the update value of a list item for iteration i. For list items that appear in an
10 inclusive clause on the scan directive, at the beginning of the scan phase for iteration i the list
11 item is assigned the result of the operation PRESUMlast ( combiner, orig-val, u0 , . . . , ui ). For list
12 items that appear in an exclusive clause on the scan directive, at the beginning of the scan
13 phase for iteration i = 0 the list item is assigned the value orig-val, and at the beginning of the scan
14 phase for iteration i > 0 the list item is assigned the result of the operation PRESUMlast ( combiner,
15 orig-val, u0 , . . . , ui-1 ).
16 For list items that appear in an inclusive clause, at the end of the separated construct, the
17 original list item is assigned the private copy from the last logical iteration of the loops associated
18 with the separated construct. For list items that appear in an exclusive clause, let k be the last
19 logical iteration of the loops associated with the separated construct. At the end of the separated
20 construct, the original list item is assigned the result of the operation PRESUMlast ( combiner,
21 orig-val, u0 , . . . , uk ).
22 Restrictions
23 Restrictions to the scan directive are as follows:
24 • A separated construct must have at most one scan directive as a separating directive.
25 • The loops that are associated with the directive to which the scan directive is associated must
26 all be perfectly nested.
27 • Each list item that appears in the inclusive or exclusive clause must appear in a
28 reduction clause with the inscan modifier on the separated construct.
29 • Each list item that appears in a reduction clause with the inscan modifier on the separated
30 construct must appear in a clause on the separating scan directive.
31 • Cross-iteration dependences across different logical iterations must not exist, except for
32 dependences for the list items specified in an inclusive or exclusive clause.
33 • Intra-iteration dependences from a statement in the structured block sequence that precede a
34 scan directive to a statement in the structured block sequence that follows a scan directive
35 must not exist, except for dependences for the list items specified in an inclusive or
36 exclusive clause.
37 • The private copy of list items that appear in the inclusive or exclusive clause must not be
38 modified in the scan phase.

142 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • do directive, see Section 11.5.2
3 • exclusive clause, see Section 5.6.2
4 • for directive, see Section 11.5.1
5 • inclusive clause, see Section 5.6.1
6 • reduction clause, see Section 5.5.8
7 • simd directive, see Section 10.4

8 5.6.1 inclusive Clause


9 Name: inclusive Properties: unique

10 Arguments
Name Type Properties
11
list list of variable list item type default

12 Directives
13 scan

14 Semantics
15 The inclusive clause is used on a separating directive that separates a structured block into two
16 structured block sequences. The clause determines the association of the structured block sequence
17 that precedes the directive on which the clause appears to a phase of that directive.
18 The list items that appear in an inclusive clause may include array sections.

19 Cross References
20 • scan directive, see Section 5.6

21 5.6.2 exclusive Clause


22 Name: exclusive Properties: unique

23 Arguments
Name Type Properties
24
list list of variable list item type default

25 Directives
26 scan

CHAPTER 5. DATA ENVIRONMENT 143


1 Semantics
2 The exclusive clause is used on a separating directive that separates a structured block into two
3 structured block sequences. The clause determines the association of the structured block sequence
4 that precedes the directive on which the clause appears to a phase of that directive.
5 The list items that appear in an exclusive clause may include array sections.

6 Cross References
7 • scan directive, see Section 5.6

8 5.7 Data Copying Clauses


9 This section describes the copyin clause and the copyprivate clause. These two clauses
10 support copying data values from private or threadprivate variables of an implicit task or thread to
11 the corresponding variables of other implicit tasks or threads in the team.

12 5.7.1 copyin Clause


13 Name: copyin Properties: data copying

14 Arguments
Name Type Properties
15
list list of variable list item type default

16 Directives
17 parallel

18 Semantics
19 The copyin clause provides a mechanism to copy the value of a threadprivate variable of the
20 primary thread to the threadprivate variable of each other member of the team that is executing the
21 parallel region.
C / C++
22 The copy is performed after the team is formed and prior to the execution of the associated
23 structured block. For variables of non-array type, the copy is by copy assignment. For an array of
24 elements of non-array type, each element is copied as if by assignment from an element of the array
25 of the primary thread to the corresponding element of the array of all other threads.
C / C++
C++
26 For class types, the copy assignment operator is invoked. The order in which copy assignment
27 operators for different variables of the same class type are invoked is unspecified.
C++

144 OpenMP API – Version 5.2 November 2021


Fortran
1 The copy is performed, as if by assignment, after the team is formed and prior to the execution of
2 the associated structured block.
3 Named variables that appear in a threadprivate common block may be specified. The whole
4 common block does not need to be specified.
5 On entry to any parallel region, each thread’s copy of a variable that is affected by a copyin
6 clause for the parallel region will acquire the type parameters, allocation, association, and
7 definition status of the copy of the primary thread, according to the following rules:
8 • If the original list item has the POINTER attribute, each copy receives the same association
9 status as that of the copy of the primary thread as if by pointer assignment.
10 • If the original list item does not have the POINTER attribute, each copy becomes defined with
11 the value of the copy of the primary thread as if by intrinsic assignment unless the list item has a
12 type bound procedure as a defined assignment. If the original list item that does not have the
13 POINTER attribute has the allocation status of unallocated, each copy will have the same status.
14 • If the original list item is unallocated or unassociated, each copy inherits the declared type
15 parameters and the default type parameter values from the original list item.
Fortran
16 Restrictions
17 Restrictions to the copyin clause are as follows:
18 • A list item that appears in a copyin clause must be threadprivate.
C++
19 • A variable of class type (or array thereof) that appears in a copyin clause requires an
20 accessible, unambiguous copy assignment operator for the class type.
C++
Fortran
21 • A common block name that appears in a copyin clause must be declared to be a common block
22 in the same scoping unit in which the copyin clause appears.
23 • A polymorphic variable with the ALLOCATABLE attribute must not be a list item.
Fortran
24 Cross References
25 • parallel directive, see Section 10.1
26 • threadprivate directive, see Section 5.2

CHAPTER 5. DATA ENVIRONMENT 145


1 5.7.2 copyprivate Clause
2 Name: copyprivate Properties: end-clause, data copying

3 Arguments
Name Type Properties
4
list list of variable list item type default

5 Directives
6 single

7 Semantics
8 The copyprivate clause provides a mechanism to use a private variable to broadcast a value
9 from the data environment of one implicit task to the data environments of the other implicit tasks
10 that belong to the parallel region. The effect of the copyprivate clause on the specified list
11 items occurs after the execution of the structured block associated with the associated construct,
12 and before any of the threads in the team have left the barrier at the end of the construct. To avoid
13 data races, concurrent reads or updates of the list item must be synchronized with the update of the
14 list item that occurs as a result of the copyprivate clause if, for example, the nowait clause is
15 used to remove the barrier.
C / C++
16 In all other implicit tasks that belong to the parallel region, each specified list item becomes defined
17 with the value of the corresponding list item in the implicit task associated with the thread that
18 executed the structured block. For variables of non-array type, the definition occurs by copy
19 assignment. For an array of elements of non-array type, each element is copied by copy assignment
20 from an element of the array in the data environment of the implicit task that is associated with the
21 thread that executed the structured block to the corresponding element of the array in the data
22 environment of the other implicit tasks.
C / C++
C++
23 For class types, a copy assignment operator is invoked. The order in which copy assignment
24 operators for different variables of class type are called is unspecified.
C++
Fortran
25 If a list item does not have the POINTER attribute, then in all other implicit tasks that belong to the
26 parallel region, the list item becomes defined as if by intrinsic assignment with the value of the
27 corresponding list item in the implicit task that is associated with the thread that executed the
28 structured block. If the list item has a type bound procedure as a defined assignment, the
29 assignment is performed by the defined assignment.

146 OpenMP API – Version 5.2 November 2021


1 If the list item has the POINTER attribute then in all other implicit tasks that belong to the parallel
2 region the list item receives, as if by pointer assignment, the same association status as the
3 corresponding list item in the implicit task that is associated with the thread that executed the
4 structured block.
5 The order in which any final subroutines for different variables of a finalizable type are called is
6 unspecified.
Fortran
7 Restrictions
8 Restrictions to the copyprivate clause are as follows:
9 • All list items that appear in a copyprivate clause must be either threadprivate or private in
10 the enclosing context.
C++
11 • A variable of class type (or array thereof) that appears in a copyprivate clause requires an
12 accessible unambiguous copy assignment operator for the class type.
C++
Fortran
13 • A common block that appears in a copyprivate clause must be threadprivate.
14 • Pointers with the INTENT(IN) attribute must not appear in a copyprivate clause.
15 • Any list item with the ALLOCATABLE attribute must have the allocation status of allocated when
16 the intrinsic assignment is performed.
17 • If a list item is a polymorphic variable with the ALLOCATABLE attribute, the behavior is
18 unspecified.
Fortran
19 Cross References
20 • firstprivate clause, see Section 5.4.4
21 • private clause, see Section 5.4.3
22 • single directive, see Section 11.1

23 5.8 Data-Mapping Control


24 This section describes the available mechanisms for controlling how data are mapped to device data
25 environments. It covers implicit data-mapping attribute rules for variables referenced in target
26 constructs, explicit clauses for specifying how data should be mapped, and clauses for making
27 available variables with static lifetimes and procedures on other devices. It also describes how
28 mappers may be defined and referenced to control the mapping of data with user-defined types.

CHAPTER 5. DATA ENVIRONMENT 147


1 5.8.1 Implicit Data-Mapping Attribute Rules
2 When specified, explicit data-environment attribute clauses on target directives determine the
3 attributes for variables referenced in a target construct. Otherwise, the first matching rule from
4 the following list determines the implicit data-mapping (or data-sharing) attribute for variables
5 referenced in a target construct that do not have a predetermined data-sharing attribute
6 according to Section 5.1.1. References to structure elements or array elements are treated as
7 references to the structure or array, respectively, for the purposes of determining implicit
8 data-mapping or data-sharing attributes of variables in a target construct.
9 • If a variable appears in an enter or link clause on a declare target directive that does not have
10 a device_type clause with the nohost device-type-description then it is treated as if it had
11 appeared in a map clause with a map-type of tofrom.
12 • If a variable is the base variable of a list item in a reduction, lastprivate or linear
13 clause on a combined target construct then the list item is treated as if it had appeared in a map
14 clause with a map-type of tofrom if Section 17.2 specifies this behavior.
15 • If a variable is the base variable of a list item in an in_reduction clause on a target
16 construct then it is treated as if the list item had appeared in a map clause with a map-type of
17 tofrom and a map-type-modifier of always.
18 • If a defaultmap clause is present for the category of the variable and specifies an implicit
19 behavior other than default, the data-mapping or data-sharing attribute is determined by that
20 clause.
C++
21 • If the target construct is within a class non-static member function, and a variable is an
22 accessible data member of the object for which the non-static data member function is invoked,
23 the variable is treated as if the this[:1] expression had appeared in a map clause with a
24 map-type of tofrom. Additionally, if the variable is of type pointer or reference to pointer, it is
25 also treated as if it had appeared in a map clause as a zero-length array section.
26 • If the this keyword is referenced inside a target construct within a class non-static member
27 function, it is treated as if the this[:1] expression had appeared in a map clause with a
28 map-type of tofrom.
C++
C / C++
29 • A variable that is of type pointer, but is neither a pointer to function nor (for C++) a pointer to a
30 member function, is treated as if it is the base pointer of a zero-length array section that had
31 appeared as a list item in a map clause.
C / C++

148 OpenMP API – Version 5.2 November 2021


C++
1 • A variable that is of type reference to pointer, but is neither a reference to point to function nor a
2 reference to a pointer to a member function is treated as if it had appeared in a map clause as a
3 zero-length array section.
C++
4 • If a variable is not a scalar then it is treated as if it had appeared in a map clause with a map-type
5 of tofrom.
Fortran
6 • If a scalar variable has the TARGET, ALLOCATABLE or POINTER attribute then it is treated as
7 if it had appeared in a map clause with a map-type of tofrom.
Fortran
8 • If the above rules do not apply then a scalar variable is not mapped but instead has an implicit
9 data-sharing attribute of firstprivate (see Section 5.1.1).

10 5.8.2 Mapper Identifiers and mapper Modifiers


11 Modifiers
Name Modifies Type Properties
mapper locator-list Complex, name: mapper unique
Arguments:
12
mapper-identifier OpenMP
identifier (default)

13 Clauses
14 from, map, to
15 Mapper identifiers can be used to uniquely identify the mapper used in a map or data-motion clause
16 through a mapper modifier, which is a unique, complex modifier. A declare mapper directive
17 defines a mapper identifier that can later be specified in a mapper modifier as its
18 modifier-parameter-specification. Each mapper identifier is a base-language identifier or default
19 where default is the default mapper for all types.
20 A non-structure type T has a predefined default mapper that is defined as if by the following
21 declare mapper directive:
C / C++
22 #pragma omp declare mapper(T v) map(tofrom: v)
C / C++

CHAPTER 5. DATA ENVIRONMENT 149


Fortran
1 !$omp declare mapper(T :: v) map(tofrom: v)
Fortran
2 A structure type T has a predefined default mapper that is defined as if by a declare mapper
3 directive that specifies v in a map clause with the alloc map-type and each structure element of v
4 in a map clause with the tofrom map-type.
5 A declare mapper directive that uses the default mapper identifier overrides the predefined
6 default mapper for the given type, making it the default mapper for variables of that type.

7 Cross References
8 • from clause, see Section 5.9.2
9 • map clause, see Section 5.8.3
10 • to clause, see Section 5.9.1

11 5.8.3 map Clause


Name: map Properties: data-environment attribute, data-
12
mapping attribute

13 Arguments
Name Type Properties
14
locator-list list of locator list item type default

15 Modifiers
Name Modifies Type Properties
map-type-modifier locator-list Keyword: always, close, default
present
mapper locator-list Complex, name: mapper unique
Arguments:
mapper-identifier OpenMP
identifier (default)
16
iterator locator-list Complex, name: iterator unique
Arguments:
iterator-specifier OpenMP
expression (repeatable)

map-type locator-list Keyword: alloc, delete, ultimate


from, release, to,
tofrom

150 OpenMP API – Version 5.2 November 2021


1 Directives
2 declare mapper, target, target data, target enter data, target exit
3 data

4 Additional information
5 The commas that separate modifiers in a map clause are optional. The specification of modifiers
6 without comma separators for the map clause has been deprecated.

7 Semantics
8 The map clause specifies how an original list item is mapped from the current task’s data
9 environment to a corresponding list item in the device data environment of the device identified by
10 the construct. If a map-type is not specified, the map-type defaults to tofrom. The map clause is
11 map-entering if the map-type is to, tofrom or alloc. The map clause is map-exiting if the
12 map-type is from, tofrom, release or delete.
13 The list items that appear in a map clause may include array sections and structure elements. A list
14 item in a map clause may reference any iterator-identifier defined in its iterator modifier. A list
15 item may appear more than once in the map clauses that are specified on the same directive.
16 If a mapper modifier is not present, the behavior is as if a mapper modifier was specified with the
17 default parameter. The map behavior of a list item in a map clause is modified by a visible
18 user-defined mapper (see Section 5.8.8) if the mapper-identifier of the mapper modifier is defined
19 for a base-language type that matches the type of the list item. Otherwise, the predefined default
20 mapper for the type of the list item applies. The effect of the mapper is to remove the list item from
21 the map clause, if the present modifier does not also appear, and to apply the clauses specified in
22 the declared mapper to the construct on which the map clause appears. In the clauses applied by the
23 mapper, references to var are replaced with references to the list item and the map-type is replaced
24 with a final map type that is determined according to the rules of map-type decay (see
25 Section 5.8.8).
26 A list item that is an array or array section of a type for which a user-defined mapper exists is
27 mapped as if the map type decays to alloc, release, or delete, and then each array element
28 is mapped with the original map type, as if by a separate construct, according to the mapper.
Fortran
29 If a component of a derived type list item is a map clause list item that results from the predefined
30 default mapper for that derived type, and if the derived type component is not an explicit list item or
31 the base expression of an explicit list item in a map clause on the construct, then:
32 • If it has the POINTER attribute, the map clause treats its association status as if it is undefined;
33 and
34 • If it has the ALLOCATABLE attribute and an allocated allocation status, and it is present in the
35 device data environment when the construct is encountered, the map clause may treat its
36 allocation status as if it is unallocated if the corresponding component does not have allocated
37 storage.

CHAPTER 5. DATA ENVIRONMENT 151


1 If a list item in a map clause is an associated pointer and the pointer is not the base pointer of
2 another list item in a map clause on the same construct, then it is treated as if its pointer target is
3 implicitly mapped in the same clause. For the purposes of the map clause, the mapped pointer
4 target is treated as if its base pointer is the associated pointer.
Fortran
5 For map clauses on map-entering constructs, if any list item has a base pointer for which a
6 corresponding pointer exists in the data environment upon entry to the region and either a new list
7 item or the corresponding pointer is created in the device data environment on entry to the region,
8 then:
C / C++
9 1. The corresponding pointer variable is assigned an address such that the corresponding list item
10 can be accessed through the pointer in a target region.
C / C++
Fortran
11 1. The corresponding pointer variable is associated with a pointer target that has the same rank and
12 bounds as the pointer target of the original pointer, such that the corresponding list item can be
13 accessed through the pointer in a target region.
Fortran
14 2. The corresponding pointer variable becomes an attached pointer for the corresponding list item.
15 3. If the original base pointer and the corresponding attached pointer share storage, then the
16 original list item and the corresponding list item must share storage.
C++
17 If a lambda is mapped explicitly or implicitly, variables that are captured by the lambda behave as
18 follows:
19 • The variables that are of pointer type are treated as if they had appeared in a map clause as
20 zero-length array sections; and
21 • The variables that are of reference type are treated as if they had appeared in a map clause.
22 If a member variable is captured by a lambda in class scope, and the lambda is later mapped
23 explicitly or implicitly with its full static type, the this pointer is treated as if it had appeared on a
24 map clause.
C++
25 If a map clause with a present map-type-modifier appears on a construct and on entry to the
26 region the corresponding list item is not present in the device data environment, runtime error
27 termination is performed.
28 The map clauses on a construct collectively determine the set of mappable storage blocks for that
29 construct. All map clause list items that have the same containing structure or share storage result
30 in a single mappable storage block that contains the storage of the list items. The storage for each
31 other map clause list item becomes a distinct mappable storage block.

152 OpenMP API – Version 5.2 November 2021


1 For each mappable storage block that is determined by the map clauses on a map-entering
2 construct, on entry to the region the following sequence of steps occurs as if they are performed as a
3 single atomic operation:
4 1. If a corresponding storage block is not present in the device data environment then:
5 a) A corresponding storage block, which share storage with the original storage block, is
6 created in the device data environment of the device;
7 b) The corresponding storage block receives a reference count that is initialized to zero. This
8 reference count also applies to any part of the corresponding storage block.
9 2. The reference count of the corresponding storage block is incremented by one.
10 3. For each map clause list item on the construct that is contained by the mappable storage block:
11 a) If the reference count of the corresponding storage block is one, a new list item with
12 language-specific attributes derived from the original list item is created in the
13 corresponding storage block. The reference count of the new list item is always equal to the
14 reference count of its storage.
15 b) If the reference count of the corresponding list item is one or if the always
16 map-type-modifier is specified, and if the map-type is to or tofrom, the corresponding list
17 item is updated as if the list item appeared in a to clause on a target update directive.
18

19 Note – If the effect of the map clauses on a construct would assign the value of an original list
20 item to a corresponding list item more than once, then an implementation is allowed to ignore
21 additional assignments of the same value to the corresponding list item.
22

23 In all cases on entry to the region, concurrent reads or updates of any part of the corresponding list
24 item must be synchronized with any update of the corresponding list item that occurs as a result of
25 the map clause to avoid data races.
26 The original and corresponding list items may share storage such that writes to either item by one
27 task followed by a read or write of the other item by another task without intervening
28 synchronization can result in data races. They are guaranteed to share storage if the map clause
29 appears on a target construct that corresponds to an inactive target region, or if it appears on
30 a mapping-only construct that applies to the device data environment of the host device.
31 If corresponding storage for a mappable storage block derived from map clauses on a map-exiting
32 construct is not present in the device data environment on exit from the region, the mappable
33 storage block is ignored. For each mappable storage block that is determined by the map clauses on
34 a map-exiting construct, on exit from the region the following sequence of steps occurs as if
35 performed as a single atomic operation:

CHAPTER 5. DATA ENVIRONMENT 153


1 1. For each map clause list item that is contained by the mappable storage block:
2 a) If the reference count of the corresponding list item is one or if the always
3 map-type-modifier is specified, and if the map-type is from or tofrom, the original list
4 item is updated as if the list item appeared in a from clause on a target update
5 directive.
6 2. If the map-type is not delete and the reference count of the corresponding storage block is
7 finite then the reference count is decremented by one.
8 3. If the map-type is delete and the reference count of the corresponding storage block is finite
9 then the reference count is set to zero.
10 4. If the reference count of the corresponding storage block is zero, all storage to which that
11 reference count applies is removed from the device data environment.
12 If the effect of the map clauses on a construct would assign the value of a corresponding list item to
13 an original list item more than once, then an implementation is allowed to ignore additional
14 assignments of the same value to the original list item.
15 In all cases on exit from the region, concurrent reads or updates of any part of the original list item
16 must be synchronized with any update of the original list item that occurs as a result of the map
17 clause to avoid data races.
18 If a single contiguous part of the original storage of a list item with an implicit data-mapping
19 attribute has corresponding storage in the device data environment prior to a task encountering the
20 construct on which the map clause appears, only that part of the original storage will have
21 corresponding storage in the device data environment as a result of the map clause.
22 If a list item with an implicit data-mapping attribute does not have any corresponding storage in the
23 device data environment prior to a task encountering the construct associated with the map clause,
24 and one or more contiguous parts of the original storage are either list items or base pointers to list
25 items that are explicitly mapped on the construct, only those parts of the original storage will have
26 corresponding storage in the device data environment as a result of the map clauses on the
27 construct.
C / C++
28 If a new list item is created then the new list item will have the same static type as the original list
29 item, and language-specific attributes of the new list item, including size and alignment, are
30 determined by that type.
C / C++
C++
31 If corresponding storage that differs from the original mappable storage block is created in a device
32 data environment, all new list items that are created in that corresponding storage are default
33 initialized. Default initialization for new list items of class type, including their data members, is
34 performed as if with an implicitly-declared default constructor and as if non-static data member
35 initializers are ignored.

154 OpenMP API – Version 5.2 November 2021


1 If the type of a new list item is a reference to a type T then it is initialized to refer to the object in
2 the device data environment that corresponds to the object referenced by the original list item. The
3 effect is as if the object were mapped through a pointer with an array section of length one and
4 elements of type T.
C++
Fortran
5 If a new list item is created then the new list item will have the same type, type parameter, and rank
6 as the original list item. The new list item inherits all default values for the type parameters from
7 the original list item.
8 If the allocation status of an original list item that has the ALLOCATABLE attribute is changed
9 while a corresponding list item is present in the device data environment, the allocation status of the
10 corresponding list item is unspecified until the list item is again mapped with an always modifier
11 on entry to a map-entering region.
Fortran
12 The close map-type-modifier is a hint to the runtime to allocate memory close to the target device.

13 Execution Model Events


14 The target-map event occurs in a thread that executes the outermost region that corresponds to an
15 encountered device construct with a map clause, after the target-task-begin event for the device
16 construct and before any mapping operations are performed.
17 The target-data-op-begin event occurs before a thread initiates a data operation on the target device
18 that is associated with a map clause, in the outermost region that corresponds to the encountered
19 construct.
20 The target-data-op-end event occurs after a thread initiates a data operation on the target device
21 that is associated with a map clause, in the outermost region that corresponds to the encountered
22 construct.

23 Tool Callbacks
24 A thread dispatches one or more registered ompt_callback_target_map or
25 ompt_callback_target_map_emi callbacks for each occurrence of a target-map event in
26 that thread. The callback occurs in the context of the target task and has type signature
27 ompt_callback_target_map_t or ompt_callback_target_map_emi_t,
28 respectively.
29 A thread dispatches a registered ompt_callback_target_data_op_emi callback with
30 ompt_scope_begin as its endpoint argument for each occurrence of a target-data-op-begin
31 event in that thread. Similarly, a thread dispatches a registered
32 ompt_callback_target_data_op_emi callback with ompt_scope_end as its endpoint
33 argument for each occurrence of a target-data-op-end event in that thread. These callbacks have
34 type signature ompt_callback_target_data_op_emi_t.

CHAPTER 5. DATA ENVIRONMENT 155


1 A thread dispatches a registered ompt_callback_target_data_op callback for each
2 occurrence of a target-data-op-end event in that thread. The callback occurs in the context of the
3 target task and has type signature ompt_callback_target_data_op_t.

4 Restrictions
5 Restrictions to the map clause are as follows:
6 • Two list items of the map clauses on the same construct must not share original storage unless
7 they are the same list item or unless one is the containing structure of the other.
8 • If the same list item appears more than once in map clauses on the same construct, the map
9 clauses must specify the same mapper modifier.
10 • If a list item is an array section, it must specify contiguous storage.
11 • If an expression that is used to form a list item in a map clause contains an iterator identifier, the
12 list item instances that would result from different values of the iterator must not have the same
13 containing array and must not have base pointers that share original storage.
14 • If multiple list items are explicitly mapped on the same construct and have the same containing
15 array or have base pointers that share original storage, and if any of the list items do not have
16 corresponding list items that are present in the device data environment prior to a task
17 encountering the construct, then the list items must refer to the same array elements of either the
18 containing array or the implicit array of the base pointers.
19 • If any part of the original storage of a list item with an explicit data-mapping attribute has
20 corresponding storage in the device data environment prior to a task encountering the construct
21 associated with the map clause, all of the original storage must have corresponding storage in the
22 device data environment prior to the task encountering the construct.
23 • If an array appears as a list item in a map clause, multiple parts of the array have corresponding
24 storage in the device data environment prior to a task encountering the construct associated with
25 the map clause, and the corresponding storage for those parts was created by maps from more
26 than one earlier construct, the behavior is unspecified.
27 • If a list item is an element of a structure, and a different element of the structure has a
28 corresponding list item in the device data environment prior to a task encountering the construct
29 associated with the map clause, then the list item must also have a corresponding list item in the
30 device data environment prior to the task encountering the construct.
31 • A list item must have a mappable type.
32 • If a mapper modifier appears in a map clause, the type on which the specified mapper operates
33 must match the type of the list items in the clause.
34 • Memory spaces and memory allocators must not appear as a list item in a map clause.

156 OpenMP API – Version 5.2 November 2021


C++
1 • If a list item has a polymorphic class type and its static type does not match its dynamic type, the
2 behavior is unspecified if the map clause is specified on a map-entering construct and a
3 corresponding list item is not present in the device data environment prior to a task encountering
4 the construct.
5 • No type mapped through a reference may contain a reference to its own type, or any references to
6 types that could produce a cycle of references.
7 • If a list item is a lambda, any pointers and references captured by the lambda must point or refer
8 to storage that has corresponding storage in the device data environment prior to the task
9 encountering the construct.
C++
C / C++
10 • A list item cannot be a variable that is a member of a structure of a union type.
11 • A bit-field cannot appear in a map clause.
12 • A pointer that has a corresponding attached pointer must not be modified for the duration of the
13 lifetime of the list item to which the corresponding pointer is attached in the device data
14 environment.
C / C++
Fortran
15 • If a list item of a map clause is an allocatable variable or is the subobject of an allocatable
16 variable, the original allocatable variable may not be allocated, deallocated or reshaped while the
17 corresponding allocatable variable has allocated storage.
18 • A pointer that has a corresponding attached pointer and is associated with a given pointer target
19 must not become associated with a different pointer target for the duration of the lifetime of the
20 list item to which the corresponding pointer is attached in the device data environment.
21 • If an array section is mapped and the size of the section is smaller than that of the whole array,
22 the behavior of referencing the whole array in the target region is unspecified.
23 • A list item must not be a whole array of an assumed-size array.
24 • A list item must not be a complex part designator.
Fortran

CHAPTER 5. DATA ENVIRONMENT 157


1 Cross References
2 • Array Sections, see Section 3.2.5
3 • ompt_callback_target_data_op_emi_t and
4 ompt_callback_target_data_op_t, see Section 19.5.2.25
5 • ompt_callback_target_map_emi_t and ompt_callback_target_map_t, see
6 Section 19.5.2.27
7 • declare mapper directive, see Section 5.8.8
8 • iterator modifier, see Section 3.2.6
9 • mapper modifier, see Section 5.8.2
10 • target data directive, see Section 13.5
11 • target directive, see Section 13.8
12 • target enter data directive, see Section 13.6
13 • target exit data directive, see Section 13.7
14 • target update directive, see Section 13.9

15 5.8.4 enter Clause


Name: enter Properties: data-environment attribute, data-
16
mapping attribute

17 Arguments
Name Type Properties
18
list list of extended list item type default

19 Directives
20 declare target

21 Additional information
22 The clause-name to may be used as a synonym for the clause-name enter. This use has been
23 deprecated.

24 Semantics
25 The enter clause is a data-mapping clause.
C / C++
26 If a function appears in an enter clause in the same compilation unit in which the definition of the
27 function occurs then a device-specific version of the function is created for all devices to which the
28 directive of the clause applies.

158 OpenMP API – Version 5.2 November 2021


1 If a variable appears in an enter clause in the same compilation unit in which the definition of the
2 variable occurs then the original list item is allocated a corresponding list item in the device data
3 environment of all devices to which the directive of the clause applies.
C / C++
Fortran
4 If a procedure appears in an enter clause in the same compilation unit in which the definition of
5 the procedure occurs then a device-specific version of the procedure is created for all devices to
6 which the directive of the clause applies.
7 If a variable that is host associated appears in an enter clause then the original list item is
8 allocated a corresponding list item in the device data environment of all devices to which the
9 directive of the clause applies.
Fortran
10 If a variable appears in an enter clause then the corresponding list item in the device data
11 environment of each device to which the directive of the clause applies is initialized once, in the
12 manner specified by the program, but at an unspecified point in the program prior to the first
13 reference to that list item. The list item is never removed from those device data environments as if
14 its reference count was initialized to positive infinity.

15 Cross References
16 • declare target directive, see Section 7.8.1

17 5.8.5 link Clause


18 Name: link Properties: data-environment attribute

19 Arguments
Name Type Properties
20
list list of variable list item type default

21 Directives
22 declare target

23 Semantics
24 The link clause supports compilation of device routines that refer to variables with static storage
25 duration that appear as list items in the clause. The declare target directive on which the
26 clause appears does not map the list items. Instead, they are mapped according to the data-mapping
27 rules described in Section 5.8.

28 Cross References
29 • Data-Mapping Control, see Section 5.8
30 • declare target directive, see Section 7.8.1

CHAPTER 5. DATA ENVIRONMENT 159


C / C++

1 5.8.6 Pointer Initialization for Device Data Environments


2 This section describes how a pointer that is predetermined firstprivate for a target construct may
3 be assigned an initial value that is the address of an object that exists in a device data environment
4 and corresponds to a matching mapped list item.
5 All previously mapped list items that have corresponding storage in a given device data
6 environment constitute the set of currently mapped list items. If a currently mapped list item has a
7 base pointer, the base address of the currently mapped list item is the value of its base pointer.
8 Otherwise, the base address is determined by the following steps:
9 1. Let X refer to the currently mapped list item.
10 2. If X refers to an array section or array element, let X refer to its base array.
11 3. If X refers to a structure element, let X refer to its containing structure and return to step 2.
12 4. The base address for the currently mapped list item is the address of X.
13 Additionally, each currently mapped list item has a starting address and an ending address. The
14 starting address is the address of the first storage location associated with the list item, and the
15 ending address is the address of the storage location that immediately follows the last storage
16 location associated with the list item.
17 The mapped address range of the currently mapped list item is the range of addresses that starts
18 from the starting address and ends with the ending address. The extended address range of the
19 currently mapped list item is the range of addresses that starts from the minimum of the starting
20 address and the base address and that ends with the maximum of the ending address and the base
21 address.
22 If the value of a given pointer is in the mapped address range of a currently mapped list item then
23 that currently mapped list item is a matching mapped list item. Otherwise, if the value of the
24 pointer is in the extended address range of a currently mapped list item then that currently mapped
25 list item is a matching mapped list item.
26 If multiple matching mapped list items are found and they all appear as part of the same containing
27 structure, the one that has the lowest starting address is treated as the sole matching mapped list
28 item. Otherwise, if multiple matching mapped list items are found then the behavior is unspecified.
29 If a matching mapped list item is found, the initial value that is assigned to the pointer is a device
30 address such that the corresponding list item in the device data environment can be accessed
31 through the pointer in a target region.
32 If a matching mapped list item is not found, the pointer retains its original value as per the
33 firstprivate semantics described in Section 5.4.4.

160 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • map clause, see Section 5.8.3
3 • requires directive, see Section 8.2
4 • target directive, see Section 13.8
C / C++

5 5.8.7 defaultmap Clause


6 Name: defaultmap Properties: unique, post-modified

7 Arguments
Name Type Properties
implicit-behavior Keyword: alloc, default, default
8
firstprivate, from, none,
present, to, tofrom

9 Modifiers
Name Modifies Type Properties
variable-category implicit-behavior Keyword: aggregate, default
10
all, allocatable,
pointer, scalar

11 Directives
12 target

13 Semantics
14 The defaultmap clause determines the implicit data-mapping or data-sharing attribute of certain
15 variables that are referenced in a target construct, in accordance with the rules given in
16 Section 5.8.1. The variable-category specifies the variables for which the attribute may be set, and
17 the attribute is specified by implicit-behavior. If no variable-category is specified in the clause then
18 the effect is as if all was specified for the variable-category.
C / C++
19 The scalar variable-category specifies non-pointer variables of scalar type.
C / C++
Fortran
20 The scalar variable-category specifies non-pointer and non-allocatable variables of scalar type.
21 The allocatable variable-category specifies variables with the ALLOCATABLE attribute.
Fortran

CHAPTER 5. DATA ENVIRONMENT 161


1 The pointer variable-category specifies variables of pointer type. The aggregate
2 variable-category specifies variables of aggregate type (arrays or structures). Finally, the all
3 variable-category specifies all variables.
4 If implicit-behavior is the name of a map type, the attribute is a data-mapping attribute determined
5 by an implicit map clause with the specified map type. If implicit-behavior is firstprivate,
6 the attribute is a data-sharing attribute of firstprivate. If implicit-behavior is present, the
7 attribute is a data-mapping attribute determined by an implicit map clause with the map-type of
8 alloc and map-type-modifier of present. If implicit-behavior is none then no implicit
9 data-mapping or data-sharing attributes are defined for variables in variable-category, except for
10 variables that appear in the enter or link clause of a declare target directive. If
11 implicit-behavior is default then the clause has no effect.

12 Restrictions
13 Restrictions to the defaultmap clause are as follows:
14 • A given variable-category may be specified in at most one defaultmap clause on a construct.
15 • If a defaultmap clause specifies the all variable-category, no other defaultmap clause
16 may appear on the construct.
17 • If implicit-behavior is none, each variable that is specified by variable-category and is
18 referenced in the construct but does not have a predetermined data-sharing and does not appear
19 in an enter or link clause on a declare target directive must be explicitly listed in a
20 data-environment attribute clause on the construct.
C / C++
21 • The specified variable-category must not be allocatable.
C / C++
22 Cross References
23 • Implicit Data-Mapping Attribute Rules, see Section 5.8.1
24 • target directive, see Section 13.8

25 5.8.8 declare mapper Directive


Name: declare mapper Association: none
26
Category: declarative Properties: default

27 Arguments
28 declare mapper(mapper-specifier)
Name Type Properties
29
mapper-specifier OpenMP mapper specifier default

30 Clauses
31 map

162 OpenMP API – Version 5.2 November 2021


1 Semantics
2 User-defined mappers can be defined using the declare mapper directive. The mapper-specifier
3 directive argument declares the mapper using the following syntax:
C / C++
4 [ mapper-identifier : ] type var
C / C++
Fortran
5 [ mapper-identifier : ] type :: var
Fortran
6 where mapper-identifier is a mapper identifier, type is a type that is permitted in a type-name list,
7 and var is a base-language identifier.
8 The type and an optional mapper-identifier uniquely identify the mapper for use in a map clause or
9 motion clause later in the program. The visibility and accessibility of this declaration are the same
10 as those of a variable declared at the same location in the program.
11 If mapper-identifier is not specified, the behavior is as if mapper-identifier is default.
12 The variable declared by var is available for use in all map clauses on the directive, and no part of
13 the variable to be mapped is mapped by default.
14 The effect that a user-defined mapper has on either a map clause that maps a list item of the given
15 base language type or a motion clause that invokes the mapper and updates a list item of the given
16 base language type is to replace the map or update with a set of map clauses or updates derived
17 from the map clauses specified by the mapper, as described in Section 5.8.3 and Section 5.9.
18 The final map types that a mapper applies for a map clause that maps a list item of the given type
19 are determined according to the rules of map-type decay, defined according to Table 5.3. Table 5.3
20 shows the final map type that is determined by the combination of two map types, where the rows
21 represent the map type specified by the mapper and the columns represent the map type specified
22 by a map clause that invokes the mapper. For a target exit data construct that invokes a
23 mapper with a map clause that has the from map type, if a map clause in the mapper specifies an
24 alloc or to map type then the result is a release map type.

TABLE 5.3: Map-Type Decay of Map Type Combinations

alloc to from tofrom release delete


alloc alloc alloc alloc (release) alloc release delete
to alloc to alloc (release) to release delete
from alloc alloc from from release delete
tofrom alloc to from tofrom release delete

25 A list item in a map clause that appears on a declare mapper directive may include array
26 sections.

CHAPTER 5. DATA ENVIRONMENT 163


1 All map clauses that are introduced by a mapper are further subject to mappers that are in scope,
2 except a map clause with list item var maps var without invoking a mapper.
C++
3 The declare mapper directive can also appear at locations in the program at which a static data
4 member could be declared. In this case, the visibility and accessibility of the declaration are the
5 same as those of a static data member declared at the same location in the program.
C++
6 Restrictions
7 Restrictions to the declare mapper directive are as follows:
8 • No instance of type can be mapped as part of the mapper, either directly or indirectly through
9 another base language type, except the instance var that is passed as the list item. If a set of
10 declare mapper directives results in a cyclic definition then the behavior is unspecified.
11 • The type must not declare a new base language type.
12 • At least one map clause that maps var or at least one element of var is required.
13 • List items in map clauses on the declare mapper directive may only refer to the declared
14 variable var and entities that could be referenced by a procedure defined at the same location.
15 • Neither the release or delete map-type may be specified on any map clause.
16 • If a mapper-modifier is specified for a map clause, its parameter must be default.
17 • Multiple declare mapper directives that specify the same mapper-identifier for the same
18 base language type or for compatible base language types, according to the base language rules,
19 may not appear in the same scope.
C
20 • type must be a struct or union type.
C
C++
21 • type must be a struct, union, or class type.
C++
Fortran
22 • type must not be an intrinsic type or an abstract type.
Fortran
23 Cross References
24 • map clause, see Section 5.8.3

164 OpenMP API – Version 5.2 November 2021


1 5.9 Data-Motion Clauses
2 Data-motion clauses specify data movement between a device set that is specified by the construct
3 on which they appear. One member of that device set is always the encountering device, which is
4 the device on which the encountering task for that construct executes. How the other devices, which
5 are the targeted devices, are determined is defined by the construct specification. Each data-motion
6 clause specifies the direction of the data movement relative to the targeted devices.
7 A data-motion clause specifies an OpenMP locator list as its argument. A corresponding list item
8 and an original list item exist for each list item. If the corresponding list item is not present in the
9 device data environment then no assignment occurs between the corresponding and original list
10 items. Otherwise, each corresponding list item in the device data environment has an original list
11 item in the data environment of the encountering task. Assignment is performed to either the
12 original or corresponding list item as specified with the specific data-motion clauses. List items
13 may reference any iterator-identifier defined in its iterator modifier. The list items may include
14 array sections with stride expressions.
C / C++
15 The list items may use shape-operators.
C / C++
16 If a list item is an array or array section then it is treated as if it is replaced by each of its array
17 elements in the clause.
18 If the mapper modifier is not specified, the behavior is as if the modifier was specified with the
19 default mapper-identifier. The effect of a data-motion clause on a list item is modified by a
20 visible user-defined mapper if mapper-identifier is specified for a type that matches the type of the
21 list item. Otherwise, the predefined default mapper for the type of the list item applies. Each list
22 item is replaced with the list items that the given mapper specifies are to be mapped with a map
23 type that is compatible with the data movement direction associated with the clause.
24 If a present expectation is specified and the corresponding list item is not present in the device
25 data environment then runtime error termination is performed. For a list item that is replaced with a
26 set of list items as a result of a user-defined mapper, the expectation only applies to those mapper
27 list items that share storage with the original list item.
Fortran
28 If a list item or a subobject of a list item has the ALLOCATABLE attribute, its assignment is
29 performed only if its allocation status is allocated and only with respect to the allocated storage. If a
30 list item has the POINTER attribute and its association status is associated, the effect is as if the
31 assignment is performed with respect to the pointer target.

CHAPTER 5. DATA ENVIRONMENT 165


1 On exit from the associated region, if the corresponding list item is an attached pointer, the original
2 list item, if associated, will be associated with the same pointer target with which it was associated
3 on entry to the region and the corresponding list item, if associated, will be associated with the
4 same pointer target with which it was associated on entry to the region.
Fortran
C / C++
5 On exit from the associated region, if the corresponding list item is an attached pointer, the original
6 list item will have the value it had on entry to the region and the corresponding list item will have
7 the value it had on entry to the region.
C / C++
8 For each list item that is not an attached pointer, the value of the assigned list item is assigned the
9 value of the other list item. To avoid data races, concurrent reads or updates of the assigned list
10 item must be synchronized with the update of an assigned list item that occurs as a result of a
11 data-motion clause.

12 Restrictions
13 Restrictions to data-motion clauses are as follows:
14 • Each list item clause must have a mappable type.

15 Cross References
16 • Array Sections, see Section 3.2.5
17 • Array Shaping, see Section 3.2.4
18 • declare mapper directive, see Section 5.8.8
19 • device clause, see Section 13.2
20 • from clause, see Section 5.9.2
21 • iterator modifier, see Section 3.2.6
22 • target update directive, see Section 13.9
23 • to clause, see Section 5.9.1

24 5.9.1 to Clause
25 Name: to Properties: data-motion attribute

26 Arguments
Name Type Properties
27
locator-list list of locator list item type default

166 OpenMP API – Version 5.2 November 2021


1 Modifiers
Name Modifies Type Properties
expectation Generic Keyword: present default
mapper locator-list Complex, name: mapper unique
Arguments:
mapper-identifier OpenMP
identifier (default)
2

iterator locator-list Complex, name: iterator unique


Arguments:
iterator-specifier OpenMP
expression (repeatable)

3 Directives
4 target update

5 Semantics
6 The to clause is a data motion clause that specifies movement to the targeted devices from the
7 encountering device so the corresponding list items are the assigned list items and the compatible
8 map types are to and tofrom.

9 Cross References
10 • iterator modifier, see Section 3.2.6
11 • target update directive, see Section 13.9

12 5.9.2 from Clause


13 Name: from Properties: data-motion attribute

14 Arguments
Name Type Properties
15
locator-list list of locator list item type default

CHAPTER 5. DATA ENVIRONMENT 167


1 Modifiers
Name Modifies Type Properties
expectation Generic Keyword: present default
mapper locator-list Complex, name: mapper unique
Arguments:
mapper-identifier OpenMP
identifier (default)
2

iterator locator-list Complex, name: iterator unique


Arguments:
iterator-specifier OpenMP
expression (repeatable)

3 Directives
4 target update

5 Semantics
6 The from clause is a data motion clause that specifies movement from the targeted devices to the
7 encountering device so the original list items are the assigned list items and the compatible map
8 types are from and tofrom.

9 Cross References
10 • iterator modifier, see Section 3.2.6
11 • target update directive, see Section 13.9

12 5.10 uniform Clause


13 Name: uniform Properties: data-environment attribute

14 Arguments
Name Type Properties
15
parameter-list list of parameter list item type default

16 Directives
17 declare simd

18 Semantics
19 The uniform clause declares one or more arguments to have an invariant value for all concurrent
20 invocations of the function in the execution of a single SIMD loop.

21 Cross References
22 • declare simd directive, see Section 7.7

168 OpenMP API – Version 5.2 November 2021


1 5.11 aligned Clause
Name: aligned Properties: data-environment attribute, post-
2
modified

3 Arguments
Name Type Properties
4
list list of variable list item type default

5 Modifiers
Name Modifies Type Properties
alignment list OpenMP integer expression positive, region
6
invariant, ultimate,
unique

7 Directives
8 declare simd, simd

9 Semantics
C / C++
10 The aligned clause declares that the object to which each list item points is aligned to the
11 number of bytes expressed in alignment.
C / C++
Fortran
12 The aligned clause declares that the target of each list item is aligned to the number of bytes
13 expressed in alignment.
Fortran
14 The alignment modifier specifies the alignment that the program ensures related to the list items. If
15 the alignment modifier is not specified, implementation-defined default alignments for SIMD
16 instructions on the target platforms are assumed.

17 Restrictions
18 Restrictions to the aligned clause are as follows:
C
19 • The type of list items must be array or pointer.
C
C++
20 • The type of list items must be array, pointer, reference to array, or reference to pointer.
C++

CHAPTER 5. DATA ENVIRONMENT 169


Fortran
1 • Each list item must have C_PTR or Cray pointer type or have the POINTER or ALLOCATABLE
2 attribute. Cray pointer support has been deprecated.
3 • If a list item has the ALLOCATABLE attribute, the allocation status must be allocated.
4 • If a list item has the POINTER attribute, the association status must be associated.
5 • If the type of a list item is either C_PTR or Cray pointer, it must be defined. Cray pointer support
6 has been deprecated.
Fortran
7 Cross References
8 • declare simd directive, see Section 7.7
9 • simd directive, see Section 10.4

170 OpenMP API – Version 5.2 November 2021


1 6 Memory Management
2 This chapter defines directives, clauses and related concepts for managing memory used by
3 OpenMP programs.

4 6.1 Memory Spaces


5 OpenMP memory spaces represent storage resources where variables can be stored and retrieved.
6 Table 6.1 shows the list of predefined memory spaces. The selection of a given memory space
7 expresses an intent to use storage with certain traits for the allocations. The actual storage resources
8 that each memory space represents are implementation defined.

TABLE 6.1: Predefined Memory Spaces

Memory space name Storage selection intent

omp_default_mem_space Represents the system default storage


omp_large_cap_mem_space Represents storage with large capacity
omp_const_mem_space Represents storage optimized for variables with con-
stant values
omp_high_bw_mem_space Represents storage with high bandwidth
omp_low_lat_mem_space Represents storage with low latency
9 Variables allocated in the omp_const_mem_space memory space may be initialized through
10 the firstprivate clause or with compile time constants for static and constant variables.
11 Implementation-defined mechanisms to provide the constant value of these variables may also be
12 supported.

13 Restrictions
14 Restrictions to OpenMP memory spaces are as follows:
15 • Variables in the omp_const_mem_space memory space may not be written.

171
1 6.2 Memory Allocators
2 OpenMP memory allocators can be used by a program to make allocation requests. When a
3 memory allocator receives a request to allocate storage of a certain size, an allocation of logically
4 consecutive memory in the resources of its associated memory space of at least the size that was
5 requested will be returned if possible. This allocation will not overlap with any other existing
6 allocation from an OpenMP memory allocator.
7 The behavior of the allocation process can be affected by the allocator traits that the user specifies.
8 Table 6.2 shows the allowed allocator traits, their possible values and the default value of each trait.

TABLE 6.2: Allocator Traits

Allocator trait Allowed values Default value

sync_hint contended, uncontended, contended


serialized, private
alignment Positive integer powers of 2 1 byte
access all, cgroup, pteam, thread all
pool_size Any positive integer Implementation de-
fined
fallback default_mem_fb, null_fb, default_mem_fb
abort_fb, allocator_fb
fb_data an allocator handle (none)
pinned true, false false
partition environment, nearest, blocked, environment
interleaved
9 The sync_hint trait describes the expected manner in which multiple threads may use the
10 allocator. The values and their descriptions are:
11 • contended: high contention is expected on the allocator; that is, many threads are expected to
12 request allocations simultaneously;
13 • uncontended: low contention is expected on the allocator; that is, few threads are expected to
14 request allocations simultaneously;
15 • serialized: one thread at a time will request allocations with the allocator. Requesting two
16 allocations simultaneously when specifying serialized results in unspecified behavior; and
17 • private: the same thread will request allocations with the allocator every time. Requesting an
18 allocation from different threads, simultaneously or not, when specifying private results in
19 unspecified behavior.

172 OpenMP API – Version 5.2 November 2021


1 Allocated memory will be byte aligned to at least the value specified for the alignment trait of
2 the allocator. Some directives and API routines can specify additional requirements on alignment
3 beyond those described in this section.
4 Memory allocated by allocators with the access trait defined to be all must be accessible by all
5 threads in the device where the allocation was requested. Memory allocated by allocators with the
6 access trait defined to be cgroup will be memory accessible by all threads in the same
7 contention group as the thread that requested the allocation; attempts to access it by threads that are
8 not part of the same contention group as the allocating thread result in unspecified behavior.
9 Memory allocated by allocators with the access trait defined to be pteam will be memory
10 accessible by all threads that bind to the same parallel region of the thread that requested the
11 allocation; attempts to access it by threads that do not bind to the same parallel region as the
12 allocating thread result in unspecified behavior. Memory allocated by allocators with the access
13 trait defined to be thread will be memory accessible by the thread that requested the allocation;
14 attempts to access it by threads other than the allocating thread result in unspecified behavior.
15 The total amount of storage in bytes that an allocator can use is limited by the pool_size trait.
16 For allocators with the access trait defined to be all, this limit refers to allocations from all
17 threads that access the allocator. For allocators with the access trait defined to be cgroup, this
18 limit refers to allocations from threads that access the allocator from the same contention group. For
19 allocators with the access trait defined to be pteam, this limit refers to allocations from threads
20 that access the allocator from the same parallel team. For allocators with the access trait defined
21 to be thread, this limit refers to allocations from each thread that accesses the allocator. Requests
22 that would result in using more storage than pool_size will not be fulfilled by the allocator.
23 The fallback trait specifies how the allocator behaves when it cannot fulfill an allocation
24 request. If the fallback trait is set to null_fb, the allocator returns the value zero if it fails to
25 allocate the memory. If the fallback trait is set to abort_fb, the behavior is as if an error
26 directive for which sev-level is fatal and action-time is execution is encountered if the
27 allocation fails. If the fallback trait is set to allocator_fb then when an allocation fails the
28 request will be delegated to the allocator specified in the fb_data trait. If the fallback trait is
29 set to default_mem_fb then when an allocation fails another allocation will be tried in
30 omp_default_mem_space, which assumes all allocator traits to be set to their default values
31 except for fallback trait, which will be set to null_fb.
32 Allocators with the pinned trait defined to be true ensure that their allocations remain in the
33 same storage resource at the same location for their entire lifetime.
34 The partition trait describes the partitioning of allocated memory over the storage resources
35 represented by the memory space associated with the allocator. The partitioning will be done in
36 parts with a minimum size that is implementation defined. The values are:
37 • environment: the placement of allocated memory is determined by the execution
38 environment;
39 • nearest: allocated memory is placed in the storage resource that is nearest to the thread that
40 requests the allocation;

CHAPTER 6. MEMORY MANAGEMENT 173


1 • blocked: allocated memory is partitioned into parts of approximately the same size with at
2 most one part per storage resource; and
3 • interleaved: allocated memory parts are distributed in a round-robin fashion across the
4 storage resources.
5 Table 6.3 shows the list of predefined memory allocators and their associated memory spaces. The
6 predefined memory allocators have default values for their allocator traits unless otherwise
7 specified.

TABLE 6.3: Predefined Allocators

Allocator name Associated memory space Non-default trait


values

omp_default_mem_alloc omp_default_mem_space fallback:null_fb


omp_large_cap_mem_alloc omp_large_cap_mem_space (none)
omp_const_mem_alloc omp_const_mem_space (none)
omp_high_bw_mem_alloc omp_high_bw_mem_space (none)
omp_low_lat_mem_alloc omp_low_lat_mem_space (none)
omp_cgroup_mem_alloc Implementation defined access:cgroup
omp_pteam_mem_alloc Implementation defined access:pteam
omp_thread_mem_alloc Implementation defined access:thread
Fortran
8 If any operation of the base language causes a reallocation of a variable that is allocated with a
9 memory allocator then that memory allocator will be used to deallocate the current memory and to
10 allocate the new memory. For allocated allocatable components of such variables, the allocator that
11 will be used for the deallocation and allocation is unspecified.
Fortran

12 6.3 align Clause


13 Name: align Properties: unique

14 Arguments
Name Type Properties
15
alignment expression of integer type constant, positive

16 Directives
17 allocate

174 OpenMP API – Version 5.2 November 2021


1 Semantics
2 The align clause is used to specify the byte alignment to use for allocations associated with the
3 construct on which the clause appears. Specifically, each allocation is byte aligned to at least the
4 maximum of the value to which alignment evaluates, the alignment trait of the allocator being
5 used for the allocation, and the alignment required by the base language for the type of the variable
6 that is allocated. On constructs on which the clause may appear, if it is not specified then the effect
7 is as if it was specified with the alignment trait of the allocator being used for the allocation.

8 Restrictions
9 Restrictions to the align clause are as follows:
10 • alignment must evaluate to a power of two.

11 Cross References
12 • Memory Allocators, see Section 6.2
13 • allocate directive, see Section 6.5

14 6.4 allocator Clause


15 Name: allocator Properties: unique

16 Arguments
Name Type Properties
17
allocator expression of allocator_handle type default

18 Directives
19 allocate

20 Semantics
21 The allocator clause specifies the memory allocator to be used for allocations associated with
22 the construct on which the clause appears. Specifically, the allocator to which allocator evaluates is
23 used for the allocations. On constructs on which the clause may appear, if it is not specified then the
24 effect is as if it was specified with the value of the def-allocator-var ICV.

25 Cross References
26 • Memory Allocators, see Section 6.2
27 • allocate directive, see Section 6.5
28 • def-allocator-var ICV, see Table 2.1

CHAPTER 6. MEMORY MANAGEMENT 175


1 6.5 allocate Directive
Name: allocate Association: none
2
Category: declarative Properties: default

3 Arguments
Name Type Properties
4
list list of variable list item type default

5 Clauses
6 align, allocator

7 Semantics
8 The storage for each list item that appears in the allocate directive is provided an allocation
9 through the memory allocator as determined by the allocator clause with an alignment as
10 determined by the align clause. The scope of this allocation is that of the list item in the base
11 language. At the end of the scope for a given list item the memory allocator used to allocate that list
12 item deallocates the storage.
13 For allocations that arise from this directive the null_fb value of the fallback allocator trait
14 behaves as if the abort_fb had been specified.

15 Restrictions
16 Restrictions to the allocate directive are as follows:
17 • A variable that is part of another variable (as an array element or a structure element) cannot
18 appear in a allocate directive.
19 • An allocate directive must appear in the same scope as the declarations of each of its list
20 items and must follow all such declarations.
21 • A declared variable may appear as a list item in at most one allocate directive in a given
22 compilation unit.
23 • allocate directives that appear in a target region must specify an allocator clause
24 unless a requires directive with the dynamic_allocators clause is present in the same
25 compilation unit.
C / C++
26 • If a list item has static storage duration, the allocator clause must be specified and the
27 allocator expression in the clause must be a constant expression that evaluates to one of the
28 predefined memory allocator values.
29 • A variable that is declared in a namespace or global scope may only appear as a list item in an
30 allocate directive if an allocate directive that lists the variable follows a declaration that
31 defines the variable and if all allocate directives that list it specify the same allocator.
C / C++

176 OpenMP API – Version 5.2 November 2021


C
1 • After a list item has been allocated, the scope that contains the allocate directive must not
2 end abnormally, such as through a call to the longjmp function.
C
C++
3 • After a list item has been allocated, the scope that contains the allocate directive must not end
4 abnormally, such as through a call to the longjmp function, other than through C++ exceptions.
5 • A variable that has a reference type may not appear as a list item in an allocate directive.
C++
Fortran
6 • A list item that is specified in an allocate directive must not have the ALLOCATABLE or
7 POINTER attribute.
8 • If a list item has the SAVE attribute, either explicitly or implicitly, or is a common block name
9 then the allocator clause must be specified and only predefined memory allocator
10 parameters can be used in the clause.
11 • A variable that is part of a common block may not be specified as a list item in an allocate
12 directive, except implicitly via the named common block.
13 • A named common block may appear as a list item in at most one allocate directive in a given
14 compilation unit.
15 • If a named common block appears as a list item in an allocate directive, it must appear as a
16 list item in an allocate directive that specifies the same allocator in every compilation unit in
17 which the common block is used.
18 • An associate name may not appear as a list item in an allocate directive.
Fortran
19 Cross References
20 • Memory Allocators, see Section 6.2
21 • align clause, see Section 6.3
22 • allocator clause, see Section 6.4

CHAPTER 6. MEMORY MANAGEMENT 177


1 6.6 allocate Clause
2 Name: allocate Properties: default

3 Arguments
Name Type Properties
4
list list of variable list item type default

5 Modifiers
Name Modifies Type Properties
allocator-simple- list expression of OpenMP allo- exclusive, unique
modifier cator_handle type
allocator-complex- list Complex, name: unique
modifier allocator Arguments:
allocator expression of allo-
6 cator_handle type (default)

align-modifier list Complex, name: align Ar- unique


guments:
alignment expression of in-
teger type (constant, positive)

7 Directives
8 allocators, distribute, do, for, parallel, scope, sections, single, target,
9 task, taskgroup, taskloop, teams

10 Semantics
11 The allocate clause specifies the memory allocator to be used to obtain storage for a list of
12 variables. If a list item in the clause also appears in a data-sharing attribute clause on the same
13 directive that privatizes the list item, allocations that arise from that list item in the clause will be
14 provided by the memory allocator. If the allocator-simple-modifier is specified, the behavior is as if
15 the allocator-complex-modifier is instead specified with allocator-simple-modifier as its allocator
16 argument. The allocator-complex-modifier and align-modifier have the same syntax and semantics
17 for the allocate clause as the allocator and align clauses have for the allocate
18 directive.
19 For allocations that arise from this clause the null_fb value of the fallback allocator trait behaves
20 as if the abort_fb had been specified.

178 OpenMP API – Version 5.2 November 2021


1 Restrictions
2 Restrictions to the allocate clause are as follows:
3 • For any list item that is specified in the allocate clause on a directive other than the
4 allocators directive, a data-sharing attribute clause that may create a private copy of that list
5 item must be specified on the same directive.
6 • For task, taskloop or target directives, allocation requests to memory allocators with the
7 trait access set to thread result in unspecified behavior.
8 • allocate clauses that appear on a target construct or on constructs in a target region
9 must specify an allocator-simple-modifier or allocator-complex-modifier unless a requires
10 directive with the dynamic_allocators clause is present in the same compilation unit.

11 Cross References
12 • Memory Allocators, see Section 6.2
13 • align clause, see Section 6.3
14 • allocator clause, see Section 6.4
15 • allocators directive, see Section 6.7
16 • distribute directive, see Section 11.6
17 • do directive, see Section 11.5.2
18 • for directive, see Section 11.5.1
19 • parallel directive, see Section 10.1
20 • scope directive, see Section 11.2
21 • sections directive, see Section 11.3
22 • single directive, see Section 11.1
23 • target directive, see Section 13.8
24 • task directive, see Section 12.5
25 • taskgroup directive, see Section 15.4
26 • taskloop directive, see Section 12.6
27 • teams directive, see Section 10.2

CHAPTER 6. MEMORY MANAGEMENT 179


Fortran

1 6.7 allocators Construct


Name: allocators Association: block (allocator structured
2 block)
Category: executable Properties: default

3 Clauses
4 allocate

5 Additional information
6 The allocators construct may alternatively be expressed as one or more allocate directives
7 that precede the allocator structured block. The syntax of these directives are as described in
8 Section 6.5, except that the list directive argument is optional. If a list argument is not specified, the
9 effect is as if there is an implicit list consisting of the names of each variable to be allocated in the
10 associated allocate-stmt that is not explicitly listed in another allocate directive associated with
11 the statement. allocate directives are semantically equivalent to an allocators directive that
12 specifies OpenMP allocators and the variables to which they apply in one or more allocate
13 clauses, and restricted uses of the allocators directive imply that equivalent uses of
14 allocate directives are also restricted. If the allocate directive is used, an allocator will be
15 used to allocate all variables even if they are not explicitly listed. This alternate syntax has been
16 deprecated.

17 Semantics
18 The allocators construct specifies that OpenMP memory allocators are used for certain
19 variables that are allocated by the associated allocate-stmt. If a variable that is to be allocated
20 appears as a list item in an allocate clause on the directive, an OpenMP allocator is used to
21 allocate storage for the variable according to the semantics of the allocate clause. If a variable
22 that is to be allocated does not appear as a list item in an allocate clause, the allocation is
23 performed according to the base language implementation.

24 Restrictions
25 Restrictions to the allocators construct are as follows:
26 • A list item that appears in an allocate clause must appear as one of the variables that is
27 allocated by the allocate-stmt in the associated allocator structured block.
28 Additional restrictions to the (deprecated) allocate directive when it is associated with an
29 allocator structured block are as follows:
30 • If a list is specified, the directive must be preceded by an executable statement or OpenMP
31 construct.
32 • If multiple allocate directives are associated with an allocator structured block, at most one
33 directive may specify no list items.

180 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • Memory Allocators, see Section 6.2
3 • OpenMP Allocator Structured Blocks, see Section 4.3.1.1
4 • allocate clause, see Section 6.6
5 • allocate directive, see Section 6.5
Fortran

6 6.8 uses_allocators Clause


Name: uses_allocators Properties: data-environment attribute, data-
7
sharing attribute

8 Arguments
Name Type Properties
9
allocator expression of allocator_handle type default

10 Modifiers
Name Modifies Type Properties
mem-space Generic Complex, name: memspace default
Arguments:
memspace-handle
expression of
memspace_handle type (de-
11 fault)

traits-array Generic Complex, name: traits default


Arguments:
traits variable of alloctrait
array type (default)

12 Directives
13 target

14 Additional information
15 The comma-separated list syntax, in which each list item is a clause-argument-specification of the
16 form allocator[(traits)] may also be used for the uses_allocators clause arguments. With
17 this syntax, traits must be a constant array with constant values. This syntax has been deprecated.

CHAPTER 6. MEMORY MANAGEMENT 181


1 Semantics
2 The uses_allocators clause enables the use of the specified allocator in the region associated
3 with the directive on which the clause appears. If allocator refers to a predefined allocator, that
4 predefined allocator will be available for use in the region. If allocator does not refer to a
5 predefined allocator, the effect is as if allocator is specified on a private clause. The resulting
6 corresponding item is assigned the result of a call to omp_init_allocator at the beginning of
7 the associated region with arguments memspace-handle, the number of traits in the traits array, and
8 traits. If mem-space is not specified, the effect is as if memspace-handle is specified as
9 omp_default_mem_space. If traits-array is not specified, the effect is as if traits is specified
10 as an empty array. Further, at the end of the associated region, the effect is as if this allocator is
11 destroyed as if by a call to omp_destroy_allocator.

12 Restrictions
13 • The allocator expression must be a base language identifier.
14 • If allocator is a predefined allocator, no modifiers may be specified.
15 • If allocator is not a predefined allocator, it must be a variable.
16 • The allocator argument must not appear in other data-sharing attribute clauses or data-mapping
17 attribute clauses on the same construct.
18 • The traits argument for the traits-array modifier must be a constant array, have constant values
19 and be defined in the same scope as the construct on which the clause appears.
20 • The memspace-handle argument for the mem-space modifier must be an identifier that matches
21 one of the predefined memory space names.

22 Cross References
23 • Memory Allocators, see Section 6.2
24 • Memory Spaces, see Section 6.1
25 • omp_destroy_allocator, see Section 18.13.3
26 • omp_init_allocator, see Section 18.13.2
27 • target directive, see Section 13.8

182 OpenMP API – Version 5.2 November 2021


1 7 Variant Directives
2 This chapter defines directives and related concepts to support the seamless adaption of programs
3 to OpenMP contexts.

4 7.1 OpenMP Contexts


5 At any point in a program, an OpenMP context exists that defines traits that describe the active
6 OpenMP constructs, the execution devices, functionality supported by the implementation and
7 available dynamic values. The traits are grouped into trait sets. The following trait sets exist:
8 construct, device, target_device, implementation and dynamic. Traits are categorized as name-list
9 traits, clause-list traits, non-property traits and extension traits. This categorization determines the
10 syntax that is used to match the trait, as defined in Section 7.2.
11 The construct set is composed of the directive names, each being a trait, of all enclosing constructs
12 at that point in the program up to a target construct. Combined and composite constructs are
13 added to the set as distinct constructs in the same nesting order specified by the original construct.
14 Whether the dispatch construct is added to the construct set is implementation defined. If it is
15 added, it will only be added for the target-call of the associated code. The set is ordered by nesting
16 level in ascending order. Specifically, the ordering of the set of constructs is c1 , . . . , cN , where c1 is
17 the construct at the outermost nesting level and cN is the construct at the innermost nesting level. In
18 addition, if the point in the program is not enclosed by a target construct, the following rules are
19 applied in order:
20 1. For procedures with a declare simd directive, the simd trait is added to the beginning of the
21 set as c1 for any generated SIMD versions so the total size of the set is increased by one.
22 2. For procedures that are determined to be function variants by a declare variant directive, the
23 selectors c1 , . . . , cM of the construct selector set are added in the same order to the
24 beginning of the set as c1 , . . . , cM so the total size of the set is increased by M .
25 3. For procedures that are determined to be target function variants by a declare target directive, the
26 target trait is added to the beginning of the set as c1 so the total size of the set is increased by one.
27 The simd trait is a clause-list trait that is defined with properties that match the clauses accepted by
28 the declare simd directive with the same name and semantics. The simd trait defines at least the
29 simdlen property and one of the inbranch or notinbranch properties. Traits in the construct set
30 other than simd are non-property traits.

183
1 The device set includes traits that define the characteristics of the device being targeted by the
2 compiler at that point in the program. For each target device that the implementation supports, a
3 target_device set exists that defines the characteristics of that device. At least the following traits
4 must be defined for the device and all target_device sets:
5 • The kind(kind-name-list) trait specifies the general kind of the device. The following kind-name
6 values are defined:
7 – host, which specifies that the device is the host device;
8 – nohost, which specifies that the device is not the host device; and
9 – the values defined in the OpenMP Additional Definitions document.
10 • The isa(isa-name-list) trait specifies the Instruction Set Architectures supported by the device.
11 The accepted isa-name values are implementation defined.
12 • The arch(arch-name-list) trait specifies the architectures supported by the device. The accepted
13 arch-name values are implementation defined.
14 The kind, isa and arch traits in the device and target_device sets are name-list traits.
15 Additionally, the target_device set defines the following trait:
16 • The device_num trait specifies the device number of the device.
17 The implementation set includes traits that describe the functionality supported by the OpenMP
18 implementation at that point in the program. At least the following traits can be defined:
19 • The vendor(vendor-name-list) trait, which specifies the vendor identifiers of the implementation.
20 OpenMP defined values for vendor-name are defined in the OpenMP Additional Definitions
21 document.
22 • The extension(extension-name-list) trait, which specifies vendor specific extensions to the
23 OpenMP specification. The accepted extension-name values are implementation defined.
24 • A trait with a name that is identical to the name of any clause that was supplied to the requires
25 directive prior to the program point. Such traits other than the atomic_default_mem_order trait
26 are non-property traits. The presence of these traits has been deprecated.
27 • A requires(requires-clause-list) trait, which is a clause-list trait for which the properties are the
28 clauses that have been supplied to the requires directive prior to the program point as well as
29 implementation-defined implicit requirements.
30 The vendor and extension traits in the implementation set are name-list traits.
31 Implementations can define additional traits in the device, target_device and implementation sets;
32 these traits are extension traits.
33 The dynamic trait set includes traits that define the dynamic properties of a program at a point in its
34 execution. The data state trait in the dynamic trait set refers to the complete data state of the
35 program that may be accessed at runtime.

184 OpenMP API – Version 5.2 November 2021


1 7.2 Context Selectors
2 Context selectors are used to define the properties that can match an OpenMP context. OpenMP
3 defines different sets of selectors, each containing different selectors.
4 The syntax for a context selector is context-selector-specification as described in the following
5 grammar:
6 context-selector-specification:
7 trait-set-selector[,trait-set-selector[,...]]
8
9 trait-set-selector:
10 trait-set-selector-name={trait-selector[, trait-selector[, ...]]}
11
12 trait-selector:
13 trait-selector-name[([trait-score: ] trait-property[, trait-property[, ...]])]
14
15 trait-property:
16 trait-property-name
17 trait-property-clause
18 trait-property-expression
19 trait-property-extension
20
21 trait-property-clause:
22 clause
23
24 trait-property-name:
25 identifier
26 string-literal
27
28 trait-property-expression
29 scalar-expression (for C/C++)
30 scalar-logical-expression (for Fortran)
31 scalar-integer-expression (for Fortran)
32
33 trait-score:
34 score(score-expression)
35
36 trait-property-extension:
37 trait-property-name
38 identifier(trait-property-extension[, trait-property-extension[, ...]])
39 constant integer expression

40 For trait selectors that correspond to name-list traits, each trait-property should be
41 trait-property-name and for any value that is a valid identifier both the identifier and the

CHAPTER 7. VARIANT DIRECTIVES 185


1 corresponding string literal (for C/C++) and the corresponding char-literal-constant (for Fortran)
2 representation are considered representations of the same value.
3 For trait selectors that correspond to clause-list traits, each trait-property should be
4 trait-property-clause. The syntax is the same as for the matching OpenMP clause.
5 The construct selector set defines the construct traits that should be active in the OpenMP
6 context. Each selector that can be defined in the construct set is the directive-name of a
7 context-matching construct. Each trait-property of the simd selector is a trait-property-clause.
8 The syntax is the same as for a valid clause of the declare simd directive and the restrictions on
9 the clauses from that directive apply. The construct selector is an ordered list c1 , . . . , cN .
10 The device and implementation selector sets define the traits that should be active in the
11 corresponding trait set of the OpenMP context. The target_device selector set defines the
12 traits that should be active in the target_device trait set for the device that the specified
13 device_num selector identifies. The same traits that are defined in the corresponding traits sets
14 can be used as selectors with the same properties. The kind selector of the device and
15 target_device selector sets can also specify the value any, which is as if no kind selector
16 was specified. If a device_num selector does not appear in the target_device selector set
17 then a device_num selector that specifies the value of the default-device-var ICV is implied. For
18 the device_num selector of the target_device selector set, a single
19 trait-property-expression must be specified. For the atomic_default_mem_order selector of
20 the implementation set, a single trait-property must be specified as an identifier equal to one
21 of the valid arguments to the atomic_default_mem_order clause on the requires
22 directive. For the requires selector of the implementation set, each trait-property is a
23 trait-property-clause. The syntax is the same as for a valid clause of the requires directive and
24 the restrictions on the clauses from that directive apply.
25 The user selector set defines the condition selector that provides additional user-defined
26 conditions.
27 The condition selector contains a single trait-property-expression that must evaluate to true for
28 the selector to be true.
29 Any non-constant expression that is evaluated to determine the suitability of a variant is evaluated
30 according to the data state trait in the dynamic trait set of the OpenMP context.
31 The user selector set is dynamic if the condition selector is present and the expression in the
32 condition selector is not a constant expression; otherwise, it is static.
33 All parts of a context selector define the static part of the context selector except the following
34 parts, which define the dynamic part of a context selector:
35 • Its user selector set if it is dynamic; and
36 • Its target_device selector set.
37 For the match clause of a declare variant directive, any argument of the base function that
38 is referenced in an expression that appears in the context selector is treated as a reference to the

186 OpenMP API – Version 5.2 November 2021


1 expression that is passed into that argument at the call to the base function. Otherwise, a variable or
2 procedure reference in an expression that appears in a context selector is a reference to the variable
3 or procedure of that name that is visible at the location of the directive on which the selector
4 appears.
C++
5 Each occurrence of the this pointer in an expression in a context selector that appears in the
6 match clause of a declare variant directive is treated as an expression that is the address of
7 the object on which the associated base function is invoked.
C++
8 Implementations can allow further selectors to be specified. Each specified trait-property for these
9 implementation-defined selectors should be trait-property-extension. Implementations can ignore
10 specified selectors that are not those described in this section.

11 Restrictions
12 Restrictions to context selectors are as follows:
13 • Each trait-property can only be specified once in a trait-selector other than the construct
14 selector set.
15 • Each trait-set-selector-name can only be specified once.
16 • Each trait-selector-name can only be specified once.
17 • A trait-score cannot be specified in traits from the construct, device or
18 target_device trait-selector-sets.
19 • A score-expression must be a non-negative constant integer expression.
20 • The expression of a device_num trait must evaluate to a non-negative integer value that is less
21 than or equal to the value of omp_get_num_devices().
22 • A variable or procedure that is referenced in an expression that appears in a context selector must
23 be visible at the location of the directive on which the selector appears unless the directive is a
24 declare variant directive and the variable is an argument of the associated base function.
25 • If trait-property any is specified in the kind trait-selector of the device or
26 target_device selector set, no other trait-property may be specified in the same selector.
27 • For a trait-selector that corresponds to a name-list trait, at least one trait-property must be
28 specified.
29 • For a trait-selector that corresponds to a non-property trait, no trait-property may be specified.
30 • For the requires selector of the implementation selector set, at least one trait-property
31 must be specified.

CHAPTER 7. VARIANT DIRECTIVES 187


1 7.3 Matching and Scoring Context Selectors
2 A given context selector is compatible with a given OpenMP context if the following conditions are
3 satisfied:
4 • All selectors in the user set of the context selector are true;
5 • All traits and trait properties that are defined by selectors in the target_device set of the
6 context selector are active in the target_device trait set for the device that is identified by the
7 device_num selector;
8 • All traits and trait properties that are defined by selectors in the construct, device and
9 implementation sets of the context selector are active in the corresponding trait sets of the
10 OpenMP context;
11 • For each selector in the context selector, its properties are a subset of the properties of the
12 corresponding trait of the OpenMP context;
13 • Selectors in the construct set of the context selector appear in the same relative order as their
14 corresponding traits in the construct trait set of the OpenMP context; and
15 • No specified implementation-defined selector is ignored by the implementation.
16 Some properties of the simd selector have special rules to match the properties of the simd trait:
17 • The simdlen(N) property of the selector matches the simdlen(M) trait of the OpenMP context
18 if M is a multiple of N ; and
19 • The aligned(list:N) property of the selector matches the aligned(list:M) trait of the OpenMP
20 context if N is a multiple of M .
21 Among compatible context selectors, a score is computed using the following algorithm:
22 1. Each trait selector for which the corresponding trait appears in the construct trait set in the
23 OpenMP context is given the value 2p−1 where p is the position of the corresponding trait, cp , in
24 the context construct trait set; if the traits that correspond to the construct selector set
25 appear multiple times in the OpenMP context, the highest valued subset of context traits that
26 contains all selectors in the same order are used;
27 2. The kind, arch, and isa selectors, if specified, are given the values 2l , 2l+1 and 2l+2 ,
28 respectively, where l is the number of traits in the construct set;
29 3. Trait selectors for which a trait-score is specified are given the value specified by the trait-score
30 score-expression;
31 4. The values given to any additional selectors allowed by the implementation are implementation
32 defined;
33 5. Other selectors are given a value of zero; and

188 OpenMP API – Version 5.2 November 2021


1 6. A context selector that is a strict subset of another context selector has a score of zero. For other
2 context selectors, the final score is the sum of the values of all specified selectors plus 1.

3 7.4 Metadirectives
4 A metadirective is a directive that can specify multiple directive variants of which one may be
5 conditionally selected to replace the metadirective based on the enclosing OpenMP context. A
6 metadirective is replaced by a nothing directive or one of the directive variants specified by the
7 when clauses or the otherwise clause. If no otherwise clause is specified the effect is as if
8 one was specified without an associated directive variant.
9 The OpenMP context for a given metadirective is defined according to Section 7.1. The order of
10 clauses that appear on a metadirective is significant and otherwise must be the last clause
11 specified on a metadirective.
12 Replacement candidates are ordered according to the following rules in decreasing precedence:
13 • A candidate is before another one if the score associated with the context selector of the
14 corresponding when clause is higher.
15 • A candidate that was explicitly specified is before one that was implicitly specified.
16 • Candidates are ordered according to the order in which they lexically appear on the metadirective.
17 The list of dynamic replacement candidates is the prefix of the sorted list of replacement candidates
18 up to and including the first candidate for which the corresponding when clause has a static context
19 selector. The first dynamic replacement candidate for which the corresponding when clause has a
20 compatible context selector, according to the matching rules defined in Section 7.3, replaces the
21 metadirective.

22 Restrictions
23 Restrictions to metadirectives are as follows:
24 • Replacement of the metadirective with the directive variant associated with any of the dynamic
25 replacement candidates must result in a conforming OpenMP program.
26 • Insertion of user code at the location of a metadirective must be allowed if the first dynamic
27 replacement candidate does not have a static context selector.
28 • All items must be executable directives if the first dynamic replacement candidate does not have
29 a static context selector.
Fortran
30 • A metadirective that appears in the specification part of a subprogram must follow all
31 variant-generating declarative directives that appear in the same specification part.
32 • All directive variants of a metadirective must be pure otherwise the metadirective is not pure.
Fortran

CHAPTER 7. VARIANT DIRECTIVES 189


1 7.4.1 when Clause
2 Name: when Properties: default

3 Arguments
Name Type Properties
4
directive-variant directive-specification optional, unique

5 Modifiers
Name Modifies Type Properties
6 context-selector directive-variant An OpenMP context- required, unique
selector-specification

7 Directives
8 begin metadirective, metadirective

9 Semantics
10 The directive variant specified by a when clause is a candidate to replace the metadirective on
11 which the clause is specified if the static part of the corresponding context selector is compatible
12 with the OpenMP context according to the matching rules defined in Section 7.3. If a when clause
13 does not explicitly specify a directive variant it implicitly specifies a nothing directive as the
14 directive variant.
15 Expressions that appear in the context selector of a when clause are evaluated if no prior dynamic
16 replacement candidate has a compatible context selector, and the number of times each expression
17 is evaluated is implementation defined. All variables referenced by these expressions are
18 considered to be referenced by the metadirective.
19 A directive variant that is associated with a when clause can only affect the program if the directive
20 variant is a dynamic replacement candidate.

21 Restrictions
22 Restrictions to the when clause are as follows:
23 • directive-variant must not specify a metadirective.
24 • context-selector must not specify any properties for the simd selector.
C / C++
25 • directive-variant must not specify a begin declare variant directive.
C / C++

190 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • Context Selectors, see Section 7.2
3 • begin metadirective directive, see Section 7.4.4
4 • metadirective directive, see Section 7.4.3
5 • nothing directive, see Section 8.4

6 7.4.2 otherwise Clause


7 Name: otherwise Properties: unique, ultimate

8 Arguments
Name Type Properties
9
directive-variant directive-specification optional, unique

10 Directives
11 begin metadirective, metadirective

12 Additional information
13 The clause-name default may be used as a synonym for the clause-name otherwise. This use
14 has been deprecated.

15 Semantics
16 The otherwise clause is treated as a when clause with the specified directive variant, if any, and
17 an always compatible static context selector that has a score lower than the scores associated with
18 any other clause.

19 Restrictions
20 Restrictions to the otherwise clause are as follows:
21 • directive-variant must not specify a metadirective.
C / C++
22 • directive-variant must not specify a begin declare variant directive.
C / C++
23 Cross References
24 • begin metadirective directive, see Section 7.4.4
25 • metadirective directive, see Section 7.4.3
26 • when clause, see Section 7.4.1

CHAPTER 7. VARIANT DIRECTIVES 191


1 7.4.3 metadirective
Name: metadirective Association: none
2
Category: meta Properties: pure

3 Clauses
4 otherwise, when

5 Semantics
6 The metadirective specifies metadirective semantics.

7 Cross References
8 • Metadirectives, see Section 7.4
9 • otherwise clause, see Section 7.4.2
10 • when clause, see Section 7.4.1

11 7.4.4 begin metadirective


Name: begin metadirective Association: delimited
12
Category: meta Properties: pure

13 Clauses
14 otherwise, when

15 Semantics
16 The begin metadirective is a metadirective for which the specified directive variants other
17 than the nothing directive must accept a paired end directive. For any directive variant that is
18 selected to replace the begin metadirective directive, the end metadirective directive
19 is implicitly replaced by its paired end directive to demarcate the statements that are affected by or
20 are associated with the directive variant. If the nothing directive is selected to replace the
21 begin metadirective directive, the paired end metadirective is ignored.

22 Restrictions
23 The restrictions to begin metadirective are as follows:
24 • Any directive-variant that is specified by a when or otherwise clause must be an OpenMP
25 directive that has a paired end directive or must be the nothing directive.

26 Cross References
27 • Metadirectives, see Section 7.4
28 • nothing directive, see Section 8.4
29 • otherwise clause, see Section 7.4.2
30 • when clause, see Section 7.4.1

192 OpenMP API – Version 5.2 November 2021


1 7.5 Declare Variant Directives
2 Declare variant directives declare base functions to have the specified function variant. The context
3 selector in the match clause is associated with the variant.
4 The OpenMP context for a direct call to a given base function is defined according to Section 7.1. If
5 a declare variant directive for the base function is visible at the call site and the static part of the
6 context selector that is associated with the declared function variant is compatible with the
7 OpenMP context of the call according to the matching rules defined in Section 7.3 then the variant
8 is a replacement candidate to be called instead of the base function. Replacement candidates are
9 ordered in decreasing order of the score associated with the context selector. If two replacement
10 candidates have the same score then their order is implementation defined.
11 The list of dynamic replacement candidates is the prefix of the sorted list of replacement candidates
12 up to and including the first candidate for which the corresponding context selector is static.
13 The first dynamic replacement candidate for which the corresponding context selector is
14 compatible, according to the matching rules defined in Section 7.3, is called instead of the base
15 function. If no compatible candidate exists then the base function is called.
16 Expressions that appear in the context selector of a match clause are evaluated if no prior dynamic
17 replacement candidate has a compatible context selector, and the number of times each expression
18 is evaluated is implementation defined. All variables referenced by these expressions are
19 considered to be referenced at the call site.
C++
20 For calls to constexpr base functions that are evaluated in constant expressions, whether any
21 variant replacement occurs is implementation defined.
C++
22 For indirect function calls that can be determined to call a particular base function, whether any
23 variant replacement occurs is unspecified.
24 Any differences that the specific OpenMP context requires in the prototype of the variant from the
25 base function prototype are implementation defined.
26 Different declare variant directives may be specified for different declarations of the same base
27 function.

28 Restrictions
29 Restrictions to declare variant directives are as follows:
30 • Calling functions that a declare variant directive determined to be a function variant directly in
31 an OpenMP context that is different from the one that the construct selector set of the context
32 selector specifies is non-conforming.
33 • If a function is determined to be a function variant through more than one declare variant
34 directive then the construct selector set of their context selectors must be the same.

CHAPTER 7. VARIANT DIRECTIVES 193


1 • A function determined to be a function variant may not be specified as a base function in another
2 declare variant directive.
3 • An adjust_args clause or append_args clause can only be specified if the dispatch
4 selector of the construct selector set appears in the match clause.
C / C++
5 • The type of the function variant must be compatible with the type of the base function after the
6 implementation-defined transformation for its OpenMP context.
C / C++
C++
7 • Declare variant directives cannot be specified for virtual, defaulted or deleted functions.
8 • Declare variant directives cannot be specified for constructors or destructors.
9 • Declare variant directives cannot be specified for immediate functions.
10 • The function that a declare variant directive determined to be a function variant may not be an
11 immediate function.
C++
12 Cross References
13 • Context Selectors, see Section 7.2
14 • OpenMP Contexts, see Section 7.1
15 • begin declare variant directive, see Section 7.5.5
16 • declare variant directive, see Section 7.5.4

17 7.5.1 match Clause


18 Name: match Properties: unique, required

19 Arguments
Name Type Properties
20 context-selector An OpenMP context-selector- default
specification

21 Directives
22 begin declare variant, declare variant

23 Semantics
24 The match clause specifies the context-selector to use to determine if a specified variant function
25 is a replacement candidate for the specified base function in a given context.

194 OpenMP API – Version 5.2 November 2021


1 Restrictions
2 Restrictions to the match clause are as follows:
3 • All variables that are referenced in an expression that appears in the context selector of a match
4 clause must be accessible at a call site to the base function according to the base language rules.

5 Cross References
6 • Context Selectors, see Section 7.2
7 • begin declare variant directive, see Section 7.5.5
8 • declare variant directive, see Section 7.5.4

9 7.5.2 adjust_args Clause


10 Name: adjust_args Properties: default

11 Arguments
Name Type Properties
12
parameter-list list of parameter list item type default

13 Modifiers
Name Modifies Type Properties
adjust-op parameter-list Keyword: required
14
need_device_ptr,
nothing

15 Directives
16 declare variant

17 Semantics
18 The adjust_args clause specifies how to adjust the arguments of the base function when a
19 specified variant function is selected for replacement. For each adjust_args clause that is
20 present on the selected variant the adjustment operation specified by adjust-op is applied to each
21 argument specified in the clause before being passed to the selected variant. If the adjust-op
22 modifier is nothing, the argument is passed to the selected variant without being modified.
23 If the adjust-op modifier is need_device_ptr, the arguments are converted to corresponding
24 device pointers of the default device. If an argument has the is_device_ptr property in its
25 interoperability requirement set then the argument is not adjusted. Otherwise, the argument is
26 converted in the same manner that a use_device_ptr clause on a target data construct
27 converts its pointer list items into device pointers. If the argument cannot be converted into a device
28 pointer then NULL is passed as the argument.

CHAPTER 7. VARIANT DIRECTIVES 195


1 Restrictions
Fortran
2 • Each argument that appears in a need_device_ptr adjust-op must be of type C_PTR in the
3 dummy argument declaration of the variant function.
Fortran
4 Cross References
5 • declare variant directive, see Section 7.5.4

6 7.5.3 append_args Clause


7 Name: append_args Properties: unique

8 Arguments
Name Type Properties
9
append-op-list list of OpenMP operation list item type default

10 Directives
11 declare variant

12 Semantics
13 The append_args clause specifies additional arguments to pass in the call when a specified
14 variant function is selected for replacement. The arguments are constructed according to each
15 specified list item in append-op-list and are passed in the same order in which they are specified in
16 the list.
17 The supported OpenMP operations in append-op-list are:
18 interop

19 The interop operation accepts a comma-separated list of operands, each of which is an


20 interop-type that is supported by the init clause on the interop construct.
21 Each interop operation constructs an argument of interop OpenMP type using the
22 interoperability requirement set of the encountering task. The argument is constructed as if by an
23 interop construct with an init clause that specifies each interop-type operand in the interop
24 operation. If the interoperability requirement set contains one or more properties that could be used
25 as clauses for an interop construct of interop-type, the behavior is as if the corresponding
26 clauses would also be part of the interop construct and those properties are removed from the
27 interoperability requirement set.
28 This argument is destroyed after the call to the selected variant returns, as if an interop construct
29 with a destroy clause was used with the same clauses that were used to initialize the argument.

196 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • Interoperability Requirement Set, see Section 14.2
3 • OpenMP Operations, see Section 3.2.3
4 • declare variant directive, see Section 7.5.4
5 • interop directive, see Section 14.1

6 7.5.4 declare variant Directive


Name: declare variant Association: declaration
7
Category: declarative Properties: pure

8 Arguments
9 declare variant([base–name:]variant-name)
Name Type Properties
10 base-name identifier of function type optional
variant-name identifier of function type default

11 Clauses
12 adjust_args, append_args, match

13 Semantics
14 The declare variant specifies declare variant semantics for a single replacement candidate.
15 variant-name identifies the function variant while base-name identifies the base function.
C
16 Any expressions in the match clause are interpreted as if they appeared in the scope of arguments
17 of the base function.
C
C++
18 variant-name and any expressions in the match clause are interpreted as if they appeared at the
19 scope of the trailing return type of the base function.
20 The function variant is determined by base language standard name lookup rules ([basic.lookup])
21 of variant-name using the argument types at the call site after implementation-defined changes have
22 been made according to the OpenMP context.
C++
Fortran
23 The procedure to which base-name refers is resolved at the location of the directive according to the
24 establishment rules for procedure names in the base language.
Fortran

CHAPTER 7. VARIANT DIRECTIVES 197


1 Restrictions
2 • If base-name is specified, it must match the name used in the associated declaration, if any
3 declaration is associated.
Fortran
4 • base-name must not be a generic name, an entry name, the name of a procedure pointer, a
5 dummy procedure or a statement function.
6 • If base-name is omitted then the declare variant directive must appear in an interface
7 block or the specification part of a procedure.
8 • Any declare variant directive must appear in the specification part of a subroutine
9 subprogram, function subprogram, or interface body to which it applies.
10 • If the directive is specified for a procedure that is declared via a procedure declaration statement,
11 the base-name must be specified.
12 • The procedure base-name must have an accessible explicit interface at the location of the
13 directive.
Fortran
14 Cross References
15 • Declare Variant Directives, see Section 7.5
16 • adjust_args clause, see Section 7.5.2
17 • append_args clause, see Section 7.5.3
18 • match clause, see Section 7.5.1

C / C++

19 7.5.5 begin declare variant Directive


Name: begin declare variant Association: delimited (declaration-
20 definition-seq)
Category: declarative Properties: default

21 Clauses
22 match

23 Semantics
24 The begin declare variant directive associates the context selector in the match clause
25 with each function definition in declaration-definition-seq. For the purpose of call resolution, each
26 function definition that appears between a begin declare variant directive and its paired
27 end directive is a function variant for an assumed base function, with the same name and a
28 compatible prototype, that is declared elsewhere without an associated declare variant directive.

198 OpenMP API – Version 5.2 November 2021


1 If a declare variant directive appears between a begin declare variant directive and its
2 paired end directive, the effective context selectors of the outer directive are appended to the
3 context selector of the inner directive to form the effective context selector of the inner directive. If
4 a trait-set-selector is present on both directives, the trait-selector list of the outer directive is
5 appended to the trait-selector list of the inner directive after equivalent trait-selectors have been
6 removed from the outer list. Restrictions that apply to explicitly specified context selectors also
7 apply to effective context selectors constructed through this process.
8 The symbol name of a function definition that appears between a begin declare variant
9 directive and its paired end directive is determined through the base language rules after the name
10 of the function has been augmented with a string that is determined according to the effective
11 context selector of the begin declare variant directive. The symbol names of two definitions
12 of a function are considered to be equal if and only if their effective context selectors are equivalent.
13 If the context selector of a begin declare variant directive contains traits in the device or
14 implementation set that are known never to be compatible with an OpenMP context during the
15 current compilation, the preprocessed code that follows the begin declare variant directive
16 up to its paired end directive is elided.
17 Any expressions in the match clause are interpreted at the location of the directive.

18 Restrictions
19 The restrictions to begin declare variant directive are as follows:
20 • match clause must not contain a simd trait-selector-name.
21 • Two begin declare variant directives and their paired end directives must either
22 encompass disjoint source ranges or be perfectly nested.
23 • match clause must not contain a dynamic context selector that references the this pointer.
24 • If an expression in the context selector that appears in match clause references the this
25 pointer, the base function must be a non-static member function.

26 Cross References
27 • Declare Variant Directives, see Section 7.5
28 • match clause, see Section 7.5.1
C / C++

CHAPTER 7. VARIANT DIRECTIVES 199


1 7.6 dispatch Construct
Name: dispatch Association: block (function dispatch struc-
2 tured block)
Category: executable Properties: context-matching

3 Clauses
4 depend, device, is_device_ptr, nocontext, novariants, nowait

5 Binding
6 The binding task set for a dispatch region is the generating task. The dispatch region binds
7 to the region of the generating task.

8 Semantics
9 The dispatch construct controls whether variant substitution occurs for target-call in the
10 associated function dispatch structured block.
11 Properties added to the interoperability requirement set can be removed by the effect of other
12 directives (see Section 14.2) before the dispatch region is executed. If one or more depend
13 clauses are present on the dispatch construct, they are added as depend properties of the
14 interoperability requirement set. If a nowait clause is present on the dispatch construct the
15 nowait property is added to the interoperability requirement set. For each list item specified in an
16 is_device_ptr clause, an is_device_ptr property for that list item is added to the
17 interoperability requirement set.
18 If the interoperability requirement set contains one or more depend properties, the behavior is as if
19 those properties were applied as depend clauses to a taskwait construct that is executed before
20 the dispatch region is executed.
21 The presence of the nowait property in the interoperability requirement set has no effect on the
22 dispatch construct.
23 If the device clause is present, the value of the default-device-var ICV is set to the value of the
24 expression in the clause on entry to the dispatch region and is restored to its previous value at
25 the end of the region.

26 Cross References
27 • Interoperability Requirement Set, see Section 14.2
28 • OpenMP Function Dispatch Structured Blocks, see Section 4.3.1.2
29 • depend clause, see Section 15.9.5
30 • device clause, see Section 13.2
31 • is_device_ptr clause, see Section 5.4.7
32 • nocontext clause, see Section 7.6.2

200 OpenMP API – Version 5.2 November 2021


1 • novariants clause, see Section 7.6.1
2 • nowait clause, see Section 15.6

3 7.6.1 novariants Clause


4 Name: novariants Properties: unique

5 Arguments
Name Type Properties
6
do-not-use-variant expression of logical type default

7 Directives
8 dispatch

9 Semantics
10 If do-not-use-variant evaluates to true, no function variant is selected for the target-call of the
11 dispatch region associated with the novariants clause even if one would be selected
12 normally. The use of a variable in do-not-use-variant causes an implicit reference to the variable in
13 all enclosing constructs. do-not-use-variant is evaluated in the enclosing context.

14 Cross References
15 • dispatch directive, see Section 7.6

16 7.6.2 nocontext Clause


17 Name: nocontext Properties: unique

18 Arguments
Name Type Properties
19
do-not-update-context expression of logical type default

20 Directives
21 dispatch

22 Semantics
23 If do-not-update-context evaluates to true, the construct on which the nocontext clause appears
24 is not added to the construct set of the OpenMP context. The use of a variable in
25 do-not-update-context causes an implicit reference to the variable in all enclosing constructs.
26 do-not-update-context is evaluated in the enclosing context.

27 Cross References
28 • dispatch directive, see Section 7.6

CHAPTER 7. VARIANT DIRECTIVES 201


1 7.7 declare simd Directive
Name: declare simd Association: declaration
2
Category: declarative Properties: pure

3 Arguments
4 declare simd[(proc-name)]
Name Type Properties
5
proc-name identifier of function type optional

6 Clause groups
7 branch

8 Clauses
9 aligned, linear, simdlen, uniform

10 Semantics
11 The association of one or more declare simd directives with a function declaration or definition
12 enables the creation of corresponding SIMD versions of the associated function that can be used to
13 process multiple arguments from a single invocation in a SIMD loop concurrently.
14 If a SIMD version is created and the simdlen clause is not specified, the number of concurrent
15 arguments for the function is implementation defined.
16 For purposes of the linear clause, any integer-typed parameter that is specified in a uniform
17 clause on the directive is considered to be constant and so may be used in linear-step.
C / C++
18 The expressions that appear in the clauses of each directive are evaluated in the scope of the
19 arguments of the function declaration or definition.
C / C++
C++
20 The special this pointer can be used as if it was one of the arguments to the function in any of the
21 linear, aligned, or uniform clauses.
C++
22 Restrictions
23 Restrictions to the declare simd directive are as follows:
24 • The function or subroutine body must be a structured block.
25 • The execution of the function or subroutine, when called from a SIMD loop, cannot result in the
26 execution of an OpenMP construct except for an ordered construct with the simd clause or an
27 atomic construct.
28 • The execution of the function or subroutine cannot have any side effects that would alter its
29 execution for concurrent iterations of a SIMD chunk.

202 OpenMP API – Version 5.2 November 2021


C / C++
1 • If the function has any declarations, then the declare simd directive for any declaration that
2 has one must be equivalent to the one specified for the definition.
3 • The function cannot contain calls to the longjmp or setjmp functions.
C / C++
C++
4 • The function cannot contain throw statements.
C++
Fortran
5 • proc-name must not be a generic name, procedure pointer, or entry name.
6 • If proc-name is omitted, the declare simd directive must appear in the specification part of a
7 subroutine subprogram or a function subprogram for which creation of the SIMD versions is
8 enabled.
9 • Any declare simd directive must appear in the specification part of a subroutine subprogram,
10 function subprogram, or interface body to which it applies.
11 • If a declare simd directive is specified in an interface block for a procedure, it must match a
12 declare simd directive in the definition of the procedure.
13 • If a procedure is declared via a procedure declaration statement, the procedure proc-name should
14 appear in the same specification.
15 • If a declare simd directive is specified for a procedure name with explicit interface and a
16 declare simd directive is also specified for the definition of the procedure then the two
17 declare simd directives must match.
18 • Procedure pointers may not be used to access versions created by the declare simd directive.
Fortran
19 Cross References
20 • aligned clause, see Section 5.11
21 • linear clause, see Section 5.4.6
22 • reduction clause, see Section 5.5.8
23 • simdlen clause, see Section 10.4.3
24 • uniform clause, see Section 5.10

CHAPTER 7. VARIANT DIRECTIVES 203


1 7.7.1 branch Clauses
2 Clause groups
3 Properties: unique, exclusive, inarguable Members: inbranch, notinbranch

4 Directives
5 declare simd

6 Semantics
7 The branch clause grouping defines a set of clauses that indicate if a function can be assumed to be
8 or not to be encountered in a branch. The inbranch clause specifies that the function will always
9 be called from inside a conditional statement of the calling context. The notinbranch clause
10 specifies that the function will never be called from inside a conditional statement of the calling
11 context. If neither clause is specified, then the function may or may not be called from inside a
12 conditional statement of the calling context.

13 Cross References
14 • declare simd directive, see Section 7.7

15 7.8 Declare Target Directives


16 Declare target directives apply to procedures and/or variables to ensure that they can be executed or
17 accessed on a device. Variables are mapped for all device executions, or for specific device
18 executions through a link clause. An implementation may generate different versions of a
19 procedure to be used for target regions that execute on different devices. Whether the same
20 version is generated for different devices, or whether a version that is called in a target region
21 differs from the version that is called outside a target region, is implementation defined.
22 To facilitate device usage, OpenMP defines rules that implicitly specify declare target directives for
23 procedures and variables. The remainder of this section defines those rules as well as restrictions
24 that apply to all declare target directives.
25 If a variable with static storage duration is declared in a device routine then the named variable is
26 treated as if it had appeared in an enter clause on a declare target directive.
27 In the following, a non-host declare target directive is one that does not specify a device_type
28 clause with host. Further, a reverse-offload region is a region that is associated with a target
29 construct that specifies a device clause with the ancestor device-modifier.
C / C++
30 If a function is referenced outside of any reverse-offload region in a function that appears as a list
31 item in an enter clause on a non-host declare target directive then the name of the referenced
32 function is treated as if it had appeared in an enter clause on a declare target directive.
33 If a variable with static storage duration or a function (except lambda for C++) is referenced in the
34 initializer expression list of a variable with static storage duration that appears as a list item in an

204 OpenMP API – Version 5.2 November 2021


1 enter clause on a declare target directive then the name of the referenced variable or function is
2 treated as if it had appeared in an enter clause on a declare target directive.
C / C++
Fortran
3 If a procedure is referenced outside of any reverse-offload region in a procedure that appears as a
4 list item in an enter clause on a non-host declare target directive then the name of the
5 referenced procedure is treated as if it had appeared in an enter clause on a declare target
6 directive.
7 If a declare target directive has a device_type clause then any enclosed internal
8 procedures cannot contain any declare target directives. The enclosing device_type
9 clause implicitly applies to internal procedures.
Fortran
10 Execution Model Events
11 The target-global-data-op event occurs when an original variable is associated with a
12 corresponding variable on a device as a result of a declare target directive; the event occurs before
13 the first access to the corresponding variable.
14 Tool Callbacks
15 A thread dispatches a registered ompt_callback_target_data_op callback, or a registered
16 ompt_callback_target_data_op_emi callback with ompt_scope_beginend as its
17 endpoint argument for each occurrence of a target-global-data-op event in that thread. These
18 callbacks have type signature ompt_callback_target_data_op_t or
19 ompt_callback_target_data_op_emi_t, respectively.
20 Restrictions
21 Restrictions to any declare target directive are as follows:
22 • A variable declared in the directive must have a mappable type.
23 • A variable declared in the directive must have static storage duration.
24 • The same list item must not explicitly appear in both a enter clause on one declare target
25 directive and a link clause on another declare target directive.
26 • If a variable appears in a enter clause on the declare target directive, its initializer must not
27 refer to a variable that appears in a link clause on a declare target directive.
28 Cross References
29 • ompt_callback_target_data_op_emi_t and
30 ompt_callback_target_data_op_t, see Section 19.5.2.25
31 • begin declare target directive, see Section 7.8.2
32 • declare target directive, see Section 7.8.1
33 • enter clause, see Section 5.8.4
34 • link clause, see Section 5.8.5
35 • target directive, see Section 13.8

CHAPTER 7. VARIANT DIRECTIVES 205


1 7.8.1 declare target Directive
Name: declare target Association: none
2
Category: declarative Properties: device, declare target, pure

3 Arguments
4 declare target(extended-list)
Name Type Properties
5
extended-list list of extended list item type optional

6 Clauses
7 device_type, enter, indirect, link

8 Semantics
9 The declare target directive is a declare target directive. If the extended-list argument is
10 specified, the effect is as if an enter clause was specified with the extended-list as its argument.
Fortran
11 If a declare target directive does not have any clauses and does not have an extended-list then
12 an implicit enter clause with one item is formed from the name of the enclosing subroutine
13 subprogram, function subprogram or interface body to which it applies.
Fortran
14 Restrictions
15 Restrictions to the declare target directive are as follows:
16 • If the extended-list argument is specified, no clauses may be specified.
17 • If the directive has a clause, it must contain at least one enter clause or at least one link
18 clause.
19 • A variable for which nohost is specified may not appear in a link clause.
Fortran
20 • If a list item is a procedure name, it must not be a generic name, procedure pointer, entry name,
21 or statement function name.
22 • If no clauses are specified or if a device_type clause is specified, the directive must appear in
23 a specification part of a subroutine subprogram, function subprogram or interface body.
24 • If a list item is a procedure name, the directive must be in the specification part of that subroutine
25 or function subprogram or in the specification part of that subroutine or function in an interface
26 body.
27 • If an extended list item is a variable name, the directive must appear in the specification part of a
28 subroutine subprogram, function subprogram, program or module.

206 OpenMP API – Version 5.2 November 2021


1 • If the directive is specified in an interface block for a procedure, it must match a
2 declare target directive in the definition of the procedure, including the device_type
3 clause if present.
4 • If an external procedure is a type-bound procedure of a derived type and the directive is specified
5 in the definition of the external procedure, it must appear in the interface block that is accessible
6 to the derived-type definition.
7 • If any procedure is declared via a procedure declaration statement that is not in the type-bound
8 procedure part of a derived-type definition, any declare target with the procedure name
9 must appear in the same specification part.
10 • The directive must appear in the declaration section of a scoping unit in which the common block
11 or variable is declared.
12 • If a declare target directive that specifies a common block name appears in one program
13 unit, then such a directive must also appear in every other program unit that contains a COMMON
14 statement that specifies the same name, after the last such COMMON statement in the program unit.
15 • If a list item is declared with the BIND attribute, the corresponding C entities must also be
16 specified in a declare target directive in the C program.
17 • A variable can only appear in a declare target directive in the scope in which it is declared.
18 It must not be an element of a common block or appear in an EQUIVALENCE statement.
19 • A variable that appears in a declare target directive must be declared in the Fortran scope
20 of a module or have the SAVE attribute, either explicitly or implicitly.
Fortran
21 Cross References
22 • Declare Target Directives, see Section 7.8
23 • device_type clause, see Section 13.1
24 • enter clause, see Section 5.8.4
25 • indirect clause, see Section 7.8.3
26 • link clause, see Section 5.8.5
C / C++

27 7.8.2 begin declare target Directive


Name: begin declare target Association: delimited (declaration-
28 definition-seq)
Category: declarative Properties: device, declare target

29 Clauses
30 device_type, indirect

CHAPTER 7. VARIANT DIRECTIVES 207


1 Additional information
2 The directive name declare target may be used as a synonym to begin declare target
3 if no clauses are specified. This syntax has been deprecated.

4 Semantics
5 The begin declare target directive is a declare target directive. The directive and its paired
6 end directive form a delimited code region that defines an implicit extended-list. The implicit
7 extended-list consists of the variable names of any variable declarations at file or namespace scope
8 that appear in the delimited code region and of the function names of any function declarations at
9 file, namespace or class scope that appear in the delimited code region. The implicit extended-list is
10 converted to an implicit enter clause.
11 The delimited code region may contain declare target directives. If a device_type clause is
12 present on the contained declare target directive, then its argument determines which versions are
13 made available. If a list item appears both in an implicit and explicit list, the explicit list determines
14 which versions are made available.

15 Restrictions
16 Restrictions to the begin declare target directive are as follows:
C++
17 • The function names of overloaded functions or template functions may only be specified within
18 an implicit extended-list.
19 • If a lambda declaration and definition appears between a begin declare target directive
20 and the paired end directive, all variables that are captured by the lambda expression must also
21 appear in an enter clause.
22 • A module export or import statement cannot appear between a declare target directive and the
23 paired end directive.
C++
24 Cross References
25 • Declare Target Directives, see Section 7.8
26 • device_type clause, see Section 13.1
27 • enter clause, see Section 5.8.4
28 • indirect clause, see Section 7.8.3
C / C++

208 OpenMP API – Version 5.2 November 2021


1 7.8.3 indirect Clause
2 Name: indirect Properties: unique

3 Arguments
Name Type Properties
4
invoked-by-fptr expression of logical type constant, optional

5 Directives
6 begin declare target, declare target

7 Semantics
8 If invoked-by-fptr evaluates to true, any procedures that appear in an enter clause on the directive
9 on which the indirect clause is specified may be called with an indirect device invocation. If the
10 invoked-by-fptr does not evaluate to true, any procedures that appear in an enter clause on the
11 directive may not be called with an indirect device invocation. Unless otherwise specified by an
12 indirect clause, procedures may not be called with an indirect device invocation. If the
13 indirect clause is specified and invoked-by-fptr is not specified, the effect of the clause is as if
14 invoked-by-fptr evaluates to true.
C / C++
15 If a function appears in the implicit enter clause of a begin declare target directive and in
16 the enter clause of a declare target directive that is contained in the delimited code region of the
17 begin declare target directive, and if an indirect clause appears on both directives, then
18 the indirect clause on the begin declare target directive has no effect for that function.
C / C++
19 Restrictions
20 Restrictions to the indirect clause are as follows:
21 • If invoked-by-fptr evaluates to true, a device_type clause must not appear on the same
22 directive unless it specifies any. for its device-type-description.

23 Cross References
24 • begin declare target directive, see Section 7.8.2
25 • declare target directive, see Section 7.8.1

CHAPTER 7. VARIANT DIRECTIVES 209


1 8 Informational and Utility Directives
2 An informational directive conveys information about code properties to the compiler while a
3 utility directive facilitates interactions with the compiler or supports code readability. A utility
4 directive is informational unless the at clause implies it to be executable.

5 8.1 at Clause
6 Name: at Properties: unique

7 Arguments
Name Type Properties
8 action-time Keyword: compilation, default
execution

9 Directives
10 error

11 Semantics
12 The at clause determines when the implementation performs an action that is associated with a
13 utility directive. If action-time is compilation, the action is performed during compilation if the
14 directive appears in a declarative context or in an executable context that is reachable at runtime. If
15 action-time is compilation and the directive appears in an executable context that is not
16 reachable at runtime, the action may or may not be performed. If action-time is execution, the
17 action is performed during program execution when a thread encounters the directive and the
18 directive is considered to be an executable directive. If the at clause is not specified, the effect is as
19 if action-time is compilation.

20 Cross References
21 • error directive, see Section 8.5

22 8.2 requires Directive


Name: requires Association: none
23
Category: informational Properties: default

210
1 Clause groups
2 requirement

3 Semantics
4 The requires directive specifies features that an implementation must support for correct
5 execution and requirements for the execution of all code in the current compilation unit. The
6 behavior that a requirement clause specifies may override the normal behavior specified elsewhere
7 in this document. Whether an implementation supports the feature that a given requirement clause
8 specifies is implementation defined.
9 The clauses of a requires directive are added to the requires trait in the OpenMP context for all
10 program points that follow the directive.

11 Restrictions
12 The restrictions to the requires directive are as follows:
13 • All requires directives in the same compilation unit that specify the
14 atomic_default_mem_order requirement must specify the same argument.
15 • Any requires directive that specifies a reverse_offload, unified_address, or
16 unified_shared_memory requirement must appear lexically before any device constructs
17 or device routines.
18 • A requires directive may not appear lexically after a context selector in which any clause of
19 the requires directive is used.
20 • Either all compilation units of a program that contain declare target directives, device constructs
21 or device routines or none of them must specify a requires directive that specifies the
22 reverse_offload, unified_address or unified_shared_memory requirement.
23 • A requires directive that specifies the atomic_default_mem_order requirement must
24 not appear lexically after any atomic construct on which memory-order-clause is not specified.
C
25 • The requires directive may only appear at file scope.
C
C++
26 • The requires directive may only appear at file or namespace scope.
C++
Fortran
27 • The requires directive must appear in the specification part of a program unit, after any USE
28 statement, any IMPORT statement, and any IMPLICIT statement, unless the directive appears
29 by referencing a module and each clause already appeared with the same arguments in the
30 specification part of the program unit.
Fortran

CHAPTER 8. INFORMATIONAL AND UTILITY DIRECTIVES 211


1 8.2.1 requirement Clauses
2 Clause groups
Properties: unique Members: atomic_default_mem_order,
dynamic_allocators,
3
reverse_offload, unified_address,
unified_shared_memory

4 Directives
5 requires

6 Semantics
7 The requirement clause grouping defines a set of clauses that indicate the requirement that a
8 program requires the implementation to support. Other than atomic_default_mem_order,
9 the members of the set are inarguable.
10 If an implementation supports a given requirement clause then the use of that clause on a
11 requires directive will cause the implementation to ensure the enforcement of a guarantee
12 represented by the specific member of the clause grouping. If the implementation does not support
13 the requirement then it must perform compile-time error termination.
14 The reverse_offload clause requires an implementation to guarantee that if a target
15 construct specifies a device clause in which the ancestor modifier appears, the target
16 region can execute on the parent device of an enclosing target region.
17 The unified_address clause requires an implementation to guarantees that all devices
18 accessible through OpenMP API routines and directives use a unified address space. In this address
19 space, a pointer will always refer to the same location in memory from all devices accessible
20 through OpenMP. Any OpenMP mechanism that returns a device pointer is guaranteed to return a
21 device address that supports pointer arithmetic, and the is_device_ptr clause is not necessary
22 to obtain device addresses from device pointers for use inside target regions. Host pointers may
23 be passed as device pointer arguments to device memory routines and device pointers may be
24 passed as host pointer arguments to device memory routines. Non-host devices may still have
25 discrete memories and dereferencing a device pointer on the host device or a host pointer on a
26 non-host device remains unspecified behavior. Memory local to a specific execution context may be
27 exempt from the unified_address requirement, following the restrictions of locality to a given
28 execution context, thread or contention group.
29 The unified_shared_memory clause implies the unified_address requirement,
30 inheriting all of its behaviors. The implementation must also guarantee that storage locations in
31 memory are accessible to threads on all available devices that the implementation supports, except
32 for memory that is local to a specific execution context as defined in the description of
33 unified_address above. Every device address that refers to storage allocated through
34 OpenMP device memory routines is a valid host pointer that may be dereferenced.
35 The unified_shared_memory clause makes map clauses optional on target constructs and
36 declare target directives optional for variables with static storage duration that are accessed inside

212 OpenMP API – Version 5.2 November 2021


1 functions to which a declare target directive is applied. Scalar variables are still firstprivate by
2 default when referenced inside target constructs. Values stored into memory by one device may
3 not be visible to another device until those two devices synchronize with each other or both devices
4 synchronize with the host.
5 The dynamic_allocators clause removes certain restrictions on the use of memory allocators
6 in target regions. Specifically, allocators may be used in a target region without specifying
7 the uses_allocators clause on the corresponding target construct. The implementation
8 must support calls to the omp_init_allocator and omp_destroy_allocator API
9 routines in target regions. Finally, default allocators may be used on allocate directives and
10 allocate clauses, and in omp_alloc API routines in target regions.
11 The atomic_default_mem_order clause specifies the default memory ordering behavior for
12 atomic constructs that an implementation must provide. The effect is as if its argument appears as
13 a clause on any atomic construct that does not specify a memory order clause.

14 Cross References
15 • requires directive, see Section 8.2

16 8.3 Assumption Directives


17 Assumption directives provide invariants that specify additional information about the expected
18 properties of the program that can optionally be used for optimization. An implementation may
19 ignore this information without altering the behavior of the program. Different assumption
20 directive formats facilitate definition of assumptions for a scope that is appropriate to each base
21 language. The scope of a particular format is its assumption scope and is defined in the section that
22 defines that format. If the invariants do not hold at runtime, the behavior is unspecified.

23 8.3.1 assumption Clauses


24 Clause groups
Properties: Members: absent, contains, holds,
25 no_openmp, no_openmp_routines,
no_parallelism

26 Directives
27 assume, assumes, begin assumes

28 Semantics
29 The assumption clause grouping defines a set of clauses that indicate the assumptions that a
30 program ensures the implementation can exploit. Other than absent, contains and holds,
31 the members of the set are inarguable and unique.

CHAPTER 8. INFORMATIONAL AND UTILITY DIRECTIVES 213


1 The no_openmp clause guarantees that no OpenMP related code is executed in the assumption
2 scope. The no_openmp_routines clause guarantees that no explicit OpenMP runtime library
3 calls are executed in the assumption scope. The no_parallelism clause guarantees that no
4 OpenMP tasks (explicit or implicit) will be generated and that no SIMD constructs will be executed
5 in the assumption scope.
C++
6 The no_openmp clause also guarantees that no thread will throw an exception in the assumption
7 scope if it is contained in a region that arises from an exception-aborting directive.
C++
8 The absent and contains clauses accept a directive-name list that may match a construct that
9 is encountered within the assumption scope. An encountered construct matches the directive name
10 if it or (if it is a combined or composite construct) one of its leaf constructs has the same
11 directive-name as one of the members of the list. The absent clause specifies that the program
12 guarantees that no constructs that match a listed directive name are encountered in the assumption
13 scope. The contains clause specifies that constructs that match the listed directive names are
14 likely to be encountered in the assumption scope.
15 When the holds clause appears on an assumption directive, the program guarantees that the listed
16 expression evaluates to true in the assumption scope. The effect of the clause does not include an
17 observable evaluation of the expression.

18 Restrictions
19 The restrictions to assumption clauses are as follows:
20 • A directive-name list member must not specify a combined or composite directive.
21 • A directive-name list member must not specify a directive that is a declarative directive, an
22 informational directive other than the error directive, or a metadirective.

23 Cross References
24 • assume directive, see Section 8.3.3
25 • assumes directive, see Section 8.3.2
26 • begin assumes directive, see Section 8.3.4

27 8.3.2 assumes Directive


Name: assumes Association: none
28
Category: informational Properties: pure

29 Clause groups
30 assumption

214 OpenMP API – Version 5.2 November 2021


1 Semantics
2 The assumption scope of the assumes directive is the code executed and reached from the current
3 compilation unit.

4 Restrictions
5 The restrictions to the assumes directive are as follows:
C
6 • The assumes directive may only appear at file scope.
C
C++
7 • The assumes directive may only appear at file or namespace scope.
C++
Fortran
8 • The assumes directive may only appear in the specification part of a module or subprogram,
9 after any USE statement, any IMPORT statement, and any IMPLICIT statement.
Fortran

10 8.3.3 assume Directive


Name: assume Association: block
11
Category: informational Properties: pure

12 Clause groups
13 assumption

14 Semantics
15 The assumption scope of the assume directive is the code executed in the corresponding region or
16 in any region that is nested in the corresponding region.

C / C++

17 8.3.4 begin assumes Directive


Name: begin assumes Association: delimited (declaration-
18 definition-seq)
Category: informational Properties: default

19 Clause groups
20 assumption

CHAPTER 8. INFORMATIONAL AND UTILITY DIRECTIVES 215


1 Semantics
2 The assumption scope of the begin assumes directive is the code that is executed and reached
3 from any of the declared functions in the delimited code region.
C / C++

4 8.4 nothing Directive


Name: nothing Association: none
5
Category: utility Properties: pure

6 Semantics
7 The nothing directive has no effect on the execution of the OpenMP program.

8 Cross References
9 • Metadirectives, see Section 7.4

10 8.5 error Directive


Name: error Association: none
11
Category: utility Properties: pure

12 Clauses
13 at, message, severity

14 Semantics
15 The error directive instructs the compiler or runtime to perform an error action. The error action
16 displays an implementation-defined message. The severity clause determines whether the error
17 action is abortive following the display of the message. If sev-level is fatal and action-time is
18 compilation, the message is displayed and compilation of the current compilation unit is
19 aborted. If sev-level is fatal and action-time is execution, the message is displayed and
20 program execution is aborted.

21 Execution Model Events


22 The runtime-error event occurs when a thread encounters an error directive for which the at
23 clause specifies execution.

24 Tool Callbacks
25 A thread dispatches a registered ompt_callback_error callback for each occurrence of a
26 runtime-error event in the context of the encountering task. This callback has the type signature
27 ompt_callback_error_t.

216 OpenMP API – Version 5.2 November 2021


1 Restrictions
2 Restrictions to the error directive are as follows:
3 • The directive is pure only if action-time is compilation.

4 Cross References
5 • ompt_callback_error_t, see Section 19.5.2.30
6 • at clause, see Section 8.1
7 • message clause, see Section 8.5.2
8 • severity clause, see Section 8.5.1

9 8.5.1 severity Clause


10 Name: severity Properties: unique

11 Arguments
Name Type Properties
12
sev-level Keyword: fatal, warning default

13 Directives
14 error

15 Semantics
16 The severity clause determines the action that the implementation performs. If sev-level is
17 warning, the implementation takes no action besides displaying the message that is associated
18 with the directive. if sev-level is fatal, the implementation performs the abortive action
19 associated with the directive on which the clause appears. If no severity clause is specified then
20 the effect is as if sev-level is fatal.

21 Cross References
22 • error directive, see Section 8.5

23 8.5.2 message Clause


24 Name: message Properties: unique

25 Arguments
Name Type Properties
26
msg-string expression of string type default

27 Directives
28 error

CHAPTER 8. INFORMATIONAL AND UTILITY DIRECTIVES 217


1 Semantics
2 The message clause specifies that msg-string is included in the implementation-defined message
3 that is associated with the directive on which the clause appears.

4 Restrictions
C / C++
5 • If the action-time is compilation, msg-string must be a constant string literal.
C / C++
Fortran
6 • If the action-time is compilation, msg-string must be a constant character expression.
Fortran
7 Cross References
8 • error directive, see Section 8.5

218 OpenMP API – Version 5.2 November 2021


1 9 Loop Transformation Constructs
2 A loop transformation construct replaces itself, including its associated loop nest, with a structured
3 block that may be another loop nest. If the loop transformation construct is nested inside another
4 loop nest, its replacement becomes part of that loop nest and therefore its generated loops may
5 become associated with another loop-associated directive that forms an enclosing construct. A loop
6 transformation construct that is closely nested within another loop transformation construct applies
7 before the enclosing loop transformation construct.
8 The associated loop nest of a loop transformation construct must have canonical loop nest form (see
9 Section 4.4.1). All generated loops have canonical loop nest form, unless otherwise specified. Loop
10 iteration variables of generated loops are always private in the enclosing parallelism-generating
11 construct.

12 Cross References
13 • Canonical Loop Nest Form, see Section 4.4.1

14 9.1 tile Construct


Name: tile Association: loop
15
Category: executable Properties: pure

16 Clauses
17 sizes

18 Semantics
19 The tile construct tiles the outer n loops of the associated loop nest, where n is the number of
20 items in the sizes clause, which consists of items s1 , . . . , sn . Let `1 , . . . , `n be the associated
21 loops, from outermost to innermost, which the construct replaces with a loop nest that consists of
22 2n perfectly nested loops. Let f1 , . . . , fn , t1 , . . . , tn be the generated loops, from outermost to
23 innermost. The loops f1 , . . . , fn are the floor loops and the loops t1 , . . . , tn are the tile loops. The
24 tile loops do not have canonical loop nest form.
25 Let Ω be the logical iteration vector space of the associated loops. For any (α1 , . . . , αn ) ∈ Nn ,
26 define the set of iterations {(i1 , . . . , in ) ∈ Ω | ∀k ∈ {1, . . . , n} : sk αk ≤ ik < sk αk + sk } to be
27 F = {Tα1 ,...,αn | Tα1 ,...,αn 6= ∅} to be the set of tiles with at least one iteration.
tile Tα1 ,...,αn and Q
n
28 Tiles that contain k=1 sk iterations are complete tiles. Otherwise, they are partial tiles.

219
1 The floor loops iterate over all tiles {Tα1 ,...,αn ∈ F } in lexicographic order with respect to their
2 indices (α1 , . . . , αn ) and the tile loops iterate over the iterations in Tα1 ,...,αn in the lexicographic
3 order of the corresponding iteration vectors. An implementation may reorder the sequential
4 execution of two iterations if at least one is from a partial tile and if their respective logical iteration
5 vectors in loop-nest do not have a product order relation.

6 Restrictions
7 Restrictions to the tile construct are as follows:
8 • The depth of the associated loop nest must be greater than or equal to n.
9 • All loops that are associated with the construct must be perfectly nested.
10 • No loop that is associated with the construct may be a non-rectangular loop.

11 Cross References
12 • sizes clause, see Section 9.1.1

13 9.1.1 sizes Clause


14 Name: sizes Properties: unique, required

15 Arguments
Name Type Properties
16
size-list list of expression of integer type constant, positive

17 Directives
18 tile

19 Semantics
20 The sizes clause specifies a list of n compile-time constant, positive OpenMP integer expressions.

21 Cross References
22 • tile directive, see Section 9.1

23 9.2 unroll Construct


Name: unroll Association: loop
24
Category: executable Properties: pure

25 Clauses
26 full, partial

27 Clause set
28 Properties: exclusive Members: full, partial

220 OpenMP API – Version 5.2 November 2021


1 Semantics
2 The unroll construct unrolls the outermost loop of the loop nest according to its specified clause.
3 If no clauses are specified, if and how the loop is unrolled is implementation defined. The unroll
4 construct results in a generated loop that has canonical loop nest form if and only if the partial
5 clause is specified.

6 Cross References
7 • full clause, see Section 9.2.1
8 • partial clause, see Section 9.2.2

9 9.2.1 full Clause


10 Name: full Properties: unique

11 Directives
12 unroll

13 Semantics
14 The full clause specifies that the associated loop is fully unrolled. The construct is replaced by a
15 structured block that only contains n instances of its loop body, one for each of the n logical
16 iterations of the associated loop and in their logical iteration order.

17 Restrictions
18 Restrictions to the full clause are as follows:
19 • The iteration count of the associated loop must be a compile-time constant.

20 Cross References
21 • unroll directive, see Section 9.2

22 9.2.2 partial Clause


23 Name: partial Properties: unique

24 Arguments
Name Type Properties
25 unroll-factor expression of integer type optional, constant, posi-
tive

26 Directives
27 unroll

CHAPTER 9. LOOP TRANSFORMATION CONSTRUCTS 221


1 Semantics
2 The partial clause specifies that the associated loop is first tiled with a tile size of unroll-factor.
3 Then, the generated tile loop is fully unrolled. If the partial clause is used without an
4 unroll-factor argument then the unroll factor is a positive integer that is implementation defined.

5 Cross References
6 • unroll directive, see Section 9.2

222 OpenMP API – Version 5.2 November 2021


1 10 Parallelism Generation and Control
2 This chapter defines constructs for generating and controlling parallelism.

3 10.1 parallel Construct


Name: parallel Association: block
4 Category: executable Properties: parallelism-generating, can-
cellable, thread-limiting, context-matching

5 Clauses
6 allocate, copyin, default, firstprivate, if, num_threads, private,
7 proc_bind, reduction, shared

8 Binding
9 The binding thread set for a parallel region is the encountering thread. The encountering thread
10 becomes the primary thread of the new team.

11 Semantics
12 When a thread encounters a parallel construct, a team of threads is created to execute the
13 parallel region (see Section 10.1.1 for more information about how the number of threads in
14 the team is determined, including the evaluation of the if and num_threads clauses). The
15 thread that encountered the parallel construct becomes the primary thread of the new team,
16 with a thread number of zero for the duration of the new parallel region. All threads in the new
17 team, including the primary thread, execute the region. Once the team is created, the number of
18 threads in the team remains constant for the duration of that parallel region.
19 Within a parallel region, thread numbers uniquely identify each thread. Thread numbers are
20 consecutive whole numbers ranging from zero for the primary thread up to one less than the
21 number of threads in the team. A thread may obtain its own thread number by a call to the
22 omp_get_thread_num library routine.
23 A set of implicit tasks, equal in number to the number of threads in the team, is generated by the
24 encountering thread. The structured block of the parallel construct determines the code that
25 will be executed in each implicit task. Each task is assigned to a different thread in the team and
26 becomes tied. The task region of the task that the encountering thread is executing is suspended and
27 each thread in the team executes its implicit task. Each thread can execute a path of statements that
28 is different from that of the other threads.

223
1 The implementation may cause any thread to suspend execution of its implicit task at a task
2 scheduling point, and to switch to execution of any explicit task generated by any of the threads in
3 the team, before eventually resuming execution of the implicit task (for more details see
4 Chapter 12).
5 An implicit barrier occurs at the end of a parallel region. After the end of a parallel region,
6 only the primary thread of the team resumes execution of the enclosing task region.
7 If a thread in a team that is executing a parallel region encounters another parallel
8 directive, it creates a new team, according to the rules in Section 10.1.1, and it becomes the primary
9 thread of that new team.
10 If execution of a thread terminates while inside a parallel region, execution of all threads in all
11 teams terminates. The order of termination of threads is unspecified. All work done by a team prior
12 to any barrier that the team has passed in the program is guaranteed to be complete. The amount of
13 work done by each thread after the last barrier that it passed and before it terminates is unspecified.

14 Execution Model Events


15 The parallel-begin event occurs in a thread that encounters a parallel construct before any
16 implicit task is created for the corresponding parallel region.
17 Upon creation of each implicit task, an implicit-task-begin event occurs in the thread that executes
18 the implicit task after the implicit task is fully initialized but before the thread begins to execute the
19 structured block of the parallel construct.
20 If the parallel region creates a native thread, a native-thread-begin event occurs as the first
21 event in the context of the new thread prior to the implicit-task-begin event.
22 Events associated with implicit barriers occur at the end of a parallel region. Section 15.3.2
23 describes events associated with implicit barriers.
24 When a thread finishes an implicit task, an implicit-task-end event occurs in the thread after events
25 associated with implicit barrier synchronization in the implicit task.
26 The parallel-end event occurs in the thread that encounters the parallel construct after the
27 thread executes its implicit-task-end event but before the thread resumes execution of the
28 encountering task.
29 If a native thread is destroyed at the end of a parallel region, a native-thread-end event occurs
30 in the thread as the last event prior to destruction of the thread.

31 Tool Callbacks
32 A thread dispatches a registered ompt_callback_parallel_begin callback for each
33 occurrence of a parallel-begin event in that thread. The callback occurs in the task that encounters
34 the parallel construct. This callback has the type signature
35 ompt_callback_parallel_begin_t. In the dispatched callback,
36 (flags & ompt_parallel_team) evaluates to true.

224 OpenMP API – Version 5.2 November 2021


1 A thread dispatches a registered ompt_callback_implicit_task callback with
2 ompt_scope_begin as its endpoint argument for each occurrence of an implicit-task-begin
3 event in that thread. Similarly, a thread dispatches a registered
4 ompt_callback_implicit_task callback with ompt_scope_end as its endpoint
5 argument for each occurrence of an implicit-task-end event in that thread. The callbacks occur in
6 the context of the implicit task and have type signature ompt_callback_implicit_task_t.
7 In the dispatched callback, (flags & ompt_task_implicit) evaluates to true.
8 A thread dispatches a registered ompt_callback_parallel_end callback for each
9 occurrence of a parallel-end event in that thread. The callback occurs in the task that encounters
10 the parallel construct. This callback has the type signature
11 ompt_callback_parallel_end_t.
12 A thread dispatches a registered ompt_callback_thread_begin callback for the
13 native-thread-begin event in that thread. The callback occurs in the context of the thread. The
14 callback has type signature ompt_callback_thread_begin_t.
15 A thread dispatches a registered ompt_callback_thread_end callback for the
16 native-thread-end event in that thread. The callback occurs in the context of the thread. The
17 callback has type signature ompt_callback_thread_end_t.

18 Cross References
19 • Determining the Number of Threads for a parallel Region, see Section 10.1.1
20 • omp_get_thread_num, see Section 18.2.4
21 • ompt_callback_implicit_task_t, see Section 19.5.2.11
22 • ompt_callback_parallel_begin_t, see Section 19.5.2.3
23 • ompt_callback_parallel_end_t, see Section 19.5.2.4
24 • ompt_callback_thread_begin_t, see Section 19.5.2.1
25 • ompt_callback_thread_end_t, see Section 19.5.2.2
26 • ompt_scope_endpoint_t, see Section 19.4.4.11
27 • allocate clause, see Section 6.6
28 • copyin clause, see Section 5.7.1
29 • default clause, see Section 5.4.1
30 • firstprivate clause, see Section 5.4.4
31 • if clause, see Section 3.4
32 • num_threads clause, see Section 10.1.2
33 • private clause, see Section 5.4.3
34 • proc_bind clause, see Section 10.1.4

CHAPTER 10. PARALLELISM GENERATION AND CONTROL 225


1 • reduction clause, see Section 5.5.8
2 • shared clause, see Section 5.4.2

3 10.1.1 Determining the Number of Threads for a parallel


4 Region
5 When execution encounters a parallel directive, the value of the if clause or num_threads
6 clause (if any) on the directive, the current parallel context, and the values of the nthreads-var,
7 dyn-var, thread-limit-var, and max-active-levels-var ICVs are used to determine the number of
8 threads to use in the region.
9 Using a variable in an if or num_threads clause expression of a parallel construct causes
10 an implicit reference to the variable in all enclosing constructs. The if clause expression and the
11 num_threads clause expression are evaluated in the context outside of the parallel construct,
12 and no ordering of those evaluations is specified. In what order or how many times any side effects
13 of the evaluation of the num_threads or if clause expressions occur is also unspecified.
14 When a thread encounters a parallel construct, the number of threads is determined according
15 to Algorithm 2.1.

16
17
18
Algorithm 2.1
19 let ThreadsBusy be the number of OpenMP threads currently executing in this contention group;
20 if an if clause exists
21 then let IfClauseValue be the value of the if clause expression;
22 else let IfClauseValue = true;
23 if a num_threads clause exists
24 then let ThreadsRequested be the value of the num_threads clause expression;
25 else let ThreadsRequested = value of the first element of nthreads-var;
26 let ThreadsAvailable = (thread-limit-var - ThreadsBusy + 1);
27 if (IfClauseValue = false)
28 then number of threads = 1;
29 else if (active-levels-var ≥ max-active-levels-var)
30 then number of threads = 1;
31 else if (dyn-var = true) and (ThreadsRequested ≤ ThreadsAvailable)
32 then 1 ≤ number of threads ≤ ThreadsRequested;

226 OpenMP API – Version 5.2 November 2021


1 else if (dyn-var = true) and (ThreadsRequested > ThreadsAvailable)
2 then 1 ≤ number of threads ≤ ThreadsAvailable;
3 else if (dyn-var = false) and (ThreadsRequested ≤ ThreadsAvailable)
4 then number of threads = ThreadsRequested;
5 else if (dyn-var = false) and (ThreadsRequested > ThreadsAvailable)
6 then behavior is implementation defined;
7

8 Cross References
9 • dyn-var ICV, see Table 2.1
10 • if clause, see Section 3.4
11 • max-active-levels-var ICV, see Table 2.1
12 • nthreads-var ICV, see Table 2.1
13 • num_threads clause, see Section 10.1.2
14 • parallel directive, see Section 10.1
15 • thread-limit-var ICV, see Table 2.1

16 10.1.2 num_threads Clause


17 Name: num_threads Properties: unique

18 Arguments
Name Type Properties
19
nthreads expression of integer type positive

20 Directives
21 parallel

22 Semantics
23 The num_threads clause specifies the desired number of threads to execute a parallel region.

24 Cross References
25 • parallel directive, see Section 10.1

CHAPTER 10. PARALLELISM GENERATION AND CONTROL 227


1 10.1.3 Controlling OpenMP Thread Affinity
2 When a thread encounters a parallel directive without a proc_bind clause, the bind-var ICV
3 is used to determine the policy for assigning OpenMP threads to places within the current place
4 partition, that is, within the places listed in the place-partition-var ICV for the implicit task of the
5 encountering thread. If the parallel directive has a proc_bind clause then the binding policy
6 specified by the proc_bind clause overrides the policy specified by the first element of the
7 bind-var ICV. Once a thread in the team is assigned to a place, the OpenMP implementation should
8 not move it to another place.
9 The primary thread affinity policy instructs the execution environment to assign every thread in
10 the team to the same place as the primary thread. The place partition is not changed by this policy,
11 and each implicit task inherits the place-partition-var ICV of the parent implicit task. The master
12 thread-affinity policy, which has been deprecated, has identical semantics to the primary thread
13 affinity policy.
14 The close thread affinity policy instructs the execution environment to assign the threads in the
15 team to places close to the place of the parent thread. The place partition is not changed by this
16 policy, and each implicit task inherits the place-partition-var ICV of the parent implicit task. If T
17 is the number of threads in the team, and P is the number of places in the parent’s place partition,
18 then the assignment of threads in the team to places is as follows:
19 • T ≤ P : The primary thread executes on the place of the parent thread. The thread with the next
20 smallest thread number executes on the next place in the place partition, and so on, with wrap
21 around with respect to the place partition of the primary thread.
22 • T > P : Each place p will contain Sp threads with consecutive thread numbers where
23 bT /P c ≤ Sp ≤ dT /P e. The first S0 threads (including the primary thread) are assigned to the
24 place of the parent thread. The next S1 threads are assigned to the next place in the place
25 partition, and so on, with wrap around with respect to the place partition of the primary thread.
26 When P does not divide T evenly, the exact number of threads in a particular place is
27 implementation defined.
28 The purpose of the spread thread affinity policy is to create a sparse distribution for a team of T
29 threads among the P places of the parent’s place partition. A sparse distribution is achieved by first
30 subdividing the parent partition into T subpartitions if T ≤ P , or P subpartitions if T > P . Then
31 one thread (T ≤ P ) or a set of threads (T > P ) is assigned to each subpartition. The
32 place-partition-var ICV of each implicit task is set to its subpartition. The subpartitioning is not
33 only a mechanism for achieving a sparse distribution, it also defines a subset of places for a thread
34 to use when creating a nested parallel region. The assignment of threads to places is as follows:
35 • T ≤ P : The parent thread’s place partition is split into T subpartitions, where each subpartition
36 contains bP/T c or dP/T e consecutive places. A single thread is assigned to each subpartition.
37 The primary thread executes on the place of the parent thread and is assigned to the subpartition
38 that includes that place. The thread with the next smallest thread number is assigned to the first
39 place in the next subpartition, and so on, with wrap around with respect to the original place
40 partition of the primary thread.

228 OpenMP API – Version 5.2 November 2021


1 • T > P : The parent thread’s place partition is split into P subpartitions, each consisting of a
2 single place. Each subpartition is assigned Sp threads with consecutive thread numbers, where
3 bT /P c ≤ Sp ≤ dT /P e. The first S0 threads (including the primary thread) are assigned to the
4 subpartition that contains the place of the parent thread. The next S1 threads are assigned to the
5 next subpartition, and so on, with wrap around with respect to the original place partition of the
6 primary thread. When P does not divide T evenly, the exact number of threads in a particular
7 subpartition is implementation defined.
8 The determination of whether the affinity request can be fulfilled is implementation defined. If the
9 affinity request cannot be fulfilled, then the affinity of threads in the team is implementation defined.
10

11 Note – Wrap around is needed if the end of a place partition is reached before all thread
12 assignments are done. For example, wrap around may be needed in the case of close and T ≤ P ,
13 if the primary thread is assigned to a place other than the first place in the place partition. In this
14 case, thread 1 is assigned to the place after the place of the primary thread, thread 2 is assigned to
15 the place after that, and so on. The end of the place partition may be reached before all threads are
16 assigned. In this case, assignment of threads is resumed with the first place in the place partition.
17

18 Cross References
19 • bind-var ICV, see Table 2.1
20 • parallel directive, see Section 10.1
21 • place-partition-var ICV, see Table 2.1
22 • proc_bind clause, see Section 10.1.4

23 10.1.4 proc_bind Clause


24 Name: proc_bind Properties: unique

25 Arguments
Name Type Properties
26 affinity-policy Keyword: close, master (depre- default
cated), primary, spread

27 Directives
28 parallel

29 Semantics
30 The proc_bind clause specifies the mapping of OpenMP threads to places within the current
31 place partition, that is, within the places listed in the place-partition-var ICV for the implicit task of
32 the encountering thread. The effect of the possible values for affinity-policy are described in
33 Section 10.1.3

CHAPTER 10. PARALLELISM GENERATION AND CONTROL 229


1 Cross References
2 • Controlling OpenMP Thread Affinity, see Section 10.1.3
3 • parallel directive, see Section 10.1

4 10.2 teams Construct


Name: teams Association: block
5 Category: executable Properties: parallelism-generating, thread-
limiting, context-matching
6 Clauses
7 allocate, default, firstprivate, num_teams, private, reduction, shared,
8 thread_limit
9 Binding
10 The binding thread set for a teams region is the encountering thread.
11 Semantics
12 When a thread encounters a teams construct, a league of teams is created. Each team is an initial
13 team, and the initial thread in each team executes the teams region. The number of teams created
14 is determined by evaluating the if and num_teams clauses. Once the teams are created, the
15 number of initial teams remains constant for the duration of the teams region. Within a teams
16 region, initial team numbers uniquely identify each initial team. Initial team numbers are
17 consecutive whole numbers ranging from zero to one less than the number of initial teams.
18 When an if clause is present on a teams construct and the if clause expression evaluates to
19 false, the number of created teams is one. The use of a variable in an if clause expression of a
20 teams construct causes an implicit reference to the variable in all enclosing constructs. The if
21 clause expression is evaluated in the context outside of the teams construct.
22 If a thread_limit clause is not present on the teams construct, but the construct is closely
23 nested inside a target construct on which the thread_limit clause is specified, the behavior
24 is as if that thread_limit clause is also specified for the teams construct.
25 On a combined or composite construct that includes target and teams constructs, the
26 expressions in num_teams and thread_limit clauses are evaluated on the host device on
27 entry to the target construct.
28 The place list, given by the place-partition-var ICV of the encountering thread, is split into
29 subpartitions in an implementation-defined manner, and each team is assigned to a subpartition by
30 setting the place-partition-var of its initial thread to the subpartition.
31 The teams construct sets the default-device-var ICV for each initial thread to an
32 implementation-defined value.
33 After the teams have completed execution of the teams region, the encountering task resumes
34 execution of the enclosing task region.

230 OpenMP API – Version 5.2 November 2021


1 Execution Model Events
2 The teams-begin event occurs in a thread that encounters a teams construct before any initial task
3 is created for the corresponding teams region.
4 Upon creation of each initial task, an initial-task-begin event occurs in the thread that executes the
5 initial task after the initial task is fully initialized but before the thread begins to execute the
6 structured block of the teams construct.
7 If the teams region creates a native thread, a native-thread-begin event occurs as the first event in
8 the context of the new thread prior to the initial-task-begin event.
9 When a thread finishes an initial task, an initial-task-end event occurs in the thread.
10 The teams-end event occurs in the thread that encounters the teams construct after the thread
11 executes its initial-task-end event but before it resumes execution of the encountering task.
12 If a native thread is destroyed at the end of a teams region, a native-thread-end event occurs in the
13 thread as the last event prior to destruction of the thread.

14 Tool Callbacks
15 A thread dispatches a registered ompt_callback_parallel_begin callback for each
16 occurrence of a teams-begin event in that thread. The callback occurs in the task that encounters the
17 teams construct. This callback has the type signature
18 ompt_callback_parallel_begin_t. In the dispatched callback,
19 (flags & ompt_parallel_league) evaluates to true.
20 A thread dispatches a registered ompt_callback_implicit_task callback with
21 ompt_scope_begin as its endpoint argument for each occurrence of an initial-task-begin in
22 that thread. Similarly, a thread dispatches a registered ompt_callback_implicit_task
23 callback with ompt_scope_end as its endpoint argument for each occurrence of an
24 initial-task-end event in that thread. The callbacks occur in the context of the initial task and have
25 type signature ompt_callback_implicit_task_t. In the dispatched callback,
26 (flags & ompt_task_initial) evaluates to true.
27 A thread dispatches a registered ompt_callback_parallel_end callback for each
28 occurrence of a teams-end event in that thread. The callback occurs in the task that encounters the
29 teams construct. This callback has the type signature ompt_callback_parallel_end_t.
30 A thread dispatches a registered ompt_callback_thread_begin callback for the
31 native-thread-begin event in that thread. The callback occurs in the context of the thread. The
32 callback has type signature ompt_callback_thread_begin_t.
33 A thread dispatches a registered ompt_callback_thread_end callback for the
34 native-thread-end event in that thread. The callback occurs in the context of the thread. The
35 callback has type signature ompt_callback_thread_end_t.

CHAPTER 10. PARALLELISM GENERATION AND CONTROL 231


1 Restrictions
2 Restrictions to the teams construct are as follows:
3 • If a reduction-modifier is specified in a reduction clause that appears on the directive then the
4 reduction modifier must be default.
5 • A teams region must be strictly nested within the implicit parallel region that surrounds the
6 whole OpenMP program or a target region. If a teams region is nested inside a target
7 region, the corresponding target construct must not contain any statements, declarations or
8 directives outside of the corresponding teams construct.
9 • distribute regions, including any distribute regions arising from composite constructs,
10 parallel regions, including any parallel regions arising from combined constructs, loop
11 regions, omp_get_num_teams() regions, and omp_get_team_num() regions are the
12 only OpenMP regions that may be strictly nested inside the teams region.

13 Cross References
14 • omp_get_num_teams, see Section 18.4.1
15 • omp_get_team_num, see Section 18.4.2
16 • ompt_callback_implicit_task_t, see Section 19.5.2.11
17 • ompt_callback_parallel_begin_t, see Section 19.5.2.3
18 • ompt_callback_parallel_end_t, see Section 19.5.2.4
19 • ompt_callback_thread_begin_t, see Section 19.5.2.1
20 • ompt_callback_thread_end_t, see Section 19.5.2.2
21 • allocate clause, see Section 6.6
22 • default clause, see Section 5.4.1
23 • distribute directive, see Section 11.6
24 • firstprivate clause, see Section 5.4.4
25 • num_teams clause, see Section 10.2.1
26 • parallel directive, see Section 10.1
27 • private clause, see Section 5.4.3
28 • reduction clause, see Section 5.5.8
29 • shared clause, see Section 5.4.2
30 • target directive, see Section 13.8
31 • thread_limit clause, see Section 13.3

232 OpenMP API – Version 5.2 November 2021


1 10.2.1 num_teams Clause
2 Name: num_teams Properties: unique

3 Arguments
Name Type Properties
4
upper-bound expression of integer type positive

5 Modifiers
Name Modifies Type Properties
6 lower-bound Generic OpenMP integer expression positive, ultimate,
unique

7 Directives
8 teams

9 Semantics
10 The num_teams clause specifies the bounds on the number of teams created by the construct on
11 which it appears. lower-bound specifies the lower bound and upper-bound specifies the upper
12 bound on the number of teams requested. If lower-bound is not specified, the effect is as if
13 lower-bound is specified as equal to upper-bound. The number of teams created is implementation
14 defined, but it will be greater than or equal to the lower bound and less than or equal to the upper
15 bound.
16 If the num_teams clause is not specified on a construct then the effect is as if upper-bound was
17 specified as follows. If the value of the nteams-var ICV is greater than zero, the effect is as if
18 upper-bound was specified to an implementation-defined value greater than zero but less than or
19 equal to the value of the nteams-var ICV. Otherwise, the effect is as if upper-bound was specified as
20 an implementation defined value greater than or equal to one.

21 Restrictions
22 • lower-bound must be less than or equal to upper-bound.

23 Cross References
24 • teams directive, see Section 10.2

25 10.3 order Clause


26 Name: order Properties: unique

27 Arguments
Name Type Properties
28
ordering Keyword: concurrent default

CHAPTER 10. PARALLELISM GENERATION AND CONTROL 233


1 Modifiers
Name Modifies Type Properties
2 order-modifier ordering Keyword: reproducible, default
unconstrained

3 Directives
4 distribute, do, for, loop, simd

5 Semantics
6 The order clause specifies an ordering of execution for the iterations of the associated loops of a
7 loop-associated directive. If ordering is concurrent, the logical iterations of the associated
8 loops may execute in any order, including concurrently.
9 The order-modifier on the order clause affects the schedule specification for the purpose of
10 determining its consistency with other schedules (see Section 4.4.5). If order-modifier is
11 reproducible, the loop schedule for the construct on which the clause appears is reproducible,
12 whereas if order-modifier is unconstrained, the loop schedule is not reproducible.

13 Restrictions
14 Restrictions to the order clause are as follows:
15 • The only constructs that may be encountered inside a region that corresponds to a construct with
16 an order clause that specifies concurrent are the loop construct, the parallel
17 construct, the simd construct, and combined constructs for which the first construct is a
18 parallel construct.
19 • A region that corresponds to a construct with an order clause that specifies concurrent may
20 not contain calls to procedures that contain OpenMP directives.
21 • A region that corresponds to a construct with an order clause that specifies concurrent may
22 not contain OpenMP runtime API calls.
23 • If a threadprivate variable is referenced inside a region that corresponds to a construct with an
24 order clause that specifies concurrent, the behavior is unspecified.

25 Cross References
26 • distribute directive, see Section 11.6
27 • do directive, see Section 11.5.2
28 • for directive, see Section 11.5.1
29 • loop directive, see Section 11.7
30 • simd directive, see Section 10.4

234 OpenMP API – Version 5.2 November 2021


1 10.4 simd Construct
Name: simd Association: loop
2 Category: executable Properties: parallelism-generating, context-
matching, simdizable, pure
3 Separating directives
4 scan
5 Clauses
6 aligned, collapse, if, lastprivate, linear, nontemporal, order, private,
7 reduction, safelen, simdlen
8 Binding
9 A simd region binds to the current task region. The binding thread set of the simd region is the
10 current team.
11 Semantics
12 The simd construct enables the execution of multiple iterations of the associated loops
13 concurrently by using SIMD instructions. At the beginning of each logical iteration, the loop
14 iteration variable or the variable declared by range-decl of each associated loop has the value that it
15 would have if the set of the associated loops was executed sequentially. The number of iterations
16 that are executed concurrently at any given time is implementation defined. Each concurrent
17 iteration will be executed by a different SIMD lane. Each set of concurrent iterations is a SIMD
18 chunk. Lexical forward dependences in the iterations of the original loop must be preserved within
19 each SIMD chunk, unless an order clause that specifies concurrent is present.
20 When an if clause is present and evaluates to false, the preferred number of iterations to be
21 executed concurrently is one, regardless of whether a simdlen clause is specified.
22 Restrictions
23 Restrictions to the simd construct are as follows:
24 • If both simdlen and safelen clauses are specified, the value of the simdlen length must
25 be less than or equal to the value of the safelen length.
26 • Only simdizable constructs can be encountered during execution of a simd region.
27 • If an order clause that specifies concurrent appears on a simd directive, the safelen
28 clause may not also appear.
C / C++
29 • The simd region cannot contain calls to the longjmp or setjmp functions.
C / C++
C++
30 • No exception can be raised in the simd region.
31 • The only random access iterator types that are allowed for the associated loops are pointer types.
C++

CHAPTER 10. PARALLELISM GENERATION AND CONTROL 235


1 Cross References
2 • aligned clause, see Section 5.11
3 • collapse clause, see Section 4.4.3
4 • if clause, see Section 3.4
5 • lastprivate clause, see Section 5.4.5
6 • linear clause, see Section 5.4.6
7 • nontemporal clause, see Section 10.4.1
8 • order clause, see Section 10.3
9 • private clause, see Section 5.4.3
10 • reduction clause, see Section 5.5.8
11 • safelen clause, see Section 10.4.2
12 • scan directive, see Section 5.6
13 • simdlen clause, see Section 10.4.3

14 10.4.1 nontemporal Clause


15 Name: nontemporal Properties: default

16 Arguments
Name Type Properties
17
list list of variable list item type default

18 Directives
19 simd

20 Semantics
21 The nontemporal clause specifies that accesses to the storage locations to which the list items
22 refer have low temporal locality across the iterations in which those storage locations are accessed.
23 The list items of the nontemporal clause may also appear as list items of data-environment
24 attribute clauses.

25 Cross References
26 • simd directive, see Section 10.4

236 OpenMP API – Version 5.2 November 2021


1 10.4.2 safelen Clause
2 Name: safelen Properties: unique

3 Arguments
Name Type Properties
4
length expression of integer type positive, constant

5 Directives
6 simd

7 Semantics
8 The safelen clause specifies that no two concurrent iterations within a SIMD chunk can have a
9 distance in the logical iteration space that is greater than or equal to the value given in the clause.

10 Cross References
11 • simd directive, see Section 10.4

12 10.4.3 simdlen Clause


13 Name: simdlen Properties: unique

14 Arguments
Name Type Properties
15
length expression of integer type positive, constant

16 Directives
17 declare simd, simd

18 Semantics
19 When the simdlen clause appears on a simd construct, length is treated as a hint that specifies
20 the preferred number of iterations to be executed concurrently. When the simdlen clause appears
21 on a declare simd construct, if a SIMD version of the associated function is created, length
22 corresponds to the number of concurrent arguments of the function.

23 Cross References
24 • declare simd directive, see Section 7.7
25 • simd directive, see Section 10.4

CHAPTER 10. PARALLELISM GENERATION AND CONTROL 237


1 10.5 masked Construct
Name: masked Association: block
2
Category: executable Properties: thread-limiting

3 Clauses
4 filter

5 Additional information
6 The directive-name master may be used as a synonym to masked if no clauses are specified.
7 This syntax has been deprecated.

8 Binding
9 The binding thread set for a masked region is the current team. A masked region binds to the
10 innermost enclosing parallel region.

11 Semantics
12 The masked construct specifies a structured block that is executed by a subset of the threads of the
13 current team. The filter clause selects a subset of the threads of the team that executes the
14 binding parallel region to execute the structured block of the masked region. Other threads in the
15 team do not execute the associated structured block. No implied barrier occurs either on entry to or
16 exit from the masked construct. The result of evaluating the thread_num parameter of the
17 filter clause may vary across threads.
18 If more than one thread in the team executes the structured block of a masked region, the
19 structured block must include any synchronization required to ensure that data races do not occur.

20 Execution Model Events


21 The masked-begin event occurs in any thread of a team that executes the masked region on entry
22 to the region.
23 The masked-end event occurs in any thread of a team that executes the masked region on exit from
24 the region.

25 Tool Callbacks
26 A thread dispatches a registered ompt_callback_masked callback with
27 ompt_scope_begin as its endpoint argument for each occurrence of a masked-begin event in
28 that thread. Similarly, a thread dispatches a registered ompt_callback_masked callback with
29 ompt_scope_end as its endpoint argument for each occurrence of a masked-end event in that
30 thread. These callbacks occur in the context of the task executed by the current thread and have the
31 type signature ompt_callback_masked_t.

32 Cross References
33 • ompt_callback_masked_t, see Section 19.5.2.12
34 • ompt_scope_endpoint_t, see Section 19.4.4.11
35 • filter clause, see Section 10.5.1

238 OpenMP API – Version 5.2 November 2021


1 10.5.1 filter Clause
2 Name: filter Properties: unique

3 Arguments
Name Type Properties
4
thread_num expression of integer type default

5 Directives
6 masked

7 Semantics
8 If thread_num specifies the thread number of the current thread in the current team then the
9 filter clause selects the current thread. If the filter clause is not specified, the effect is as if
10 the clause is specified with thread_num equal to zero, so that the filter clause selects the
11 primary thread. The use of a variable in a thread_num clause expression causes an implicit
12 reference to the variable in all enclosing constructs.

13 Cross References
14 • masked directive, see Section 10.5

CHAPTER 10. PARALLELISM GENERATION AND CONTROL 239


1 11 Work-Distribution Constructs
2 A work-distribution construct distributes the execution of the corresponding region among the
3 threads in its binding thread set. Threads execute portions of the region in the context of the
4 implicit tasks that each one is executing.
5 A work-distribution construct is worksharing if the binding thread set is a thread team. A
6 worksharing region has no barrier on entry; however, an implied barrier exists at the end of the
7 worksharing region, unless a nowait clause is specified. If a nowait clause is present, an
8 implementation may omit the barrier at the end of the worksharing region. In this case, threads that
9 finish early may proceed straight to the instructions that follow the worksharing region without
10 waiting for the other members of the team to finish the worksharing region, and without performing
11 a flush operation.

12 Restrictions
13 The following restrictions apply to work-distribution constructs:
14 • Each work-distribution region must be encountered by all threads in the binding thread set or by
15 none at all unless cancellation has been requested for the innermost enclosing parallel region.
16 • The sequence of encountered work-distribution regions that have the same binding thread set
17 must be the same for every thread in the binding thread set.
18 • The sequence of encountered worksharing regions and barrier regions that bind to the same
19 thread team must be the same for every thread in the team.

20 11.1 single Construct


Name: single Association: block
21 Category: executable Properties: work-distribution, worksharing,
thread-limiting

22 Clauses
23 allocate, copyprivate, firstprivate, nowait, private

24 Binding
25 The binding thread set for a single region is the current team. A single region binds to the
26 innermost enclosing parallel region. Only the threads of the team that executes the binding
27 parallel region participate in the execution of the structured block and the implied barrier of the
28 single region if the barrier is not eliminated by a nowait clause.

240
1 Semantics
2 The single construct specifies that the associated structured block is executed by only one of the
3 threads in the team (not necessarily the primary thread), in the context of its implicit task. The
4 method of choosing a thread to execute the structured block each time the team encounters the
5 construct is implementation defined. An implicit barrier occurs at the end of a single region if
6 the nowait clause is not specified.

7 Execution Model Events


8 The single-begin event occurs after an implicit task encounters a single construct but before the
9 task starts to execute the structured block of the single region.
10 The single-end event occurs after an implicit task finishes execution of a single region but before
11 it resumes execution of the enclosing region.

12 Tool Callbacks
13 A thread dispatches a registered ompt_callback_work callback with ompt_scope_begin
14 as its endpoint argument for each occurrence of a single-begin event in that thread. Similarly, a
15 thread dispatches a registered ompt_callback_work callback with ompt_scope_end as its
16 endpoint argument for each occurrence of a single-end event in that thread. For each of these
17 callbacks, the wstype argument is ompt_work_single_executor if the thread executes the
18 structured block associated with the single region; otherwise, the wstype argument is
19 ompt_work_single_other. The callback has type signature ompt_callback_work_t.

20 Restrictions
21 Restrictions to the single construct are as follows:
22 • The copyprivate clause must not be used with the nowait clause.

23 Cross References
24 • ompt_callback_work_t, see Section 19.5.2.5
25 • ompt_scope_endpoint_t, see Section 19.4.4.11
26 • ompt_work_t, see Section 19.4.4.16
27 • allocate clause, see Section 6.6
28 • copyprivate clause, see Section 5.7.2
29 • firstprivate clause, see Section 5.4.4
30 • nowait clause, see Section 15.6
31 • private clause, see Section 5.4.3

CHAPTER 11. WORK-DISTRIBUTION CONSTRUCTS 241


1 11.2 scope Construct
Name: scope Association: block
2 Category: executable Properties: work-distribution, worksharing,
thread-limiting

3 Clauses
4 allocate, firstprivate, nowait, private, reduction

5 Binding
6 The binding thread set for a scope region is the current team. A scope region binds to the
7 innermost enclosing parallel region. Only the threads of the team that executes the binding parallel
8 region participate in the execution of the structured block and the implied barrier of the scope
9 region if the barrier is not eliminated by a nowait clause.

10 Semantics
11 The scope construct specifies that all threads in a team execute the associated structured block and
12 any additionally specified OpenMP operations. An implicit barrier occurs at the end of a scope
13 region if the nowait clause is not specified.

14 Execution Model Events


15 The scope-begin event occurs after an implicit task encounters a scope construct but before the
16 task starts to execute the structured block of the scope region.
17 The scope-end event occurs after an implicit task finishes execution of a scope region but before it
18 resumes execution of the enclosing region.

19 Tool Callbacks
20 A thread dispatches a registered ompt_callback_work callback with ompt_scope_begin
21 as its endpoint argument and ompt_work_scope as its work_type argument for each occurrence
22 of a scope-begin event in that thread. Similarly, a thread dispatches a registered
23 ompt_callback_work callback with ompt_scope_end as its endpoint argument and
24 ompt_work_scope as its work_type argument for each occurrence of a scope-end event in that
25 thread. The callbacks occur in the context of the implicit task. The callbacks have type signature
26 ompt_callback_work_t.

27 Cross References
28 • ompt_callback_work_t, see Section 19.5.2.5
29 • ompt_scope_endpoint_t, see Section 19.4.4.11
30 • ompt_work_t, see Section 19.4.4.16
31 • allocate clause, see Section 6.6
32 • firstprivate clause, see Section 5.4.4
33 • nowait clause, see Section 15.6

242 OpenMP API – Version 5.2 November 2021


1 • private clause, see Section 5.4.3
2 • reduction clause, see Section 5.5.8

3 11.3 sections Construct


Name: sections Association: block
4 Category: executable Properties: work-distribution, worksharing,
thread-limiting, cancellable

5 Separating directives
6 section
7 Clauses
8 allocate, firstprivate, lastprivate, nowait, private, reduction
9 Binding
10 The binding thread set for a sections region is the current team. A sections region binds to
11 the innermost enclosing parallel region. Only the threads of the team that executes the binding
12 parallel region participate in the execution of the structured block sequences and the implied
13 barrier of the sections region if the barrier is not eliminated by a nowait clause.
14 Semantics
15 The sections construct is a non-iterative worksharing construct that contains a structured block
16 that consists of a set of structured block sequences that are to be distributed among and executed by
17 the threads in a team. Each structured block sequence is executed by one of the threads in the team
18 in the context of its implicit task. An implicit barrier occurs at the end of a sections region if
19 the nowait clause is not specified.
20 Each structured block sequence in the sections construct is preceded by a section directive
21 except possibly the first sequence, for which a preceding section directive is optional. The
22 method of scheduling the structured block sequences among the threads in the team is
23 implementation defined.
24 Execution Model Events
25 The sections-begin event occurs after an implicit task encounters a sections construct but before
26 the task executes any structured block sequences of the sections region.
27 The sections-end event occurs after an implicit task finishes execution of a sections region but
28 before it resumes execution of the enclosing context.
29 Tool Callbacks
30 A thread dispatches a registered ompt_callback_work callback with ompt_scope_begin
31 as its endpoint argument and ompt_work_sections as its work_type argument for each
32 occurrence of a sections-begin event in that thread. Similarly, a thread dispatches a registered
33 ompt_callback_work callback with ompt_scope_end as its endpoint argument and
34 ompt_work_sections as its work_type argument for each occurrence of a sections-end event
35 in that thread. The callbacks occur in the context of the implicit task. The callbacks have type
36 signature ompt_callback_work_t.

CHAPTER 11. WORK-DISTRIBUTION CONSTRUCTS 243


1 Cross References
2 • ompt_callback_dispatch_t, see Section 19.5.2.6
3 • ompt_callback_work_t, see Section 19.5.2.5
4 • ompt_scope_endpoint_t, see Section 19.4.4.11
5 • ompt_work_t, see Section 19.4.4.16
6 • allocate clause, see Section 6.6
7 • firstprivate clause, see Section 5.4.4
8 • lastprivate clause, see Section 5.4.5
9 • nowait clause, see Section 15.6
10 • private clause, see Section 5.4.3
11 • reduction clause, see Section 5.5.8
12 • section directive, see Section 11.3.1

13 11.3.1 section Directive


Name: section Association: separating
14
Category: subsidiary Properties: default

15 Separated directives
16 sections

17 Semantics
18 The section directive may be used to separate the structured block that is associated with a
19 sections construct into multiple sections, each of which is a structured block sequence.

20 Execution Model Events


21 The section-begin event occurs before an implicit task starts to execute a structured block sequence
22 in the sections construct for each of those structured block sequences that the task executes.

23 Tool Callbacks
24 A thread dispatches a registered ompt_callback_dispatch callback for each occurrence of a
25 section-begin event in that thread. The callback occurs in the context of the implicit task. The
26 callback has type signature ompt_callback_dispatch_t.

27 Cross References
28 • sections directive, see Section 11.3

244 OpenMP API – Version 5.2 November 2021


Fortran

1 11.4 workshare Construct


Name: workshare Association: block
2
Category: executable Properties: work-distribution, worksharing

3 Clauses
4 nowait

5 Binding
6 The binding thread set for a workshare region is the current team. A workshare region binds
7 to the innermost enclosing parallel region. Only the threads of the team that executes the
8 binding parallel region participate in the execution of the units of work and the implied barrier
9 of the workshare region if the barrier is not eliminated by a nowait clause.

10 Semantics
11 The workshare construct divides the execution of the associated structured block into separate
12 units of work and causes the threads of the team to share the work such that each unit is executed
13 only once by one thread, in the context of its implicit task. An implicit barrier occurs at the end of a
14 workshare region if a nowait clause is not specified.
15 An implementation of the workshare construct must insert any synchronization that is required
16 to maintain standard Fortran semantics. For example, the effects of one statement within the
17 structured block must appear to occur before the execution of succeeding statements, and the
18 evaluation of the right hand side of an assignment must appear to complete prior to the effects of
19 assigning to the left hand side.
20 The statements in the workshare construct are divided into units of work as follows:
21 • For array expressions within each statement, including transformational array intrinsic functions
22 that compute scalar values from arrays:
23 – Evaluation of each element of the array expression, including any references to elemental
24 functions, is a unit of work.
25 – Evaluation of transformational array intrinsic functions may be freely subdivided into any
26 number of units of work.
27 • For array assignment statements, assignment of each element is a unit of work.
28 • For scalar assignment statements, each assignment operation is a unit of work.
29 • For WHERE statements or constructs, evaluation of the mask expression and the masked
30 assignments are each a unit of work.
31 • For FORALL statements or constructs, evaluation of the mask expression, expressions occurring
32 in the specification of the iteration space, and the masked assignments are each a unit of work.

CHAPTER 11. WORK-DISTRIBUTION CONSTRUCTS 245


Fortran (cont.)

1 • For atomic constructs, critical constructs, and parallel constructs, the construct is a
2 unit of work. A new thread team executes the statements contained in a parallel construct.
3 • If none of the rules above apply to a portion of a statement in the structured block, then that
4 portion is a unit of work.
5 The transformational array intrinsic functions are MATMUL, DOT_PRODUCT, SUM, PRODUCT,
6 MAXVAL, MINVAL, COUNT, ANY, ALL, SPREAD, PACK, UNPACK, RESHAPE, TRANSPOSE,
7 EOSHIFT, CSHIFT, MINLOC, and MAXLOC.
8 How units of work are assigned to the threads that execute a workshare region is unspecified.
9 If an array expression in the block references the value, association status, or allocation status of
10 private variables, the value of the expression is undefined, unless the same value would be
11 computed by every thread.
12 If an array assignment, a scalar assignment, a masked array assignment, or a FORALL assignment
13 assigns to a private variable in the block, the result is unspecified.
14 The workshare directive causes the sharing of work to occur only in the workshare construct,
15 and not in the remainder of the workshare region.

16 Execution Model Events


17 The workshare-begin event occurs after an implicit task encounters a workshare construct but
18 before the task starts to execute the structured block of the workshare region.
19 The workshare-end event occurs after an implicit task finishes execution of a workshare region
20 but before it resumes execution of the enclosing context.

21 Tool Callbacks
22 A thread dispatches a registered ompt_callback_work callback with ompt_scope_begin
23 as its endpoint argument and ompt_work_workshare as its work_type argument for each
24 occurrence of a workshare-begin event in that thread. Similarly, a thread dispatches a registered
25 ompt_callback_work callback with ompt_scope_end as its endpoint argument and
26 ompt_work_workshare as its work_type argument for each occurrence of a workshare-end
27 event in that thread. The callbacks occur in the context of the implicit task. The callbacks have type
28 signature ompt_callback_work_t.

29 Restrictions
30 Restrictions to the workshare construct are as follows:
31 • The only OpenMP constructs that may be closely nested inside a workshare construct are the
32 atomic, critical, and parallel constructs.
33 • Base language statements that are encountered inside a workshare construct but that are not
34 enclosed within a parallel or atomic construct that is nested inside the workshare
35 construct must consist of only the following:
36 – array assignments;

246 OpenMP API – Version 5.2 November 2021


1 – scalar assignments;
2 – FORALL statements;
3 – FORALL constructs;
4 – WHERE statements;
5 – WHERE constructs; and
6 – BLOCK constructs that are strictly structured blocks associated with OpenMP directives.
7 • All array assignments, scalar assignments, and masked array assignments that are encountered
8 inside a workshare construct but are not nested inside a parallel construct that is nested
9 inside the workshare construct must be intrinsic assignments.
10 • The construct must not contain any user-defined function calls unless either the function is pure
11 and elemental or the function call is contained inside a parallel construct that is nested inside
12 the workshare construct.
13 Cross References
14 • ompt_callback_work_t, see Section 19.5.2.5
15 • ompt_scope_endpoint_t, see Section 19.4.4.11
16 • ompt_work_t, see Section 19.4.4.16
17 • atomic directive, see Section 15.8.4
18 • critical directive, see Section 15.2
19 • nowait clause, see Section 15.6
20 • parallel directive, see Section 10.1
Fortran

21 11.5 Worksharing-Loop Constructs


22 Binding
23 The binding thread set for a worksharing-loop region is the current team. A worksharing-loop
24 region binds to the innermost enclosing parallel region. Only those threads participate in
25 execution of the loop iterations and the implied barrier of the worksharing-loop region when that
26 barrier is not eliminated by a nowait clause.
27 Semantics
28 The worksharing-loop construct is a worksharing construct that specifies that the iterations of one
29 or more associated loops will be executed in parallel by threads in the team in the context of their
30 implicit tasks. The iterations are distributed across threads that already exist in the team that is
31 executing the parallel region to which the worksharing-loop region binds. Each thread executes
32 its assigned chunks in the context of its implicit task. The iterations of a given chunk are executed
33 in sequential order.

CHAPTER 11. WORK-DISTRIBUTION CONSTRUCTS 247


1 If specified, the schedule clause determines the schedule of the logical iterations associated with
2 the construct. That is, it determines the division of iterations into chunks and how those chunks are
3 assigned to the threads. If the schedule clause is not specified then the schedule is
4 implementation defined.
5 At the beginning of each logical iteration, the loop iteration variable or the variable declared by
6 range-decl of each associated loop has the value that it would have if the set of the associated loops
7 was executed sequentially.
8 The schedule is reproducible if one of the following conditions is true:
9 • The order clause is specified with the reproducible order-modifier; or
10 • The schedule clause is specified with static as the kind argument but not the simd
11 ordering-modifier and the order clause is not specified with the unconstrained
12 order-modifier.
13 Programs can only depend on which thread executes a particular iteration if the schedule is
14 reproducible. Schedule reproducibility also determines the consistency with the execution of
15 constructs with the same schedule.
16 Execution Model Events
17 The ws-loop-begin event occurs after an implicit task encounters a worksharing-loop construct but
18 before the task starts execution of the structured block of the worksharing-loop region.
19 The ws-loop-end event occurs after a worksharing-loop region finishes execution but before
20 resuming execution of the encountering task.
21 The ws-loop-iteration-begin event occurs at the beginning of each iteration of a worksharing-loop
22 region. The ws-loop-chunk-begin event occurs for each scheduled chunk of a worksharing-loop
23 region before the implicit task executes any of the associated iterations.
24 Tool Callbacks
25 A thread dispatches a registered ompt_callback_work callback with ompt_scope_begin
26 as its endpoint argument for each occurrence of a ws-loop-begin event in that thread. Similarly, a
27 thread dispatches a registered ompt_callback_work callback with ompt_scope_end as its
28 endpoint argument for each occurrence of a ws-loop-end event in that thread. The callbacks occur
29 in the context of the implicit task. The callbacks have type signature ompt_callback_work_t
30 and the work_type argument indicates the schedule as shown in Table 11.1.
31 A thread dispatches a registered ompt_callback_dispatch callback for each occurrence of a
32 ws-loop-iteration-begin or ws-loop-chunk-begin event in that thread. The callback occurs in the
33 context of the implicit task. The callback has type signature ompt_callback_dispatch_t.

248 OpenMP API – Version 5.2 November 2021


TABLE 11.1: ompt_callback_work Callback Work Types for Worksharing-Loop

Value of work_type If determined schedule is

ompt_work_loop unknown at runtime


ompt_work_loop_static static
ompt_work_loop_dynamic dynamic
ompt_work_loop_guided guided
ompt_work_loop_other implementation specific

1 Restrictions
2 Restrictions to the worksharing-loop construct are as follows:
3 • The logical iteration space of the loops associated with the worksharing-loop construct must be
4 the same for all threads in the team.
5 • The value of the run-sched-var ICV must be the same for all threads in the team.

6 Cross References
7 • Consistent Loop Schedules, see Section 4.4.5
8 • OMP_SCHEDULE, see Section 21.2.1
9 • ompt_callback_work_t, see Section 19.5.2.5
10 • ompt_scope_endpoint_t, see Section 19.4.4.11
11 • ompt_work_t, see Section 19.4.4.16
12 • do directive, see Section 11.5.2
13 • for directive, see Section 11.5.1
14 • nowait clause, see Section 15.6
15 • order clause, see Section 10.3
16 • schedule clause, see Section 11.5.3

CHAPTER 11. WORK-DISTRIBUTION CONSTRUCTS 249


C / C++

1 11.5.1 for Construct


Name: for Association: loop-associated
Category: executable Properties: work-distribution, workshar-
2
ing, worksharing-loop, cancellable, context-
matching

3 Separating directives
4 scan

5 Clauses
6 allocate, collapse, firstprivate, lastprivate, linear, nowait, order,
7 ordered, private, reduction, schedule

8 Semantics
9 The for construct is a worksharing-loop construct.

10 Cross References
11 • Worksharing-Loop Constructs, see Section 11.5
12 • allocate clause, see Section 6.6
13 • collapse clause, see Section 4.4.3
14 • firstprivate clause, see Section 5.4.4
15 • lastprivate clause, see Section 5.4.5
16 • linear clause, see Section 5.4.6
17 • nowait clause, see Section 15.6
18 • order clause, see Section 10.3
19 • ordered clause, see Section 4.4.4
20 • private clause, see Section 5.4.3
21 • reduction clause, see Section 5.5.8
22 • scan directive, see Section 5.6
23 • schedule clause, see Section 11.5.3
C / C++

250 OpenMP API – Version 5.2 November 2021


Fortran

1 11.5.2 do Construct
Name: do Association: loop
Category: executable Properties: work-distribution, workshar-
2
ing, worksharing-loop, cancellable, context-
matching

3 Separating directives
4 scan

5 Clauses
6 allocate, collapse, firstprivate, lastprivate, linear, nowait, order,
7 ordered, private, reduction, schedule

8 Semantics
9 The do construct is a worksharing-loop construct.

10 Cross References
11 • Worksharing-Loop Constructs, see Section 11.5
12 • allocate clause, see Section 6.6
13 • collapse clause, see Section 4.4.3
14 • firstprivate clause, see Section 5.4.4
15 • lastprivate clause, see Section 5.4.5
16 • linear clause, see Section 5.4.6
17 • nowait clause, see Section 15.6
18 • order clause, see Section 10.3
19 • ordered clause, see Section 4.4.4
20 • private clause, see Section 5.4.3
21 • reduction clause, see Section 5.5.8
22 • scan directive, see Section 5.6
23 • schedule clause, see Section 11.5.3
Fortran

CHAPTER 11. WORK-DISTRIBUTION CONSTRUCTS 251


1 11.5.3 schedule Clause
2 Name: schedule Properties: unique

3 Arguments
Name Type Properties
kind Keyword: auto, dynamic, guided, default
4 runtime, static
chunk_size expression of integer type ultimate, optional, posi-
tive, region-invariant

5 Modifiers
Name Modifies Type Properties
ordering-modifier kind Keyword: monotonic, unique
6
nonmonotonic
chunk-modifier kind Keyword: simd unique

7 Directives
8 do, for

9 Semantics
10 The schedule clause specifies how iterations of associated loops of a worksharing-loop construct
11 are divided into contiguous non-empty subsets, called chunks, and how these chunks are distributed
12 among threads of the team. The chunk_size expression is evaluated using the original list items of
13 any variables that are made private in the worksharing-loop construct. Whether, in what order, or
14 how many times, any side effects of the evaluation of this expression occur is unspecified. The use
15 of a variable in a schedule clause expression of a worksharing-loop construct causes an implicit
16 reference to the variable in all enclosing constructs.
17 If the kind argument is static, iterations are divided into chunks of size chunk_size, and the
18 chunks are assigned to the threads in the team in a round-robin fashion in the order of the thread
19 number. Each chunk contains chunk_size iterations, except for the chunk that contains the
20 sequentially last iteration, which may have fewer iterations. If chunk_size is not specified, the
21 logical iteration space is divided into chunks that are approximately equal in size, and at most one
22 chunk is distributed to each thread.
23 If the kind argument is dynamic, each thread executes a chunk, then requests another chunk, until
24 no chunks remain to be assigned. Each chunk contains chunk_size iterations, except for the chunk
25 that contains the sequentially last iteration, which may have fewer iterations. If chunk_size is not
26 specified, it defaults to 1.
27 If the kind argument is guided, each thread executes a chunk, then requests another chunk, until
28 no chunks remain to be assigned. For a chunk_size of 1, the size of each chunk is proportional to
29 the number of unassigned iterations divided by the number of threads in the team, decreasing to 1.
30 For a chunk_size with value k > 1, the size of each chunk is determined in the same way, with the

252 OpenMP API – Version 5.2 November 2021


1 restriction that the chunks do not contain fewer than k iterations (except for the chunk that contains
2 the sequentially last iteration, which may have fewer than k iterations). If chunk_size is not
3 specified, it defaults to 1.
4 If the kind argument is auto, the decision regarding scheduling is implementation defined.
5 If the kind argument is runtime, the decision regarding scheduling is deferred until runtime, and
6 the behavior is as if the clause specifies kind, chunk-size and ordering-modifier as set in the
7 run-sched-var ICV. If the schedule clause explicitly specifies any modifiers then they override
8 any corresponding modifiers that are specified in the run-sched-var ICV.
9 If the simd chunk-modifier is specified and the loop is associated with a SIMD construct,
10 new_chunk_size = dchunk_size/simd_widthe e ∗ simd_width is the chunk_size for all chunks
11 except the first and last chunks, where simd_width is an implementation-defined value. The first
12 chunk will have at least new_chunk_size iterations except if it is also the last chunk. The last chunk
13 may have fewer iterations than new_chunk_size. If the simd modifier is specified and the loop is
14 not associated with a SIMD construct, the modifier is ignored.
15

16 Note – For a team of p threads and a loop of n iterations, let dn/pee be the integer q that satisfies
17 n = p ∗ q − r, with 0 <= r < p. One compliant implementation of the static schedule (with no
18 specified chunk_size) would behave as though chunk_size had been specified with value q. Another
19 compliant implementation would assign q iterations to the first p − r threads, and q − 1 iterations to
20 the remaining r threads. This illustrates why a conforming program must not rely on the details of a
21 particular implementation.
22 A compliant implementation of the guided schedule with a chunk_size value of k would assign
23 q = dn/pe e iterations to the first available thread and set n to the larger of n − q and p ∗ k. It would
24 then repeat this process until q is greater than or equal to the number of remaining iterations, at
25 which time the remaining iterations form the final chunk. Another compliant implementation could
26 use the same method, except with q = dn/(2p)e e, and set n to the larger of n − q and 2 ∗ p ∗ k.
27
28 If the monotonic ordering-modifier is specified then each thread executes the chunks that it is
29 assigned in increasing logical iteration order. When the nonmonotonic ordering-modifier is
30 specified then chunks may be assigned to threads in any order and the behavior of an application
31 that depends on any execution order of the chunks is unspecified. If an ordering-modifier is not
32 specified, the effect is as if the monotonic modifier is specified if the kind argument is static
33 or an ordered clause is specified on the construct; otherwise, the effect is as if the
34 nonmonotonic modifier is specified.

35 Restrictions
36 Restrictions to the schedule clause are as follows:
37 • The schedule clause cannot be specified if any of the associated loops are non-rectangular.
38 • The value of the chunk_size expression must be the same for all threads in the team.

CHAPTER 11. WORK-DISTRIBUTION CONSTRUCTS 253


1 • If runtime or auto is specified for kind, chunk_size must not be specified.
2 • The nonmonotonic ordering-modifier cannot be specified if an ordered clause is specified
3 on the same construct.

4 Cross References
5 • do directive, see Section 11.5.2
6 • for directive, see Section 11.5.1
7 • ordered clause, see Section 4.4.4
8 • run-sched-var ICV, see Table 2.1

9 11.6 distribute Construct


Name: distribute Association: loop
10
Category: executable Properties: work-distribution

11 Clauses
12 allocate, collapse, dist_schedule, firstprivate, lastprivate, order,
13 private

14 Binding
15 The binding thread set for a distribute region is the set of initial threads executing an
16 enclosing teams region. A distribute region binds to this teams region.
17 Semantics
18 The distribute construct specifies that the iterations of one or more loops will be executed by
19 the initial teams in the context of their implicit tasks. The iterations are distributed across the initial
20 threads of all initial teams that execute the teams region to which the distribute region binds.
21 No implicit barrier occurs at the end of a distribute region. To avoid data races the original list
22 items that are modified due to lastprivate clauses should not be accessed between the end of
23 the distribute construct and the end of the teams region to which the distribute binds.
24 If the dist_schedule clause is not specified, the schedule is implementation defined.
25 At the beginning of each logical iteration, the loop iteration variable or the variable declared by
26 range-decl of each associated loop has the value that it would have if the set of the associated loops
27 was executed sequentially.
28 The schedule is reproducible if one of the following conditions is true:
29 • The order clause is specified with the reproducible modifier; or
30 • The dist_schedule clause is specified with static as the kind parameter and the order
31 clause is not specified with the unconstrained order-modifier.

254 OpenMP API – Version 5.2 November 2021


1 Programs can only depend on which team executes a particular iteration if the schedule is
2 reproducible. Schedule reproducibility also determines the consistency with the execution of
3 constructs with the same schedule.
4 Execution Model Events
5 The distribute-begin event occurs after an initial task encounters a distribute construct but
6 before the task starts to execute the structured block of the distribute region.
7 The distribute-end event occurs after an initial task finishes execution of a distribute region
8 but before it resumes execution of the enclosing context.
9 The distribute-chunk-begin event occurs for each scheduled chunk of a distribute region
10 before execution of any associated iteration.
11 Tool Callbacks
12 A thread dispatches a registered ompt_callback_work callback with ompt_scope_begin
13 as its endpoint argument and ompt_work_distribute as its work_type argument for each
14 occurrence of a distribute-begin event in that thread. Similarly, a thread dispatches a registered
15 ompt_callback_work callback with ompt_scope_end as its endpoint argument and
16 ompt_work_distribute as its work_type argument for each occurrence of a distribute-end
17 event in that thread. The callbacks occur in the context of the implicit task. The callbacks have type
18 signature ompt_callback_work_t.
19 A thread dispatches a registered ompt_callback_dispatch callback for each occurrence of a
20 distribute-chunk-begin event in that thread. The callback occurs in the context of the initial task.
21 The callback has type signature ompt_callback_dispatch_t.
22 Restrictions
23 Restrictions to the distribute construct are as follows:
24 • The logical iteration space of the loops associated with the distribute construct must be the
25 same for all teams in the league.
26 • The region that corresponds to the distribute construct must be strictly nested inside a
27 teams region.
28 • A list item may appear in a firstprivate or lastprivate clause, but not in both.
29 • The conditional lastprivate-modifier must not be specified.
30 Cross References
31 • Consistent Loop Schedules, see Section 4.4.5
32 • ompt_callback_work_t, see Section 19.5.2.5
33 • ompt_work_t, see Section 19.4.4.16
34 • allocate clause, see Section 6.6
35 • collapse clause, see Section 4.4.3
36 • dist_schedule clause, see Section 11.6.1

CHAPTER 11. WORK-DISTRIBUTION CONSTRUCTS 255


1 • firstprivate clause, see Section 5.4.4
2 • lastprivate clause, see Section 5.4.5
3 • order clause, see Section 10.3
4 • private clause, see Section 5.4.3
5 • teams directive, see Section 10.2

6 11.6.1 dist_schedule Clause


7 Name: dist_schedule Properties: unique

8 Arguments
Name Type Properties
kind Keyword: static default
9
chunk_size expression of integer type ultimate, optional, posi-
tive, region-invariant

10 Directives
11 distribute
12 Semantics
13 The dist_schedule clause specifies how iterations of associated loops of a distribute
14 construct are divided into contiguous non-empty subsets, called chunks, and how these chunks are
15 distributed among the teams of the league. if chunk_size is not specified, the iteration space is
16 divided into chunks that are approximately equal in size, and at most one chunk is distributed to
17 each initial team of the league.
18 If the chunk_size argument is specified, iterations are divided into chunks of size chunk_size. The
19 chunk_size expression is evaluated using the original list items of any variables that are made
20 private in the distribute construct. Whether, in what order, or how many times, any side
21 effects of the evaluation of this expression occur is unspecified. The use of a variable in a
22 dist_schedule clause expression of a distribute construct causes an implicit reference to
23 the variable in all enclosing constructs. These chunks are assigned to the initial teams of the league
24 in a round-robin fashion in the order of the initial team number.
25 Restrictions
26 Restrictions to the dist_schedule clause are as follows:
27 • The value of the chunk_size expression must be the same for all teams in the league.
28 • The dist_schedule clause cannot be specified if any of the associated loops are
29 non-rectangular.

30 Cross References
31 • distribute directive, see Section 11.6

256 OpenMP API – Version 5.2 November 2021


1 11.7 loop Construct
Name: loop Association: loop-associated
2 Category: executable Properties: work-distribution, worksharing,
simdizable

3 Clauses
4 bind, collapse, lastprivate, order, private, reduction

5 Binding
6 The bind clause determines the binding region, which determines the binding thread set.

7 Semantics
8 A loop construct specifies that the logical iterations of the associated loops may execute
9 concurrently and permits the encountering threads to execute the loop accordingly. A loop
10 construct is a worksharing construct if its binding region is the innermost enclosing parallel region.
11 Otherwise it is not a worksharing region. The directive asserts that the iterations of the associated
12 loops may execute in any order, including concurrently. Each logical iteration is executed once per
13 instance of the loop region that is encountered by exactly one thread that is a member of the
14 binding thread set.
15 At the beginning of each logical iteration, the loop iteration variable or the variable declared by
16 range-decl of each associated loop has the value that it would have if the set of the associated loops
17 was executed sequentially.
18 If the order clause is not present, the behavior is as if an order clause that specifies
19 concurrent appeared on the construct. The loop schedule for a loop construct is reproducible
20 unless the order clause is present with the unconstrained order-modifier.
21 If the loop region binds to a teams region, the threads in the binding thread set may continue
22 execution after the loop region without waiting for all logical iterations of the associated loops to
23 complete. The iterations are guaranteed to complete before the end of the teams region. If the
24 loop region does not bind to a teams region, all logical iterations of the associated loops must
25 complete before the encountering threads continue execution after the loop region.
26 For the purpose of determining its consistency with other schedules, the schedule is defined by the
27 implicit order clause. The schedule is reproducible if the schedule specified through the implicit
28 order clause is reproducible.

29 Restrictions
30 Restrictions to the loop construct are as follows:
31 • A list item may not appear in a lastprivate clause unless it is the loop iteration variable of a
32 loop that is associated with the construct.
33 • If a reduction-modifier is specified in a reduction clause that appears on the directive then the
34 reduction modifier must be default.

CHAPTER 11. WORK-DISTRIBUTION CONSTRUCTS 257


1 • If a loop construct is not nested inside another OpenMP construct then the bind clause must
2 be present.
3 • If a loop region binds to a teams or parallel region, it must be encountered by all threads in
4 the binding thread set or by none of them.
5 Cross References
6 • Consistent Loop Schedules, see Section 4.4.5
7 • bind clause, see Section 11.7.1
8 • collapse clause, see Section 4.4.3
9 • lastprivate clause, see Section 5.4.5
10 • order clause, see Section 10.3
11 • private clause, see Section 5.4.3
12 • reduction clause, see Section 5.5.8
13 • teams directive, see Section 10.2

14 11.7.1 bind Clause


15 Name: bind Properties: unique

16 Arguments
Name Type Properties
17 binding Keyword: parallel, teams, default
thread

18 Directives
19 loop
20 Semantics
21 The bind clause specifies the binding region of the construct on which it appears. Specifically, if
22 binding is teams and an innermost enclosing teams region exists then the binding region is that
23 teams region; if binding is parallel then the binding region is the innermost enclosing parallel
24 region, which may be an implicit parallel region; and if binding is thread then the binding region
25 is not defined. If the bind clause is not specified on a construct for which it may be specified and
26 the construct is closely nested inside a teams or parallel construct, the effect is as if binding is
27 teams or parallel. If none of those conditions hold, the binding region is not defined.
28 The specified binding region determines the binding thread set. Specifically, if the binding region is
29 a teams region, then the binding thread set is the set of initial threads that are executing that
30 region while if the binding region is a parallel region, then the binding thread set is the team of
31 threads that are executing that region. If the binding region is not defined, then the binding thread
32 set is the encountering thread.

258 OpenMP API – Version 5.2 November 2021


1 Restrictions
2 Restrictions to the bind clause are as follows:
3 • If teams is specified as binding then the corresponding loop region must be strictly nested
4 inside a teams region.
5 • If teams is specified as binding and the corresponding loop region executes on a non-host
6 device then the behavior of a reduction clause that appears on the corresponding loop
7 construct is unspecified if the construct is not nested inside a teams construct.
8 • If parallel is specified as binding, the behavior is unspecified if the corresponding loop
9 region is closely nested inside a simd region.

10 Cross References
11 • loop directive, see Section 11.7
12 • parallel construct, see Section 10.1
13 • teams construct, see Section 10.2.

CHAPTER 11. WORK-DISTRIBUTION CONSTRUCTS 259


1 12 Tasking Constructs
2 This chapter defines directives and concepts related to explicit tasks.

3 12.1 untied Clause


4 Name: untied Properties: unique, inarguable

5 Directives
6 task, taskloop

7 Semantics
8 The untied clause specifies that tasks generated by the construct on which it appears are untied,
9 which means that any thread in the team can resume the task region after a suspension. If the
10 untied clause is not specified on a construct on which it may appear, generated tasks are tied; if a
11 tied task is suspended, its task region can only be resumed by the thread that started its execution.
12 If a generated task is a final or an included task, the untied clause is ignored and the task is tied.

13 Cross References
14 • task directive, see Section 12.5
15 • taskloop directive, see Section 12.6

16 12.2 mergeable Clause


17 Name: mergeable Properties: unique, inarguable

18 Directives
19 task, taskloop

20 Semantics
21 The mergeable clause specifies that tasks generated by the construct on which it appears are
22 mergeable tasks.

23 Cross References
24 • task directive, see Section 12.5
25 • taskloop directive, see Section 12.6

260
1 12.3 final Clause
2 Name: final Properties: unique

3 Arguments
Name Type Properties
4
finalize expression of logical type default

5 Directives
6 task, taskloop

7 Semantics
8 The final clause specifies that tasks generated by the construct on which it appears are final tasks
9 if the finalize expression evaluates to true. All task constructs that are encountered during
10 execution of a final task generate final and included tasks. The use of a variable in a finalize
11 expression causes an implicit reference to the variable in all enclosing constructs. The finalize
12 expression is evaluated in the context outside of the construct on which the clause appears,

13 Cross References
14 • task directive, see Section 12.5
15 • taskloop directive, see Section 12.6

16 12.4 priority Clause


17 Name: priority Properties: unique

18 Arguments
Name Type Properties
19
priority-value expression of integer type constant, non-negative

20 Directives
21 task, taskloop

22 Semantics
23 The priority clause specifies a hint for the task execution order of tasks generated by the
24 construct on which it appears in the priority-value argument. Among all tasks ready to be executed,
25 higher priority tasks (those with a higher numerical priority-value) are recommended to execute
26 before lower priority ones. The default priority-value when no priority clause is specified is
27 zero (the lowest priority). If a specified priority-value is higher than the max-task-priority-var ICV
28 then the implementation will use the value of that ICV. A program that relies on the task execution
29 order being determined by the priority-value may have unspecified behavior.

CHAPTER 12. TASKING CONSTRUCTS 261


1 Cross References
2 • max-task-priority-var ICV, see Table 2.1
3 • task directive, see Section 12.5
4 • taskloop directive, see Section 12.6

5 12.5 task Construct


Name: task Association: block
6 Category: executable Properties: parallelism-generating, thread-
limiting, task-generating

7 Clauses
8 affinity, allocate, default, depend, detach, final, firstprivate, if,
9 in_reduction, mergeable, priority, private, shared, untied
10 Clause set
11 Properties: exclusive Members: detach, mergeable

12 Binding
13 The binding thread set of the task region is the current team. A task region binds to the
14 innermost enclosing parallel region.
15 Semantics
16 When a thread encounters a task construct, an explicit task is generated from the code for the
17 associated structured block. The data environment of the task is created according to the
18 data-sharing attribute clauses on the task construct, per-data environment ICVs, and any defaults
19 that apply. The data environment of the task is destroyed when the execution code of the associated
20 structured block is completed.
21 The encountering thread may immediately execute the task, or defer its execution. In the latter case,
22 any thread in the team may be assigned the task. Completion of the task can be guaranteed using
23 task synchronization constructs and clauses. If a task construct is encountered during execution
24 of an outer task, the generated task region that corresponds to this construct is not a part of the
25 outer task region unless the generated task is an included task.
26 A detachable task is completed when the execution of its associated structured block is completed
27 and the allow-completion event is fulfilled. If no detach clause is present on a task construct,
28 the generated task is completed when the execution of its associated structured block is completed.
29 A thread that encounters a task scheduling point within the task region may temporarily suspend
30 the task region.
31 The task construct includes a task scheduling point in the task region of its generating task,
32 immediately following the generation of the explicit task. Each explicit task region includes a
33 task scheduling point at the end of its associated structured block.

262 OpenMP API – Version 5.2 November 2021


1
2 Note – When storage is shared by an explicit task region, the programmer must ensure, by
3 adding proper synchronization, that the storage does not reach the end of its lifetime before the
4 explicit task region completes its execution.
5

6 When an if clause is present on a task construct and the if clause expression evaluates to false,
7 an undeferred task is generated, and the encountering thread must suspend the current task region,
8 for which execution cannot be resumed until execution of the structured block that is associated
9 with the generated task is completed. The use of a variable in an if clause expression of a task
10 construct causes an implicit reference to the variable in all enclosing constructs. The if clause
11 expression is evaluated in the context outside of the task construct.

12 Execution Model Events


13 The task-create event occurs when a thread encounters a construct that causes a new task to be
14 created. The event occurs after the task is initialized but before it begins execution or is deferred.

15 Tool Callbacks
16 A thread dispatches a registered ompt_callback_task_create callback for each occurrence
17 of a task-create event in the context of the encountering task. This callback has the type signature
18 ompt_callback_task_create_t and the flags argument indicates the task types shown in
19 Table 12.1.

TABLE 12.1: ompt_callback_task_create Callback Flags Evaluation

Operation Evaluates to true

(flags & ompt_task_explicit) Always in the dispatched callback


(flags & ompt_task_undeferred) If the task is an undeferred task
(flags & ompt_task_final) If the task is a final task
(flags & ompt_task_untied) If the task is an untied task
(flags & ompt_task_mergeable) If the task is a mergeable task
(flags & ompt_task_merged) If the task is a merged task

20 Cross References
21 • Task Scheduling, see Section 12.9
22 • omp_fulfill_event, see Section 18.11.1
23 • ompt_callback_task_create_t, see Section 19.5.2.7
24 • affinity clause, see Section 12.5.1
25 • allocate clause, see Section 6.6

CHAPTER 12. TASKING CONSTRUCTS 263


1 • default clause, see Section 5.4.1
2 • depend clause, see Section 15.9.5
3 • detach clause, see Section 12.5.2
4 • final clause, see Section 12.3
5 • firstprivate clause, see Section 5.4.4
6 • if clause, see Section 3.4
7 • in_reduction clause, see Section 5.5.10
8 • mergeable clause, see Section 12.2
9 • priority clause, see Section 12.4
10 • private clause, see Section 5.4.3
11 • shared clause, see Section 5.4.2
12 • untied clause, see Section 12.1

13 12.5.1 affinity Clause


14 Name: affinity Properties: unique

15 Arguments
Name Type Properties
16
locator-list list of locator list item type default

17 Modifiers
Name Modifies Type Properties
iterator locator-list Complex, name: iterator unique
Arguments:
18
iterator-specifier OpenMP
expression (repeatable)

19 Directives
20 task

21 Semantics
22 The affinity clause specifies a hint to indicate data affinity of tasks generated by the construct
23 on which it appears. The hint recommends to execute generated tasks close to the location of the
24 original list items. A program that relies on the task execution location being determined by this list
25 may have unspecified behavior.

264 OpenMP API – Version 5.2 November 2021


1 The list items that appear in the affinity clause may also appear in data-environment clauses.
2 The list items may reference any iterators-identifier that is defined in the same clause and may
3 include array sections.
C / C++
4 The list items that appear in the affinity clause may use shape-operators.
C / C++
5 Cross References
6 • iterator modifier, see Section 3.2.6
7 • task directive, see Section 12.5

8 12.5.2 detach Clause


9 Name: detach Properties: unique

10 Arguments
Name Type Properties
11
event-handle variable of event_handle type default

12 Directives
13 task

14 Semantics
15 The detach clause specifies that the task generated by the construct on which it appears is a
16 detachable task. A new allow-completion event is created and connected to the completion of the
17 associated task region. The original event-handle is updated to represent that allow-completion
18 event before the task data environment is created. The event-handle is considered as if it was
19 specified on a firstprivate clause. The use of a variable in a detach clause expression of a
20 task construct causes an implicit reference to the variable in all enclosing constructs.

21 Restrictions
22 Restrictions to the detach clause are as follows:
23 • If a detach clause appears on a directive, then the encountering task must not be a final task.
24 • A variable that appears in a detach clause cannot appear as a list item on a data-environment
25 attribute clause on the same construct.
26 • A variable that is part of another variable (as an array element or a structure element) cannot
27 appear in a detach clause.

CHAPTER 12. TASKING CONSTRUCTS 265


Fortran
1 • event-handle must not have the POINTER attribute.
2 • If event-handle has the ALLOCATABLE attribute, the allocation status must be allocated when
3 the task construct is encountered, and the allocation status must not be changed, either
4 explicitly or implicitly, in the task region.
Fortran
5 Cross References
6 • firstprivate clause, see Section 5.4.4.
7 • task directive, see Section 12.5

8 12.6 taskloop Construct


Name: taskloop Association: loop
9 Category: executable Properties: parallelism-generating, task-
generating

10 Clauses
11 allocate, collapse, default, final, firstprivate, grainsize, if,
12 in_reduction, lastprivate, mergeable, nogroup, num_tasks, priority,
13 private, reduction, shared, untied

14 Clause set synchronization-clause


15 Properties: exclusive Members: nogroup, reduction

16 Clause set granularity-clause


17 Properties: exclusive Members: grainsize, num_tasks

18 Binding
19 The binding thread set of the taskloop region is the current team. A taskloop region binds to
20 the innermost enclosing parallel region.

21 Semantics
22 When a thread encounters a taskloop construct, the construct partitions the iterations of the
23 associated loops into chunks, each of which is assigned to an explicit task for parallel execution.
24 The iteration count for each associated loop is computed before entry to the outermost loop. The
25 data environment of each generated task is created according to the data-sharing attribute clauses
26 on the taskloop construct, per-data environment ICVs, and any defaults that apply. The order of
27 the creation of the loop tasks is unspecified. Programs that rely on any execution order of the
28 logical iterations are non-conforming.

266 OpenMP API – Version 5.2 November 2021


1 If the nogroup clause is not present, the taskloop construct executes as if it was enclosed in a
2 taskgroup construct with no statements or directives outside of the taskloop construct. Thus,
3 the taskloop construct creates an implicit taskgroup region. If the nogroup clause is
4 present, no implicit taskgroup region is created.
5 If a reduction clause is present, the behavior is as if a task_reduction clause with the
6 same reduction operator and list items was applied to the implicit taskgroup construct that
7 encloses the taskloop construct. The taskloop construct executes as if each generated task
8 was defined by a task construct on which an in_reduction clause with the same reduction
9 operator and list items is present. Thus, the generated tasks are participants of the reduction defined
10 by the task_reduction clause that was applied to the implicit taskgroup construct.
11 If an in_reduction clause is present, the behavior is as if each generated task was defined by a
12 task construct on which an in_reduction clause with the same reduction operator and list
13 items is present. Thus, the generated tasks are participants of a reduction previously defined by a
14 reduction scoping clause.
15 If no clause from the granularity-clause set is present, the number of loop tasks generated and the
16 number of logical iterations assigned to these tasks is implementation defined.
17 At the beginning of each logical iteration, the loop iteration variable or the variable declared by
18 range-decl of each associated loop has the value that it would have if the set of the associated loops
19 was executed sequentially.
20 When an if clause is present and the if clause expression evaluates to false, undeferred tasks are
21 generated. The use of a variable in an if clause expression causes an implicit reference to the
22 variable in all enclosing constructs.
C++
23 For firstprivate variables of class type, the number of invocations of copy constructors that
24 perform the initialization is implementation defined.
C++
25
26 Note – When storage is shared by a taskloop region, the programmer must ensure, by adding
27 proper synchronization, that the storage does not reach the end of its lifetime before the taskloop
28 region and its descendent tasks complete their execution.
29

30 Execution Model Events


31 The taskloop-begin event occurs upon entering the taskloop region. A taskloop-begin will
32 precede any task-create events for the generated tasks. The taskloop-end event occurs upon
33 completion of the taskloop region.
34 Events for an implicit taskgroup region that surrounds the taskloop region are the same as for
35 the taskgroup construct.

CHAPTER 12. TASKING CONSTRUCTS 267


1 The taskloop-iteration-begin event occurs at the beginning of each iteration of a taskloop region
2 before an explicit task executes the iteration. The taskloop-chunk-begin event occurs before an
3 explicit task executes any of its associated iterations in a taskloop region.
4 Tool Callbacks
5 A thread dispatches a registered ompt_callback_work callback for each occurrence of a
6 taskloop-begin and taskloop-end event in that thread. The callback occurs in the context of the
7 encountering task. The callback has type signature ompt_callback_work_t. The callback
8 receives ompt_scope_begin or ompt_scope_end as its endpoint argument, as appropriate,
9 and ompt_work_taskloop as its work_type argument.
10 A thread dispatches a registered ompt_callback_dispatch callback for each occurrence of a
11 taskloop-iteration-begin or taskloop-chunk-begin event in that thread.
12 The callback binds to the explicit task executing the iterations. The callback has type signature
13 ompt_callback_dispatch_t.
14 Restrictions
15 Restrictions to the taskloop construct are as follows:
16 • The reduction-modifier must be default.
17 • The conditional lastprivate-modifier must not be specified.
18 Cross References
19 • Canonical Loop Nest Form, see Section 4.4.1
20 • ompt_callback_dispatch_t, see Section 19.5.2.6
21 • ompt_callback_work_t, see Section 19.5.2.5
22 • ompt_scope_endpoint_t, see Section 19.4.4.11
23 • ompt_work_t, see Section 19.4.4.16
24 • allocate clause, see Section 6.6
25 • collapse clause, see Section 4.4.3
26 • default clause, see Section 5.4.1
27 • final clause, see Section 12.3
28 • firstprivate clause, see Section 5.4.4
29 • grainsize clause, see Section 12.6.1
30 • if clause, see Section 3.4
31 • in_reduction clause, see Section 5.5.10
32 • lastprivate clause, see Section 5.4.5
33 • mergeable clause, see Section 12.2

268 OpenMP API – Version 5.2 November 2021


1 • nogroup clause, see Section 15.7
2 • num_tasks clause, see Section 12.6.2
3 • priority clause, see Section 12.4
4 • private clause, see Section 5.4.3
5 • reduction clause, see Section 5.5.8
6 • shared clause, see Section 5.4.2
7 • task directive, see Section 12.5
8 • taskgroup directive, see Section 15.4
9 • untied clause, see Section 12.1

10 12.6.1 grainsize Clause


11 Name: grainsize Properties: unique

12 Arguments
Name Type Properties
13
grain-size expression of integer type positive

14 Modifiers
Name Modifies Type Properties
15
prescriptiveness grain-size Keyword: strict unique

16 Directives
17 taskloop
18 Semantics
19 The grainsize clause specifies the number of logical iterations, Lt , that are assigned to each
20 generated task t. If prescriptiveness is not specified as strict, other than possibly for the
21 generated task that contains the sequentially last iteration, Lt is greater than or equal to the
22 minimum of the value of the grain-size expression and the number of logical iterations, but less
23 than two times the value of the grain-size expression. If prescriptiveness is specified as strict,
24 other than possibly for the generated task that contains the sequentially last iteration, Lt is equal to
25 the value of the grain-size expression. In both cases, the generated task that contains the
26 sequentially last iteration may have fewer iterations than the value of the grain-size expression.
27 Restrictions
28 Restrictions to the grainsize clause are as follows:
29 • None of the associated loops may be non-rectangular loops.
30 Cross References
31 • taskloop directive, see Section 12.6

CHAPTER 12. TASKING CONSTRUCTS 269


1 12.6.2 num_tasks Clause
2 Name: num_tasks Properties: unique

3 Arguments
Name Type Properties
4
num-tasks expression of integer type positive

5 Modifiers
Name Modifies Type Properties
6
prescriptiveness num-tasks Keyword: strict unique

7 Directives
8 taskloop
9 Semantics
10 The num_tasks clause specifies that the taskloop construct create as many tasks as the
11 minimum of the num-tasks expression and the number of logical iterations. Each task must have at
12 least one logical iteration. If prescriptiveness is specified as strict for a task loop with N logical
13 iterations, the logical iterations are partitioned in a balanced manner and each partition is assigned,
14 in order, to a generated task. The partition size is dN/num-taskse e until the number of remaining
15 iterations divides the number of remaining tasks evenly, at which point the partition size becomes
16 bN/num-tasksc c.
17 Restrictions
18 Restrictions to the num_tasks clause are as follows:
19 • None of the associated loops may be non-rectangular loops.
20 Cross References
21 • taskloop directive, see Section 12.6

22 12.7 taskyield Construct


Name: taskyield Association: none
23
Category: executable Properties: default

24 Binding
25 A taskyield region binds to the current task region. The binding thread set of the taskyield
26 region is the current team.

27 Semantics
28 The taskyield region includes an explicit task scheduling point in the current task region.

29 Cross References
30 • Task Scheduling, see Section 12.9

270 OpenMP API – Version 5.2 November 2021


1 12.8 Initial Task
2 Execution Model Events
3 No events are associated with the implicit parallel region in each initial thread.
4 The initial-thread-begin event occurs in an initial thread after the OpenMP runtime invokes the tool
5 initializer but before the initial thread begins to execute the first OpenMP region in the initial task.
6 The initial-task-begin event occurs after an initial-thread-begin event but before the first OpenMP
7 region in the initial task begins to execute.
8 The initial-task-end event occurs before an initial-thread-end event but after the last OpenMP
9 region in the initial task finishes execution.
10 The initial-thread-end event occurs as the final event in an initial thread at the end of an initial task
11 immediately prior to invocation of the tool finalizer.
12 Tool Callbacks
13 A thread dispatches a registered ompt_callback_thread_begin callback for the
14 initial-thread-begin event in an initial thread. The callback occurs in the context of the initial
15 thread. The callback has type signature ompt_callback_thread_begin_t. The callback
16 receives ompt_thread_initial as its thread_type argument.
17 A thread dispatches a registered ompt_callback_implicit_task callback with
18 ompt_scope_begin as its endpoint argument for each occurrence of an initial-task-begin event
19 in that thread. Similarly, a thread dispatches a registered ompt_callback_implicit_task
20 callback with ompt_scope_end as its endpoint argument for each occurrence of an
21 initial-task-end event in that thread. The callbacks occur in the context of the initial task and have
22 type signature ompt_callback_implicit_task_t. In the dispatched callback,
23 (flag & ompt_task_initial) always evaluates to true.
24 A thread dispatches a registered ompt_callback_thread_end callback for the
25 initial-thread-end event in that thread. The callback occurs in the context of the thread. The
26 callback has type signature ompt_callback_thread_end_t. The implicit parallel region
27 does not dispatch a ompt_callback_parallel_end callback; however, the implicit parallel
28 region can be finalized within this ompt_callback_thread_end callback.
29 Cross References
30 • ompt_callback_implicit_task_t, see Section 19.5.2.11
31 • ompt_callback_parallel_begin_t, see Section 19.5.2.3
32 • ompt_callback_parallel_end_t, see Section 19.5.2.4
33 • ompt_callback_thread_begin_t, see Section 19.5.2.1
34 • ompt_callback_thread_end_t, see Section 19.5.2.2
35 • ompt_task_flag_t, see Section 19.4.4.19
36 • ompt_thread_t, see Section 19.4.4.10

CHAPTER 12. TASKING CONSTRUCTS 271


1 12.9 Task Scheduling
2 Whenever a thread reaches a task scheduling point, the implementation may cause it to perform a
3 task switch, beginning or resuming execution of a different task bound to the current team. Task
4 scheduling points are implied at the following locations:
5 • during the generation of an explicit task;
6 • the point immediately following the generation of an explicit task;
7 • after the point of completion of the structured block associated with a task;
8 • in a taskyield region;
9 • in a taskwait region;
10 • at the end of a taskgroup region;
11 • in an implicit barrier region;
12 • in an explicit barrier region;
13 • during the generation of a target region;
14 • the point immediately following the generation of a target region;
15 • at the beginning and end of a target data region;
16 • in a target update region;
17 • in a target enter data region;
18 • in a target exit data region;
19 • in the omp_target_memcpy routine;
20 • in the omp_target_memcpy_async routine;
21 • in the omp_target_memcpy_rect routine; and
22 • in the omp_target_memcpy_rect_async routine.
23 When a thread encounters a task scheduling point it may do one of the following, subject to the
24 Task Scheduling Constraints (below):
25 • begin execution of a tied task bound to the current team;
26 • resume any suspended task region, bound to the current team, to which it is tied;
27 • begin execution of an untied task bound to the current team; or
28 • resume any suspended untied task region bound to the current team.
29 If more than one of the above choices is available, which one is chosen is unspecified.

272 OpenMP API – Version 5.2 November 2021


1 Task Scheduling Constraints are as follows:
2 1. Scheduling of new tied tasks is constrained by the set of task regions that are currently tied to the
3 thread and that are not suspended in a barrier region. If this set is empty, any new tied task may
4 be scheduled. Otherwise, a new tied task may be scheduled only if it is a descendent task of
5 every task in the set.
6 2. A dependent task shall not start its execution until its task dependences are fulfilled.
7 3. A task shall not be scheduled while any task with which it is mutually exclusive has been
8 scheduled but has not yet completed.
9 4. When an explicit task is generated by a construct that contains an if clause for which the
10 expression evaluated to false, and the previous constraints are already met, the task is executed
11 immediately after generation of the task.
12 A program that relies on any other assumption about task scheduling is non-conforming.
13

14 Note – Task scheduling points dynamically divide task regions into parts. Each part is executed
15 uninterrupted from start to end. Different parts of the same task region are executed in the order in
16 which they are encountered. In the absence of task synchronization constructs, the order in which a
17 thread executes parts of different schedulable tasks is unspecified.
18 A program must behave correctly and consistently with all conceivable scheduling sequences that
19 are compatible with the rules above.
20 For example, if threadprivate storage is accessed (explicitly in the source code or implicitly
21 in calls to library routines) in one part of a task region, its value cannot be assumed to be preserved
22 into the next part of the same task region if another schedulable task exists that modifies it.
23 As another example, if a lock acquire and release happen in different parts of a task region, no
24 attempt should be made to acquire the same lock in any part of another task that the executing
25 thread may schedule. Otherwise, a deadlock is possible. A similar situation can occur when a
26 critical region spans multiple parts of a task and another schedulable task contains a
27 critical region with the same name.
28 The use of threadprivate variables and the use of locks or critical sections in an explicit task with an
29 if clause must take into account that when the if clause evaluates to false, the task is executed
30 immediately, without regard to Task Scheduling Constraint 2.
31

32 Execution Model Events


33 The task-schedule event occurs in a thread when the thread switches tasks at a task scheduling
34 point; no event occurs when switching to or from a merged task.

CHAPTER 12. TASKING CONSTRUCTS 273


1 Tool Callbacks
2 A thread dispatches a registered ompt_callback_task_schedule callback for each
3 occurrence of a task-schedule event in the context of the task that begins or resumes. This callback
4 has the type signature ompt_callback_task_schedule_t. The argument prior_task_status
5 is used to indicate the cause for suspending the prior task. This cause may be the completion of the
6 prior task region, the encountering of a taskyield construct, or the encountering of an active
7 cancellation point.

8 Cross References
9 • ompt_callback_task_schedule_t, see Section 19.5.2.10

274 OpenMP API – Version 5.2 November 2021


1 13 Device Directives and Clauses
2 This chapter defines constructs and concepts related to device execution.

3 13.1 device_type Clause


4 Name: device_type Properties: unique

5 Arguments
Name Type Properties
6
device-type-description Keyword: any, host, nohost default

7 Directives
8 begin declare target, declare target

9 Semantics
10 The device_type clause specifies if a version of the procedure or variable should be made
11 available on the host device, non-host devices or both the host device and non-host devices. If
12 host is specified then only a host device version of the procedure or variable is made available. If
13 any is specified then both host device and non-host device versions of the procedure or variable are
14 made available. If nohost is specified for a procedure then only non-host device versions of the
15 procedure are made available. If nohost is specified for a variable then that variable is not
16 available on the host device. If the device_type clause is not specified, the behavior is as if the
17 device_type clause appears with any specified.

18 Cross References
19 • begin declare target directive, see Section 7.8.2
20 • declare target directive, see Section 7.8.1

275
1 13.2 device Clause
2 Name: device Properties: unique

3 Arguments
Name Type Properties
4
device-description expression of integer type default

5 Modifiers
Name Modifies Type Properties
6 device-modifier device-description Keyword: ancestor, default
device_num

7 Directives
8 dispatch, interop, target, target data, target enter data, target exit
9 data, target update

10 Semantics
11 The device clause identifies the target device that is associated with a device construct.
12 If device_num is specified as the device-modifier, the device-description specifies the device
13 number of the target device. If device-modifier does not appear in the clause, the behavior of the
14 clause is as if device-modifier is device_num. If the device-description evaluates to
15 omp_invalid_device, runtime error termination is performed.
16 If ancestor is specified as the device-modifier, the device-description specifies the number of
17 target nesting level of the target device. Specifically, if the device-description evaluates to 1, the
18 target device is the parent device of the enclosing target region. If the construct on which the
19 device clause appears is not encountered in a target region, the current device is treated as the
20 parent device.
21 Unless otherwise specified, for directives that accept the device clause, if no device clause is
22 present, the behavior is as if the device clause appears without a device-modifier and with a
23 device-description that evaluates to the value of the default-device-var ICV.

24 Restrictions
25 • The ancestor device-modifier must not appear on the device clause on any directive other
26 than the target construct.
27 • If the ancestor device-modifier is specified, the device-description must evaluate to 1
28 and a requires directive with the reverse_offload clause must be specified;
29 • If the device_num device-modifier is specified and target-offload-var is not mandatory,
30 device-description must evaluate to a conforming device number.

276 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • dispatch directive, see Section 7.6
3 • interop directive, see Section 14.1
4 • target data directive, see Section 13.5
5 • target directive, see Section 13.8
6 • target enter data directive, see Section 13.6
7 • target exit data directive, see Section 13.7
8 • target update directive, see Section 13.9
9 • target-offload-var ICV, see Table 2.1

10 13.3 thread_limit Clause


11 Name: thread_limit Properties: unique

12 Arguments
Name Type Properties
13
threadlim expression of integer type positive

14 Directives
15 target, teams

16 Semantics
17 As described in Section 2.4, some constructs limit the number of threads that may participate in a
18 contention group initiated by each team by setting the value of the thread-limit-var ICV for the
19 initial task to an implementation-defined value greater than zero. If the thread_limit clause is
20 specified, the number of threads will be less than or equal to threadlim. Otherwise, if the
21 teams-thread-limit-var ICV is greater than zero, the effect is as if the thread_limit clause was
22 specified with a threadlim that evaluates to an implementation defined value less than or equal to
23 the teams-thread-limit-var ICV.

24 Cross References
25 • target directive, see Section 13.8
26 • teams directive, see Section 10.2

CHAPTER 13. DEVICE DIRECTIVES AND CLAUSES 277


1 13.4 Device Initialization
2 Execution Model Events
3 The device-initialize event occurs in a thread that begins initialization of OpenMP on the device,
4 after the device’s OpenMP initialization, which may include device-side tool initialization,
5 completes.
6 The device-load event for a code block for a target device occurs in some thread before any thread
7 executes code from that code block on that target device.
8 The device-unload event for a target device occurs in some thread whenever a code block is
9 unloaded from the device.
10 The device-finalize event for a target device that has been initialized occurs in some thread before
11 an OpenMP implementation shuts down.

12 Tool Callbacks
13 A thread dispatches a registered ompt_callback_device_initialize callback for each
14 occurrence of a device-initialize event in that thread. This callback has type signature
15 ompt_callback_device_initialize_t.
16 A thread dispatches a registered ompt_callback_device_load callback for each occurrence
17 of a device-load event in that thread. This callback has type signature
18 ompt_callback_device_load_t.
19 A thread dispatches a registered ompt_callback_device_unload callback for each
20 occurrence of a device-unload event in that thread. This callback has type signature
21 ompt_callback_device_unload_t.
22 A thread dispatches a registered ompt_callback_device_finalize callback for each
23 occurrence of a device-finalize event in that thread. This callback has type signature
24 ompt_callback_device_finalize_t.

25 Restrictions
26 Restrictions to OpenMP device initialization are as follows:
27 • No thread may offload execution of an OpenMP construct to a device until a dispatched
28 ompt_callback_device_initialize callback completes.
29 • No thread may offload execution of an OpenMP construct to a device after a dispatched
30 ompt_callback_device_finalize callback occurs.

31 Cross References
32 • ompt_callback_device_finalize_t, see Section 19.5.2.20
33 • ompt_callback_device_initialize_t, see Section 19.5.2.19
34 • ompt_callback_device_load_t, see Section 19.5.2.21
35 • ompt_callback_device_unload_t, see Section 19.5.2.22

278 OpenMP API – Version 5.2 November 2021


1 13.5 target data Construct
Name: target data Association: block
Category: executable Properties: device, device-affecting, data-
2
mapping, map-entering, map-exiting,
mapping-only

3 Clauses
4 device, if, map, use_device_addr, use_device_ptr

5 Clause set data-environment-clause


Properties: required Members: map, use_device_addr,
6
use_device_ptr

7 Binding
8 The binding task set for a target data region is the generating task. The target data region
9 binds to the region of the generating task.

10 Semantics
11 The target data construct maps variables to a device data environment. When a
12 target data construct is encountered, the encountering task executes the region. When an if
13 clause is present and the if clause expression evaluates to false, the target device is the host.
14 Variables are mapped for the extent of the region, according to any data-mapping attribute clauses,
15 from the data environment of the encountering task to the device data environment.
16 A list item that appears in a map clause may also appear in a use_device_ptr clause or a
17 use_device_addr clause. If one or more map clauses are present, the list item conversions that
18 are performed for any use_device_ptr or use_device_addr clause occur after all
19 variables are mapped on entry to the region according to those map clauses.

20 Execution Model Events


21 The events associated with entering a target data region are the same events as associated with
22 a target enter data construct, as described in Section 13.6.
23 The events associated with exiting a target data region are the same events as associated with a
24 target exit data construct, as described in Section 13.7.

25 Tool Callbacks
26 The tool callbacks dispatched when entering a target data region are the same as the tool
27 callbacks dispatched when encountering a target enter data construct, as described in
28 Section 13.6.
29 The tool callbacks dispatched when exiting a target data region are the same as the tool
30 callbacks dispatched when encountering a target exit data construct, as described in
31 Section 13.7.

CHAPTER 13. DEVICE DIRECTIVES AND CLAUSES 279


1 Restrictions
2 Restrictions to the target data construct are as follows:
3 • A map-type in a map clause must be to, from, tofrom or alloc.

4 Cross References
5 • device clause, see Section 13.2
6 • if clause, see Section 3.4
7 • map clause, see Section 5.8.3
8 • use_device_addr clause, see Section 5.4.10
9 • use_device_ptr clause, see Section 5.4.8

10 13.6 target enter data Construct


Name: target enter data Association: none
Category: executable Properties: parallelism-generating, task-
11
generating, device, device-affecting, data-
mapping, map-entering, mapping-only

12 Clauses
13 depend, device, if, map, nowait

14 Binding
15 The binding task set for a target enter data region is the generating task, which is the target
16 task generated by the target enter data construct. The target enter data region binds
17 to the corresponding target task region.

18 Semantics
19 When a target enter data construct is encountered, the list items are mapped to the device
20 data environment according to the map clause semantics. The target enter data construct
21 generates a target task. The generated task region encloses the target enter data region. If a
22 depend clause is present, it is associated with the target task. If the nowait clause is present,
23 execution of the target task may be deferred. If the nowait clause is not present, the target task is
24 an included task.
25 All clauses are evaluated when the target enter data construct is encountered. The data
26 environment of the target task is created according to the data-mapping attribute clauses on the
27 target enter data construct, per-data environment ICVs, and any default data-sharing
28 attribute rules that apply to the target enter data construct. If a variable or part of a variable
29 is mapped by the target enter data construct, the variable has a default data-sharing attribute
30 of shared in the data environment of the target task.

280 OpenMP API – Version 5.2 November 2021


1 Assignment operations associated with mapping a variable (see Section 5.8.3) occur when the
2 target task executes.
3 When an if clause is present and the if clause expression evaluates to false, the target device is
4 the host.

5 Execution Model Events


6 Events associated with a target task are the same as for the task construct defined in Section 12.5.
7 The target-enter-data-begin event occurs after creation of the target task and completion of all
8 predecessor tasks that are not target tasks for the same device. The target-enter-data-begin event is
9 a target-task-begin event.
10 The target-enter-data-end event occurs after all other events associated with the
11 target enter data construct.

12 Tool Callbacks
13 Callbacks associated with events for target tasks are the same as for the task construct defined in
14 Section 12.5; (flags & ompt_task_target) always evaluates to true in the dispatched callback.
15 A thread dispatches a registered ompt_callback_target or
16 ompt_callback_target_emi callback with ompt_scope_begin as its endpoint
17 argument and ompt_target_enter_data or ompt_target_enter_data_nowait if
18 the nowait clause is present as its kind argument for each occurrence of a target-enter-data-begin
19 event in that thread in the context of the target task on the host. Similarly, a thread dispatches a
20 registered ompt_callback_target or ompt_callback_target_emi callback with
21 ompt_scope_end as its endpoint argument and ompt_target_enter_data or
22 ompt_target_enter_data_nowait if the nowait clause is present as its kind argument
23 for each occurrence of a target-enter-data-end event in that thread in the context of the target task
24 on the host. These callbacks have type signature ompt_callback_target_t or
25 ompt_callback_target_emi_t, respectively.

26 Restrictions
27 Restrictions to the target enter data construct are as follows:
28 • At least one map clause must appear on the directive.
29 • All map clauses must be map-entering.

30 Cross References
31 • ompt_callback_target_emi_t and ompt_callback_target_t, see
32 Section 19.5.2.26
33 • depend clause, see Section 15.9.5
34 • device clause, see Section 13.2
35 • if clause, see Section 3.4

CHAPTER 13. DEVICE DIRECTIVES AND CLAUSES 281


1 • map clause, see Section 5.8.3
2 • nowait clause, see Section 15.6
3 • task directive, see Section 12.5

4 13.7 target exit data Construct


Name: target exit data Association: none
Category: executable Properties: parallelism-generating, task-
5
generating, device, device-affecting, data-
mapping, map-exiting, mapping-only

6 Clauses
7 depend, device, if, map, nowait

8 Binding
9 The binding task set for a target exit data region is the generating task, which is the target
10 task generated by the target exit data construct. The target exit data region binds to
11 the corresponding target task region.

12 Semantics
13 When a target exit data construct is encountered, the list items in the map clauses are
14 unmapped from the device data environment according to the map clause semantics. The
15 target exit data construct generates a target task. The generated task region encloses the
16 target exit data region. If a depend clause is present, it is associated with the target task. If
17 the nowait clause is present, execution of the target task may be deferred. If the nowait clause
18 is not present, the target task is an included task.
19 All clauses are evaluated when the target exit data construct is encountered. The data
20 environment of the target task is created according to the data-mapping attribute clauses on the
21 target exit data construct, per-data environment ICVs, and any default data-sharing attribute
22 rules that apply to the target exit data construct. If a variable or part of a variable is mapped
23 by the target exit data construct, the variable has a default data-sharing attribute of shared in
24 the data environment of the target task.
25 Assignment operations associated with mapping a variable (see Section 5.8.3) occur when the
26 target task executes.
27 When an if clause is present and the if clause expression evaluates to false, the target device is
28 the host.

282 OpenMP API – Version 5.2 November 2021


1 Execution Model Events
2 Events associated with a target task are the same as for the task construct defined in Section 12.5.
3 The target-exit-data-begin event occurs after creation of the target task and completion of all
4 predecessor tasks that are not target tasks for the same device. The target-exit-data-begin event is a
5 target-task-begin event.
6 The target-exit-data-end event occurs after all other events associated with the
7 target exit data construct.

8 Tool Callbacks
9 Callbacks associated with events for target tasks are the same as for the task construct defined in
10 Section 12.5; (flags & ompt_task_target) always evaluates to true in the dispatched callback.
11 A thread dispatches a registered ompt_callback_target or
12 ompt_callback_target_emi callback with ompt_scope_begin as its endpoint
13 argument and ompt_target_exit_data or ompt_target_exit_data_nowait if the
14 nowait clause is present as its kind argument for each occurrence of a target-exit-data-begin
15 event in that thread in the context of the target task on the host. Similarly, a thread dispatches a
16 registered ompt_callback_target or ompt_callback_target_emi callback with
17 ompt_scope_end as its endpoint argument and ompt_target_exit_data or
18 ompt_target_exit_data_nowait if the nowait clause is present as its kind argument for
19 each occurrence of a target-exit-data-end event in that thread in the context of the target task on the
20 host. These callbacks have type signature ompt_callback_target_t or
21 ompt_callback_target_emi_t, respectively.

22 Restrictions
23 Restrictions to the target exit data construct are as follows:
24 • At least one map clause must appear on the directive.
25 • All map clauses must be a map-exiting.

26 Cross References
27 • ompt_callback_target_emi_t and ompt_callback_target_t, see
28 Section 19.5.2.26
29 • depend clause, see Section 15.9.5
30 • device clause, see Section 13.2
31 • if clause, see Section 3.4
32 • map clause, see Section 5.8.3
33 • nowait clause, see Section 15.6
34 • task directive, see Section 12.5

CHAPTER 13. DEVICE DIRECTIVES AND CLAUSES 283


1 13.8 target Construct
Name: target Association: block
Category: executable Properties: parallelism-generating, thread-
2 limiting, exception-aborting, task-generating,
device, device-affecting, data-mapping, map-
entering, map-exiting, context-matching

3 Clauses
4 allocate, defaultmap, depend, device, firstprivate, has_device_addr, if,
5 in_reduction, is_device_ptr, map, nowait, private, thread_limit,
6 uses_allocators

7 Binding
8 The binding task set for a target region is the generating task, which is the target task generated
9 by the target construct. The target region binds to the corresponding target task region.

10 Semantics
11 The target construct provides a superset of the functionality provided by the target data
12 directive, except for the use_device_ptr and use_device_addr clauses. The functionality
13 added to the target directive is the inclusion of an executable region to be executed on a device.
14 The target construct generates a target task. The generated task region encloses the target
15 region. If a depend clause is present, it is associated with the target task. The device clause
16 determines the device on which the target region executes. If the nowait clause is present,
17 execution of the target task may be deferred. If the nowait clause is not present, the target task is
18 an included task.
19 All clauses are evaluated when the target construct is encountered. The data environment of the
20 target task is created according to the data-sharing and data-mapping attribute clauses on the
21 target construct, per-data environment ICVs, and any default data-sharing attribute rules that
22 apply to the target construct. If a variable or part of a variable is mapped by the target
23 construct and does not appear as a list item in an in_reduction clause on the construct, the
24 variable has a default data-sharing attribute of shared in the data environment of the target task.
25 Assignment operations associated with mapping a variable (see Section 5.8.3) occur when the
26 target task executes.
27 If the device clause is specified with the ancestor device-modifier, the encountering thread
28 waits for completion of the target region on the parent device before resuming. For any list item
29 that appears in a map clause on the same construct, if the corresponding list item exists in the device
30 data environment of the parent device, it is treated as if it has a reference count of positive infinity.
31 When an if clause is present and the if clause expression evaluates to false, the effect is as if a
32 device clause that specifies omp_initial_device as the device number is present,
33 regardless of any other device clause on the directive.

284 OpenMP API – Version 5.2 November 2021


1 If a procedure is explicitly or implicitly referenced in a target construct that does not specify a
2 device clause in which the ancestor device-modifier appears then that procedure is treated as
3 if its name had appeared in an enter clause on a declare target directive.
4 If a variable with static storage duration is declared in a target construct that does not specify a
5 device clause in which the ancestor device-modifier appears then the named variable is
6 treated as if it had appeared in a enter clause on a declare target directive.
C / C++
7 If a list item in a map clause has a base pointer and it is a scalar variable with a predetermined
8 data-sharing attribute of firstprivate (see Section 5.1.1), then on entry to the target region:
9 • If the list item is not a zero-length array section, the corresponding private variable is initialized
10 such that the corresponding list item in the device data environment can be accessed through the
11 pointer in the target region.
12 • If the list item is a zero-length array section , the corresponding private variable is initialized
13 according to Section 5.8.6.
C / C++
Fortran
14 When an internal procedure is called in a target region, any references to variables that are host
15 associated in the procedure have unspecified behavior.
Fortran
16 Execution Model Events
17 Events associated with a target task are the same as for the task construct defined in Section 12.5.
18 Events associated with the initial task that executes the target region are defined in Section 12.8.
19 The target-submit-begin event occurs prior to initiating creation of an initial task on a target device
20 for a target region.
21 The target-submit-end event occurs after initiating creation of an initial task on a target device for a
22 target region.
23 The target-begin event occurs after creation of the target task and completion of all predecessor
24 tasks that are not target tasks for the same device. The target-begin event is a target-task-begin
25 event.
26 The target-end event occurs after all other events associated with the target construct.

27 Tool Callbacks
28 Callbacks associated with events for target tasks are the same as for the task construct defined in
29 Section 12.5; (flags & ompt_task_target) always evaluates to true in the dispatched callback.

CHAPTER 13. DEVICE DIRECTIVES AND CLAUSES 285


1 A thread dispatches a registered ompt_callback_target or
2 ompt_callback_target_emi callback with ompt_scope_begin as its endpoint
3 argument and ompt_target or ompt_target_nowait if the nowait clause is present as its
4 kind argument for each occurrence of a target-begin event in that thread in the context of the target
5 task on the host. Similarly, a thread dispatches a registered ompt_callback_target or
6 ompt_callback_target_emi callback with ompt_scope_end as its endpoint argument
7 and ompt_target or ompt_target_nowait if the nowait clause is present as its kind
8 argument for each occurrence of a target-end event in that thread in the context of the target task on
9 the host. These callbacks have type signature ompt_callback_target_t or
10 ompt_callback_target_emi_t, respectively.
11 A thread dispatches a registered ompt_callback_target_submit_emi callback with
12 ompt_scope_begin as its endpoint argument for each occurrence of a target-submit-begin
13 event in that thread. Similarly, a thread dispatches a registered
14 ompt_callback_target_submit_emi callback with ompt_scope_end as its endpoint
15 argument for each occurrence of a target-submit-end event in that thread. These callbacks have type
16 signature ompt_callback_target_submit_emi_t.
17 A thread dispatches a registered ompt_callback_target_submit callback for each
18 occurrence of a target-submit-begin event in that thread. The callback occurs in the context of the
19 target task and has type signature ompt_callback_target_submit_t.
20 Restrictions
21 Restrictions to the target construct are as follows:
22 • Device-affecting constructs, other than target constructs for which the ancestor
23 device-modifier is specified, must not be encountered during execution of a target region.
24 • The result of an omp_set_default_device, omp_get_default_device, or
25 omp_get_num_devices routine called within a target region is unspecified.
26 • The effect of an access to a threadprivate variable in a target region is unspecified.
27 • If a list item in a map clause is a structure element, any other element of that structure that is
28 referenced in the target construct must also appear as a list item in a map clause.
29 • A list item in a data-sharing attribute clause that is specified on a target construct must not
30 have the same base variable as a list item in a map clause on the construct.
31 • A variable referenced in a target region but not the target construct that is not declared in
32 the target region must appear in a declare target directive.
33 • A map-type in a map clause must be to, from, tofrom or alloc.
34 • If a device clause is specified with the ancestor device-modifier, only the device,
35 firstprivate, private, defaultmap, and map clauses may appear on the construct and
36 no OpenMP constructs or calls to OpenMP API runtime routines are allowed inside the
37 corresponding target region.
38 • Memory allocators that do not appear in a uses_allocators clause cannot appear as an
39 allocator in an allocate clause or be used in the target region unless a requires
40 directive with the dynamic_allocators clause is present in the same compilation unit.

286 OpenMP API – Version 5.2 November 2021


1 • Any IEEE floating-point exception status flag, halting mode, or rounding mode set prior to a
2 target region is unspecified in the region.
3 • Any IEEE floating-point exception status flag, halting mode, or rounding mode set in a target
4 region is unspecified upon exiting the region.
5 • A program must not rely on the value of a function address in a target region except for
6 assignments, comparisons to zero and indirect calls.
C / C++
7 • An attached pointer must not be modified in a target region.
C / C++
C++
8 • The run-time type information (RTTI) of an object can only be accessed from the device on
9 which it was constructed.
10 • Invoking a virtual member function of an object on a device other than the device on which the
11 object was constructed results in unspecified behavior, unless the object is accessible and was
12 constructed on the host device.
13 • If an object of polymorphic class type is destructed, virtual member functions of any previously
14 existing corresponding objects in other device data environments must not be invoked.
C++
Fortran
15 • An attached pointer that is associated with a given pointer target must not become associated
16 with a different pointer target in a target region.
17 • If a list item in a map clause is an array section, and the array section is derived from a variable
18 with a POINTER or ALLOCATABLE attribute then the behavior is unspecified if the
19 corresponding list item’s variable is modified in the region.
20 • A reference to a coarray that is encountered on a non-host device must not be coindexed or appear
21 as an actual argument to a procedure where the corresponding dummy argument is a coarray.
22 • If the allocation status of a mapped variable that has the ALLOCATABLE attribute is unallocated
23 on entry to a target region, the allocation status of the corresponding variable in the device
24 data environment must be unallocated upon exiting the region.
25 • If the allocation status of a mapped variable that has the ALLOCATABLE attribute is allocated on
26 entry to a target region, the allocation status and shape of the corresponding variable in the
27 device data environment may not be changed, either explicitly or implicitly, in the region after
28 entry to it.
29 • If the association status of a list item with the POINTER attribute that appears in a map clause
30 on the construct is associated upon entry to the target region, the list item must be associated
31 with the same pointer target upon exit from the region.

CHAPTER 13. DEVICE DIRECTIVES AND CLAUSES 287


1 • If the association status of a list item with the POINTER attribute that appears in a map clause
2 on the construct is disassociated upon entry to the target region, the list item must be
3 disassociated upon exit from the region.
4 • If the association status of a list item with the POINTER attribute that appears in a map clause
5 on the construct is undefined on entry to the target region, the association status of the list
6 item must not be associated upon exit from the region.
7 • A program must not rely on the association status of a procedure pointer in a target region
8 except for calls to the ASSOCIATED inquiry function without the optional proc-target argument,
9 pointer assignments and indirect calls.
Fortran
10 Cross References
11 • ompt_callback_target_emi_t and ompt_callback_target_t, see
12 Section 19.5.2.26
13 • ompt_callback_target_submit_emi_t and
14 ompt_callback_target_submit_t, see Section 19.5.2.28
15 • allocate clause, see Section 6.6
16 • defaultmap clause, see Section 5.8.7
17 • depend clause, see Section 15.9.5
18 • device clause, see Section 13.2
19 • firstprivate clause, see Section 5.4.4
20 • has_device_addr clause, see Section 5.4.9
21 • if clause, see Section 3.4
22 • in_reduction clause, see Section 5.5.10
23 • is_device_ptr clause, see Section 5.4.7
24 • map clause, see Section 5.8.3
25 • nowait clause, see Section 15.6
26 • private clause, see Section 5.4.3
27 • target data directive, see Section 13.5
28 • task directive, see Section 12.5
29 • thread_limit clause, see Section 13.3
30 • uses_allocators clause, see Section 6.8

288 OpenMP API – Version 5.2 November 2021


1 13.9 target update Construct
Name: target update Association: none
2 Category: executable Properties: parallelism-generating, task-
generating, device, device-affecting
3 Clauses
4 depend, device, from, if, nowait, to
5 Clause set
6 Properties: required Members: from, to
7 Binding
8 The binding task set for a target update region is the generating task, which is the target task
9 generated by the target update construct. The target update region binds to the
10 corresponding target task region.
11 Semantics
12 The target update directive makes the corresponding list items in the device data environment
13 consistent with their original list items, according to the specified data-motion-clauses. The
14 target update construct generates a target task. The generated task region encloses the
15 target update region. If a depend clause is present, it is associated with the target task. If the
16 nowait clause is present, execution of the target task may be deferred. If the nowait clause is
17 not present, the target task is an included task.
18 All clauses are evaluated when the target update construct is encountered. The data
19 environment of the target task is created according to data-motion-clauses on the
20 target update construct, per-data environment ICVs, and any default data-sharing attribute
21 rules that apply to the target update construct. If a variable or part of a variable is a list item in
22 a data-motion-clause on the target update construct, the variable has a default data-sharing
23 attribute of shared in the data environment of the target task.
24 Assignment operations associated with any motion clauses occur when the target task executes.
25 When an if clause is present and the if clause expression evaluates to false, no assignments occur.
26 Execution Model Events
27 Events associated with a target task are the same as for the task construct defined in Section 12.5.
28 The target-update-begin event occurs after creation of the target task and completion of all
29 predecessor tasks that are not target tasks for the same device.
30 The target-update-end event occurs after all other events associated with the target update
31 construct.
32 The target-data-op-begin event occurs in the target update region before a thread initiates a
33 data operation on the target device.
34 The target-data-op-end event occurs in the target update region after a thread initiates a data
35 operation on the target device.

CHAPTER 13. DEVICE DIRECTIVES AND CLAUSES 289


1 Tool Callbacks
2 Callbacks associated with events for target tasks are the same as for the task construct defined in
3 Section 12.5; (flags & ompt_task_target) always evaluates to true in the dispatched callback.
4 A thread dispatches a registered ompt_callback_target or
5 ompt_callback_target_emi callback with ompt_scope_begin as its endpoint
6 argument and ompt_target_update or ompt_target_update_nowait if the nowait
7 clause is present as its kind argument for each occurrence of a target-update-begin event in that
8 thread in the context of the target task on the host. Similarly, a thread dispatches a registered
9 ompt_callback_target or ompt_callback_target_emi callback with
10 ompt_scope_end as its endpoint argument and ompt_target_update or
11 ompt_target_update_nowait if the nowait clause is present as its kind argument for each
12 occurrence of a target-update-end event in that thread in the context of the target task on the host.
13 These callbacks have type signature ompt_callback_target_t or
14 ompt_callback_target_emi_t, respectively.
15 A thread dispatches a registered ompt_callback_target_data_op_emi callback with
16 ompt_scope_begin as its endpoint argument for each occurrence of a target-data-op-begin
17 event in that thread. Similarly, a thread dispatches a registered
18 ompt_callback_target_data_op_emi callback with ompt_scope_end as its endpoint
19 argument for each occurrence of a target-data-op-end event in that thread. These callbacks have
20 type signature ompt_callback_target_data_op_emi_t.
21 A thread dispatches a registered ompt_callback_target_data_op callback for each
22 occurrence of a target-data-op-end event in that thread. The callback occurs in the context of the
23 target task and has type signature ompt_callback_target_data_op_t.

24 Cross References
25 • ompt_callback_target_emi_t and ompt_callback_target_t, see
26 Section 19.5.2.26
27 • ompt_callback_task_create_t, see Section 19.5.2.7
28 • depend clause, see Section 15.9.5
29 • device clause, see Section 13.2
30 • from clause, see Section 5.9.2
31 • if clause, see Section 3.4
32 • nowait clause, see Section 15.6
33 • task directive, see Section 12.5
34 • to clause, see Section 5.9.1

290 OpenMP API – Version 5.2 November 2021


1 14 Interoperability
2 An OpenMP implementation may interoperate with one or more foreign runtime environments
3 through the use of the interop construct that is described in this chapter, the interop operation
4 for a declared variant function and the interoperability routines that are available through the
5 OpenMP Runtime API.
C / C++
6 The implementation must provide foreign-runtime-id values that are enumerators of type
7 omp_interop_fr_t and that correspond to the supported foreign runtime environments.
C / C++
Fortran
8 The implementation must provide foreign-runtime-id values that are named integer constants with
9 kind omp_interop_fr_kind and that correspond to the supported foreign runtime
10 environments.
Fortran
11 Each foreign-runtime-id value provided by an implementation will be available as
12 omp_ifr_name, where name is the name of the foreign runtime environment. Available names
13 include those that are listed in the OpenMP Additional Definitions document;
14 implementation-defined names may also be supported. The value of omp_ifr_last is defined as
15 one greater than the value of the highest supported foreign-runtime-id value that is listed in the
16 aforementioned document.

17 Cross References
18 • Interoperability Routines, see Section 18.12

19 14.1 interop Construct


Name: interop Association: none
20
Category: executable Properties: device

21 Clauses
22 depend, destroy, device, init, nowait, use

23 Clause set action-clause


24 Properties: required Members: destroy, init, use

CHAPTER 14. INTEROPERABILITY 291


1 Binding
2 The binding task set for an interop region is the generating task. The interop region binds to
3 the region of the generating task.

4 Semantics
5 The interop construct retrieves interoperability properties from the OpenMP implementation to
6 enable interoperability with foreign execution contexts. When an interop construct is
7 encountered, the encountering task executes the region.
8 For each action-clause, the interop-type set is the set of interop-type modifiers specified for the
9 clause if the clause is init or for the init clause that initialized the interop-var that is specified for
10 the clause if the clause is not init.
11 If the interop-type set includes targetsync, an empty mergeable task is generated. If the
12 nowait clause is not present on the construct then the task is also an included task. Any depend
13 clauses that are present on the construct apply to the generated task.
14 The interop construct ensures an ordered execution of the generated task relative to foreign tasks
15 executed in the foreign execution context through the foreign synchronization object that is
16 accessible through the targetsync property. When the creation of the foreign task precedes the
17 encountering of an interop construct in happens before order (see Section 1.4.5), the foreign
18 task must complete execution before the generated task begins execution. Similarly, when the
19 creation of a foreign task follows the encountering of an interop construct in happens before
20 order, the foreign task must not begin execution until the generated task completes execution. No
21 ordering is imposed between the encountering thread and either foreign tasks or OpenMP tasks by
22 the interop construct.
23 If the interop-type set does not include targetsync, the nowait clause has no effect.

24 Restrictions
25 Restrictions to the interop construct are as follows:
26 • A depend clause can only appear on the directive if the interop-type includes targetsync.
27 • Each interop-var may be specified for at most one action-clause of each interop construct.

28 Cross References
29 • Interoperability Routines, see Section 18.12
30 • depend clause, see Section 15.9.5
31 • destroy clause, see Section 3.5
32 • device clause, see Section 13.2
33 • init clause, see Section 14.1.2
34 • nowait clause, see Section 15.6
35 • use clause, see Section 14.1.3

292 OpenMP API – Version 5.2 November 2021


1 14.1.1 OpenMP Foreign Runtime Identifiers
2 An OpenMP foreign runtime identifier, foreign-runtime-id, is a base language string literal or a
3 compile-time constant OpenMP integer expression. Allowed values for foreign-runtime-id include
4 the names (as string literals) and integer values that the OpenMP Additional Definitions document
5 specifies and the corresponding omp_ifr_name constants of OpenMP interop_fr type.
6 Implementation-defined values for foreign-runtime-id may also be supported.

7 14.1.2 init Clause


8 Name: init Properties: default

9 Arguments
Name Type Properties
10
interop-var variable of omp_interop_t type default

11 Modifiers
Name Modifies Type Properties
interop-preference Generic Complex, name: complex, unique
prefer_type Arguments:
preference_list OpenMP
12 foreign runtime preference
list (default)

interop-type Generic Keyword: target, repeatable, re-


targetsync quired

13 Directives
14 interop

15 Semantics
16 The init clause specifies that interop-var is initialized to refer to the list of properties associated
17 with any interop-type. For any interop-type, the properties type, type_name, vendor,
18 vendor_name and device_num will be available. If the implementation cannot initialize
19 interop-var, it is initialized to the value of omp_interop_none, which is defined to be zero.
20 The targetsync interop-type will additionally provide the targetsync property, which is the
21 handle to a foreign synchronization object for enabling synchronization between OpenMP tasks and
22 foreign tasks that execute in the foreign execution context.
23 The target interop-type will additionally provide the following properties:
24 • device, which will be a foreign device handle;
25 • device_context, which will be a foreign device context handle; and
26 • platform, which will be a handle to a foreign platform of the device.

CHAPTER 14. INTEROPERABILITY 293


1 If the prefer_type interop-modifier clause is specified, the first supported foreign-runtime-id in
2 preference-list in left-to-right order is used. The foreign-runtime-id that is used if the
3 implementation does not support any of the items in preference-list is implementation defined.

4 Restrictions
5 Restrictions to the init clause are as follows:
6 • Each interop-type may be specified at most once.
7 • interop-var must be non-const.

8 Cross References
9 • OpenMP Foreign Runtime Identifiers, see Section 14.1.1
10 • interop directive, see Section 14.1

11 14.1.3 use Clause


12 Name: use Properties: default

13 Arguments
Name Type Properties
14
interop-var variable of omp_interop_t type default

15 Directives
16 interop

17 Semantics
18 The use clause specifies the interop-var that is used for the effects of the directive on which the
19 clause appears. However, interop-var is not initialized, destroyed or otherwise modified. The
20 interop-type is inferred based on the interop-type used to initialize interop-var.

21 Cross References
22 • interop directive, see Section 14.1

23 14.2 Interoperability Requirement Set


24 The interoperability requirement set of each task is a logical set of properties that can be added or
25 removed by different directives. These properties can be queried by other constructs that have
26 interoperability semantics.
27 A construct can add the following properties to the set:
28 • depend, which specifies that the construct requires enforcement of the synchronization
29 relationship expressed by the depend clause;

294 OpenMP API – Version 5.2 November 2021


1 • nowait, which specifies that the construct is asynchronous; and
2 • is_device_ptr(list-item), which specifies that the list-item is a device pointer in the construct.
3 The following directives may add properties to the set:
4 • dispatch.
5 The following directives may remove properties from the set:
6 • declare variant.

7 Cross References
8 • Declare Variant Directives, see Section 7.5
9 • dispatch directive, see Section 7.6

CHAPTER 14. INTEROPERABILITY 295


1 15 Synchronization Constructs and
2 Clauses
3 A synchronization construct orders the completion of code executed by different threads. This
4 ordering is imposed by synchronizing flush operations that are executed as part of the region that
5 corresponds to the construct.
6 Synchronization through the use of synchronizing flush operations and atomic operations is
7 described in Section 1.4.4 and Section 1.4.6. Section 15.8.6 defines the behavior of synchronizing
8 flush operations that are implied at various other locations in an OpenMP program.

9 15.1 Synchronization Hints


10 The programmer can provide hints about the expected dynamic behavior or suggested
11 implementation of a lock by using omp_init_lock_with_hint or
12 omp_init_nest_lock_with_hint to initialize it. Synchronization hints may also be
13 provided for atomic and critical directives by using the hint clause. The effect of a hint
14 does not change the semantics of the associated construct; if ignoring the hint changes the program
15 semantics, the result is unspecified.

16 Cross References
17 • hint clause, see Section 15.1.2
18 • omp_init_lock_with_hint and omp_init_nest_lock_with_hint, see
19 Section 18.9.2

20 15.1.1 Synchronization Hint Type


21 Synchronization hints are specified with an OpenMP sync_hint type. The C/C++ header file
22 (omp.h) and the Fortran include file (omp_lib.h) and/or Fortran module file (omp_lib) define
23 the valid hint constants. The valid constants must include the following, which can be extended
24 with implementation-defined values:
C / C++
25 typedef enum omp_sync_hint_t {
26 omp_sync_hint_none = 0x0,
27 omp_lock_hint_none = omp_sync_hint_none,
28 omp_sync_hint_uncontended = 0x1,
29 omp_lock_hint_uncontended = omp_sync_hint_uncontended,

296 OpenMP API – Version 5.2 November 2021


1 omp_sync_hint_contended = 0x2,
2 omp_lock_hint_contended = omp_sync_hint_contended,
3 omp_sync_hint_nonspeculative = 0x4,
4 omp_lock_hint_nonspeculative = omp_sync_hint_nonspeculative,
5 omp_sync_hint_speculative = 0x8,
6 omp_lock_hint_speculative = omp_sync_hint_speculative
7 } omp_sync_hint_t;
8
9 typedef omp_sync_hint_t omp_lock_hint_t;
C / C++
Fortran
10 integer, parameter :: omp_lock_hint_kind = omp_sync_hint_kind
11
12 integer (kind=omp_sync_hint_kind), &
13 parameter :: omp_sync_hint_none = &
14 int(Z’0’, kind=omp_sync_hint_kind)
15 integer (kind=omp_lock_hint_kind), &
16 parameter :: omp_lock_hint_none = omp_sync_hint_none
17 integer (kind=omp_sync_hint_kind), &
18 parameter :: omp_sync_hint_uncontended = &
19 int(Z’1’, kind=omp_sync_hint_kind)
20 integer (kind=omp_lock_hint_kind), &
21 parameter :: omp_lock_hint_uncontended = &
22 omp_sync_hint_uncontended
23 integer (kind=omp_sync_hint_kind), &
24 parameter :: omp_sync_hint_contended = &
25 int(Z’2’, kind=omp_sync_hint_kind)
26 integer (kind=omp_lock_hint_kind), &
27 parameter :: omp_lock_hint_contended = &
28 omp_sync_hint_contended
29 integer (kind=omp_sync_hint_kind), &
30 parameter :: omp_sync_hint_nonspeculative = &
31 int(Z’4’, kind=omp_sync_hint_kind)
32 integer (kind=omp_lock_hint_kind), &
33 parameter :: omp_lock_hint_nonspeculative = &
34 omp_sync_hint_nonspeculative
35 integer (kind=omp_sync_hint_kind), &
36 parameter :: omp_sync_hint_speculative = &
37 int(Z’8’, kind=omp_sync_hint_kind)
38 integer (kind=omp_lock_hint_kind), &
39 parameter :: omp_lock_hint_speculative = &
40 omp_sync_hint_speculative
Fortran

CHAPTER 15. SYNCHRONIZATION CONSTRUCTS AND CLAUSES 297


1 The hints can be combined by using the + or | operators in C/C++ or the + operator in Fortran.
2 Combining omp_sync_hint_none with any other hint is equivalent to specifying the other hint.
3 The intended meaning of each hint is:
4 • omp_sync_hint_uncontended: low contention is expected in this operation, that is, few
5 threads are expected to perform the operation simultaneously in a manner that requires
6 synchronization;
7 • omp_sync_hint_contended: high contention is expected in this operation, that is, many
8 threads are expected to perform the operation simultaneously in a manner that requires
9 synchronization;
10 • omp_sync_hint_speculative: the programmer suggests that the operation should be
11 implemented using speculative techniques such as transactional memory; and
12 • omp_sync_hint_nonspeculative: the programmer suggests that the operation should
13 not be implemented using speculative techniques such as transactional memory.
14

15 Note – Future OpenMP specifications may add additional hints to the sync_hint type.
16 Implementers are advised to add implementation-defined hints starting from the most significant bit
17 of the type and to include the name of the implementation in the name of the added hint to avoid
18 name conflicts with other OpenMP implementations.
19

20 The OpenMP sync_hint and lock_hint types are synonyms for each other. The OpenMP
21 lock_hint type has been deprecated.

22 Restrictions
23 Restrictions to the synchronization hints are as follows:
24 • The hints omp_sync_hint_uncontended and omp_sync_hint_contended cannot
25 be combined.
26 • The hints omp_sync_hint_nonspeculative and omp_sync_hint_speculative
27 cannot be combined.
28 The restrictions for combining multiple values of the OpenMP sync_hint type apply equally to
29 the corresponding values of the OpenMP lock_hint type, and expressions that mix the two
30 types.

298 OpenMP API – Version 5.2 November 2021


1 15.1.2 hint Clause
2 Name: hint Properties: unique

3 Arguments
Name Type Properties
4
hint-expr expression of sync_hint type default

5 Directives
6 atomic, critical

7 Semantics
8 The hint clause gives the implementation additional information about the expected runtime
9 properties of the region that corresponds to the construct on which it appears and that can
10 optionally be used to optimize the implementation. The presence of a hint clause does not affect
11 the semantics of the construct. If no hint clause is specified for a construct that accepts it, the
12 effect is as if hint(omp_sync_hint_none) had been specified.

13 Restrictions
14 • hint-expr must evaluate to a valid synchronization hint.

15 Cross References
16 • Synchronization Hint Type, see Section 15.1.1
17 • atomic directive, see Section 15.8.4
18 • critical directive, see Section 15.2

19 15.2 critical Construct


Name: critical Association: block
20
Category: executable Properties: thread-limiting

21 Arguments
22 critical(name)
Name Type Properties
23
name base language identifier optional

24 Clauses
25 hint

26 Binding
27 The binding thread set for a critical region is all threads in the contention group.

CHAPTER 15. SYNCHRONIZATION CONSTRUCTS AND CLAUSES 299


1 Semantics
2 The name argument is used to identify the critical construct. For any critical construct for
3 which name is not specified, the effect is as if an identical (unspecified) name was specified. The
4 region that corresponds to a critical construct of a given name is executed as if only a single
5 thread at a time among all threads in the contention group executes the region, without regard to the
6 teams to which the threads belong.
C / C++
7 Identifiers used to identify a critical construct have external linkage and are in a name space
8 that is separate from the name spaces used by labels, tags, members, and ordinary identifiers.
C / C++
Fortran
9 The names of critical constructs are global entities of the program. If a name conflicts with
10 any other entity, the behavior of the program is unspecified.
Fortran
11 Execution Model Events
12 The critical-acquiring event occurs in a thread that encounters the critical construct on entry
13 to the critical region before initiating synchronization for the region.
14 The critical-acquired event occurs in a thread that encounters the critical construct after it
15 enters the region, but before it executes the structured block of the critical region.
16 The critical-released event occurs in a thread that encounters the critical construct after it
17 completes any synchronization on exit from the critical region.

18 Tool Callbacks
19 A thread dispatches a registered ompt_callback_mutex_acquire callback for each
20 occurrence of a critical-acquiring event in that thread. This callback has the type signature
21 ompt_callback_mutex_acquire_t.
22 A thread dispatches a registered ompt_callback_mutex_acquired callback for each
23 occurrence of a critical-acquired event in that thread. This callback has the type signature
24 ompt_callback_mutex_t.
25 A thread dispatches a registered ompt_callback_mutex_released callback for each
26 occurrence of a critical-released event in that thread. This callback has the type signature
27 ompt_callback_mutex_t.
28 The callbacks occur in the task that encounters the critical construct. The callbacks should receive
29 ompt_mutex_critical as their kind argument if practical, but a less specific kind is
30 acceptable.

300 OpenMP API – Version 5.2 November 2021


1 Restrictions
2 Restrictions to the critical construct are as follows:
3 • Unless omp_sync_hint_none is specified, the critical construct must specify a name.
4 • The hint-expr that is applied to each of the critical constructs with the same name must
5 evaluate to the same value.
Fortran
6 • If a name is specified on a critical directive, the same name must also be specified on the
7 end critical directive.
8 • If no name appears on the critical directive, no name can appear on the end critical
9 directive.
Fortran
10 Cross References
11 • ompt_callback_mutex_acquire_t, see Section 19.5.2.14
12 • ompt_callback_mutex_t, see Section 19.5.2.15
13 • ompt_mutex_t, see Section 19.4.4.17
14 • hint clause, see Section 15.1.2

15 15.3 Barriers
16 15.3.1 barrier Construct
Name: barrier Association: none
17
Category: executable Properties: default

18 Binding
19 The binding thread set for a barrier region is the current team. A barrier region binds to the
20 innermost enclosing parallel region.

21 Semantics
22 The barrier construct specifies an explicit barrier at the point at which the construct appears.
23 Unless the binding region is canceled, all threads of the team that executes that binding region must
24 enter the barrier region and complete execution of all explicit tasks bound to that binding region
25 before any of the threads continue execution beyond the barrier.
26 The barrier region includes an implicit task scheduling point in the current task region.

CHAPTER 15. SYNCHRONIZATION CONSTRUCTS AND CLAUSES 301


1 Execution Model Events
2 The explicit-barrier-begin event occurs in each thread that encounters the barrier construct on
3 entry to the barrier region.
4 The explicit-barrier-wait-begin event occurs when a task begins an interval of active or passive
5 waiting in a barrier region.
6 The explicit-barrier-wait-end event occurs when a task ends an interval of active or passive waiting
7 and resumes execution in a barrier region.
8 The explicit-barrier-end event occurs in each thread that encounters the barrier construct after
9 the barrier synchronization on exit from the barrier region.
10 A cancellation event occurs if cancellation is activated at an implicit cancellation point in a
11 barrier region.

12 Tool Callbacks
13 A thread dispatches a registered ompt_callback_sync_region callback with
14 ompt_sync_region_barrier_explicit as its kind argument and ompt_scope_begin
15 as its endpoint argument for each occurrence of an explicit-barrier-begin event. Similarly, a thread
16 dispatches a registered ompt_callback_sync_region callback with
17 ompt_sync_region_barrier_explicit as its kind argument and ompt_scope_end as
18 its endpoint argument for each occurrence of an explicit-barrier-end event. These callbacks occur
19 in the context of the task that encountered the barrier construct and have type signature
20 ompt_callback_sync_region_t.
21 A thread dispatches a registered ompt_callback_sync_region_wait callback with
22 ompt_sync_region_barrier_explicit as its kind argument and ompt_scope_begin
23 as its endpoint argument for each occurrence of an explicit-barrier-wait-begin event. Similarly, a
24 thread dispatches a registered ompt_callback_sync_region_wait callback with
25 ompt_sync_region_barrier_explicit as its kind argument and ompt_scope_end as
26 its endpoint argument for each occurrence of an explicit-barrier-wait-end event. These callbacks
27 occur in the context of the task that encountered the barrier construct and have type signature
28 ompt_callback_sync_region_t.
29 A thread dispatches a registered ompt_callback_cancel callback with
30 ompt_cancel_detected as its flags argument for each occurrence of a cancellation event in
31 that thread. The callback occurs in the context of the encountering task. The callback has type
32 signature ompt_callback_cancel_t.

33 Restrictions
34 Restrictions to the barrier construct are as follows:
35 • Each barrier region must be encountered by all threads in a team or by none at all, unless
36 cancellation has been requested for the innermost enclosing parallel region.
37 • The sequence of worksharing regions and barrier regions encountered must be the same for
38 every thread in a team.

302 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • ompt_callback_cancel_t, see Section 19.5.2.18
3 • ompt_callback_sync_region_t, see Section 19.5.2.13
4 • ompt_scope_endpoint_t, see Section 19.4.4.11
5 • ompt_sync_region_t, see Section 19.4.4.14

6 15.3.2 Implicit Barriers


7 This section describes the OMPT events and tool callbacks associated with implicit barriers, which
8 occur at the end of various regions as defined in the description of the constructs to which they
9 correspond. Implicit barriers are task scheduling points. For a description of task scheduling
10 points, associated events, and tool callbacks, see Section 12.9.

11 Execution Model Events


12 The implicit-barrier-begin event occurs in each implicit task at the beginning of an implicit barrier
13 region.
14 The implicit-barrier-wait-begin event occurs when a task begins an interval of active or passive
15 waiting in an implicit barrier region.
16 The implicit-barrier-wait-end event occurs when a task ends an interval of active or waiting and
17 resumes execution of an implicit barrier region.
18 The implicit-barrier-end event occurs in each implicit task after the barrier synchronization on exit
19 from an implicit barrier region.
20 A cancellation event occurs if cancellation is activated at an implicit cancellation point in an
21 implicit barrier region.

22 Tool Callbacks
23 A thread dispatches a registered ompt_callback_sync_region callback for each implicit
24 barrier begin and end event. Similarly, a thread dispatches a registered
25 ompt_callback_sync_region_wait callback for each implicit barrier wait-begin and
26 wait-end event. All callbacks for implicit barrier events execute in the context of the encountering
27 task and have type signature ompt_callback_sync_region_t.
28 For the implicit barrier at the end of a worksharing construct, the kind argument is
29 ompt_sync_region_barrier_implicit_workshare. For the implicit barrier at the end
30 of a parallel region, the kind argument is
31 ompt_sync_region_barrier_implicit_parallel. For an extra barrier added by an
32 OpenMP implementation, the kind argument is
33 ompt_sync_region_barrier_implementation. For a barrier at the end of a teams
34 region, the kind argument is ompt_sync_region_barrier_teams.

CHAPTER 15. SYNCHRONIZATION CONSTRUCTS AND CLAUSES 303


1 A thread dispatches a registered ompt_callback_cancel callback with
2 ompt_cancel_detected as its flags argument for each occurrence of a cancellation event in
3 that thread. The callback occurs in the context of the encountering task. The callback has type
4 signature ompt_callback_cancel_t.

5 Restrictions
6 Restrictions to implicit barriers are as follows:
7 • If a thread is in the state ompt_state_wait_barrier_implicit_parallel, a call to
8 ompt_get_parallel_info may return a pointer to a copy of the data object associated
9 with the parallel region rather than a pointer to the associated data object itself. Writing to the
10 data object returned by omp_get_parallel_info when a thread is in the
11 ompt_state_wait_barrier_implicit_parallel results in unspecified behavior.

12 Cross References
13 • ompt_callback_cancel_t, see Section 19.5.2.18
14 • ompt_callback_sync_region_t, see Section 19.5.2.13
15 • ompt_cancel_flag_t, see Section 19.4.4.26
16 • ompt_scope_endpoint_t, see Section 19.4.4.11
17 • ompt_sync_region_t, see Section 19.4.4.14

18 15.3.3 Implementation-Specific Barriers


19 An OpenMP implementation can execute implementation-specific barriers that the OpenMP
20 specification does not imply; therefore, no execution model events are bound to them. The
21 implementation can handle these barriers like implicit barriers and dispatch all events as for
22 implicit barriers. These callbacks use ompt_sync_region_barrier_implementation
23 — or ompt_sync_region_barrier, if the implementation cannot make a distinction — as
24 the kind argument when they are dispatched.

25 15.4 taskgroup Construct


Name: taskgroup Association: block
26
Category: executable Properties: cancellable

27 Clauses
28 allocate, task_reduction

29 Binding
30 The binding task set of a taskgroup region is all tasks of the current team that are generated in
31 the region. A taskgroup region binds to the innermost enclosing parallel region.

304 OpenMP API – Version 5.2 November 2021


1 Semantics
2 The taskgroup construct specifies a wait on completion of the taskgroup set associated with the
3 taskgroup region. When a thread encounters a taskgroup construct, it starts executing the
4 region.
5 An implicit task scheduling point occurs at the end of the taskgroup region. The current task is
6 suspended at the task scheduling point until all tasks in the taskgroup set complete execution.

7 Execution Model Events


8 The taskgroup-begin event occurs in each thread that encounters the taskgroup construct on
9 entry to the taskgroup region.
10 The taskgroup-wait-begin event occurs when a task begins an interval of active or passive waiting
11 in a taskgroup region.
12 The taskgroup-wait-end event occurs when a task ends an interval of active or passive waiting and
13 resumes execution in a taskgroup region.
14 The taskgroup-end event occurs in each thread that encounters the taskgroup construct after the
15 taskgroup synchronization on exit from the taskgroup region.

16 Tool Callbacks
17 A thread dispatches a registered ompt_callback_sync_region callback with
18 ompt_sync_region_taskgroup as its kind argument and ompt_scope_begin as its
19 endpoint argument for each occurrence of a taskgroup-begin event in the task that encounters the
20 taskgroup construct. Similarly, a thread dispatches a registered
21 ompt_callback_sync_region callback with ompt_sync_region_taskgroup as its
22 kind argument and ompt_scope_end as its endpoint argument for each occurrence of a
23 taskgroup-end event in the task that encounters the taskgroup construct. These callbacks occur
24 in the task that encounters the taskgroup construct and have the type signature
25 ompt_callback_sync_region_t.
26 A thread dispatches a registered ompt_callback_sync_region_wait callback with
27 ompt_sync_region_taskgroup as its kind argument and ompt_scope_begin as its
28 endpoint argument for each occurrence of a taskgroup-wait-begin event. Similarly, a thread
29 dispatches a registered ompt_callback_sync_region_wait callback with
30 ompt_sync_region_taskgroup as its kind argument and ompt_scope_end as its
31 endpoint argument for each occurrence of a taskgroup-wait-end event. These callbacks occur in the
32 context of the task that encounters the taskgroup construct and have type signature
33 ompt_callback_sync_region_t.

34 Cross References
35 • Task Scheduling, see Section 12.9
36 • ompt_callback_sync_region_t, see Section 19.5.2.13
37 • ompt_scope_endpoint_t, see Section 19.4.4.11

CHAPTER 15. SYNCHRONIZATION CONSTRUCTS AND CLAUSES 305


1 • ompt_sync_region_t, see Section 19.4.4.14
2 • allocate clause, see Section 6.6
3 • task_reduction clause, see Section 5.5.9

4 15.5 taskwait Construct


Name: taskwait Association: none
5
Category: executable Properties: default

6 Clauses
7 depend, nowait

8 Binding
9 The taskwait region binds to the current task region. The binding thread set of the taskwait
10 region is the current team.

11 Semantics
12 The taskwait construct specifies a wait on the completion of child tasks of the current task.
13 If no depend clause is present on the taskwait construct, the current task region is suspended
14 at an implicit task scheduling point associated with the construct. The current task region remains
15 suspended until all child tasks that it generated before the taskwait region complete execution.
16 If one or more depend clauses are present on the taskwait construct and the nowait clause is
17 not also present, the behavior is as if these clauses were applied to a task construct with an empty
18 associated structured block that generates a mergeable and included task. Thus, the current task
19 region is suspended until the predecessor tasks of this task complete execution.
20 If one or more depend clauses are present on the taskwait construct and the nowait clause is
21 also present, the behavior is as if these clauses were applied to a task construct with an empty
22 associated structured block that generates a task for which execution may be deferred. Thus, all
23 predecessor tasks of this task must complete execution before any subsequently generated task that
24 depends on this task starts its execution.

25 Execution Model Events


26 The taskwait-begin event occurs in a thread when it encounters a taskwait construct with no
27 depend clause on entry to the taskwait region.
28 The taskwait-wait-begin event occurs when a task begins an interval of active or passive waiting in
29 a region corresponding to a taskwait construct with no depend clause.
30 The taskwait-wait-end event occurs when a task ends an interval of active or passive waiting and
31 resumes execution from a region corresponding to a taskwait construct with no depend clause.
32 The taskwait-end event occurs in a thread when it encounters a taskwait construct with no
33 depend clause after the taskwait synchronization on exit from the taskwait region.

306 OpenMP API – Version 5.2 November 2021


1 The taskwait-init event occurs in a thread when it encounters a taskwait construct with one or
2 more depend clauses on entry to the taskwait region.
3 The taskwait-complete event occurs on completion of the dependent task that results from a
4 taskwait construct with one or more depend clauses, in the context of the thread that executes
5 the dependent task and before any subsequently generated task that depends on the dependent task
6 starts its execution.

7 Tool Callbacks
8 A thread dispatches a registered ompt_callback_sync_region callback with
9 ompt_sync_region_taskwait as its kind argument and ompt_scope_begin as its
10 endpoint argument for each occurrence of a taskwait-begin event in the task that encounters the
11 taskwait construct. Similarly, a thread dispatches a registered
12 ompt_callback_sync_region callback with ompt_sync_region_taskwait as its
13 kind argument and ompt_scope_end as its endpoint argument for each occurrence of a
14 taskwait-end event in the task that encounters the taskwait construct. These callbacks occur in
15 the task that encounters the taskwait construct and have the type signature
16 ompt_callback_sync_region_t.
17 A thread dispatches a registered ompt_callback_sync_region_wait callback with
18 ompt_sync_region_taskwait as its kind argument and ompt_scope_begin as its
19 endpoint argument for each occurrence of a taskwait-wait-begin event. Similarly, a thread
20 dispatches a registered ompt_callback_sync_region_wait callback with
21 ompt_sync_region_taskwait as its kind argument and ompt_scope_end as its endpoint
22 argument for each occurrence of a taskwait-wait-end event. These callbacks occur in the context of
23 the task that encounters the taskwait construct and have type signature
24 ompt_callback_sync_region_t.
25 A thread dispatches a registered ompt_callback_task_create callback for each occurrence
26 of a taskwait-init event in the context of the encountering task. This callback has the type signature
27 ompt_callback_task_create_t. In the dispatched callback, (flags &
28 ompt_task_taskwait) always evaluates to true. If the nowait clause is not present,
29 (flags & ompt_task_undeferred) also evaluates to true.
30 A thread dispatches a registered ompt_callback_task_schedule callback for each
31 occurrence of a taskwait-complete event. This callback has the type signature
32 ompt_callback_task_schedule_t with ompt_taskwait_complete as its
33 prior_task_status argument.

34 Restrictions
35 Restrictions to the taskwait construct are as follows:
36 • The mutexinoutset dependence-type may not appear in a depend clause on a taskwait
37 construct.
38 • If the dependence-type of a depend clause is depobj then the dependence objects cannot
39 represent dependences of the mutexinoutset dependence type.

CHAPTER 15. SYNCHRONIZATION CONSTRUCTS AND CLAUSES 307


1 • The nowait clause may only appear on a taskwait directive if the depend clause is present.

2 Cross References
3 • ompt_callback_sync_region_t, see Section 19.5.2.13
4 • ompt_scope_endpoint_t, see Section 19.4.4.11
5 • ompt_sync_region_t, see Section 19.4.4.14
6 • depend clause, see Section 15.9.5
7 • nowait clause, see Section 15.6
8 • task directive, see Section 12.5

9 15.6 nowait Clause


10 Name: nowait Properties: unique, end-clause

11 Directives
12 dispatch, do, for, interop, scope, sections, single, target, target enter
13 data, target exit data, target update, taskwait, workshare

14 Semantics
15 The nowait clause overrides any synchronization that would otherwise occur at the end of a
16 construct. It can also specify that an interoperability requirement set includes the nowait property.
17 If the construct includes an implicit barrier, the nowait clause specifies that the barrier will not
18 occur. For constructs that generate a task, the nowait clause specifies that the generated task may
19 be deferred. If the nowait clause is not present on the directive then the generated task is an
20 included task (so it executes synchronously in the context of the encountering task). For constructs
21 that generate an interoperability requirement set, the nowait clause adds the nowait property to
22 the set.

23 Cross References
24 • dispatch directive, see Section 7.6
25 • do directive, see Section 11.5.2
26 • for directive, see Section 11.5.1
27 • interop directive, see Section 14.1
28 • scope directive, see Section 11.2
29 • sections directive, see Section 11.3
30 • single directive, see Section 11.1
31 • target directive, see Section 13.8

308 OpenMP API – Version 5.2 November 2021


1 • target enter data directive, see Section 13.6
2 • target exit data directive, see Section 13.7
3 • target update directive, see Section 13.9
4 • taskwait directive, see Section 15.5
5 • workshare directive, see Section 11.4

6 15.7 nogroup Clause


7 Name: nogroup Properties: unique

8 Directives
9 taskloop

10 Semantics
11 The nogroup clause overrides any implicit taskgroup that would otherwise enclose the
12 construct.

13 Cross References
14 • taskloop directive, see Section 12.6

15 15.8 OpenMP Memory Ordering


16 This sections describes constructs and clauses in OpenMP that support ordering of memory
17 operations.

18 15.8.1 memory-order Clauses


19 Clause groups
Properties: unique, exclusive, inarguable Members: acq_rel, acquire, relaxed,
20
release, seq_cst

21 Directives
22 atomic, flush

23 Semantics
24 The memory-order clause grouping defines a set of clauses that indicate the memory ordering
25 requirements for the visibility of the effects of the constructs on which they may be specified.

CHAPTER 15. SYNCHRONIZATION CONSTRUCTS AND CLAUSES 309


1 Cross References
2 • atomic directive, see Section 15.8.4
3 • flush directive, see Section 15.8.5

4 15.8.2 atomic Clauses


5 Clause groups
6 Properties: unique, exclusive, inarguable Members: read, update, write

7 Directives
8 atomic

9 Semantics
10 The atomic clause grouping defines a set of clauses that defines the semantics for which a directive
11 enforces atomicity. If a construct accepts the atomic clause grouping and no member of the
12 grouping is specified, the effect is as if the update clause is specified.

13 Cross References
14 • atomic directive, see Section 15.8.4

15 15.8.3 extended-atomic Clauses


16 Clause groups
17 Properties: unique Members: capture, compare, fail, weak

18 Directives
19 atomic

20 Semantics
21 The extended-atomic clause grouping defines a set of clauses that extend the atomicity semantics
22 specified by members of the atomic clause grouping. Other than the fail clause, they are
23 inarguable; the fail clause takes a member of the memory-order clause grouping as an argument.
24 The capture clause extends the semantics to capture the value of the variable being updated
25 atomically. The compare clause extends the semantics to perform the atomic update conditionally.
26 The fail clause extends the semantics to specify the memory ordering requirements for any
27 comparison performed by any atomic conditional update that fails. Its argument overrides any other
28 specified memory ordering. If the fail clause is not specified on an atomic conditional update the
29 effect is as if the fail clause is specified with a default argument that depends on the effective
30 memory ordering. If the effective memory ordering is acq_rel, the default argument is
31 acquire. If the effective memory ordering is release, the default argument is relaxed. For
32 any other effective memory ordering, the default argument is equal to that effective memory
33 ordering. The weak clause specifies that the comparison performed by a conditional atomic update
34 may spuriously fail, evaluating to not equal even when the values are equal.

310 OpenMP API – Version 5.2 November 2021


1
2 Note – Allowing for spurious failure by specifying a weak clause can result in performance gains
3 on some systems when using compare-and-swap in a loop. For cases where a single
4 compare-and-swap would otherwise be sufficient, using a loop over a weak compare-and-swap is
5 unlikely to improve performance.
6
7 Restrictions
8 Restrictions to the atomic construct are as follows:
9 • acq_rel and release cannot be specified as arguments to the fail clause.
10 Cross References
11 • atomic Clauses, see Section 15.8.2
12 • atomic directive, see Section 15.8.4
13 • memory-order Clauses, see Section 15.8.1

14 15.8.4 atomic Construct


Name: atomic Association: block (atomic structured block)
15
Category: executable Properties: simdizable
16 Clause groups
17 atomic, extended-atomic, memory-order
18 Clauses
19 hint
20 This section uses the terminology and symbols defined for OpenMP Atomic Structured Blocks (see
21 Section 4.3.1.3).
22 Binding
23 If the size of x is 8, 16, 32, or 64 bits and x is aligned to a multiple of its size, the binding thread set
24 for the atomic region is all threads on the device. Otherwise, the binding thread set for the
25 atomic region is all threads in the contention group. atomic regions enforce exclusive access
26 with respect to other atomic regions that access the same storage location x among all threads in
27 the binding thread set without regard to the teams to which the threads belong.
28 Semantics
29 The atomic construct ensures that a specific storage location is accessed atomically so that
30 possible simultaneous reads and writes by multiple threads do not result in indeterminate values.
31 The atomic construct with the read clause results in an atomic read of the location designated
32 by x. The atomic construct with the write clause results in an atomic write of the location
33 designated by x. The atomic construct with the update clause results in an atomic update of the
34 location designated by x using the designated operator or intrinsic. Only the read and write of the
35 location designated by x are performed mutually atomically. The evaluation of expr or expr-list
36 need not be atomic with respect to the read or write of the location designated by x. No task
37 scheduling points are allowed between the read and the write of the location designated by x.

CHAPTER 15. SYNCHRONIZATION CONSTRUCTS AND CLAUSES 311


1 If the capture clause is present, the atomic update is an atomic captured update — an atomic
2 update to the location designated by x using the designated operator or intrinsic while also
3 capturing the original or final value of the location designated by x with respect to the atomic
4 update. The original or final value of the location designated by x is written in the location
5 designated by v based on the base language semantics of structured block or statements of the
6 atomic construct. Only the read and write of the location designated by x are performed mutually
7 atomically. Neither the evaluation of expr or expr-list, nor the write to the location designated by v,
8 need be atomic with respect to the read or write of the location designated by x.
9 If the compare clause is present, the atomic update is an atomic conditional update. For forms
10 that use an equality comparison, the operation is an atomic compare-and-swap. It atomically
11 compares the value of x to e and writes the value of d into the location designated by x if they are
12 equal. Based on the base language semantics of the associated structured block, the original or final
13 value of the location designated by x is written to the location designated by v, which is allowed to
14 be the same location as designated by e, or the result of the comparison is written to the location
15 designated by r. Only the read and write of the location designated by x are performed mutually
16 atomically. Neither the evaluation of either e or d nor writes to the locations designated by v and r
17 need be atomic with respect to the read or write of the location designated by x.
C / C++
18 If the compare clause is present, forms that use ordop are logically an atomic maximum or
19 minimum, but they may be implemented with a compare-and-swap loop with short-circuiting. For
20 forms where statement is cond-expr-stmt, if the result of the condition implies that the value of x
21 does not change then the update may not occur.
C / C++
22 If a memory-order clause is present, or implicitly provided by a requires directive, it specifies
23 the effective memory ordering. Otherwise the effect is as if the relaxed memory ordering clause
24 is specified.
25 The atomic construct may be used to enforce memory consistency between threads, based on the
26 guarantees provided by Section 1.4.6. A strong flush on the location designated by x is performed
27 on entry to and exit from the atomic operation, ensuring that the set of all atomic operations applied
28 to the same location in a race-free program has a total completion order. If the write or update
29 clause is specified, the atomic operation is not an atomic conditional update for which the
30 comparison fails, and the effective memory ordering is release, acq_rel, or seq_cst, the
31 strong flush on entry to the atomic operation is also a release flush. If the read or update clause
32 is specified and the effective memory ordering is acquire, acq_rel, or seq_cst then the
33 strong flush on exit from the atomic operation is also an acquire flush. Therefore, if the effective
34 memory ordering is not relaxed, release and/or acquire flush operations are implied and permit
35 synchronization between the threads without the use of explicit flush directives.
36 For all forms of the atomic construct, any combination of two or more of these atomic
37 constructs enforces mutually exclusive access to the locations designated by x among threads in the
38 binding thread set. To avoid data races, all accesses of the locations designated by x that could
39 potentially occur in parallel must be protected with an atomic construct.

312 OpenMP API – Version 5.2 November 2021


1 atomic regions do not guarantee exclusive access with respect to any accesses outside of
2 atomic regions to the same storage location x even if those accesses occur during a critical
3 or ordered region, while an OpenMP lock is owned by the executing task, or during the
4 execution of a reduction clause.
5 However, other OpenMP synchronization can ensure the desired exclusive access. For example, a
6 barrier that follows a series of atomic updates to x guarantees that subsequent accesses do not form
7 a race with the atomic accesses.
8 A compliant implementation may enforce exclusive access between atomic regions that update
9 different storage locations. The circumstances under which this occurs are implementation defined.
10 If the storage location designated by x is not size-aligned (that is, if the byte alignment of x is not a
11 multiple of the size of x), then the behavior of the atomic region is implementation defined.

12 Execution Model Events


13 The atomic-acquiring event occurs in the thread that encounters the atomic construct on entry to
14 the atomic region before initiating synchronization for the region.
15 The atomic-acquired event occurs in the thread that encounters the atomic construct after it
16 enters the region, but before it executes the structured block of the atomic region.
17 The atomic-released event occurs in the thread that encounters the atomic construct after it
18 completes any synchronization on exit from the atomic region.

19 Tool Callbacks
20 A thread dispatches a registered ompt_callback_mutex_acquire callback for each
21 occurrence of an atomic-acquiring event in that thread. This callback has the type signature
22 ompt_callback_mutex_acquire_t.
23 A thread dispatches a registered ompt_callback_mutex_acquired callback for each
24 occurrence of an atomic-acquired event in that thread. This callback has the type signature
25 ompt_callback_mutex_t.
26 A thread dispatches a registered ompt_callback_mutex_released callback with
27 ompt_mutex_atomic as the kind argument if practical, although a less specific kind may be
28 used, for each occurrence of an atomic-released event in that thread. This callback has the type
29 signature ompt_callback_mutex_t and occurs in the task that encounters the atomic
30 construct.

31 Restrictions
32 Restrictions to the atomic construct are as follows:
33 • OpenMP constructs may not be encountered during execution of an atomic region.
34 • If a capture or compare clause is specified, the atomic clause must be update.
35 • If a capture clause is specified but the compare clause is not specified, an
36 update-capture-atomic structured block must be associated with the construct.

CHAPTER 15. SYNCHRONIZATION CONSTRUCTS AND CLAUSES 313


1 • If both capture and compare clauses are specified, a conditional-update-capture-atomic
2 structured block must be associated with the construct.
3 • If a compare clause is specified but the capture clause is not specified, a
4 conditional-update-atomic structured block must be associated with the construct.
5 • If a write clause is specified, a write-atomic structured block must be associated with the
6 construct.
7 • If a read clause is specified, a read-atomic structured block must be associated with the
8 construct.
9 • If the atomic clause is read then the memory-order clause must not be release.
10 • If the atomic clause is write then the memory-order clause must not be acquire.
11 • The weak clause may only appear if the resulting atomic operation is an atomic conditional
12 update for which the comparison tests for equality.
C / C++
13 • All atomic accesses to the storage locations designated by x throughout the program are required
14 to have a compatible type.
15 • The fail clause may only appear if the resulting atomic operation is an atomic conditional
16 update.
C / C++
Fortran
17 • All atomic accesses to the storage locations designated by x throughout the program are required
18 to have the same type and type parameters.
19 • The fail clause may only appear if the resulting atomic operation is an atomic conditional
20 update or an atomic update where intrinsic-procedure-name is either MAX or MIN.
Fortran
21 Cross References
22 • Lock Routines, see Section 18.9
23 • OpenMP Atomic Structured Blocks, see Section 4.3.1.3
24 • Synchronization Hints, see Section 15.1
25 • ompt_callback_mutex_acquire_t, see Section 19.5.2.14
26 • ompt_callback_mutex_t, see Section 19.5.2.15
27 • ompt_mutex_t, see Section 19.4.4.17
28 • ordered Construct, see Section 15.10
29 • barrier directive, see Section 15.3.1

314 OpenMP API – Version 5.2 November 2021


1 • critical directive, see Section 15.2
2 • flush directive, see Section 15.8.5
3 • hint clause, see Section 15.1.2
4 • requires directive, see Section 8.2

5 15.8.5 flush Construct


Name: flush Association: none
6
Category: executable Properties: default

7 Arguments
8 flush(list)
Name Type Properties
9
list list of variable list item type optional

10 Clause groups
11 memory-order

12 Binding
13 The binding thread set for a flush region is all threads in the device-set of its flush operation.

14 Semantics
15 The flush construct executes the OpenMP flush operation. This operation makes a thread’s
16 temporary view of memory consistent with memory and enforces an order on the memory
17 operations of the variables explicitly specified or implied. Execution of a flush region affects the
18 memory and it affects the temporary view of memory of the encountering thread. It does not affect
19 the temporary view of other threads. Other threads on devices in the device-set must themselves
20 execute a flush operation in order to be guaranteed to observe the effects of the flush operation of
21 the encountering thread. See the memory model description in Section 1.4 for more details.
22 If neither a memory-order clause nor a list argument appears on a flush construct then the
23 behavior is as if the memory-order clause is seq_cst.
24 A flush construct with the seq_cst clause, executed on a given thread, operates as if all data
25 storage blocks that are accessible to the thread are flushed by a strong flush operation. A flush
26 construct with a list applies a strong flush operation to the items in the list, and the flush operation
27 does not complete until the operation is complete for all specified list items. An implementation
28 may implement a flush construct with a list by ignoring the list and treating it the same as a
29 flush construct with the seq_cst clause.
30 If no list items are specified, the flush operation has the release and/or acquire flush properties:
31 • If the memory-order clause is seq_cst or acq_rel, the flush operation is both a release flush
32 and an acquire flush.

CHAPTER 15. SYNCHRONIZATION CONSTRUCTS AND CLAUSES 315


1 • If the memory-order clause is release, the flush operation is a release flush.
2 • If the memory-order clause is acquire, the flush operation is an acquire flush.
C / C++
3 If a pointer is present in the list, the pointer itself is flushed, not the memory block to which the
4 pointer refers.
5 A flush construct without a list corresponds to a call to atomic_thread_fence, where the
6 argument is given by the identifier that results from prefixing memory_order_ to the
7 memory-order clause name.
8 For a flush construct without a list, the generated flush region implicitly performs the
9 corresponding call to atomic_thread_fence. The behavior of an explicit call to
10 atomic_thread_fence that occurs in the program and does not have the argument
11 memory_order_consume is as if the call is replaced by its corresponding flush construct.
C / C++
Fortran
12 If the list item or a subobject of the list item has the POINTER attribute, the allocation or
13 association status of the POINTER item is flushed, but the pointer target is not. If the list item is a
14 Cray pointer, the pointer is flushed, but the object to which it points is not. Cray pointer support has
15 been deprecated. If the list item is of type C_PTR, the variable is flushed, but the storage that
16 corresponds to that address is not flushed. If the list item or the subobject of the list item has the
17 ALLOCATABLE attribute and has an allocation status of allocated, the allocated variable is flushed;
18 otherwise the allocation status is flushed.
Fortran
19 Execution Model Events
20 The flush event occurs in a thread that encounters the flush construct.

21 Tool Callbacks
22 A thread dispatches a registered ompt_callback_flush callback for each occurrence of a
23 flush event in that thread. This callback has the type signature ompt_callback_flush_t.

24 Restrictions
25 Restrictions to the flush construct are as follows:
26 • If a memory-order clause is specified, the list argument must not be specified.
27 • The memory-order clause must not be relaxed.

28 Cross References
29 • ompt_callback_flush_t, see Section 19.5.2.17

316 OpenMP API – Version 5.2 November 2021


1 15.8.6 Implicit Flushes
2 Flush operations implied when executing an atomic region are described in Section 15.8.4.
3 A flush region that corresponds to a flush directive with the release clause present is
4 implied at the following locations:
5 • During a barrier region;
6 • At entry to a parallel region;
7 • At entry to a teams region;
8 • At exit from a critical region;
9 • During an omp_unset_lock region;
10 • During an omp_unset_nest_lock region;
11 • During an omp_fulfill_event region;
12 • Immediately before every task scheduling point;
13 • At exit from the task region of each implicit task;
14 • At exit from an ordered region, if a threads clause or a doacross clause with a source
15 dependence type is present, or if no clauses are present; and
16 • During a cancel region, if the cancel-var ICV is true.
17 For a target construct, the device-set of an implicit release flush that is performed in a target task
18 during the generation of the target region and that is performed on exit from the initial task
19 region that implicitly encloses the target region consists of the devices that execute the target
20 task and the target region.
21 A flush region that corresponds to a flush directive with the acquire clause present is
22 implied at the following locations:
23 • During a barrier region;
24 • At exit from a teams region;
25 • At entry to a critical region;
26 • If the region causes the lock to be set, during:
27 – an omp_set_lock region;
28 – an omp_test_lock region;
29 – an omp_set_nest_lock region; and
30 – an omp_test_nest_lock region;
31 • Immediately after every task scheduling point;

CHAPTER 15. SYNCHRONIZATION CONSTRUCTS AND CLAUSES 317


1 • At entry to the task region of each implicit task;
2 • At entry to an ordered region, if a threads clause or a doacross clause with a sink
3 dependence type is present, or if no clauses are present; and
4 • Immediately before a cancellation point, if the cancel-var ICV is true and cancellation has been
5 activated.
6 For a target construct, the device-set of an implicit acquire flush that is performed in a target
7 task following the generation of the target region or that is performed on entry to the initial task
8 region that implicitly encloses the target region consists of the devices that execute the target
9 task and the target region.
10

11 Note – A flush region is not implied at the following locations:


12 • At entry to worksharing regions; and
13 • At entry to or exit from masked regions.
14
15 The synchronization behavior of implicit flushes is as follows:
16 • When a thread executes an atomic region for which the corresponding construct has the
17 release, acq_rel, or seq_cst clause and specifies an atomic operation that starts a given
18 release sequence, the release flush that is performed on entry to the atomic operation
19 synchronizes with an acquire flush that is performed by a different thread and has an associated
20 atomic operation that reads a value written by a modification in the release sequence.
21 • When a thread executes an atomic region for which the corresponding construct has the
22 acquire, acq_rel, or seq_cst clause and specifies an atomic operation that reads a value
23 written by a given modification, a release flush that is performed by a different thread and has an
24 associated release sequence that contains that modification synchronizes with the acquire flush
25 that is performed on exit from the atomic operation.
26 • When a thread executes a critical region that has a given name, the behavior is as if the
27 release flush performed on exit from the region synchronizes with the acquire flush performed on
28 entry to the next critical region with the same name that is performed by a different thread,
29 if it exists.
30 • When a thread team executes a barrier region, the behavior is as if the release flush performed
31 by each thread within the region, and the release flush performed by any other thread upon
32 fulfilling the allow-completion event for a detachable task bound to the binding parallel region of
33 the region, synchronizes with the acquire flush performed by all other threads within the region.
34 • When a thread executes a taskwait region that does not result in the creation of a dependent
35 task and the task that encounters the corresponding taskwait construct has at least one child
36 task, the behavior is as if each thread that executes a child task that is generated before the

318 OpenMP API – Version 5.2 November 2021


1 taskwait region performs a release flush upon completion of the associated structured block
2 of the child task that synchronizes with an acquire flush performed in the taskwait region. If
3 the child task is detachable, the thread that fulfills its allow-completion event performs a release
4 flush upon fulfilling the event that synchronizes with the acquire flush performed in the
5 taskwait region.
6 • When a thread executes a taskgroup region, the behavior is as if each thread that executes a
7 remaining descendent task performs a release flush upon completion of the associated structured
8 block of the descendent task that synchronizes with an acquire flush performed on exit from the
9 taskgroup region. If the descendent task is detachable, the thread that fulfills its
10 allow-completion event performs a release flush upon fulfilling the event that synchronizes with
11 the acquire flush performed in the taskgroup region.
12 • When a thread executes an ordered region that does not arise from a stand-alone ordered
13 directive, the behavior is as if the release flush performed on exit from the region synchronizes
14 with the acquire flush performed on entry to an ordered region encountered in the next logical
15 iteration to be executed by a different thread, if it exists.
16 • When a thread executes an ordered region that arises from a stand-alone ordered directive,
17 the behavior is as if the release flush performed in the ordered region from a given source
18 iteration synchronizes with the acquire flush performed in all ordered regions executed by a
19 different thread that are waiting for dependences on that iteration to be satisfied.
20 • When a thread team begins execution of a parallel region, the behavior is as if the release
21 flush performed by the primary thread on entry to the parallel region synchronizes with the
22 acquire flush performed on entry to each implicit task that is assigned to a different thread.
23 • When an initial thread begins execution of a target region that is generated by a different
24 thread from a target task, the behavior is as if the release flush performed by the generating
25 thread in the target task synchronizes with the acquire flush performed by the initial thread on
26 entry to its initial task region.
27 • When an initial thread completes execution of a target region that is generated by a different
28 thread from a target task, the behavior is as if the release flush performed by the initial thread on
29 exit from its initial task region synchronizes with the acquire flush performed by the generating
30 thread in the target task.
31 • When a thread encounters a teams construct, the behavior is as if the release flush performed by
32 the thread on entry to the teams region synchronizes with the acquire flush performed on entry
33 to each initial task that is executed by a different initial thread that participates in the execution of
34 the teams region.
35 • When a thread that encounters a teams construct reaches the end of the teams region, the
36 behavior is as if the release flush performed by each different participating initial thread at exit
37 from its initial task synchronizes with the acquire flush performed by the thread at exit from the
38 teams region.
39 • When a task generates an explicit task that begins execution on a different thread, the behavior is

CHAPTER 15. SYNCHRONIZATION CONSTRUCTS AND CLAUSES 319


1 as if the thread that is executing the generating task performs a release flush that synchronizes
2 with the acquire flush performed by the thread that begins to execute the explicit task.
3 • When an undeferred task completes execution on a given thread that is different from the thread
4 on which its generating task is suspended, the behavior is as if a release flush performed by the
5 thread that completes execution of the associated structured block of the undeferred task
6 synchronizes with an acquire flush performed by the thread that resumes execution of the
7 generating task.
8 • When a dependent task with one or more predecessor tasks begins execution on a given thread,
9 the behavior is as if each release flush performed by a different thread on completion of the
10 associated structured block of a predecessor task synchronizes with the acquire flush performed
11 by the thread that begins to execute the dependent task. If the predecessor task is detachable, the
12 thread that fulfills its allow-completion event performs a release flush upon fulfilling the event
13 that synchronizes with the acquire flush performed when the dependent task begins to execute.
14 • When a task begins execution on a given thread and it is mutually exclusive with respect to
15 another sibling task that is executed by a different thread, the behavior is as if each release flush
16 performed on completion of the sibling task synchronizes with the acquire flush performed by
17 the thread that begins to execute the task.
18 • When a thread executes a cancel region, the cancel-var ICV is true, and cancellation is not
19 already activated for the specified region, the behavior is as if the release flush performed during
20 the cancel region synchronizes with the acquire flush performed by a different thread
21 immediately before a cancellation point in which that thread observes cancellation was activated
22 for the region.
23 • When a thread executes an omp_unset_lock region that causes the specified lock to be unset,
24 the behavior is as if a release flush is performed during the omp_unset_lock region that
25 synchronizes with an acquire flush that is performed during the next omp_set_lock or
26 omp_test_lock region to be executed by a different thread that causes the specified lock to be
27 set.
28 • When a thread executes an omp_unset_nest_lock region that causes the specified nested
29 lock to be unset, the behavior is as if a release flush is performed during the
30 omp_unset_nest_lock region that synchronizes with an acquire flush that is performed
31 during the next omp_set_nest_lock or omp_test_nest_lock region to be executed by
32 a different thread that causes the specified nested lock to be set.

33 15.9 OpenMP Dependences


34 This section describes constructs and clauses in OpenMP that support the specification and
35 enforcement of dependences. OpenMP supports two kinds of dependences: task dependences,
36 which enforce orderings between tasks; and cross-iteration dependences, which enforce orderings
37 between loop iterations.

320 OpenMP API – Version 5.2 November 2021


1 15.9.1 task-dependence-type Modifier
2 Modifiers
Name Modifies Type Properties
task-dependence- locator-list Keyword: depobj, in, required, ultimate
3
type inout, inoutset,
mutexinoutset, out

4 Clauses
5 depend, update

6 Semantics
7 OpenMP clauses that are related to task dependences use the task-dependence-type modifier to
8 identify the type of dependence relevant to that clause. The effect of the type of dependence is
9 associated with locator list items as described with the depend clause, see Section 15.9.5.

10 Cross References
11 • depend clause, see Section 15.9.5
12 • update clause, see Section 15.9.3

13 15.9.2 Depend Objects


14 OpenMP depend objects can be used to supply user-computed dependences to depend clauses.
15 OpenMP depend objects must be accessed only through the depobj construct or through the
16 depend clause; programs that otherwise access OpenMP depend objects are non-conforming.
17 An OpenMP depend object can be in one of the following states: uninitialized or initialized.
18 Initially, OpenMP depend objects are in the uninitialized state.

19 15.9.3 update Clause


20 Name: update Properties: unique

21 Arguments
Name Type Properties
22 task-dependence-type Keyword: depobj, in, inout, default
inoutset, mutexinoutset, out

23 Directives
24 depobj

25 Semantics
26 The update clause sets the dependence type of an OpenMP depend object to
27 task-dependence-type.

CHAPTER 15. SYNCHRONIZATION CONSTRUCTS AND CLAUSES 321


1 Restrictions
2 Restrictions to the update clause are as follows:
3 • task-dependence-type must not be depobj.

4 Cross References
5 • depobj directive, see Section 15.9.4
6 • task-dependence-type modifier, see Section 15.9.1

7 15.9.4 depobj Construct


Name: depobj Association: none
8
Category: executable Properties: default

9 Arguments
10 depobj(depend-object)
Name Type Properties
11
depend-object variable of depend type default

12 Clauses
13 depend, destroy, update

14 Clause set
15 Properties: unique, required, exclusive Members: depend, destroy, update

16 Binding
17 The binding thread set for a depobj region is the encountering thread.

18 Semantics
19 The depobj construct initializes, updates or destroys an OpenMP depend object. If a depend
20 clause is specified, the state of depend-object is set to initialized and depend-object is set to
21 represent the dependence that the depend clause specifies. If an update clause is specified,
22 depend-object is updated to represent the new dependence type. If a destroy clause is specified,
23 the state of depend-object is set to uninitialized.

24 Restrictions
25 Restrictions to the depobj construct are as follows:
26 • A depend clause on a depobj construct must only specify one locator.
27 • The state of depend-object must be uninitialized if a depend clause is specified.
28 • The state of depend-object must be initialized if a destroy clause or update clause is
29 specified.

322 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • depend clause, see Section 15.9.5
3 • destroy clause, see Section 3.5
4 • task-dependence-type modifier, see Section 15.9.1
5 • update clause, see Section 15.9.3

6 15.9.5 depend Clause


7 Name: depend Properties: default

8 Arguments
Name Type Properties
9
locator-list list of locator list item type default

10 Modifiers
Name Modifies Type Properties
task-dependence- locator-list Keyword: depobj, in, required, ultimate
type inout, inoutset,
mutexinoutset, out
11 iterator locator-list Complex, name: iterator unique
Arguments:
iterator-specifier OpenMP
expression (repeatable)

12 Directives
13 depobj, interop, target, target enter data, target exit data, target
14 update, task, taskwait

15 Semantics
16 The depend clause enforces additional constraints on the scheduling of tasks. These constraints
17 establish dependences only between sibling tasks. Task dependences are derived from the
18 task-dependence-type and the list items.
19 The storage location of a list item matches the storage location of another list item if they have the
20 same storage location, or if any of the list items is omp_all_memory.
21 For the in task-dependence-type, if the storage location of at least one of the list items matches the
22 storage location of a list item appearing in a depend clause with an out, inout,
23 mutexinoutset, or inoutset task-dependence-type on a construct from which a sibling task
24 was previously generated, then the generated task will be a dependent task of that sibling task.

CHAPTER 15. SYNCHRONIZATION CONSTRUCTS AND CLAUSES 323


1 For the out and inout task-dependence-types, if the storage location of at least one of the list
2 items matches the storage location of a list item appearing in a depend clause with an in, out,
3 inout, mutexinoutset, or inoutset task-dependence-type on a construct from which a
4 sibling task was previously generated, then the generated task will be a dependent task of that
5 sibling task.
6 For the mutexinoutset task-dependence-type, if the storage location of at least one of the list
7 items matches the storage location of a list item appearing in a depend clause with an in, out,
8 inout, or inoutset task-dependence-type on a construct from which a sibling task was
9 previously generated, then the generated task will be a dependent task of that sibling task.
10 If a list item appearing in a depend clause with a mutexinoutset task-dependence-type on a
11 task generating construct matches a list item appearing in a depend clause with a
12 mutexinoutset task-dependence-type on a different task generating construct, and both
13 constructs generate sibling tasks, the sibling tasks will be mutually exclusive tasks.
14 For the inoutset task-dependence-type, if the storage location of at least one of the list items
15 matches the storage location of a list item appearing in a depend clause with an in, out, inout,
16 or mutexinoutset task-dependence-type on a construct from which a sibling task was
17 previously generated, then the generated task will be a dependent task of that sibling task.
18 When the task-dependence-type is depobj, the task dependences are derived from the
19 dependences represented by the depend objects specified in the depend clause as if the depend
20 clauses of the depobj constructs were specified in the current construct.
21 The list items that appear in the depend clause may reference any iterators-identifier defined in its
22 iterator modifier.
23 The list items that appear in the depend clause may include array sections or the
24 omp_all_memory reserved locator.
Fortran
25 If a list item has the ALLOCATABLE attribute and its allocation status is unallocated, the behavior
26 is unspecified. If a list item has the POINTER attribute and its association status is disassociated or
27 undefined, the behavior is unspecified.
Fortran
C / C++
28 The list items that appear in a depend clause may use shape-operators.
C / C++
29

30 Note – The enforced task dependence establishes a synchronization of memory accesses


31 performed by a dependent task with respect to accesses performed by the predecessor tasks.
32 However, the programmer must properly synchronize with respect to other concurrent accesses that
33 occur outside of those tasks.
34

324 OpenMP API – Version 5.2 November 2021


1 Execution Model Events
2 The task-dependences event occurs in a thread that encounters a task generating construct or a
3 taskwait construct with a depend clause immediately after the task-create event for the new
4 task or the taskwait-init event.
5 The task-dependence event indicates an unfulfilled dependence for the generated task. This event
6 occurs in a thread that observes the unfulfilled dependence before it is satisfied.

7 Tool Callbacks
8 A thread dispatches the ompt_callback_dependences callback for each occurrence of the
9 task-dependences event to announce its dependences with respect to the list items in the depend
10 clause. This callback has type signature ompt_callback_dependences_t.
11 A thread dispatches the ompt_callback_task_dependence callback for a task-dependence
12 event to report a dependence between a predecessor task (src_task_data) and a dependent task
13 (sink_task_data). This callback has type signature ompt_callback_task_dependence_t.

14 Restrictions
15 Restrictions to the depend clause are as follows:
16 • List items, other than reserved locators, used in depend clauses of the same task or sibling tasks
17 must indicate identical storage locations or disjoint storage locations.
18 • List items used in depend clauses cannot be zero-length array sections.
19 • The omp_all_memory reserved locator can only be used in a depend clause with an out or
20 inout task-dependence-type.
21 • Array sections cannot be specified in depend clauses with the depobj task-dependence-type.
22 • List items used in depend clauses with the depobj task-dependence-type must be expressions
23 of the OpenMP depend type that correspond to depend objects in the initialized state.
24 • List items that are expressions of the OpenMP depend type can only be used in depend
25 clauses with the depobj task-dependence-type.
Fortran
26 • A common block name cannot appear in a depend clause.
Fortran
C / C++
27 • A bit-field cannot appear in a depend clause.
C / C++

CHAPTER 15. SYNCHRONIZATION CONSTRUCTS AND CLAUSES 325


1 Cross References
2 • Array Sections, see Section 3.2.5
3 • Array Shaping, see Section 3.2.4
4 • ompt_callback_dependences_t, see Section 19.5.2.8
5 • ompt_callback_task_dependence_t, see Section 19.5.2.9
6 • depobj directive, see Section 15.9.4
7 • interop directive, see Section 14.1
8 • iterator modifier, see Section 3.2.6
9 • target directive, see Section 13.8
10 • target enter data directive, see Section 13.6
11 • target exit data directive, see Section 13.7
12 • target update directive, see Section 13.9
13 • task directive, see Section 12.5
14 • task-dependence-type modifier, see Section 15.9.1
15 • taskwait directive, see Section 15.5

16 15.9.6 doacross Clause


17 Name: doacross Properties: required

18 Arguments
Name Type Properties
19
vector loop-iteration vector default

20 Modifiers
Name Modifies Type Properties
21
dependence-type vector Keyword: sink, source required

22 Directives
23 ordered

24 Additional information
25 The clause-name depend may be used as a synonym for the clause-name doacross. This use
26 has been deprecated.

326 OpenMP API – Version 5.2 November 2021


1 Semantics
2 The doacross clause identifies cross-iteration dependences that imply additional constraints on
3 the scheduling of loop iterations. These constraints establish dependences only between loop
4 iterations.
5 The source dependence-type specifies the satisfaction of cross-iteration dependences that arise
6 from the current iteration. If the source dependence-type is specified then the vector argument is
7 optional; if vector is omitted, it is assumed to be omp_cur_iteration.
8 The sink dependence-type specifies a cross-iteration dependence, where vector indicates the
9 iteration that satisfies the dependence. If vector does not occur in the iteration space, the
10 doacross clause is ignored. If all doacross clauses on an ordered construct are ignored
11 then the construct is ignored.
12

13 Note – If the sink dependence-type is specified for a vector that does not indicate an earlier
14 iteration of the logical iteration space, deadlock may occur.
15

16 Restrictions
17 Restrictions to the doacross clause are as follows:
18 • If vector is specified without the omp_cur_iteration keyword and it has n dimensions, the
19 innermost loop-associated construct that encloses the construct on which the clause appears must
20 specify an ordered clause for which the parameter value equals n.
21 • If vector is specified with the omp_cur_iteration keyword and with sink as the
22 dependence-type then it must be omp_cur_iteration - 1.
23 • If vector is specified with source as the dependence-type then it must be
24 omp_cur_iteration.
25 • For each element of vector for which the sink dependence-type is specified, if the loop iteration
26 variable var i has an integral or pointer type, the ith expression of vector must be computable
27 without overflow in that type for any value of var i that can encounter the construct on which the
28 doacross clause appears.
C++
29 • For each element of vector for which the sink dependence-type is specified, if the loop iteration
30 variable var i is of a random access iterator type other than pointer type, the ith expression of
31 vector must be computable without overflow in the type that would be used by
32 std::distance applied to variables of the type of var i for any value of var i that can
33 encounter the construct on which the doacross clause appears.
C++

CHAPTER 15. SYNCHRONIZATION CONSTRUCTS AND CLAUSES 327


1 Cross References
2 • OpenMP Loop-Iteration Spaces and Vectors, see Section 4.4.2
3 • ordered clause, see Section 4.4.4
4 • ordered directive, see Section 15.10.1

5 15.10 ordered Construct


6 This section describes two forms for the ordered construct, the stand-alone ordered construct
7 and the block-associated ordered construct. Both forms include the execution model events, tool
8 callbacks, and restrictions listed in this section.

9 Execution Model Events


10 The ordered-acquiring event occurs in the task that encounters the ordered construct on entry to
11 the ordered region before it initiates synchronization for the region.
12 The ordered-acquired event occurs in the task that encounters the ordered construct after it
13 enters the region, but before it executes the structured block of the ordered region.
14 The ordered-released event occurs in the task that encounters the ordered construct after it
15 completes any synchronization on exit from the ordered region.

16 Tool Callbacks
17 A thread dispatches a registered ompt_callback_mutex_acquire callback for each
18 occurrence of an ordered-acquiring event in that thread. This callback has the type signature
19 ompt_callback_mutex_acquire_t.
20 A thread dispatches a registered ompt_callback_mutex_acquired callback for each
21 occurrence of an ordered-acquired event in that thread. This callback has the type signature
22 ompt_callback_mutex_t.
23 A thread dispatches a registered ompt_callback_mutex_released callback with
24 ompt_mutex_ordered as the kind argument if practical, although a less specific kind may be
25 used, for each occurrence of an ordered-released event in that thread. This callback has the type
26 signature ompt_callback_mutex_t and occurs in the task that encounters the construct.

27 Restrictions
28 • The construct that corresponds to the binding region of an ordered region must specify an
29 ordered clause.
30 • The construct that corresponds to the binding region of an ordered region must not specify a
31 reduction clause with the inscan modifier.
32 • The regions of a stand-alone ordered construct and a block-associated ordered construct
33 must not have the same binding region.

328 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • ompt_callback_mutex_acquire_t, see Section 19.5.2.14
3 • ompt_callback_mutex_t, see Section 19.5.2.15

4 15.10.1 Stand-alone ordered Construct


Name: ordered Association: none
5
Category: executable Properties: default

6 Clauses
7 doacross

8 Binding
9 The binding thread set for a stand-alone ordered region is the current team. A stand-alone
10 ordered region binds to the innermost enclosing worksharing-loop region.

11 Semantics
12 The stand-alone ordered construct specifies that execution must not violate cross-iteration
13 dependences as specified in the doacross clauses that appear on the construct. When a thread
14 that is executing an iteration encounters a ordered construct with one or more doacross
15 clauses for which the sink dependence-type is specified, the thread waits until its dependences on
16 all valid iterations specified by the doacross clauses are satisfied before it continues execution. A
17 specific dependence is satisfied when a thread that is executing the corresponding iteration
18 encounters an ordered construct with a doacross clause for which the source
19 dependence-type is specified.

20 Execution Model Events


21 The doacross-sink event occurs in the task that encounters an ordered construct for each
22 doacross clause for which the sink dependence-type is specified after the dependence is
23 fulfilled.
24 The doacross-source event occurs in the task that encounters an ordered construct with a
25 doacross clause for which the source dependence-type is specified before signaling that the
26 dependence has been fulfilled.

27 Tool Callbacks
28 A thread dispatches a registered ompt_callback_dependences callback with all vector
29 entries listed as ompt_dependence_type_sink in the deps argument for each occurrence of a
30 doacross-sink event in that thread. A thread dispatches a registered
31 ompt_callback_dependences callback with all vector entries listed as
32 ompt_dependence_type_source in the deps argument for each occurrence of a
33 doacross-source event in that thread. These callbacks have the type signature
34 ompt_callback_dependences_t.

CHAPTER 15. SYNCHRONIZATION CONSTRUCTS AND CLAUSES 329


1 Restrictions
2 Additional restrictions to the stand-alone ordered construct are as follows:
3 • At most one doacross clause may appear on the construct with source as the
4 dependence-type.
5 • All doacross clauses that appear on the construct must specify the same dependence-type.
6 • The construct must not be an orphaned construct.

7 Cross References
8 • Worksharing-Loop Constructs, see Section 11.5
9 • ompt_callback_dependences_t, see Section 19.5.2.8
10 • doacross clause, see Section 15.9.6

11 15.10.2 Block-associated ordered Construct


Name: ordered Association: block
12
Category: executable Properties: simdizable, thread-limiting

13 Clause groups
14 parallelization-level

15 Binding
16 The binding thread set for a block-associated ordered region is the current team. A
17 block-associated ordered region binds to the innermost enclosing worksharing-loop, simd or
18 worksharing-loop SIMD region.

19 Semantics
20 If no clauses are specified, the effect is as if the threads parallelization-level clause was
21 specified. If the threads clause is specified, the threads in the team that is executing the
22 worksharing-loop region execute ordered regions sequentially in the order of the loop iterations.
23 If the simd parallelization-level clause is specified, the ordered regions encountered by any
24 thread will execute one at a time in the order of the loop iterations. With either
25 parallelization-level, execution of code outside the region for different iterations can run in parallel;
26 execution of that code within the same iteration must observe any constraints imposed by the
27 base-language semantics.
28 When the thread that is executing the first iteration of the loop encounters an ordered construct,
29 it can enter the ordered region without waiting. When a thread that is executing any subsequent
30 iteration encounters a block-associated ordered construct, it waits at the beginning of the
31 ordered region until execution of all ordered regions that belong to all previous iterations has
32 completed. ordered regions that bind to different regions execute independently of each other.

330 OpenMP API – Version 5.2 November 2021


1 Restrictions
2 Additional restrictions to the block-associated ordered construct are as follows:
3 • The construct is simdizable only if the simd parallelization-level is specified.
4 • If the simd parallelization-level is specified, the binding region must be a simd region or one
5 that corresponds to a combined or composite construct for which the simd construct is a leaf
6 construct.
7 • If the threads parallelization-level is specified, the binding region must be a
8 worksharing-loop region or one that corresponds to a combined or composite construct for which
9 the worksharing-loop is a leaf construct.
10 • If the threads parallelization-level is specified and the binding region corresponds to a
11 combined or composite construct then simd construct must not be a leaf construct unless the
12 simd parallelization-level is also specified.
13 • During execution of the logical iteration of a loop-associated construct, a thread must not execute
14 more than one block-associated ordered region that binds to the corresponding region of the
15 loop-associated construct.
16 • An ordered clause with a parameter value equal to one must appear on the construct that
17 corresponds to the binding region.

18 Cross References
19 • Worksharing-Loop Constructs, see Section 11.5
20 • ordered clause, see Section 4.4.4
21 • parallelization-level Clauses, see Section 15.10.3
22 • simd directive, see Section 10.4

23 15.10.3 parallelization-level Clauses


24 Clause groups
25 Properties: unique, inarguable Members: simd, threads

26 Directives
27 ordered

28 Semantics
29 The parallelization-level clause grouping defines a set of clauses that indicate the level of
30 parallelization with which to associate a construct.

31 Cross References
32 • ordered directive, see Section 15.10.2

CHAPTER 15. SYNCHRONIZATION CONSTRUCTS AND CLAUSES 331


1 16 Cancellation Constructs
2 This chapter defines constructs related to cancellation of OpenMP regions.

3 16.1 cancel Construct


Name: cancel Association: none
4
Category: executable Properties: default

5 Clauses
6 if, do, for, parallel, sections, taskgroup

7 Additional information
8 The cancel-directive-name clause set consists of the directive-name of each directive that has the
9 cancellable property (i.e., directive-name for the worksharing-loop construct, parallel,
10 sections and taskgroup). This clause set has the required, unique and exclusive properties.

11 Binding
12 The binding thread set of the cancel region is the current team. The binding region of the
13 cancel region is the innermost enclosing region of the type that corresponds to
14 cancel-directive-name.

15 Semantics
16 The cancel construct activates cancellation of the innermost enclosing region of the type
17 specified by cancel-directive-name, which must be the directive-name of a cancellable construct.
18 Cancellation of the binding region is activated only if the cancel-var ICV is true, in which case the
19 cancel construct causes the encountering task to continue execution at the end of the binding
20 region if cancel-directive-name is not taskgroup. If the cancel-var ICV is true and
21 cancel-directive-name is taskgroup, the encountering task continues execution at the end of the
22 current task region. If the cancel-var ICV is false, the cancel construct is ignored.
23 Threads check for active cancellation only at cancellation points that are implied at the following
24 locations:
25 • cancel regions;
26 • cancellation point regions;
27 • barrier regions;

332
1 • at the end of a worksharing-loop construct with a nowait clause and for which the same list
2 item appears in both firstprivate and lastprivate clauses; and
3 • implicit barrier regions.
4 When a thread reaches one of the above cancellation points and if the cancel-var ICV is true, then:
5 • If the thread is at a cancel or cancellation point region and cancel-directive-name is
6 not taskgroup, the thread continues execution at the end of the canceled region if cancellation
7 has been activated for the innermost enclosing region of the type specified.
8 • If the thread is at a cancel or cancellation point region and cancel-directive-name is
9 taskgroup, the encountering task checks for active cancellation of all of the taskgroup sets to
10 which the encountering task belongs, and continues execution at the end of the current task
11 region if cancellation has been activated for any of the taskgroup sets.
12 • If the encountering task is at a barrier region or at the end of a worksharing-loop construct with a
13 nowait clause and for which the same list item appears in both firstprivate and
14 lastprivate clauses, the encountering task checks for active cancellation of the innermost
15 enclosing parallel region. If cancellation has been activated, then the encountering task
16 continues execution at the end of the canceled region.
17 When cancellation of tasks is activated through a cancel construct with taskgroup for
18 cancel-directive-name, the tasks that belong to the taskgroup set of the innermost enclosing
19 taskgroup region will be canceled. The task that encountered that construct continues execution
20 at the end of its task region, which implies completion of that task. Any task that belongs to the
21 innermost enclosing taskgroup and has already begun execution must run to completion or until
22 a cancellation point is reached. Upon reaching a cancellation point and if cancellation is active, the
23 task continues execution at the end of its task region, which implies the completion of the task. Any
24 task that belongs to the innermost enclosing taskgroup and that has not begun execution may be
25 discarded, which implies its completion.
26 When cancellation of tasks is activated through a cancel construct with cancel-directive-name
27 other than taskgroup, each thread of the binding thread set resumes execution at the end of the
28 canceled region if a cancellation point is encountered. If the canceled region is a parallel region,
29 any tasks that have been created by a task or a taskloop construct and their descendent tasks
30 are canceled according to the above taskgroup cancellation semantics. If the canceled region is
31 not a parallel region, no task cancellation occurs.
C++
32 The usual C++ rules for object destruction are followed when cancellation is performed.
C++
Fortran
33 All private objects or subobjects with ALLOCATABLE attribute that are allocated inside the
34 canceled construct are deallocated.
Fortran

CHAPTER 16. CANCELLATION CONSTRUCTS 333


1 If the canceled construct contains a reduction-scoping or lastprivate clause, the final values of
2 the list items that appeared in those clauses are undefined.
3 When an if clause is present on a cancel construct and the if expression evaluates to false, the
4 cancel construct does not activate cancellation. The cancellation point associated with the
5 cancel construct is always encountered regardless of the value of the if expression.
6
7 Note – The programmer is responsible for releasing locks and other synchronization data
8 structures that might cause a deadlock when a cancel construct is encountered and blocked
9 threads cannot be canceled. The programmer is also responsible for ensuring proper
10 synchronizations to avoid deadlocks that might arise from cancellation of OpenMP regions that
11 contain OpenMP synchronization constructs.
12

13 Execution Model Events


14 If a task encounters a cancel construct that will activate cancellation then a cancel event occurs.
15 A discarded-task event occurs for any discarded tasks.

16 Tool Callbacks
17 A thread dispatches a registered ompt_callback_cancel callback for each occurrence of a
18 cancel event in the context of the encountering task. This callback has type signature
19 ompt_callback_cancel_t; (flags & ompt_cancel_activated) always evaluates to
20 true in the dispatched callback; (flags & ompt_cancel_parallel) evaluates to true in the
21 dispatched callback if cancel-directive-name is parallel;
22 (flags & ompt_cancel_sections) evaluates to true in the dispatched callback if
23 cancel-directive-name is sections; (flags & ompt_cancel_loop) evaluates to true in the
24 dispatched callback if cancel-directive-name is for or do; and
25 (flags & ompt_cancel_taskgroup) evaluates to true in the dispatched callback if
26 cancel-directive-name is taskgroup.
27 A thread dispatches a registered ompt_callback_cancel callback with the ompt_data_t
28 associated with the discarded task as its task_data argument and
29 ompt_cancel_discarded_task as its flags argument for each occurrence of a
30 discarded-task event. The callback occurs in the context of the task that discards the task and has
31 type signature ompt_callback_cancel_t.

32 Restrictions
33 Restrictions to the cancel construct are as follows:
34 • The behavior for concurrent cancellation of a region and a region nested within it is unspecified.
35 • If cancel-directive-name is taskgroup, the cancel construct must be closely nested inside a
36 task or a taskloop construct and the cancel region must be closely nested inside a
37 taskgroup region.

334 OpenMP API – Version 5.2 November 2021


1 • If cancel-directive-name is not taskgroup, the cancel construct must be closely nested
2 inside an OpenMP construct that matches cancel-directive-name.
3 • A worksharing construct that is canceled must not have a nowait clause or a reduction
4 clause with a user-defined reduction that uses omp_orig in the initializer-expr of the
5 corresponding declare reduction directive.
6 • A worksharing-loop construct that is canceled must not have an ordered clause or a
7 reduction clause with the inscan modifier.
8 • When cancellation is active for a parallel region, a thread in the team that binds to that
9 region may not be executing or encounter a worksharing construct with an ordered clause, a
10 reduction clause with the inscan modifier or a reduction clause with a user-defined
11 reduction that uses omp_orig in the initializer-expr of the corresponding
12 declare reduction directive.
13 • When cancellation is active for a parallel region, a thread in the team that binds to that
14 region may not be executing or encounter a scope construct with a reduction clause with a
15 user-defined reduction that uses omp_orig in the initializer-expr of the corresponding
16 declare reduction directive.
17 • During execution of a construct that may be subject to cancellation, a thread must not encounter
18 an orphaned cancellation point. That is, a cancellation point must only be encountered within
19 that construct and must not be encountered elsewhere in its region.

20 Cross References
21 • omp_get_cancellation, see Section 18.2.8
22 • ompt_callback_cancel_t, see Section 19.5.2.18
23 • ompt_cancel_flag_t, see Section 19.4.4.26
24 • barrier directive, see Section 15.3.1
25 • cancel-var ICV, see Table 2.1
26 • cancellation point directive, see Section 16.2
27 • declare reduction directive, see Section 5.5.11
28 • do directive, see Section 11.5.2
29 • firstprivate clause, see Section 5.4.4
30 • for directive, see Section 11.5.1
31 • if clause, see Section 3.4
32 • nowait clause, see Section 15.6
33 • ordered clause, see Section 4.4.4
34 • parallel directive, see Section 10.1

CHAPTER 16. CANCELLATION CONSTRUCTS 335


1 • private clause, see Section 5.4.3
2 • reduction clause, see Section 5.5.8
3 • sections directive, see Section 11.3
4 • task directive, see Section 12.5
5 • taskgroup directive, see Section 15.4

6 16.2 cancellation point Construct


Name: cancellation point Association: none
7
Category: executable Properties: default

8 Clauses
9 do, for, parallel, sections, taskgroup

10 Additional information
11 The cancel-directive-name clause set consists of the directive-name of each directive that has the
12 cancellable property (i.e., directive-name for the worksharing-loop construct, parallel,
13 sections and taskgroup). This clause set has the required, unique and exclusive properties.

14 Binding
15 The binding thread set of the cancellation point construct is the current team. The binding
16 region of the cancellation point region is the innermost enclosing region of the type that
17 corresponds to cancel-directive-name.

18 Semantics
19 The cancellation point construct introduces a user-defined cancellation point at which an
20 implicit or explicit task must check if cancellation of the innermost enclosing region of the type
21 specified by cancel-directive-name, which must be the directive-name of a cancellable construct,
22 has been activated. This construct does not implement any synchronization between threads or
23 tasks. When an implicit or explicit task reaches a user-defined cancellation point and if the
24 cancel-var ICV is true, then:
25 • If the cancel-directive-name of the encountered cancellation point construct is not
26 taskgroup, the thread continues execution at the end of the canceled region if cancellation has
27 been activated for the innermost enclosing region of the type specified.
28 • If the cancel-directive-name of the encountered cancellation point construct is
29 taskgroup, the encountering task checks for active cancellation of all taskgroup sets to which
30 the encountering task belongs and continues execution at the end of the current task region if
31 cancellation has been activated for any of them.

336 OpenMP API – Version 5.2 November 2021


1 Execution Model Events
2 The cancellation event occurs if a task encounters a cancellation point and detects the activation of
3 cancellation.

4 Tool Callbacks
5 A thread dispatches a registered ompt_callback_cancel callback for each occurrence of a
6 cancel event in the context of the encountering task. This callback has type signature
7 ompt_callback_cancel_t; (flags & ompt_cancel_detected) always evaluates to true
8 in the dispatched callback; (flags & ompt_cancel_parallel) evaluates to true in the
9 dispatched callback if cancel-directive-name of the encountered cancellation point
10 construct is parallel; (flags & ompt_cancel_sections) evaluates to true in the
11 dispatched callback if cancel-directive-name of the encountered cancellation point
12 construct is sections; (flags & ompt_cancel_loop) evaluates to true in the dispatched
13 callback if cancel-directive-name of the encountered cancellation point construct is for
14 or do; and (flags & ompt_cancel_taskgroup) evaluates to true in the dispatched callback if
15 cancel-directive-name of the encountered cancellation point construct is taskgroup.

16 Restrictions
17 Restrictions to the cancellation point construct are as follows:
18 • A cancellation point construct for which cancel-directive-name is taskgroup must be
19 closely nested inside a task or taskloop construct, and the cancellation point region
20 must be closely nested inside a taskgroup region.
21 • A cancellation point construct for which cancel-directive-name is not taskgroup must
22 be closely nested inside an OpenMP construct that matches cancel-directive-name.

23 Cross References
24 • omp_get_cancellation, see Section 18.2.8
25 • ompt_callback_cancel_t, see Section 19.5.2.18
26 • cancel-var ICV, see Table 2.1
27 • do directive, see Section 11.5.2
28 • for directive, see Section 11.5.1
29 • parallel directive, see Section 10.1
30 • sections directive, see Section 11.3
31 • taskgroup directive, see Section 15.4

CHAPTER 16. CANCELLATION CONSTRUCTS 337


1 17 Composition of Constructs
2 This chapter defines rules and mechanisms for nesting regions and for combining constructs.

3 17.1 Nesting of Regions


4 This section describes a set of restrictions on the nesting of regions. The restrictions on nesting are
5 as follows:
6 • A worksharing region may not be closely nested inside a worksharing, task, taskloop,
7 critical, ordered, atomic, or masked region.
8 • A barrier region may not be closely nested inside a worksharing, task, taskloop,
9 critical, ordered, atomic, or masked region.
10 • A masked region may not be closely nested inside a worksharing, atomic, task, or
11 taskloop region.
12 • An ordered region that corresponds to an ordered construct without any clause or with the
13 threads or depend clause may not be closely nested inside a critical, ordered, loop,
14 atomic, task, or taskloop region.
15 • An ordered region that corresponds to an ordered construct without the simd clause
16 specified must be closely nested inside a worksharing-loop region.
17 • An ordered region that corresponds to an ordered construct with the simd clause specified
18 must be closely nested inside a simd or worksharing-loop SIMD region.
19 • An ordered region that corresponds to an ordered construct with both the simd and
20 threads clauses must be closely nested inside a worksharing-loop SIMD region or closely
21 nested inside a worksharing-loop and simd region.
22 • A critical region may not be nested (closely or otherwise) inside a critical region with
23 the same name. This restriction is not sufficient to prevent deadlock.
24 • OpenMP constructs may not be encountered during execution of an atomic region.
25 • The only OpenMP constructs that can be encountered during execution of a simd (or
26 worksharing-loop SIMD) region are the atomic construct, the loop construct without a
27 defined binding region, the simd construct and the ordered construct with the simd clause.
28 • If a target update, target data, target enter data, or target exit data
29 construct is encountered during execution of a target region, the behavior is unspecified.

338
1 • If a target construct is encountered during execution of a target region and a device
2 clause in which the ancestor device-modifier appears is not present on the construct, the
3 behavior is unspecified.
4 • A teams region must be strictly nested either within the implicit parallel region that surrounds
5 the whole OpenMP program or within a target region. If a teams construct is nested within
6 a target construct, that target construct must contain no statements, declarations or
7 directives outside of the teams construct.
8 • distribute regions, including any distribute regions arising from composite constructs,
9 parallel regions, including any parallel regions arising from combined constructs, loop
10 regions, omp_get_num_teams() regions, and omp_get_team_num() regions are the
11 only OpenMP regions that may be strictly nested inside the teams region.
12 • A loop region that binds to a teams region must be strictly nested inside a teams region.
13 • A distribute region must be strictly nested inside a teams region.
14 • If cancel-directive-name is taskgroup, the cancel construct must be closely nested inside a
15 task construct and the cancel region must be closely nested inside a taskgroup region.
16 Otherwise, the cancel construct must be closely nested inside an OpenMP construct for which
17 directive-name is cancel-directive-name.
18 • A cancellation point construct for which cancel-directive-name is taskgroup must be
19 closely nested inside a task construct, and the cancellation point region must be closely
20 nested inside a taskgroup region. Otherwise, a cancellation point construct must be
21 closely nested inside an OpenMP construct for which directive-name is cancel-directive-name.
22 • The only constructs that may be encountered inside a region that corresponds to a construct with
23 an order clause that specifies concurrent are the loop, parallel and simd constructs,
24 and combined constructs for which directive-name-A is parallel.
25 • A region that corresponds to a construct with an order clause that specifies concurrent may
26 not contain calls to the OpenMP Runtime API or to procedures that contain OpenMP directives.

27 17.2 Clauses on Combined and Composite


28 Constructs
29 This section specifies the handling of clauses on combined or composite constructs and the
30 handling of implicit clauses from variables with predetermined data sharing if they are not
31 predetermined only on a particular construct. Some clauses are permitted only on a single leaf
32 construct of the combined or composite construct, in which case the effect is as if the clause is
33 applied to that specific construct. Other clauses that are permitted on more than one leaf construct
34 have the effect as if they are applied to a subset of those constructs, as detailed in this section.
35 The collapse clause is applied once to the combined or composite construct.

CHAPTER 17. COMPOSITION OF CONSTRUCTS 339


1 The effect of the private clause is as if it is applied only to the innermost leaf construct that
2 permits it.
3 The effect of the firstprivate clause is as if it is applied to one or more leaf constructs as
4 follows:
5 • To the distribute construct if it is among the constituent constructs;
6 • To the teams construct if it is among the constituent constructs and the distribute
7 construct is not;
8 • To a worksharing construct that accepts the clause if one is among the constituent constructs;
9 • To the taskloop construct if it is among the constituent constructs;
10 • To the parallel construct if it is among the constituent constructs and neither a taskloop
11 construct nor a worksharing construct that accepts the clause is among them;
12 • To the target construct if it is among the constituent constructs and the same list item neither
13 appears in a lastprivate clause nor is the base variable or base pointer of a list item that
14 appears in a map clause.
15 If the parallel construct is among the constituent constructs and the effect is not as if the
16 firstprivate clause is applied to it by the above rules, then the effect is as if the shared
17 clause with the same list item is applied to the parallel construct. If the teams construct is
18 among the constituent constructs and the effect is not as if the firstprivate clause is applied to
19 it by the above rules, then the effect is as if the shared clause with the same list item is applied to
20 the teams construct.
21 The effect of the lastprivate clause is as if it is applied to all leaf constructs that permit the
22 clause. If the parallel construct is among the constituent constructs and the list item is not also
23 specified in the firstprivate clause, then the effect of the lastprivate clause is as if the
24 shared clause with the same list item is applied to the parallel construct. If the teams
25 construct is among the constituent constructs and the list item is not also specified in the
26 firstprivate clause, then the effect of the lastprivate clause is as if the shared clause
27 with the same list item is applied to the teams construct. If the target construct is among the
28 constituent constructs and the list item is not the base variable or base pointer of a list item that
29 appears in a map clause, the effect of the lastprivate clause is as if the same list item appears
30 in a map clause with a map-type of tofrom.
31 The effect of the shared, default, thread_limit, or order clause is as if it is applied to
32 all leaf constructs that permit the clause.
33 The effect of the allocate clause is as if it is applied to all leaf constructs that permit the clause
34 and to which a data-sharing attribute clause that may create a private copy of the same list item is
35 applied.
36 The effect of the reduction clause is as if it is applied to all leaf constructs that permit the
37 clause, except for the following constructs:

340 OpenMP API – Version 5.2 November 2021


1 • The parallel construct, when combined with the sections, worksharing-loop, loop, or
2 taskloop construct; and
3 • The teams construct, when combined with the loop construct.
4 For the parallel and teams constructs above, the effect of the reduction clause instead is as
5 if each list item or, for any list item that is an array item, its corresponding base array or base
6 pointer appears in a shared clause for the construct. If the task reduction-modifier is specified,
7 the effect is as if it only modifies the behavior of the reduction clause on the innermost leaf
8 construct that accepts the modifier (see Section 5.5.8). If the inscan reduction-modifier is
9 specified, the effect is as if it modifies the behavior of the reduction clause on all constructs of
10 the combined construct to which the clause is applied and that accept the modifier. If a list item in a
11 reduction clause on a combined target construct does not have the same base variable or base
12 pointer as a list item in a map clause on the construct, then the effect is as if the list item in the
13 reduction clause appears as a list item in a map clause with a map-type of tofrom.
14 The effect of the if clause is described in Section 3.4.
15 The effect of the linear clause is as if it is applied to the innermost leaf construct. Additionally,
16 if the list item is not the iteration variable of a simd or worksharing-loop SIMD construct, the
17 effect on the outer leaf constructs is as if the list item was specified in firstprivate and
18 lastprivate clauses on the combined or composite construct, with the rules specified above
19 applied. If a list item of the linear clause is the iteration variable of a simd or worksharing-loop
20 SIMD construct and it is not declared in the construct, the effect on the outer leaf constructs is as if
21 the list item was specified in a lastprivate clause on the combined or composite construct with
22 the rules specified above applied.
23 The effect of the nowait clause is as if it is applied to the outermost leaf construct that permits it.
24 If the clauses have expressions on them, such as for various clauses where the argument of the
25 clause is an expression, or lower-bound, length, or stride expressions inside array sections (or
26 subscript and stride expressions in subscript-triplet for Fortran), or linear-step or alignment
27 expressions, the expressions are evaluated immediately before the construct to which the clause has
28 been split or duplicated per the above rules (therefore inside of the outer leaf constructs). However,
29 the expressions inside the num_teams and thread_limit clauses are always evaluated before
30 the outermost leaf construct.
31 The restriction that a list item may not appear in more than one data sharing clause with the
32 exception of specifying a variable in both firstprivate and lastprivate clauses applies
33 after the clauses are split or duplicated per the above rules.

34 Restrictions
35 Restrictions to clauses on combined and composite constructs are as follows:
36 • A clause that appears on a combined or composite construct must apply to at least one of the leaf
37 constructs per the rules defined in this section.

CHAPTER 17. COMPOSITION OF CONSTRUCTS 341


1 17.3 Combined and Composite Directive Names
2 Combined constructs are shortcuts for specifying one construct immediately nested inside another
3 construct. Composite constructs are also shortcuts for specifying the effect of one construct
4 immediately following the effect of another construct. However, composite constructs define
5 semantics to combine constructs that cannot otherwise be immediately nested.
6 For all combined and composite constructs, directive-name concatenates directive-name-A, the
7 directive name of the enclosing construct, with an intervening space followed by directive-name-B,
8 the directive name of the nested construct. If directive-name-A and directive-name-B both
9 correspond to loop-associated constructs then directive-name is a composite construct. Otherwise
10 directive-name is a combined construct.
11 If directive-name-A is taskloop, for or do then directive-name-B may be simd.
12 If directive-name-A is masked then directive-name-B may be taskloop or the directive name of
13 a combined or composite construct for which directive-name-A is taskloop.
14 If directive-name-A is parallel then directive-name-B may be loop, sections,
15 workshare, masked, for, do or the directive name of a combined or composite construct for
16 which directive-name-A is masked, for or do.
17 If directive-name-A is distribute then directive-name-B may be simd or the directive name of
18 a combined or composite construct for which directive-name-A is parallel and for or do is a
19 leaf construct.
20 If directive-name-A is teams then directive-name-B may be loop, distribute or the directive
21 name of a combined or composite construct for which directive-name-A is distribute.
22 If directive-name-A is target then directive-name-B may be simd, parallel, teams, the
23 directive name of a combined or composite construct for which directive-name-A is teams or the
24 directive name of a combined or composite construct for which directive-name-A is parallel
25 and loop, for or do is a leaf construct.
26 For all combined or composite constructs for which the masked construct is a leaf construct, the
27 directive name master may be substituted for the directive name masked. The use of the
28 directive name master has been deprecated.

29 Cross References
30 • distribute directive, see Section 11.6
31 • do directive, see Section 11.5.2
32 • for directive, see Section 11.5.1
33 • loop directive, see Section 11.7
34 • masked directive, see Section 10.5
35 • parallel directive, see Section 10.1

342 OpenMP API – Version 5.2 November 2021


1 • sections directive, see Section 11.3
2 • target directive, see Section 13.8
3 • taskloop directive, see Section 12.6
4 • teams directive, see Section 10.2
5 • workshare directive, see Section 11.4

6 17.4 Combined Construct Semantics


7 The semantics of the combined constructs are identical to that of explicitly specifying the first
8 construct containing one instance of the second construct and no other statements. All combined
9 and composite directives for which a loop-associated construct is a leaf construct are themselves
10 loop-associated constructs. For combined constructs, tool callbacks are invoked as if the constructs
11 were explicitly nested.

12 Restrictions
13 Restrictions to combined constructs are as follows:
14 • The restrictions of directive-name-A and directive-name-B apply.
15 • If directive-name-A is parallel, the nowait and in_reduction clauses must not be
16 specified.
17 • If directive-name-A is target, the copyin clause must not be specified.

18 Cross References
19 • copyin clause, see Section 5.7.1
20 • in_reduction clause, see Section 5.5.10
21 • nowait clause, see Section 15.6
22 • parallel directive, see Section 10.1
23 • target directive, see Section 13.8

24 17.5 Composite Construct Semantics


25 Composite constructs combine constructs that otherwise cannot be immediately nested.
26 Specifically, composite constructs apply multiple loop-associated constructs to the same canonical
27 loop nest. The semantics of each composite construct first apply the semantics of the enclosing
28 construct as specified by directive-name-A and any clauses that apply to it. For each task (possibly
29 implicit, possibly initial) as appropriate for the semantics of directive-name-A, the application of its
30 semantics yields a nested loop of depth two in which the outer loop iterates over the chunks

CHAPTER 17. COMPOSITION OF CONSTRUCTS 343


1 assigned to that task and the inner loop iterates over the logical iterations of each chunk. The
2 semantics of directive-name-B and any clauses that apply to it are then applied to that inner loop.
3 For composite constructs, tool callbacks are invoked as if the constructs were explicitly nested.
4 If directive-name-A is taskloop and directive-name-B is simd then for the application of the
5 simd construct, the effect of any in_reduction clause is as if a reduction clause with the
6 same reduction operator and list items is present.

7 Restrictions
8 Restrictions to composite constructs are as follows:
9 • The restrictions of directive-name-A and directive-name-B apply.
10 • If directive-name-A is distribute, the linear clause may only be specified for loop
11 iteration variables of loops that are associated with the construct.
12 • If directive-name-A is distribute, the ordered clause must not be specified.

13 Cross References
14 • distribute directive, see Section 11.6
15 • in_reduction clause, see Section 5.5.10
16 • linear clause, see Section 5.4.6
17 • ordered clause, see Section 4.4.4
18 • reduction clause, see Section 5.5.8
19 • simd directive, see Section 10.4
20 • taskloop directive, see Section 12.6

344 OpenMP API – Version 5.2 November 2021


1 18 Runtime Library Routines
2 This chapter describes the OpenMP API runtime library routines and queryable runtime states. All
3 OpenMP Runtime API names have an omp_ prefix. Names that begin with the ompx_ prefix are
4 reserved for implementation-defined extensions to the OpenMP Runtime API. In this chapter, true
5 and false are used as generic terms to simplify the description of the routines.
C / C++
6 true means a non-zero integer value and false means an integer value of zero.
C / C++

Fortran
7 true means a logical value of .TRUE. and false means a logical value of .FALSE..
Fortran

Fortran
8 Restrictions
9 The following restrictions apply to all OpenMP runtime library routines:
10 • OpenMP runtime library routines may not be called from PURE or ELEMENTAL procedures.
11 • OpenMP runtime library routines may not be called in DO CONCURRENT constructs.
Fortran

12 18.1 Runtime Library Definitions


13 For each base language, a compliant implementation must supply a set of definitions for the
14 OpenMP API runtime library routines and the special data types of their parameters. The set of
15 definitions must contain a declaration for each OpenMP API runtime library routine and variable
16 and a definition of each required data type listed below. In addition, each set of definitions may
17 specify other implementation specific values.

CHAPTER 18. RUNTIME LIBRARY ROUTINES 345


C / C++
1 The library routines are external functions with “C” linkage.
2 Prototypes for the C/C++ runtime library routines described in this chapter shall be provided in a
3 header file named omp.h. This file also defines the following:
4 • The type omp_allocator_handle_t, which must be an implementation-defined (for C++
5 possibly scoped) enum type with at least the omp_null_allocator enumerator with the
6 value zero and an enumerator for each predefined memory allocator in Table 6.3;
7 • omp_atv_default, which is an instance of a type compatible with omp_uintptr_t with
8 the value -1;
9 • The type omp_control_tool_result_t;
10 • The type omp_control_tool_t;
11 • The type omp_depend_t;
12 • The type omp_event_handle_t, which must be an implementation-defined (for C++
13 possibly scoped) enum type;
14 • The enumerator omp_initial_device with value -1;
15 • The type omp_interop_t, which must be an implementation-defined integral or pointer type;
16 • The type omp_interop_fr_t, which must be an implementation-defined enum type with
17 enumerators named omp_ifr_name where name is a foreign runtime name that is defined in
18 the OpenMP Additional Definitions document;
19 • The type omp_intptr_t, which is a signed integer type that is at least the size of a pointer on
20 any device;
21 • The enumerator omp_invalid_device with an implementation-defined value less than -1;
22 • The type omp_lock_hint_t (deprecated);
23 • The type omp_lock_t;
24 • The type omp_memspace_handle_t, which must be an implementation-defined (for C++
25 possibly scoped) enum type with an enumerator for at least each predefined memory space in
26 Table 6.1;
27 • The type omp_nest_lock_t;
28 • The type omp_pause_resource_t;
29 • The type omp_proc_bind_t;
30 • The type omp_sched_t;
31 • The type omp_sync_hint_t; and

346 OpenMP API – Version 5.2 November 2021


1 • The type omp_uintptr_t, which is an unsigned integer type capable of holding a pointer on
2 any device.
C / C++
C++
3 The OpenMP enumeration types provided in the omp.h header file shall not be scoped
4 enumeration types unless explicitly allowed.
5 The omp.h header file also defines a class template that models the Allocator concept in the
6 omp::allocator namespace for each predefined memory allocator in Table 6.3 for which the
7 name includes neither the omp_ prefix nor the _alloc suffix.
C++
Fortran
8 The OpenMP Fortran API runtime library routines are external procedures. The return values of
9 these routines are of default kind, unless otherwise specified.
10 Interface declarations for the OpenMP Fortran runtime library routines described in this chapter
11 shall be provided in the form of a Fortran module named omp_lib or a Fortran include file
12 named omp_lib.h. Whether the omp_lib.h file provides derived-type definitions or those
13 routines that require an explicit interface is implementation defined. Whether the include file or
14 the module file (or both) is provided is also implementation defined.
15 These files also define the following:
16 • The default integer named constant omp_allocator_handle_kind;
17 • An integer named constant of kind omp_allocator_handle_kind for each predefined
18 memory allocator in Table 6.3;
19 • The default integer named constant omp_alloctrait_key_kind;
20 • The default integer named constant omp_alloctrait_val_kind;
21 • The default integer named constant omp_control_tool_kind;
22 • The default integer named constant omp_control_tool_result_kind;
23 • The default integer named constant omp_depend_kind;
24 • The default integer named constant omp_event_handle_kind;
25 • The default integer named constant omp_initial_device with value -1;
26 • The default integer named constant omp_interop_kind;
27 • The default integer named constant omp_interop_fr_kind;
28 • An integer named constant omp_ifr_name of kind omp_interop_fr_kind for each name
29 that is a foreign runtime name that is defined in the OpenMP Additional Definitions document;

CHAPTER 18. RUNTIME LIBRARY ROUTINES 347


1 • The default integer named constant omp_invalid_device with an implementation-defined
2 value less than -1;
3 • The default integer named constant omp_lock_hint_kind (deprecated);
4 • The default integer named constant omp_lock_kind;
5 • The default integer named constant omp_memspace_handle_kind;
6 • An integer named constant of kind omp_memspace_handle_kind for each predefined
7 memory space in Table 6.1;
8 • The default integer named constant omp_nest_lock_kind;
9 • The default integer named constant omp_pause_resource_kind;
10 • The default integer named constant omp_proc_bind_kind;
11 • The default integer named constant omp_sched_kind;
12 • The default integer named constant omp_sync_hint_kind; and
13 • The default integer named constant openmp_version with a value yyyymm where yyyy and
14 mm are the year and month designations of the version of the OpenMP Fortran API that the
15 implementation supports; this value matches that of the C preprocessor macro _OPENMP, when
16 a macro preprocessor is supported (see Section 3.3).
17 Whether any of the OpenMP runtime library routines that take an argument are extended with a
18 generic interface so arguments of different KIND type can be accommodated is implementation
19 defined.
Fortran

20 18.2 Thread Team Routines


21 This section describes routines that affect and monitor thread teams in the current contention group.

22 18.2.1 omp_set_num_threads
23 Summary
24 The omp_set_num_threads routine affects the number of threads to be used for subsequent
25 parallel regions that do not specify a num_threads clause, by setting the value of the first
26 element of the nthreads-var ICV of the current task.

27 Format
C / C++
28 void omp_set_num_threads(int num_threads);
C / C++

348 OpenMP API – Version 5.2 November 2021


Fortran
1 subroutine omp_set_num_threads(num_threads)
2 integer num_threads
Fortran
3 Constraints on Arguments
4 The value of the argument passed to this routine must evaluate to a positive integer, or else the
5 behavior of this routine is implementation defined.

6 Binding
7 The binding task set for an omp_set_num_threads region is the generating task.

8 Effect
9 The effect of this routine is to set the value of the first element of the nthreads-var ICV of the
10 current task to the value specified in the argument.

11 Cross References
12 • Determining the Number of Threads for a parallel Region, see Section 10.1.1
13 • nthreads-var ICV, see Table 2.1
14 • num_threads clause, see Section 10.1.2
15 • parallel directive, see Section 10.1

16 18.2.2 omp_get_num_threads
17 Summary
18 The omp_get_num_threads routine returns the number of threads in the current team.

19 Format
C / C++
20 int omp_get_num_threads(void);
C / C++
Fortran
21 integer function omp_get_num_threads()
Fortran
22 Binding
23 The binding region for an omp_get_num_threads region is the innermost enclosing parallel
24 region.

25 Effect
26 The omp_get_num_threads routine returns the number of threads in the team that is executing
27 the parallel region to which the routine region binds.

CHAPTER 18. RUNTIME LIBRARY ROUTINES 349


1 18.2.3 omp_get_max_threads
2 Summary
3 The omp_get_max_threads routine returns an upper bound on the number of threads that
4 could be used to form a new team if a parallel construct without a num_threads clause is
5 encountered after execution returns from this routine.
6 Format
C / C++
7 int omp_get_max_threads(void);
C / C++
Fortran
8 integer function omp_get_max_threads()
Fortran
9 Binding
10 The binding task set for an omp_get_max_threads region is the generating task.
11 Effect
12 The value returned by omp_get_max_threads is the value of the first element of the
13 nthreads-var ICV of the current task. This value is also an upper bound on the number of threads
14 that could be used to form a new team if a parallel region without a num_threads clause is
15 encountered after execution returns from this routine.
16 Cross References
17 • Determining the Number of Threads for a parallel Region, see Section 10.1.1
18 • nthreads-var ICV, see Table 2.1
19 • num_threads clause, see Section 10.1.2
20 • parallel directive, see Section 10.1

21 18.2.4 omp_get_thread_num
22 Summary
23 The omp_get_thread_num routine returns the thread number, within the current team, of the
24 calling thread.
25 Format
C / C++
26 int omp_get_thread_num(void);
C / C++
Fortran
27 integer function omp_get_thread_num()
Fortran

350 OpenMP API – Version 5.2 November 2021


1 Binding
2 The binding thread set for an omp_get_thread_num region is the current team. The binding
3 region for an omp_get_thread_num region is the innermost enclosing parallel region.

4 Effect
5 The omp_get_thread_num routine returns the thread number of the calling thread, within the
6 team that is executing the parallel region to which the routine region binds. The thread number is
7 an integer between 0 and one less than the value returned by omp_get_num_threads,
8 inclusive. The thread number of the primary thread of the team is 0.

9 Cross References
10 • omp_get_num_threads, see Section 18.2.2

11 18.2.5 omp_in_parallel
12 Summary
13 The omp_in_parallel routine returns true if the active-levels-var ICV is greater than zero;
14 otherwise, it returns false.

15 Format
C / C++
16 int omp_in_parallel(void);
C / C++
Fortran
17 logical function omp_in_parallel()
Fortran
18 Binding
19 The binding task set for an omp_in_parallel region is the generating task.

20 Effect
21 The effect of the omp_in_parallel routine is to return true if the current task is enclosed by an
22 active parallel region, and the parallel region is enclosed by the outermost initial task
23 region on the device; otherwise it returns false.

24 Cross References
25 • active-levels-var ICV, see Table 2.1
26 • parallel directive, see Section 10.1

CHAPTER 18. RUNTIME LIBRARY ROUTINES 351


1 18.2.6 omp_set_dynamic
2 Summary
3 The omp_set_dynamic routine enables or disables dynamic adjustment of the number of
4 threads available for the execution of subsequent parallel regions by setting the value of the
5 dyn-var ICV.

6 Format
C / C++
7 void omp_set_dynamic(int dynamic_threads);
C / C++

Fortran
8 subroutine omp_set_dynamic(dynamic_threads)
9 logical dynamic_threads
Fortran
10 Binding
11 The binding task set for an omp_set_dynamic region is the generating task.

12 Effect
13 For implementations that support dynamic adjustment of the number of threads, if the argument to
14 omp_set_dynamic evaluates to true, dynamic adjustment is enabled for the current task;
15 otherwise, dynamic adjustment is disabled for the current task. For implementations that do not
16 support dynamic adjustment of the number of threads, this routine has no effect: the value of
17 dyn-var remains false.

18 Cross References
19 • dyn-var ICV, see Table 2.1

20 18.2.7 omp_get_dynamic
21 Summary
22 The omp_get_dynamic routine returns the value of the dyn-var ICV, which determines whether
23 dynamic adjustment of the number of threads is enabled or disabled.

24 Format
C / C++
25 int omp_get_dynamic(void);
C / C++
Fortran
26 logical function omp_get_dynamic()
Fortran

352 OpenMP API – Version 5.2 November 2021


1 Binding
2 The binding task set for an omp_get_dynamic region is the generating task.

3 Effect
4 This routine returns true if dynamic adjustment of the number of threads is enabled for the current
5 task; otherwise, it returns false. If an implementation does not support dynamic adjustment of the
6 number of threads, then this routine always returns false.

7 Cross References
8 • dyn-var ICV, see Table 2.1

9 18.2.8 omp_get_cancellation
10 Summary
11 The omp_get_cancellation routine returns the value of the cancel-var ICV, which
12 determines if cancellation is enabled or disabled.

13 Format
C / C++
14 int omp_get_cancellation(void);
C / C++
Fortran
15 logical function omp_get_cancellation()
Fortran
16 Binding
17 The binding task set for an omp_get_cancellation region is the whole program.

18 Effect
19 This routine returns true if cancellation is enabled. It returns false otherwise.

20 Cross References
21 • cancel-var ICV, see Table 2.1

22 18.2.9 omp_set_nested (Deprecated)


23 Summary
24 The deprecated omp_set_nested routine enables or disables nested parallelism by setting the
25 max-active-levels-var ICV.

26 Format
C / C++
27 void omp_set_nested(int nested);
C / C++

CHAPTER 18. RUNTIME LIBRARY ROUTINES 353


Fortran
1 subroutine omp_set_nested(nested)
2 logical nested
Fortran
3 Binding
4 The binding task set for an omp_set_nested region is the generating task.

5 Effect
6 If the argument to omp_set_nested evaluates to true, the value of the max-active-levels-var
7 ICV is set to the number of active levels of parallelism that the implementation supports; otherwise,
8 if the value of max-active-levels-var is greater than 1 then it is set to 1. This routine has been
9 deprecated.

10 Cross References
11 • max-active-levels-var ICV, see Table 2.1

12 18.2.10 omp_get_nested (Deprecated)


13 Summary
14 The deprecated omp_get_nested routine returns whether nested parallelism is enabled or
15 disabled, according to the value of the max-active-levels-var ICV.

16 Format
C / C++
17 int omp_get_nested(void);
C / C++
Fortran
18 logical function omp_get_nested()
Fortran
19 Binding
20 The binding task set for an omp_get_nested region is the generating task.

21 Effect
22 This routine returns true if max-active-levels-var is greater than 1 and greater than active-levels-var
23 for the current task; it returns false otherwise. If an implementation does not support nested
24 parallelism, this routine always returns false. This routine has been deprecated.

25 Cross References
26 • max-active-levels-var ICV, see Table 2.1

354 OpenMP API – Version 5.2 November 2021


1 18.2.11 omp_set_schedule
2 Summary
3 The omp_set_schedule routine affects the schedule that is applied when runtime is used as
4 schedule kind, by setting the value of the run-sched-var ICV.

5 Format
C / C++
6 void omp_set_schedule(omp_sched_t kind, int chunk_size);
C / C++
Fortran
7 subroutine omp_set_schedule(kind, chunk_size)
8 integer (kind=omp_sched_kind) kind
9 integer chunk_size
Fortran
10 Constraints on Arguments
11 The first argument passed to this routine can be one of the valid OpenMP schedule kinds (except for
12 runtime) or any implementation-specific schedule. The C/C++ header file (omp.h) and the
13 Fortran include file (omp_lib.h) and/or Fortran module file (omp_lib) define the valid
14 constants. The valid constants must include the following, which can be extended with
15 implementation-specific values:
C / C++
16 typedef enum omp_sched_t {
17 // schedule kinds
18 omp_sched_static = 0x1,
19 omp_sched_dynamic = 0x2,
20 omp_sched_guided = 0x3,
21 omp_sched_auto = 0x4,
22
23 // schedule modifier
24 omp_sched_monotonic = 0x80000000u
25 } omp_sched_t;
C / C++
Fortran
26 ! schedule kinds
27 integer(kind=omp_sched_kind), &
28 parameter :: omp_sched_static = &
29 int(Z’1’, kind=omp_sched_kind)
30 integer(kind=omp_sched_kind), &
31 parameter :: omp_sched_dynamic = &
32 int(Z’2’, kind=omp_sched_kind)

CHAPTER 18. RUNTIME LIBRARY ROUTINES 355


1 integer(kind=omp_sched_kind), &
2 parameter :: omp_sched_guided = &
3 int(Z’3’, kind=omp_sched_kind)
4 integer(kind=omp_sched_kind), &
5 parameter :: omp_sched_auto = &
6 int(Z’4’, kind=omp_sched_kind)
7
8 ! schedule modifier
9 integer(kind=omp_sched_kind), &
10 parameter :: omp_sched_monotonic = &
11 int(Z’80000000’, kind=omp_sched_kind)
Fortran
12 Binding
13 The binding task set for an omp_set_schedule region is the generating task.

14 Effect
15 The effect of this routine is to set the value of the run-sched-var ICV of the current task to the
16 values specified in the two arguments. The schedule is set to the schedule kind that is specified by
17 the first argument kind. It can be any of the standard schedule kinds or any other
18 implementation-specific one. For the schedule kinds static, dynamic, and guided, the
19 chunk_size is set to the value of the second argument, or to the default chunk_size if the value of the
20 second argument is less than 1; for the schedule kind auto, the second argument has no meaning;
21 for implementation-specific schedule kinds, the values and associated meanings of the second
22 argument are implementation defined.
23 Each of the schedule kinds can be combined with the omp_sched_monotonic modifier by
24 using the + or | operators in C/C++ or the + operator in Fortran. If the schedule kind is combined
25 with the omp_sched_monotonic modifier, the schedule is modified as if the monotonic
26 schedule modifier was specified. Otherwise, the schedule modifier is nonmonotonic.

27 Cross References
28 • run-sched-var ICV, see Table 2.1

29 18.2.12 omp_get_schedule
30 Summary
31 The omp_get_schedule routine returns the schedule that is applied when the runtime schedule
32 is used.

33 Format
C / C++
34 void omp_get_schedule(omp_sched_t *kind, int *chunk_size);
C / C++

356 OpenMP API – Version 5.2 November 2021


Fortran
1 subroutine omp_get_schedule(kind, chunk_size)
2 integer (kind=omp_sched_kind) kind
3 integer chunk_size
Fortran
4 Binding
5 The binding task set for an omp_get_schedule region is the generating task.

6 Effect
7 This routine returns the run-sched-var ICV in the task to which the routine binds. The first
8 argument kind returns the schedule to be used. It can be any of the standard schedule kinds as
9 defined in Section 18.2.11, or any implementation-specific schedule kind. If the returned schedule
10 kind is static, dynamic, or guided, the second argument chunk_size returns the chunk size to
11 be used, or a value less than 1 if the default chunk size is to be used. The value returned by the
12 second argument is implementation defined for any other schedule kinds.

13 Cross References
14 • run-sched-var ICV, see Table 2.1

15 18.2.13 omp_get_thread_limit
16 Summary
17 The omp_get_thread_limit routine returns the maximum number of OpenMP threads
18 available to participate in the current contention group.

19 Format
C / C++
20 int omp_get_thread_limit(void);
C / C++
Fortran
21 integer function omp_get_thread_limit()
Fortran
22 Binding
23 The binding task set for an omp_get_thread_limit region is the generating task.

24 Effect
25 The omp_get_thread_limit routine returns the value of the thread-limit-var ICV.

26 Cross References
27 • thread-limit-var ICV, see Table 2.1

CHAPTER 18. RUNTIME LIBRARY ROUTINES 357


1 18.2.14 omp_get_supported_active_levels
2 Summary
3 The omp_get_supported_active_levels routine returns the number of active levels of
4 parallelism supported by the implementation.

5 Format
C / C++
6 int omp_get_supported_active_levels(void);
C / C++
Fortran
7 integer function omp_get_supported_active_levels()
Fortran
8 Binding
9 The binding task set for an omp_get_supported_active_levels region is the generating
10 task.

11 Effect
12 The omp_get_supported_active_levels routine returns the number of active levels of
13 parallelism supported by the implementation. The max-active-levels-var ICV cannot have a value
14 that is greater than this number. The value that the omp_get_supported_active_levels
15 routine returns is implementation defined, but it must be greater than 0.

16 Cross References
17 • max-active-levels-var ICV, see Table 2.1

18 18.2.15 omp_set_max_active_levels
19 Summary
20 The omp_set_max_active_levels routine limits the number of nested active parallel
21 regions when a new nested parallel region is generated by the current task by setting the
22 max-active-levels-var ICV.
23 Format
C / C++
24 void omp_set_max_active_levels(int max_levels);
C / C++
Fortran
25 subroutine omp_set_max_active_levels(max_levels)
26 integer max_levels
Fortran

358 OpenMP API – Version 5.2 November 2021


1 Constraints on Arguments
2 The value of the argument passed to this routine must evaluate to a non-negative integer, otherwise
3 the behavior of this routine is implementation defined.
4 Binding
5 The binding task set for an omp_set_max_active_levels region is the generating task.
6 Effect
7 The effect of this routine is to set the value of the max-active-levels-var ICV to the value specified
8 in the argument.
9 If the number of active levels requested exceeds the number of active levels of parallelism supported
10 by the implementation, the value of the max-active-levels-var ICV will be set to the number of
11 active levels supported by the implementation. If the number of active levels requested is less than
12 the value of the active-levels-var ICV, the value of the max-active-levels-var ICV will be set to an
13 implementation-defined value between the requested number and active-levels-var, inclusive.
14 Cross References
15 • max-active-levels-var ICV, see Table 2.1

16 18.2.16 omp_get_max_active_levels
17 Summary
18 The omp_get_max_active_levels routine returns the value of the max-active-levels-var
19 ICV, which determines the maximum number of nested active parallel regions when the innermost
20 parallel region is generated by the current task.
21 Format
C / C++
22 int omp_get_max_active_levels(void);
C / C++
Fortran
23 integer function omp_get_max_active_levels()
Fortran
24 Binding
25 The binding task set for an omp_get_max_active_levels region is the generating task.
26 Effect
27 The omp_get_max_active_levels routine returns the value of the max-active-levels-var
28 ICV. The current task may only generate an active parallel region if the returned value is greater
29 than the value of the active-levels-var ICV.
30 Cross References
31 • max-active-levels-var ICV, see Table 2.1

CHAPTER 18. RUNTIME LIBRARY ROUTINES 359


1 18.2.17 omp_get_level
2 Summary
3 The omp_get_level routine returns the value of the levels-var ICV.

4 Format
C / C++
5 int omp_get_level(void);
C / C++
Fortran
6 integer function omp_get_level()
Fortran
7 Binding
8 The binding task set for an omp_get_level region is the generating task.

9 Effect
10 The effect of the omp_get_level routine is to return the number of nested parallel regions
11 (whether active or inactive) that enclose the current task such that all of the parallel regions are
12 enclosed by the outermost initial task region on the current device.

13 Cross References
14 • levels-var ICV, see Table 2.1
15 • parallel directive, see Section 10.1

16 18.2.18 omp_get_ancestor_thread_num
17 Summary
18 The omp_get_ancestor_thread_num routine returns, for a given nested level of the current
19 thread, the thread number of the ancestor of the current thread.

20 Format
C / C++
21 int omp_get_ancestor_thread_num(int level);
C / C++
Fortran
22 integer function omp_get_ancestor_thread_num(level)
23 integer level
Fortran

360 OpenMP API – Version 5.2 November 2021


1 Binding
2 The binding thread set for an omp_get_ancestor_thread_num region is the encountering
3 thread. The binding region for an omp_get_ancestor_thread_num region is the innermost
4 enclosing parallel region.

5 Effect
6 The omp_get_ancestor_thread_num routine returns the thread number of the ancestor at a
7 given nest level of the current thread or the thread number of the current thread. If the requested
8 nest level is outside the range of 0 and the nest level of the current thread, as returned by the
9 omp_get_level routine, the routine returns -1.
10

11 Note – When the omp_get_ancestor_thread_num routine is called with a value of


12 level=0, the routine always returns 0. If level=omp_get_level(), the routine has the
13 same effect as the omp_get_thread_num routine.
14

15 Cross References
16 • omp_get_level, see Section 18.2.17
17 • omp_get_thread_num, see Section 18.2.4
18 • parallel directive, see Section 10.1

19 18.2.19 omp_get_team_size
20 Summary
21 The omp_get_team_size routine returns, for a given nested level of the current thread, the size
22 of the thread team to which the ancestor or the current thread belongs.

23 Format
C / C++
24 int omp_get_team_size(int level);
C / C++
Fortran
25 integer function omp_get_team_size(level)
26 integer level
Fortran
27 Binding
28 The binding thread set for an omp_get_team_size region is the encountering thread. The
29 binding region for an omp_get_team_size region is the innermost enclosing parallel
30 region.

CHAPTER 18. RUNTIME LIBRARY ROUTINES 361


1 Effect
2 The omp_get_team_size routine returns the size of the thread team to which the ancestor or
3 the current thread belongs. If the requested nested level is outside the range of 0 and the nested
4 level of the current thread, as returned by the omp_get_level routine, the routine returns -1.
5 Inactive parallel regions are regarded as active parallel regions executed with one thread.
6

7 Note – When the omp_get_team_size routine is called with a value of level=0, the routine
8 always returns 1. If level=omp_get_level(), the routine has the same effect as the
9 omp_get_num_threads routine.
10

11 Cross References
12 • omp_get_level, see Section 18.2.17
13 • omp_get_num_threads, see Section 18.2.2
14 • parallel directive, see Section 10.1

15 18.2.20 omp_get_active_level
16 Summary
17 The omp_get_active_level routine returns the value of the active-levels-var ICV.

18 Format
C / C++
19 int omp_get_active_level(void);
C / C++
Fortran
20 integer function omp_get_active_level()
Fortran
21 Binding
22 The binding task set for the an omp_get_active_level region is the generating task.

23 Effect
24 The effect of the omp_get_active_level routine is to return the number of nested active
25 parallel regions enclosing the current task such that all of the parallel regions are enclosed
26 by the outermost initial task region on the current device.

27 Cross References
28 • active-levels-var ICV, see Table 2.1
29 • parallel directive, see Section 10.1

362 OpenMP API – Version 5.2 November 2021


1 18.3 Thread Affinity Routines
2 This section describes routines that affect and access thread affinity policies that are in effect.

3 18.3.1 omp_get_proc_bind
4 Summary
5 The omp_get_proc_bind routine returns the thread affinity policy to be used for the
6 subsequent nested parallel regions that do not specify a proc_bind clause.

7 Format
C / C++
8 omp_proc_bind_t omp_get_proc_bind(void);
C / C++
Fortran
9 integer (kind=omp_proc_bind_kind) function omp_get_proc_bind()
Fortran
10 Constraints on Arguments
11 The value returned by this routine must be one of the valid affinity policy kinds. The C/C++ header
12 file (omp.h) and the Fortran include file (omp_lib.h) and/or Fortran module file (omp_lib)
13 define the valid constants. The valid constants must include the following:
C / C++
14 typedef enum omp_proc_bind_t {
15 omp_proc_bind_false = 0,
16 omp_proc_bind_true = 1,
17 omp_proc_bind_primary = 2,
18 omp_proc_bind_master = omp_proc_bind_primary, // (deprecated)
19 omp_proc_bind_close = 3,
20 omp_proc_bind_spread = 4
21 } omp_proc_bind_t;
C / C++
Fortran
22 integer (kind=omp_proc_bind_kind), &
23 parameter :: omp_proc_bind_false = 0
24 integer (kind=omp_proc_bind_kind), &
25 parameter :: omp_proc_bind_true = 1
26 integer (kind=omp_proc_bind_kind), &
27 parameter :: omp_proc_bind_primary = 2

CHAPTER 18. RUNTIME LIBRARY ROUTINES 363


1 integer (kind=omp_proc_bind_kind), &
2 parameter :: omp_proc_bind_master = &
3 omp_proc_bind_primary ! (deprecated)
4 integer (kind=omp_proc_bind_kind), &
5 parameter :: omp_proc_bind_close = 3
6 integer (kind=omp_proc_bind_kind), &
7 parameter :: omp_proc_bind_spread = 4
Fortran
8 Binding
9 The binding task set for an omp_get_proc_bind region is the generating task.

10 Effect
11 The effect of this routine is to return the value of the first element of the bind-var ICV of the current
12 task. See Section 10.1.3 for the rules that govern the thread affinity policy.

13 Cross References
14 • Controlling OpenMP Thread Affinity, see Section 10.1.3
15 • bind-var ICV, see Table 2.1
16 • parallel directive, see Section 10.1

17 18.3.2 omp_get_num_places
18 Summary
19 The omp_get_num_places routine returns the number of places available to the execution
20 environment in the place list.

21 Format
C / C++
22 int omp_get_num_places(void);
C / C++
Fortran
23 integer function omp_get_num_places()
Fortran
24 Binding
25 The binding thread set for an omp_get_num_places region is all threads on a device. The
26 effect of executing this routine is not related to any specific region corresponding to any construct
27 or API routine.

364 OpenMP API – Version 5.2 November 2021


1 Effect
2 The omp_get_num_places routine returns the number of places in the place list. This value is
3 equivalent to the number of places in the place-partition-var ICV in the execution environment of
4 the initial task.

5 Cross References
6 • place-partition-var ICV, see Table 2.1

7 18.3.3 omp_get_place_num_procs
8 Summary
9 The omp_get_place_num_procs routine returns the number of processors available to the
10 execution environment in the specified place.

11 Format
C / C++
12 int omp_get_place_num_procs(int place_num);
C / C++
Fortran
13 integer function omp_get_place_num_procs(place_num)
14 integer place_num
Fortran
15 Binding
16 The binding thread set for an omp_get_place_num_procs region is all threads on a device.
17 The effect of executing this routine is not related to any specific region corresponding to any
18 construct or API routine.

19 Effect
20 The omp_get_place_num_procs routine returns the number of processors associated with
21 the place numbered place_num. The routine returns zero when place_num is negative or is greater
22 than or equal to the value returned by omp_get_num_places().

23 Cross References
24 • omp_get_num_places, see Section 18.3.2

25 18.3.4 omp_get_place_proc_ids
26 Summary
27 The omp_get_place_proc_ids routine returns the numerical identifiers of the processors
28 available to the execution environment in the specified place.

CHAPTER 18. RUNTIME LIBRARY ROUTINES 365


1 Format
C / C++
2 void omp_get_place_proc_ids(int place_num, int *ids);
C / C++
Fortran
3 subroutine omp_get_place_proc_ids(place_num, ids)
4 integer place_num
5 integer ids(*)
Fortran
6 Binding
7 The binding thread set for an omp_get_place_proc_ids region is all threads on a device.
8 The effect of executing this routine is not related to any specific region corresponding to any
9 construct or API routine.

10 Effect
11 The omp_get_place_proc_ids routine returns the numerical identifiers of each processor
12 associated with the place numbered place_num. The numerical identifiers are non-negative and
13 their meaning is implementation defined. The numerical identifiers are returned in the array ids and
14 their order in the array is implementation defined. The array must be sufficiently large to contain
15 omp_get_place_num_procs(place_num) integers; otherwise, the behavior is unspecified.
16 The routine has no effect when place_num has a negative value or a value greater than or equal to
17 omp_get_num_places().

18 Cross References
19 • OMP_PLACES, see Section 21.1.6
20 • omp_get_num_places, see Section 18.3.2
21 • omp_get_place_num_procs, see Section 18.3.3

22 18.3.5 omp_get_place_num
23 Summary
24 The omp_get_place_num routine returns the place number of the place to which the
25 encountering thread is bound.

26 Format
C / C++
27 int omp_get_place_num(void);
C / C++
Fortran
28 integer function omp_get_place_num()
Fortran

366 OpenMP API – Version 5.2 November 2021


1 Binding
2 The binding thread set for an omp_get_place_num region is the encountering thread.

3 Effect
4 When the encountering thread is bound to a place, the omp_get_place_num routine returns the
5 place number associated with the thread. The returned value is between 0 and one less than the
6 value returned by omp_get_num_places(), inclusive. When the encountering thread is not
7 bound to a place, the routine returns -1.

8 Cross References
9 • omp_get_num_places, see Section 18.3.2

10 18.3.6 omp_get_partition_num_places
11 Summary
12 The omp_get_partition_num_places routine returns the number of places in the place
13 partition of the innermost implicit task.

14 Format
C / C++
15 int omp_get_partition_num_places(void);
C / C++
Fortran
16 integer function omp_get_partition_num_places()
Fortran
17 Binding
18 The binding task set for an omp_get_partition_num_places region is the encountering
19 implicit task.

20 Effect
21 The omp_get_partition_num_places routine returns the number of places in the
22 place-partition-var ICV.

23 Cross References
24 • place-partition-var ICV, see Table 2.1

CHAPTER 18. RUNTIME LIBRARY ROUTINES 367


1 18.3.7 omp_get_partition_place_nums
2 Summary
3 The omp_get_partition_place_nums routine returns the list of place numbers
4 corresponding to the places in the place-partition-var ICV of the innermost implicit task.

5 Format
C / C++
6 void omp_get_partition_place_nums(int *place_nums);
C / C++
Fortran
7 subroutine omp_get_partition_place_nums(place_nums)
8 integer place_nums(*)
Fortran
9 Binding
10 The binding task set for an omp_get_partition_place_nums region is the encountering
11 implicit task.

12 Effect
13 The omp_get_partition_place_nums routine returns the list of place numbers that
14 correspond to the places in the place-partition-var ICV of the innermost implicit task. The array
15 must be sufficiently large to contain omp_get_partition_num_places() integers;
16 otherwise, the behavior is unspecified.

17 Cross References
18 • omp_get_partition_num_places, see Section 18.3.6
19 • place-partition-var ICV, see Table 2.1

20 18.3.8 omp_set_affinity_format
21 Summary
22 The omp_set_affinity_format routine sets the affinity format to be used on the device by
23 setting the value of the affinity-format-var ICV.

24 Format
C / C++
25 void omp_set_affinity_format(const char *format);
C / C++
Fortran
26 subroutine omp_set_affinity_format(format)
27 character(len=*),intent(in) :: format
Fortran

368 OpenMP API – Version 5.2 November 2021


1 Binding
2 When called from a sequential part of the program, the binding thread set for an
3 omp_set_affinity_format region is the encountering thread. When called from within any
4 parallel or teams region, the binding thread set (and binding region, if required) for the
5 omp_set_affinity_format region is implementation defined.

6 Effect
7 The effect of omp_set_affinity_format routine is to copy the character string specified by
8 the format argument into the affinity-format-var ICV on the current device.
9 This routine has the described effect only when called from a sequential part of the program. When
10 called from within a parallel or teams region, the effect of this routine is implementation
11 defined.

12 Cross References
13 • Controlling OpenMP Thread Affinity, see Section 10.1.3
14 • OMP_AFFINITY_FORMAT, see Section 21.2.5
15 • OMP_DISPLAY_AFFINITY, see Section 21.2.4
16 • omp_capture_affinity, see Section 18.3.11
17 • omp_display_affinity, see Section 18.3.10
18 • omp_get_affinity_format, see Section 18.3.9

19 18.3.9 omp_get_affinity_format
20 Summary
21 The omp_get_affinity_format routine returns the value of the affinity-format-var ICV on
22 the device.

23 Format
C / C++
24 size_t omp_get_affinity_format(char *buffer, size_t size);
C / C++
Fortran
25 integer function omp_get_affinity_format(buffer)
26 character(len=*),intent(out) :: buffer
Fortran
27 Binding
28 When called from a sequential part of the program, the binding thread set for an
29 omp_get_affinity_format region is the encountering thread. When called from within any
30 parallel or teams region, the binding thread set (and binding region, if required) for the
31 omp_get_affinity_format region is implementation defined.

CHAPTER 18. RUNTIME LIBRARY ROUTINES 369


1 Effect
C / C++
2 The omp_get_affinity_format routine returns the number of characters in the
3 affinity-format-var ICV on the current device, excluding the terminating null byte (’\0’) and if
4 size is non-zero, writes the value of the affinity-format-var ICV on the current device to buffer
5 followed by a null byte. If the return value is larger or equal to size, the affinity format specification
6 is truncated, with the terminating null byte stored to buffer[size-1]. If size is zero, nothing is
7 stored and buffer may be NULL.
C / C++
Fortran
8 The omp_get_affinity_format routine returns the number of characters that are required to
9 hold the affinity-format-var ICV on the current device and writes the value of the
10 affinity-format-var ICV on the current device to buffer. If the return value is larger than
11 len(buffer), the affinity format specification is truncated.
Fortran
12 If the buffer argument does not conform to the specified format then the result is implementation
13 defined.

14 Cross References
15 • affinity-format-var ICV, see Table 2.1
16 • parallel directive, see Section 10.1
17 • teams directive, see Section 10.2

18 18.3.10 omp_display_affinity
19 Summary
20 The omp_display_affinity routine prints the OpenMP thread affinity information using the
21 format specification provided.
22 Format
C / C++
23 void omp_display_affinity(const char *format);
C / C++
Fortran
24 subroutine omp_display_affinity(format)
25 character(len=*),intent(in) :: format
Fortran
26 Binding
27 The binding thread set for an omp_display_affinity region is the encountering thread.

370 OpenMP API – Version 5.2 November 2021


1 Effect
2 The omp_display_affinity routine prints the thread affinity information of the current
3 thread in the format specified by the format argument, followed by a new-line. If the format is
4 NULL (for C/C++) or a zero-length string (for Fortran and C/C++), the value of the
5 affinity-format-var ICV is used. If the format argument does not conform to the specified format
6 then the result is implementation defined.

7 Cross References
8 • affinity-format-var ICV, see Table 2.1

9 18.3.11 omp_capture_affinity
10 Summary
11 The omp_capture_affinity routine prints the OpenMP thread affinity information into a
12 buffer using the format specification provided.

13 Format
C / C++
14 size_t omp_capture_affinity(
15 char *buffer,
16 size_t size,
17 const char *format
18 );
C / C++
Fortran
19 integer function omp_capture_affinity(buffer,format)
20 character(len=*),intent(out) :: buffer
21 character(len=*),intent(in) :: format
Fortran
22 Binding
23 The binding thread set for an omp_capture_affinity region is the encountering thread.

24 Effect
C / C++
25 The omp_capture_affinity routine returns the number of characters in the entire thread
26 affinity information string excluding the terminating null byte (’\0’). If size is non-zero, it writes
27 the thread affinity information of the current thread in the format specified by the format argument
28 into the character string buffer followed by a null byte. If the return value is larger or equal to
29 size, the thread affinity information string is truncated, with the terminating null byte stored to
30 buffer[size-1]. If size is zero, nothing is stored and buffer may be NULL. If the format is NULL
31 or a zero-length string, the value of the affinity-format-var ICV is used.
C / C++

CHAPTER 18. RUNTIME LIBRARY ROUTINES 371


Fortran
1 The omp_capture_affinity routine returns the number of characters required to hold the
2 entire thread affinity information string and prints the thread affinity information of the current
3 thread into the character string buffer with the size of len(buffer) in the format specified by
4 the format argument. If the format is a zero-length string, the value of the affinity-format-var ICV
5 is used. If the return value is larger than len(buffer), the thread affinity information string is
6 truncated. If the format is a zero-length string, the value of the affinity-format-var ICV is used.
Fortran
7 If the format argument does not conform to the specified format then the result is implementation
8 defined.

9 Cross References
10 • affinity-format-var ICV, see Table 2.1

11 18.4 Teams Region Routines


12 This section describes routines that affect and monitor the league of teams that may execute a
13 teams region.

14 18.4.1 omp_get_num_teams
15 Summary
16 The omp_get_num_teams routine returns the number of initial teams in the current teams
17 region.

18 Format
C / C++
19 int omp_get_num_teams(void);
C / C++
Fortran
20 integer function omp_get_num_teams()
Fortran
21 Binding
22 The binding task set for an omp_get_num_teams region is the generating task

23 Effect
24 The effect of this routine is to return the number of initial teams in the current teams region. The
25 routine returns 1 if it is called from outside of a teams region.

372 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • teams directive, see Section 10.2

3 18.4.2 omp_get_team_num
4 Summary
5 The omp_get_team_num routine returns the initial team number of the calling thread.
6 Format
C / C++
7 int omp_get_team_num(void);
C / C++
Fortran
8 integer function omp_get_team_num()
Fortran
9 Binding
10 The binding task set for an omp_get_team_num region is the generating task.
11 Effect
12 The omp_get_team_num routine returns the initial team number of the calling thread. The
13 initial team number is an integer between 0 and one less than the value returned by
14 omp_get_num_teams(), inclusive. The routine returns 0 if it is called outside of a teams
15 region.
16 Cross References
17 • omp_get_num_teams, see Section 18.4.1
18 • teams directive, see Section 10.2

19 18.4.3 omp_set_num_teams
20 Summary
21 The omp_set_num_teams routine affects the number of threads to be used for subsequent
22 teams regions that do not specify a num_teams clause, by setting the value of the nteams-var
23 ICV of the current device.
24 Format
C / C++
25 void omp_set_num_teams(int num_teams);
C / C++
Fortran
26 subroutine omp_set_num_teams(num_teams)
27 integer num_teams
Fortran

CHAPTER 18. RUNTIME LIBRARY ROUTINES 373


1 Constraints on Arguments
2 The value of the argument passed to this routine must evaluate to a positive integer, or else the
3 behavior of this routine is implementation defined.

4 Binding
5 The binding task set for an omp_set_num_teams region is the generating task.

6 Effect
7 The effect of this routine is to set the value of the nteams-var ICV of the current device to the value
8 specified in the argument.

9 Restrictions
10 Restrictions to the omp_set_num_teams routine are as follows:
11 • The routine may not be called from within a parallel region that is not the implicit parallel region
12 that surrounds the whole OpenMP program.

13 Cross References
14 • nteams-var ICV, see Table 2.1
15 • num_teams clause, see Section 10.2.1
16 • teams directive, see Section 10.2

17 18.4.4 omp_get_max_teams
18 Summary
19 The omp_get_max_teams routine returns an upper bound on the number of teams that could be
20 created by a teams construct without a num_teams clause that is encountered after execution
21 returns from this routine.

22 Format
C / C++
23 int omp_get_max_teams(void);
C / C++
Fortran
24 integer function omp_get_max_teams()
Fortran
25 Binding
26 The binding task set for an omp_get_max_teams region is the generating task.

374 OpenMP API – Version 5.2 November 2021


1 Effect
2 The value returned by omp_get_max_teams is the value of the nteams-var ICV of the current
3 device. This value is also an upper bound on the number of teams that can be created by a teams
4 construct without a num_teams clause that is encountered after execution returns from this
5 routine.

6 Cross References
7 • nteams-var ICV, see Table 2.1
8 • num_teams clause, see Section 10.2.1
9 • teams directive, see Section 10.2

10 18.4.5 omp_set_teams_thread_limit
11 Summary
12 The omp_set_teams_thread_limit routine defines the maximum number of OpenMP
13 threads that can participate in each contention group created by a teams construct.

14 Format
C / C++
15 void omp_set_teams_thread_limit(int thread_limit);
C / C++
Fortran
16 subroutine omp_set_teams_thread_limit(thread_limit)
17 integer thread_limit
Fortran
18 Constraints on Arguments
19 The value of the argument passed to this routine must evaluate to a positive integer, or else the
20 behavior of this routine is implementation defined.

21 Binding
22 The binding task set for an omp_set_teams_thread_limit region is the generating task.

23 Effect
24 The omp_set_teams_thread_limit routine sets the value of the teams-thread-limit-var
25 ICV to the value of the thread_limit argument. If the value of thread_limit exceeds the number of
26 OpenMP threads that an implementation supports for each contention group created by a teams
27 construct, the value of the teams-thread-limit-var ICV will be set to the number that is supported by
28 the implementation.

CHAPTER 18. RUNTIME LIBRARY ROUTINES 375


1 Restrictions
2 Restrictions to the omp_set_teams_thread_limit routine are as follows:
3 • The routine may not be called from within a parallel region other than the implicit parallel region
4 that surrounds the whole OpenMP program.

5 Cross References
6 • teams directive, see Section 10.2
7 • teams-thread-limit-var ICV, see Table 2.1
8 • thread_limit clause, see Section 13.3

9 18.4.6 omp_get_teams_thread_limit
10 Summary
11 The omp_get_teams_thread_limit routine returns the maximum number of OpenMP
12 threads available to participate in each contention group created by a teams construct.

13 Format
C / C++
14 int omp_get_teams_thread_limit(void);
C / C++
Fortran
15 integer function omp_get_teams_thread_limit()
Fortran
16 Binding
17 The binding task set for an omp_get_teams_thread_limit region is the generating task.

18 Effect
19 The omp_get_teams_thread_limit routine returns the value of the teams-thread-limit-var
20 ICV.

21 Cross References
22 • teams directive, see Section 10.2
23 • teams-thread-limit-var ICV, see Table 2.1

376 OpenMP API – Version 5.2 November 2021


1 18.5 Tasking Routines
2 This section describes routines that pertain to OpenMP explicit tasks.

3 18.5.1 omp_get_max_task_priority
4 Summary
5 The omp_get_max_task_priority routine returns the maximum value that can be specified
6 in the priority clause.

7 Format
C / C++
8 int omp_get_max_task_priority(void);
C / C++
Fortran
9 integer function omp_get_max_task_priority()
Fortran
10 Binding
11 The binding thread set for an omp_get_max_task_priority region is all threads on the
12 device. The effect of executing this routine is not related to any specific region that corresponds to
13 any construct or API routine.
14 Effect
15 The omp_get_max_task_priority routine returns the value of the max-task-priority-var
16 ICV, which determines the maximum value that can be specified in the priority clause.

17 Cross References
18 • max-task-priority-var ICV, see Table 2.1
19 • priority clause, see Section 12.4

20 18.5.2 omp_in_explicit_task
21 Summary
22 The omp_in_explicit_task routine returns the value of the explicit-task-var ICV.

23 Format
C / C++
24 int omp_in_explicit_task(void);
C / C++
Fortran
25 logical function omp_in_explicit_task()
Fortran

CHAPTER 18. RUNTIME LIBRARY ROUTINES 377


1 Binding
2 The binding task set for an omp_in_explicit_task region is the generating task.
3 Effect
4 The omp_in_explicit_task routine returns the value of the explicit-task-var ICV, which
5 indicates whether the encountering region is an explicit task region.
6 Cross References
7 • explicit-task-var ICV, see Table 2.1
8 • task directive, see Section 12.5

9 18.5.3 omp_in_final
10 Summary
11 The omp_in_final routine returns true if the routine is executed in a final task region;
12 otherwise, it returns false.
13 Format
C / C++
14 int omp_in_final(void);
C / C++
Fortran
15 logical function omp_in_final()
Fortran
16 Binding
17 The binding task set for an omp_in_final region is the generating task.
18 Effect
19 omp_in_final returns true if the enclosing task region is final. Otherwise, it returns false.

20 18.6 Resource Relinquishing Routines


21 This section describes routines that relinquish resources used by the OpenMP runtime.

22 18.6.1 omp_pause_resource
23 Summary
24 The omp_pause_resource routine allows the runtime to relinquish resources used by OpenMP
25 on the specified device.
26 Format
C / C++
27 int omp_pause_resource(omp_pause_resource_t kind, int device_num);
C / C++

378 OpenMP API – Version 5.2 November 2021


Fortran
1 integer function omp_pause_resource(kind, device_num)
2 integer (kind=omp_pause_resource_kind) kind
3 integer device_num
Fortran
4 Constraints on Arguments
5 The first argument passed to this routine can be one of the valid OpenMP pause kind, or any
6 implementation-specific pause kind. The C/C++ header file (omp.h) and the Fortran include file
7 (omp_lib.h) and/or Fortran module file (omp_lib) define the valid constants. The valid
8 constants must include the following, which can be extended with implementation-specific values:
C / C++
9 typedef enum omp_pause_resource_t {
10 omp_pause_soft = 1,
11 omp_pause_hard = 2
12 } omp_pause_resource_t;
C / C++
Fortran
13 integer (kind=omp_pause_resource_kind), parameter :: &
14 omp_pause_soft = 1
15 integer (kind=omp_pause_resource_kind), parameter :: &
16 omp_pause_hard = 2
Fortran
17 The second argument passed to this routine indicates the device that will be paused. The
18 device_num parameter must be a conforming device number. If the device number has the value
19 omp_invalid_device, runtime error termination is performed.

20 Binding
21 The binding task set for an omp_pause_resource region is the whole program.

22 Effect
23 The omp_pause_resource routine allows the runtime to relinquish resources used by OpenMP
24 on the specified device.
25 If successful, the omp_pause_hard value results in a hard pause for which the OpenMP state is
26 not guaranteed to persist across the omp_pause_resource call. A hard pause may relinquish
27 any data allocated by OpenMP on a given device, including data allocated by memory routines for
28 that device as well as data present on the device as a result of a declare target directive or
29 target data construct. A hard pause may also relinquish any data associated with a
30 threadprivate directive. When relinquished and when applicable, base language appropriate
31 deallocation/finalization is performed. When relinquished and when applicable, mapped data on a
32 device will not be copied back from the device to the host.

CHAPTER 18. RUNTIME LIBRARY ROUTINES 379


1 If successful, the omp_pause_soft value results in a soft pause for which the OpenMP state is
2 guaranteed to persist across the call, with the exception of any data associated with a
3 threadprivate directive, which may be relinquished across the call. When relinquished and
4 when applicable, base language appropriate deallocation/finalization is performed.
5

6 Note – A hard pause may relinquish more resources, but may resume processing OpenMP regions
7 more slowly. A soft pause allows OpenMP regions to restart more quickly, but may relinquish fewer
8 resources. An OpenMP implementation will reclaim resources as needed for OpenMP regions
9 encountered after the omp_pause_resource region. Since a hard pause may unmap data on the
10 specified device, appropriate data mapping is required before using data on the specified device
11 after the omp_pause_region region.
12
13 The routine returns zero in case of success, and non-zero otherwise.
14 Tool Callbacks
15 If the tool is not allowed to interact with the specified device after encountering this call, then the
16 runtime must call the tool finalizer for that device.
17 Restrictions
18 Restrictions to the omp_pause_resource routine are as follows:
19 • The omp_pause_resource region may not be nested in any explicit OpenMP region.
20 • The routine may only be called when all explicit tasks have finalized execution.
21 Cross References
22 • Declare Target Directives, see Section 7.8
23 • target data directive, see Section 13.5
24 • threadprivate directive, see Section 5.2

25 18.6.2 omp_pause_resource_all
26 Summary
27 The omp_pause_resource_all routine allows the runtime to relinquish resources used by
28 OpenMP on all devices.
29 Format
C / C++
30 int omp_pause_resource_all(omp_pause_resource_t kind);
C / C++
Fortran
31 integer function omp_pause_resource_all(kind)
32 integer (kind=omp_pause_resource_kind) kind
Fortran

380 OpenMP API – Version 5.2 November 2021


1 Binding
2 The binding task set for an omp_pause_resource_all region is the whole program.
3 Effect
4 The omp_pause_resource_all routine allows the runtime to relinquish resources used by
5 OpenMP on all devices. It is equivalent to calling the omp_pause_resource routine once for
6 each available device, including the host device.
7 The argument kind passed to this routine can be one of the valid OpenMP pause kind as defined in
8 Section 18.6.1, or any implementation-specific pause kind.
9 Tool Callbacks
10 If the tool is not allowed to interact with a given device after encountering this call, then the
11 runtime must call the tool finalizer for that device.
12 Restrictions
13 Restrictions to the omp_pause_resource_all routine are as follows:
14 • The omp_pause_resource_all region may not be nested in any explicit OpenMP region.
15 • The routine may only be called when all explicit tasks have finalized execution.
16 Cross References
17 • omp_pause_resource, see Section 18.6.1

18 18.7 Device Information Routines


19 This section describes routines that pertain to the set of devices that are accessible to an OpenMP
20 program.

21 18.7.1 omp_get_num_procs
22 Summary
23 The omp_get_num_procs routine returns the number of processors available to the device.
24 Format
C / C++
25 int omp_get_num_procs(void);
C / C++
Fortran
26 integer function omp_get_num_procs()
Fortran
27 Binding
28 The binding thread set for an omp_get_num_procs region is all threads on a device. The effect
29 of executing this routine is not related to any specific region corresponding to any construct or API
30 routine.

CHAPTER 18. RUNTIME LIBRARY ROUTINES 381


1 Effect
2 The omp_get_num_procs routine returns the number of processors that are available to the
3 device at the time the routine is called. This value may change between the time that it is
4 determined by the omp_get_num_procs routine and the time that it is read in the calling
5 context due to system actions outside the control of the OpenMP implementation.

6 18.7.2 omp_set_default_device
7 Summary
8 The omp_set_default_device routine controls the default target device by assigning the
9 value of the default-device-var ICV.

10 Format
C / C++
11 void omp_set_default_device(int device_num);
C / C++
Fortran
12 subroutine omp_set_default_device(device_num)
13 integer device_num
Fortran
14 Binding
15 The binding task set for an omp_set_default_device region is the generating task.

16 Effect
17 The effect of this routine is to set the value of the default-device-var ICV of the current task to the
18 value specified in the argument. When called from within a target region the effect of this
19 routine is unspecified.

20 Cross References
21 • default-device-var ICV, see Table 2.1
22 • target directive, see Section 13.8

23 18.7.3 omp_get_default_device
24 Summary
25 The omp_get_default_device routine returns the default target device.

26 Format
C / C++
27 int omp_get_default_device(void);
C / C++

382 OpenMP API – Version 5.2 November 2021


Fortran
1 integer function omp_get_default_device()
Fortran
2 Binding
3 The binding task set for an omp_get_default_device region is the generating task.

4 Effect
5 The omp_get_default_device routine returns the value of the default-device-var ICV of the
6 current task. When called from within a target region the effect of this routine is unspecified.

7 Cross References
8 • default-device-var ICV, see Table 2.1
9 • target directive, see Section 13.8

10 18.7.4 omp_get_num_devices
11 Summary
12 The omp_get_num_devices routine returns the number of non-host devices available for
13 offloading code or data.

14 Format
C / C++
15 int omp_get_num_devices(void);
C / C++
Fortran
16 integer function omp_get_num_devices()
Fortran
17 Binding
18 The binding task set for an omp_get_num_devices region is the generating task.

19 Effect
20 The omp_get_num_devices routine returns the number of available non-host devices onto
21 which code or data may be offloaded. When called from within a target region the effect of this
22 routine is unspecified.

23 Cross References
24 • target directive, see Section 13.8

CHAPTER 18. RUNTIME LIBRARY ROUTINES 383


1 18.7.5 omp_get_device_num
2 Summary
3 The omp_get_device_num routine returns the device number of the device on which the
4 calling thread is executing.

5 Format
C / C++
6 int omp_get_device_num(void);
C / C++
Fortran
7 integer function omp_get_device_num()
Fortran
8 Binding
9 The binding task set for an omp_get_device_num region is the generating task.

10 Effect
11 The omp_get_device_num routine returns the device number of the device on which the
12 calling thread is executing. When called on the host device, it will return the same value as the
13 omp_get_initial_device routine.

14 18.7.6 omp_is_initial_device
15 Summary
16 The omp_is_initial_device routine returns true if the current task is executing on the host
17 device; otherwise, it returns false.

18 Format
C / C++
19 int omp_is_initial_device(void);
C / C++
Fortran
20 logical function omp_is_initial_device()
Fortran
21 Binding
22 The binding task set for an omp_is_initial_device region is the generating task.

23 Effect
24 The effect of this routine is to return true if the current task is executing on the host device;
25 otherwise, it returns false.

384 OpenMP API – Version 5.2 November 2021


1 18.7.7 omp_get_initial_device
2 Summary
3 The omp_get_initial_device routine returns a device number that represents the host
4 device.

5 Format
C / C++
6 int omp_get_initial_device(void);
C / C++
Fortran
7 integer function omp_get_initial_device()
Fortran
8 Binding
9 The binding task set for an omp_get_initial_device region is the generating task.

10 Effect
11 The effect of this routine is to return the device number of the host device. The value of the device
12 number is the value returned by the omp_get_num_devices routine. When called from within
13 a target region the effect of this routine is unspecified.

14 Cross References
15 • target directive, see Section 13.8

16 18.8 Device Memory Routines


17 This section describes routines that support allocation of memory and management of pointers in
18 the data environments of target devices.
19 If the device_num, src_device_num, or dst_device_num argument of a device memory routine has
20 the value omp_invalid_device, runtime error termination is performed.

21 18.8.1 omp_target_alloc
22 Summary
23 The omp_target_alloc routine allocates memory in a device data environment and returns a
24 device pointer to that memory.

25 Format
C / C++
26 void* omp_target_alloc(size_t size, int device_num);
C / C++

CHAPTER 18. RUNTIME LIBRARY ROUTINES 385


Fortran
1 type(c_ptr) function omp_target_alloc(size, device_num) bind(c)
2 use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t, c_int
3 integer(c_size_t), value :: size
4 integer(c_int), value :: device_num
Fortran
5 Constraints on Arguments
6 The device_num argument must be a conforming device number.

7 Binding
8 The binding task set for an omp_target_alloc region is the generating task, which is the target
9 task generated by the call to the omp_target_alloc routine.

10 Effect
11 The omp_target_alloc routine returns a device pointer that references the device address of a
12 storage location of size bytes. The storage location is dynamically allocated in the device data
13 environment of the device specified by device_num. The omp_target_alloc routine executes
14 as if part of a target task that is generated by the call to the routine and that is an included task. The
15 omp_target_alloc routine returns NULL if it cannot dynamically allocate the memory in the
16 device data environment. The device pointer returned by omp_target_alloc can be used in an
17 is_device_ptr clause (see Section 5.4.7).
Fortran
18 The omp_target_alloc routine requires an explicit interface and so might not be provided in
19 omp_lib.h.
Fortran
20 Execution Model Events
21 The target-data-allocation-begin event occurs before a thread initiates a data allocation on a target
22 device.
23 The target-data-allocation-end event occurs after a thread initiates a data allocation on a target
24 device.

25 Tool Callbacks
26 A thread dispatches a registered ompt_callback_target_data_op_emi callback with
27 ompt_scope_begin as its endpoint argument for each occurrence of a
28 target-data-allocation-begin event in that thread. Similarly, a thread dispatches a registered
29 ompt_callback_target_data_op_emi callback with ompt_scope_end as its endpoint
30 argument for each occurrence of a target-data-allocation-end event in that thread. These callbacks
31 have type signature ompt_callback_target_data_op_emi_t.
32 A thread dispatches a registered ompt_callback_target_data_op callback for each
33 occurrence of a target-data-allocation-end event in that thread. The callback occurs in the context
34 of the target task and has type signature ompt_callback_target_data_op_t.

386 OpenMP API – Version 5.2 November 2021


1 Restrictions
2 Restrictions to the omp_target_alloc routine are as follows.
3 • Freeing the storage returned by omp_target_alloc with any routine other than
4 omp_target_free results in unspecified behavior.
5 • When called from within a target region the effect is unspecified.
C / C++
6 • Unless the unified_address clause appears on a requires directive in the compilation
7 unit, pointer arithmetic is not supported on the device pointer returned by
8 omp_target_alloc.
C / C++
9 Cross References
10 • omp_target_free, see Section 18.8.2
11 • ompt_callback_target_data_op_emi_t and
12 ompt_callback_target_data_op_t, see Section 19.5.2.25
13 • is_device_ptr clause, see Section 5.4.7
14 • target directive, see Section 13.8

15 18.8.2 omp_target_free
16 Summary
17 The omp_target_free routine frees the device memory allocated by the
18 omp_target_alloc routine.

19 Format
C / C++
20 void omp_target_free(void *device_ptr, int device_num);
C / C++
Fortran
21 subroutine omp_target_free(device_ptr, device_num) bind(c)
22 use, intrinsic :: iso_c_binding, only : c_ptr, c_int
23 type(c_ptr), value :: device_ptr
24 integer(c_int), value :: device_num
Fortran
25 Constraints on Arguments
26 A program that calls omp_target_free with a non-null pointer that does not have a value
27 returned from omp_target_alloc is non-conforming. The device_num argument must be a
28 conforming device number.

CHAPTER 18. RUNTIME LIBRARY ROUTINES 387


1 Binding
2 The binding task set for an omp_target_free region is the generating task, which is the target
3 task generated by the call to the omp_target_free routine.

4 Effect
5 The omp_target_free routine frees the memory in the device data environment associated
6 with device_ptr. If device_ptr is NULL, the operation is ignored. The omp_target_free
7 routine executes as if part of a target task that is generated by the call to the routine and that is an
8 included task. Synchronization must be inserted to ensure that all accesses to device_ptr are
9 completed before the call to omp_target_free.
Fortran
10 The omp_target_free routine requires an explicit interface and so might not be provided in
11 omp_lib.h.
Fortran
12 Execution Model Events
13 The target-data-free-begin event occurs before a thread initiates a data free on a target device.
14 The target-data-free-end event occurs after a thread initiates a data free on a target device.

15 Tool Callbacks
16 A thread dispatches a registered ompt_callback_target_data_op_emi callback with
17 ompt_scope_begin as its endpoint argument for each occurrence of a target-data-free-begin
18 event in that thread. Similarly, a thread dispatches a registered
19 ompt_callback_target_data_op_emi callback with ompt_scope_end as its endpoint
20 argument for each occurrence of a target-data-free-end event in that thread. These callbacks have
21 type signature ompt_callback_target_data_op_emi_t.
22 A thread dispatches a registered ompt_callback_target_data_op callback for each
23 occurrence of a target-data-free-begin event in that thread. The callback occurs in the context of the
24 target task and has type signature ompt_callback_target_data_op_t.

25 Restrictions
26 Restrictions to the omp_target_free routine are as follows.
27 • When called from within a target region the effect is unspecified.

28 Cross References
29 • omp_target_alloc, see Section 18.8.1
30 • ompt_callback_target_data_op_emi_t and
31 ompt_callback_target_data_op_t, see Section 19.5.2.25
32 • target directive, see Section 13.8

388 OpenMP API – Version 5.2 November 2021


1 18.8.3 omp_target_is_present
2 Summary
3 The omp_target_is_present routine tests whether a host pointer refers to storage that is
4 mapped to a given device.

5 Format
C / C++
6 int omp_target_is_present(const void *ptr, int device_num);
C / C++
Fortran
7 integer(c_int) function omp_target_is_present(ptr, device_num) &
8 bind(c)
9 use, intrinsic :: iso_c_binding, only : c_ptr, c_int
10 type(c_ptr), value :: ptr
11 integer(c_int), value :: device_num
Fortran
12 Constraints on Arguments
13 The value of ptr must be a valid host pointer or NULL. The device_num argument must be a
14 conforming device number.

15 Binding
16 The binding task set for an omp_target_is_present region is the encountering task.

17 Effect
18 The omp_target_is_present routine returns true if device_num refers to the host device or
19 if ptr refers to storage that has corresponding storage in the device data environment of device
20 device_num. Otherwise, the routine returns false.
Fortran
21 The omp_target_is_present routine requires an explicit interface and so might not be
22 provided in omp_lib.h.
Fortran
23 Restrictions
24 Restrictions to the omp_target_is_present routine are as follows.
25 • When called from within a target region the effect is unspecified.

26 Cross References
27 • target directive, see Section 13.8

CHAPTER 18. RUNTIME LIBRARY ROUTINES 389


1 18.8.4 omp_target_is_accessible
2 Summary
3 The omp_target_is_accessible routine tests whether host memory is accessible from a
4 given device.

5 Format
C / C++
6 int omp_target_is_accessible( const void *ptr, size_t size,
7 int device_num);
C / C++
Fortran
8 integer(c_int) function omp_target_is_accessible( &
9 ptr, size, device_num) bind(c)
10 use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t, c_int
11 type(c_ptr), value :: ptr
12 integer(c_size_t), value :: size
13 integer(c_int), value :: device_num
Fortran
14 Constraints on Arguments
15 The value of ptr must be a valid host pointer or NULL. The device_num argument must be a
16 conforming device number.

17 Binding
18 The binding task set for an omp_target_is_accessible region is the encountering task.

19 Effect
20 This routine returns true if the storage of size bytes starting at the address given by ptr is accessible
21 from device device_num. Otherwise, it returns false.
Fortran
22 The omp_target_is_accessible routine requires an explicit interface and so might not be
23 provided in omp_lib.h.
Fortran
24 Restrictions
25 Restrictions to the omp_target_is_accessible routine are as follows.
26 • When called from within a target region the effect is unspecified.

27 Cross References
28 • target directive, see Section 13.8

390 OpenMP API – Version 5.2 November 2021


1 18.8.5 omp_target_memcpy
2 Summary
3 The omp_target_memcpy routine copies memory between any combination of host and device
4 pointers.
5 Format
C / C++
6 int omp_target_memcpy(
7 void *dst,
8 const void *src,
9 size_t length,
10 size_t dst_offset,
11 size_t src_offset,
12 int dst_device_num,
13 int src_device_num
14 );
C / C++
Fortran
15 integer(c_int) function omp_target_memcpy(dst, src, length, &
16 dst_offset, src_offset, dst_device_num, src_device_num) bind(c)
17 use, intrinsic :: iso_c_binding, only : c_ptr, c_int, c_size_t
18 type(c_ptr), value :: dst, src
19 integer(c_size_t), value :: length, dst_offset, src_offset
20 integer(c_int), value :: dst_device_num, src_device_num
Fortran
21 Constraints on Arguments
22 Each device pointer specified must be valid for the device on the same side of the copy. The
23 dst_device_num and src_device_num arguments must be conforming device numbers.
24 Binding
25 The binding task set for an omp_target_memcpy region is the generating task, which is the
26 target task generated by the call to the omp_target_memcpy routine.
27 Effect
28 This routine copies length bytes of memory at offset src_offset from src in the device data
29 environment of device src_device_num to dst starting at offset dst_offset in the device data
30 environment of device dst_device_num. The omp_target_memcpy routine executes as if part of
31 a target task that is generated by the call to the routine and that is an included task. The return value
32 is zero on success and non-zero on failure. This routine contains a task scheduling point.
Fortran
33 The omp_target_memcpy routine requires an explicit interface and so might not be provided in
34 omp_lib.h.
Fortran

CHAPTER 18. RUNTIME LIBRARY ROUTINES 391


1 Execution Model Events
2 The target-data-op-begin event occurs before a thread initiates a data transfer in the
3 omp_target_memcpy region.
4 The target-data-op-end event occurs after a thread initiates a data transfer in the
5 omp_target_memcpy region.

6 Tool Callbacks
7 A thread dispatches a registered ompt_callback_target_data_op_emi callback with
8 ompt_scope_begin as its endpoint argument for each occurrence of a target-data-op-begin
9 event in that thread. Similarly, a thread dispatches a registered
10 ompt_callback_target_data_op_emi callback with ompt_scope_end as its endpoint
11 argument for each occurrence of a target-data-op-end event in that thread. These callbacks have
12 type signature ompt_callback_target_data_op_emi_t.
13 A thread dispatches a registered ompt_callback_target_data_op callback for each
14 occurrence of a target-data-op-end event in that thread. The callback occurs in the context of the
15 target task and has type signature ompt_callback_target_data_op_t.

16 Restrictions
17 Restrictions to the omp_target_memcpy routine are as follows.
18 • When called from within a target region the effect is unspecified.

19 Cross References
20 • ompt_callback_target_data_op_emi_t and
21 ompt_callback_target_data_op_t, see Section 19.5.2.25
22 • target directive, see Section 13.8

23 18.8.6 omp_target_memcpy_rect
24 Summary
25 The omp_target_memcpy_rect routine copies a rectangular subvolume from a
26 multi-dimensional array to another multi-dimensional array. The omp_target_memcpy_rect
27 routine performs a copy between any combination of host and device pointers.

28 Format
C / C++
29 int omp_target_memcpy_rect(
30 void *dst,
31 const void *src,
32 size_t element_size,
33 int num_dims,
34 const size_t *volume,
35 const size_t *dst_offsets,

392 OpenMP API – Version 5.2 November 2021


1 const size_t *src_offsets,
2 const size_t *dst_dimensions,
3 const size_t *src_dimensions,
4 int dst_device_num,
5 int src_device_num
6 );
C / C++
Fortran
7 integer(c_int) function omp_target_memcpy_rect(dst,src,element_size, &
8 num_dims, volume, dst_offsets, src_offsets, dst_dimensions, src_dimensions, &
9 dst_device_num, src_device_num) bind(c)
10 use, intrinsic :: iso_c_binding, only : c_ptr, c_int, c_size_t
11 type(c_ptr), value :: dst, src
12 integer(c_size_t), value :: element_size
13 integer(c_int), value :: num_dims, dst_device_num, src_device_num
14 integer(c_size_t), intent(in) :: volume(*), dst_offsets(*), &
15 src_offsets(*), dst_dimensions(*), src_dimensions(*)
Fortran
16 Constraints on Arguments
17 Each device pointer specified must be valid for the device on the same side of the copy. The
18 dst_device_num and src_device_num arguments must be conforming device numbers. The length
19 of the offset and dimension arrays must be at least the value of num_dims. The value of num_dims
20 must be between 1 and the implementation-defined limit, which must be at least three.
Fortran
21 Because the interface binds directly to a C language function the function assumes C memory
22 ordering.
Fortran
23 Binding
24 The binding task set for an omp_target_memcpy_rect region is the generating task, which is
25 the target task generated by the call to the omp_target_memcpy_rect routine.

26 Effect
27 This routine copies a rectangular subvolume of src, in the device data environment of device
28 src_device_num, to dst, in the device data environment of device dst_device_num. The volume is
29 specified in terms of the size of an element, number of dimensions, and constant arrays of length
30 num_dims. The maximum number of dimensions supported is at least three; support for higher
31 dimensionality is implementation defined. The volume array specifies the length, in number of
32 elements, to copy in each dimension from src to dst. The dst_offsets (src_offsets) parameter
33 specifies the number of elements from the origin of dst (src) in elements. The dst_dimensions
34 (src_dimensions) parameter specifies the length of each dimension of dst (src).

CHAPTER 18. RUNTIME LIBRARY ROUTINES 393


1 The omp_target_memcpy_rect routine executes as if part of a target task that is generated by
2 the call to the routine and that is an included task. The routine returns zero if successful.
3 Otherwise, it returns a non-zero value. The routine contains a task scheduling point.
4 An application can determine the inclusive number of dimensions supported by an implementation
5 by passing NULL for both dst and src. The routine returns the number of dimensions supported by
6 the implementation for the specified device numbers. No copy operation is performed.
Fortran
7 The omp_target_memcpy_rect routine requires an explicit interface and so might not be
8 provided in omp_lib.h.
Fortran
9 Execution Model Events
10 The target-data-op-begin event occurs before a thread initiates a data transfer in the
11 omp_target_memcpy_rect region.
12 The target-data-op-end event occurs after a thread initiates a data transfer in the
13 omp_target_memcpy_rect region.
14 Tool Callbacks
15 A thread dispatches a registered ompt_callback_target_data_op_emi callback with
16 ompt_scope_begin as its endpoint argument for each occurrence of a target-data-op-begin
17 event in that thread. Similarly, a thread dispatches a registered
18 ompt_callback_target_data_op_emi callback with ompt_scope_end as its endpoint
19 argument for each occurrence of a target-data-op-end event in that thread. These callbacks have
20 type signature ompt_callback_target_data_op_emi_t.
21 A thread dispatches a registered ompt_callback_target_data_op callback for each
22 occurrence of a target-data-op-end event in that thread. The callback occurs in the context of the
23 target task and has type signature ompt_callback_target_data_op_t.
24 Restrictions
25 Restrictions to the omp_target_memcpy_rect routine are as follows.
26 • When called from within a target region the effect is unspecified.
27 Cross References
28 • ompt_callback_target_data_op_emi_t and
29 ompt_callback_target_data_op_t, see Section 19.5.2.25
30 • target directive, see Section 13.8

31 18.8.7 omp_target_memcpy_async
32 Summary
33 The omp_target_memcpy_async routine asynchronously performs a copy between any
34 combination of host and device pointers.

394 OpenMP API – Version 5.2 November 2021


1 Format
C / C++
2 int omp_target_memcpy_async(
3 void *dst,
4 const void *src,
5 size_t length,
6 size_t dst_offset,
7 size_t src_offset,
8 int dst_device_num,
9 int src_device_num,
10 int depobj_count,
11 omp_depend_t *depobj_list
12 );
C / C++
Fortran
13 integer(c_int) function omp_target_memcpy_async(dst, src, length, &
14 dst_offset, src_offset, dst_device_num, src_device_num, &
15 depobj_count, depobj_list) bind(c)
16 use, intrinsic :: iso_c_binding, only : c_ptr, c_int, c_size_t
17 type(c_ptr), value :: dst, src
18 integer(c_size_t), value :: length, dst_offset, src_offset
19 integer(c_int), value :: dst_device_num, src_device_num, depobj_count
20 integer(omp_depend_kind), optional :: depobj_list(*)
Fortran
21 Constraints on Arguments
22 Each device pointer specified must be valid for the device on the same side of the copy. The
23 dst_device_num and src_device_num arguments must be conforming device numbers.

24 Binding
25 The binding task set for an omp_target_memcpy_async region is the generating task, which
26 is the target task generated by the call to the omp_target_memcpy_async routine.

27 Effect
28 This routine performs an asynchronous memory copy where length bytes of memory at offset
29 src_offset from src in the device data environment of device src_device_num are copied to dst
30 starting at offset dst_offset in the device data environment of device dst_device_num. The
31 omp_target_memcpy_async routine executes as if part of a target task that is generated by the
32 call to the routine and for which execution may be deferred. Task dependences are expressed with
33 zero or more OpenMP depend objects. The dependences are specified by passing the number of
34 depend objects followed by an array of the objects. The generated target task is not a dependent task
35 if the program passes in a count of zero for depobj_count. depobj_list is ignored if the value of
36 depobj_count is zero.

CHAPTER 18. RUNTIME LIBRARY ROUTINES 395


1 The routine returns zero if successful. Otherwise, it returns a non-zero value. The routine contains
2 a task scheduling point.
Fortran
3 The omp_target_memcpy_async routine requires an explicit interface and so might not be
4 provided in omp_lib.h.
Fortran
5 Execution Model Events
6 The target-data-op-begin event occurs before a thread initiates a data transfer in the
7 omp_target_memcpy_async region.
8 The target-data-op-end event occurs after a thread initiates a data transfer in the
9 omp_target_memcpy_async region.

10 Tool Callbacks
11 A thread dispatches a registered ompt_callback_target_data_op_emi callback with
12 ompt_scope_begin as its endpoint argument for each occurrence of a target-data-op-begin
13 event in that thread. Similarly, a thread dispatches a registered
14 ompt_callback_target_data_op_emi callback with ompt_scope_end as its endpoint
15 argument for each occurrence of a target-data-op-end event in that thread. These callbacks have
16 type signature ompt_callback_target_data_op_emi_t.
17 A thread dispatches a registered ompt_callback_target_data_op callback for each
18 occurrence of a target-data-op-end event in that thread. The callback occurs in the context of the
19 target task and has type signature ompt_callback_target_data_op_t.

20 Restrictions
21 Restrictions to the omp_target_memcpy_async routine are as follows.
22 • When called from within a target region the effect is unspecified.

23 Cross References
24 • Depend Objects, see Section 15.9.2
25 • ompt_callback_target_data_op_emi_t and
26 ompt_callback_target_data_op_t, see Section 19.5.2.25
27 • target directive, see Section 13.8

28 18.8.8 omp_target_memcpy_rect_async
29 Summary
30 The omp_target_memcpy_rect_async routine asynchronously performs a copy between
31 any combination of host and device pointers.

396 OpenMP API – Version 5.2 November 2021


1 Format
C / C++
2 int omp_target_memcpy_rect_async(
3 void *dst,
4 const void *src,
5 size_t element_size,
6 int num_dims,
7 const size_t *volume,
8 const size_t *dst_offsets,
9 const size_t *src_offsets,
10 const size_t *dst_dimensions,
11 const size_t *src_dimensions,
12 int dst_device_num,
13 int src_device_num,
14 int depobj_count,
15 omp_depend_t *depobj_list
16 );
C / C++
Fortran
17 integer(c_int) function omp_target_memcpy_rect_async(dst, src, &
18 element_size, num_dims, volume, dst_offsets, src_offsets, &
19 dst_dimensions, src_dimensions, dst_device_num, src_device_num, &
20 depobj_count, depobj_list) bind(c)
21 use, intrinsic :: iso_c_binding, only : c_ptr, c_int, c_size_t
22 type(c_ptr), value :: dst, src
23 integer(c_size_t), value :: element_size
24 integer(c_int), value :: num_dims, dst_device_num, src_device_num, &
25 depobj_count
26 integer(c_size_t), intent(in) :: volume(*), dst_offsets(*), &
27 src_offsets(*), dst_dimensions(*), src_dimensions(*)
28 integer(omp_depobj_kind), optional :: depobj_list(*)
Fortran
29 Constraints on Arguments
30 Each device pointer specified must be valid for the device on the same side of the copy. The
31 dst_device_num and src_device_num arguments must be conforming device numbers. The length
32 of the offset and dimension arrays must be at least the value of num_dims. The value of num_dims
33 must be between 1 and the implementation-defined limit, which must be at least three.
Fortran
34 Because the interface binds directly to a C language function the function assumes C memory
35 ordering.
Fortran

CHAPTER 18. RUNTIME LIBRARY ROUTINES 397


1 Binding
2 The binding task set for an omp_target_memcpy_rect_async region is the generating task,
3 which is the target task generated by the call to the omp_target_memcpy_rect_async
4 routine.
5 Effect
6 This routine copies a rectangular subvolume of src, in the device data environment of device
7 src_device_num, to dst, in the device data environment of device dst_device_num. The volume is
8 specified in terms of the size of an element, number of dimensions, and constant arrays of length
9 num_dims. The maximum number of dimensions supported is at least three; support for higher
10 dimensionality is implementation defined. The volume array specifies the length, in number of
11 elements, to copy in each dimension from src to dst. The dst_offsets (src_offsets) parameter
12 specifies the number of elements from the origin of dst (src) in elements. The dst_dimensions
13 (src_dimensions) parameter specifies the length of each dimension of dst (src).
14 The omp_target_memcpy_rect_async routine executes as if part of a target task that is
15 generated by the call to the routine and for which execution may be deferred. Task dependences are
16 expressed with zero or more OpenMP depend objects. The dependences are specified by passing
17 the number of depend objects followed by an array of the objects. The generated target task is not a
18 dependent task if the program passes in a count of zero for depobj_count. depobj_list is ignored if
19 the value of depobj_count is zero.
20 The routine returns zero if successful. Otherwise, it returns a non-zero value. The routine contains
21 a task scheduling point.
22 An application can determine the number of inclusive dimensions supported by an implementation
23 by passing NULL for both dst and src. The routine returns the number of dimensions supported by
24 the implementation for the specified device numbers. No copy operation is performed.
Fortran
25 The omp_target_memcpy_rect_async routine requires an explicit interface and so might
26 not be provided in omp_lib.h.
Fortran
27 Execution Model Events
28 The target-data-op-begin event occurs before a thread initiates a data transfer in the
29 omp_target_memcpy_rect_async region.
30 The target-data-op-end event occurs after a thread initiates a data transfer in the
31 omp_target_memcpy_rect_async region.
32 Tool Callbacks
33 A thread dispatches a registered ompt_callback_target_data_op_emi callback with
34 ompt_scope_begin as its endpoint argument for each occurrence of a target-data-op-begin
35 event in that thread. Similarly, a thread dispatches a registered
36 ompt_callback_target_data_op_emi callback with ompt_scope_end as its endpoint
37 argument for each occurrence of a target-data-op-end event in that thread. These callbacks have
38 type signature ompt_callback_target_data_op_emi_t.

398 OpenMP API – Version 5.2 November 2021


1 A thread dispatches a registered ompt_callback_target_data_op callback for each
2 occurrence of a target-data-op-end event in that thread. The callback occurs in the context of the
3 target task and has type signature ompt_callback_target_data_op_t.

4 Restrictions
5 Restrictions to the omp_target_memcpy_rect_async routine are as follows.
6 • When called from within a target region the effect is unspecified.

7 Cross References
8 • Depend Objects, see Section 15.9.2
9 • ompt_callback_target_data_op_emi_t and
10 ompt_callback_target_data_op_t, see Section 19.5.2.25
11 • target directive, see Section 13.8

12 18.8.9 omp_target_associate_ptr
13 Summary
14 The omp_target_associate_ptr routine maps a device pointer, which may be returned
15 from omp_target_alloc or implementation-defined runtime routines, to a host pointer.

16 Format
C / C++
17 int omp_target_associate_ptr(
18 const void *host_ptr,
19 const void *device_ptr,
20 size_t size,
21 size_t device_offset,
22 int device_num
23 );
C / C++
Fortran
24 integer(c_int) function omp_target_associate_ptr(host_ptr, &
25 device_ptr, size, device_offset, device_num) bind(c)
26 use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t, c_int
27 type(c_ptr), value :: host_ptr, device_ptr
28 integer(c_size_t), value :: size, device_offset
29 integer(c_int), value :: device_num
Fortran
30 Constraints on Arguments
31 The value of device_ptr value must be a valid pointer to device memory for the device denoted by
32 the value of device_num. The device_num argument must be a conforming device number.

CHAPTER 18. RUNTIME LIBRARY ROUTINES 399


1 Binding
2 The binding task set for an omp_target_associate_ptr region is the generating task, which
3 is the target task generated by the call to the omp_target_associate_ptr routine.

4 Effect
5 The omp_target_associate_ptr routine associates a device pointer in the device data
6 environment of device device_num with a host pointer such that when the host pointer appears in a
7 subsequent map clause, the associated device pointer is used as the target for data motion
8 associated with that host pointer. The device_offset parameter specifies the offset into device_ptr
9 that is used as the base address for the device side of the mapping. The reference count of the
10 resulting mapping will be infinite. After being successfully associated, the buffer to which the
11 device pointer points is invalidated and accessing data directly through the device pointer results in
12 unspecified behavior. The pointer can be retrieved for other uses by using the
13 omp_target_disassociate_ptr routine to disassociate it .
14 The omp_target_associate_ptr routine executes as if part of a target task that is generated
15 by the call to the routine and that is an included task. The routine returns zero if successful.
16 Otherwise it returns a non-zero value.
17 Only one device buffer can be associated with a given host pointer value and device number pair.
18 Attempting to associate a second buffer will return non-zero. Associating the same pair of pointers
19 on the same device with the same offset has no effect and returns zero. Associating pointers that
20 share underlying storage will result in unspecified behavior. The omp_target_is_present
21 function can be used to test whether a given host pointer has a corresponding variable in the device
22 data environment.
Fortran
23 The omp_target_associate_ptr routine requires an explicit interface and so might not be
24 provided in omp_lib.h.
Fortran
25 Execution Model Events
26 The target-data-associate event occurs before a thread initiates a device pointer association on a
27 target device.

28 Tool Callbacks
29 A thread dispatches a registered ompt_callback_target_data_op callback, or a registered
30 ompt_callback_target_data_op_emi callback with ompt_scope_beginend as its
31 endpoint argument for each occurrence of a target-data-associate event in that thread. These
32 callbacks have type signature ompt_callback_target_data_op_t or
33 ompt_callback_target_data_op_emi_t, respectively.

34 Restrictions
35 Restrictions to the omp_target_associate_ptr routine are as follows.
36 • When called from within a target region the effect is unspecified.

400 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • omp_target_alloc, see Section 18.8.1
3 • omp_target_disassociate_ptr, see Section 18.8.10
4 • omp_target_is_present, see Section 18.8.3
5 • ompt_callback_target_data_op_emi_t and
6 ompt_callback_target_data_op_t, see Section 19.5.2.25
7 • target directive, see Section 13.8

8 18.8.10 omp_target_disassociate_ptr
9 Summary
10 The omp_target_disassociate_ptr removes the associated pointer for a given device
11 from a host pointer.

12 Format
C / C++
13 int omp_target_disassociate_ptr(const void *ptr, int device_num);
C / C++
Fortran
14 integer(c_int) function omp_target_disassociate_ptr(ptr, &
15 device_num) bind(c)
16 use, intrinsic :: iso_c_binding, only : c_ptr, c_int
17 type(c_ptr), value :: ptr
18 integer(c_int), value :: device_num
Fortran
19 Constraints on Arguments
20 The device_num argument must be a conforming device number.

21 Binding
22 The binding task set for an omp_target_disassociate_ptr region is the generating task,
23 which is the target task generated by the call to the omp_target_disassociate_ptr routine.

24 Effect
25 The omp_target_disassociate_ptr removes the associated device data on device
26 device_num from the presence table for host pointer ptr. A call to this routine on a pointer that is
27 not NULL and does not have associated data on the given device results in unspecified behavior.
28 The reference count of the mapping is reduced to zero, regardless of its current value. The
29 omp_target_disassociate_ptr routine executes as if part of a target task that is generated
30 by the call to the routine and that is an included task. The routine returns zero if successful.
31 Otherwise it returns a non-zero value. After a call to omp_target_disassociate_ptr, the
32 contents of the device buffer are invalidated.

CHAPTER 18. RUNTIME LIBRARY ROUTINES 401


Fortran
1 The omp_target_disassociate_ptr routine requires an explicit interface and so might not
2 be provided in omp_lib.h.
Fortran
3 Execution Model Events
4 The target-data-disassociate event occurs before a thread initiates a device pointer disassociation
5 on a target device.

6 Tool Callbacks
7 A thread dispatches a registered ompt_callback_target_data_op callback, or a registered
8 ompt_callback_target_data_op_emi callback with ompt_scope_beginend as its
9 endpoint argument for each occurrence of a target-data-disassociate event in that thread. These
10 callbacks have type signature ompt_callback_target_data_op_t or
11 ompt_callback_target_data_op_emi_t, respectively.

12 Restrictions
13 Restrictions to the omp_target_disassociate_ptr routine are as follows.
14 • When called from within a target region the effect is unspecified.

15 Cross References
16 • ompt_callback_target_data_op_emi_t and
17 ompt_callback_target_data_op_t, see Section 19.5.2.25
18 • target directive, see Section 13.8

19 18.8.11 omp_get_mapped_ptr
20 Summary
21 The omp_get_mapped_ptr routine returns the device pointer that is associated with a host
22 pointer for a given device.
23 Format
C / C++
24 void * omp_get_mapped_ptr(const void *ptr, int device_num);
C / C++
Fortran
25 type(c_ptr) function omp_get_mapped_ptr(ptr, &
26 device_num) bind(c)
27 use, intrinsic :: iso_c_binding, only : c_ptr, c_int
28 type(c_ptr), value :: ptr
29 integer(c_int), value :: device_num
Fortran

402 OpenMP API – Version 5.2 November 2021


1 Constraints on Arguments
2 The device_num argument must be a conforming device number.
3 Binding
4 The binding task set for an omp_get_mapped_ptr region is the encountering task.
5 Effect
6 The omp_get_mapped_ptr routine returns the associated device pointer on device device_num.
7 A call to this routine for a pointer that is not NULL and does not have an associated pointer on the
8 given device will return NULL. The routine returns NULL if unsuccessful. Otherwise it returns the
9 device pointer, which is ptr if device_num is the value returned by
10 omp_get_initial_device().
Fortran
11 The omp_get_mapped_ptr routine requires an explicit interface and so might not be provided
12 in omp_lib.h.
Fortran
13 Execution Model Events
14 No events are associated with this routine.
15 Restrictions
16 Restrictions to the omp_get_mapped_ptr routine are as follows.
17 • When called from within a target region the effect is unspecified.

18 Cross References
19 • omp_get_initial_device, see Section 18.7.7

20 18.9 Lock Routines


21 The OpenMP runtime library includes a set of general-purpose lock routines that can be used for
22 synchronization. These general-purpose lock routines operate on OpenMP locks that are
23 represented by OpenMP lock variables. OpenMP lock variables must be accessed only through the
24 routines described in this section; programs that otherwise access OpenMP lock variables are
25 non-conforming.
26 An OpenMP lock can be in one of the following states: uninitialized; unlocked; or locked. If a lock
27 is in the unlocked state, a task can set the lock, which changes its state to locked. The task that sets
28 the lock is then said to own the lock. A task that owns a lock can unset that lock, returning it to the
29 unlocked state. A program in which a task unsets a lock that is owned by another task is
30 non-conforming.
31 Two types of locks are supported: simple locks and nestable locks. A nestable lock can be set
32 multiple times by the same task before being unset; a simple lock cannot be set if it is already
33 owned by the task trying to set it. Simple lock variables are associated with simple locks and can

CHAPTER 18. RUNTIME LIBRARY ROUTINES 403


1 only be passed to simple lock routines. Nestable lock variables are associated with nestable locks
2 and can only be passed to nestable lock routines.
3 Each type of lock can also have a synchronization hint that contains information about the intended
4 usage of the lock by the application code. The effect of the hint is implementation defined. An
5 OpenMP implementation can use this hint to select a usage-specific lock, but hints do not change
6 the mutual exclusion semantics of locks. A conforming implementation can safely ignore the hint.
7 Constraints on the state and ownership of the lock accessed by each of the lock routines are
8 described with the routine. If these constraints are not met, the behavior of the routine is
9 unspecified.
10 The OpenMP lock routines access a lock variable such that they always read and update the most
11 current value of the lock variable. An OpenMP program does not need to include explicit flush
12 directives to ensure that the lock variable’s value is consistent among different tasks.

13 Binding
14 The binding thread set for all lock routine regions is all threads in the contention group. As a
15 consequence, for each OpenMP lock, the lock routine effects relate to all tasks that call the routines,
16 without regard to which teams in the contention group the threads that are executing the tasks
17 belong.

18 Simple Lock Routines


C / C++
19 The type omp_lock_t represents a simple lock. For the following routines, a simple lock variable
20 must be of omp_lock_t type. All simple lock routines require an argument that is a pointer to a
21 variable of type omp_lock_t.
C / C++
Fortran
22 For the following routines, a simple lock variable must be an integer variable of
23 kind=omp_lock_kind.
Fortran
24 The simple lock routines are as follows:
25 • The omp_init_lock routine initializes a simple lock;
26 • The omp_init_lock_with_hint routine initializes a simple lock and attaches a hint to it;
27 • The omp_destroy_lock routine uninitializes a simple lock;
28 • The omp_set_lock routine waits until a simple lock is available and then sets it;
29 • The omp_unset_lock routine unsets a simple lock; and
30 • The omp_test_lock routine tests a simple lock and sets it if it is available.

404 OpenMP API – Version 5.2 November 2021


1 Nestable Lock Routines
C / C++
2 The type omp_nest_lock_t represents a nestable lock. For the following routines, a nestable
3 lock variable must be of omp_nest_lock_t type. All nestable lock routines require an
4 argument that is a pointer to a variable of type omp_nest_lock_t.
C / C++
Fortran
5 For the following routines, a nestable lock variable must be an integer variable of
6 kind=omp_nest_lock_kind.
Fortran
7 The nestable lock routines are as follows:
8 • The omp_init_nest_lock routine initializes a nestable lock;
9 • The omp_init_nest_lock_with_hint routine initializes a nestable lock and attaches a
10 hint to it;
11 • The omp_destroy_nest_lock routine uninitializes a nestable lock;
12 • The omp_set_nest_lock routine waits until a nestable lock is available and then sets it;
13 • The omp_unset_nest_lock routine unsets a nestable lock; and
14 • The omp_test_nest_lock routine tests a nestable lock and sets it if it is available.

15 Restrictions
16 Restrictions to OpenMP lock routines are as follows:
17 • The use of the same OpenMP lock in different contention groups results in unspecified behavior.

18 18.9.1 omp_init_lock and omp_init_nest_lock


19 Summary
20 These routines initialize an OpenMP lock without a hint.

21 Format
C / C++
22 void omp_init_lock(omp_lock_t *lock);
23 void omp_init_nest_lock(omp_nest_lock_t *lock);
C / C++
Fortran
24 subroutine omp_init_lock(svar)
25 integer (kind=omp_lock_kind) svar
26
27 subroutine omp_init_nest_lock(nvar)
28 integer (kind=omp_nest_lock_kind) nvar
Fortran

CHAPTER 18. RUNTIME LIBRARY ROUTINES 405


1 Constraints on Arguments
2 A program that accesses a lock that is not in the uninitialized state through either routine is
3 non-conforming.

4 Effect
5 The effect of these routines is to initialize the lock to the unlocked state; that is, no task owns the
6 lock. In addition, the nesting count for a nestable lock is set to zero.

7 Execution Model Events


8 The lock-init event occurs in a thread that executes an omp_init_lock region after initialization
9 of the lock, but before it finishes the region. The nest-lock-init event occurs in a thread that executes
10 an omp_init_nest_lock region after initialization of the lock, but before it finishes the region.

11 Tool Callbacks
12 A thread dispatches a registered ompt_callback_lock_init callback with
13 omp_sync_hint_none as the hint argument and ompt_mutex_lock as the kind argument
14 for each occurrence of a lock-init event in that thread. Similarly, a thread dispatches a registered
15 ompt_callback_lock_init callback with omp_sync_hint_none as the hint argument
16 and ompt_mutex_nest_lock as the kind argument for each occurrence of a nest-lock-init
17 event in that thread. These callbacks have the type signature
18 ompt_callback_mutex_acquire_t and occur in the task that encounters the routine.

19 Cross References
20 • ompt_callback_mutex_acquire_t, see Section 19.5.2.14

21 18.9.2 omp_init_lock_with_hint and


22 omp_init_nest_lock_with_hint
23 Summary
24 These routines initialize an OpenMP lock with a hint. The effect of the hint is
25 implementation-defined. The OpenMP implementation can ignore the hint without changing
26 program semantics.

27 Format
C / C++
28 void omp_init_lock_with_hint(
29 omp_lock_t *lock,
30 omp_sync_hint_t hint
31 );
32 void omp_init_nest_lock_with_hint(
33 omp_nest_lock_t *lock,
34 omp_sync_hint_t hint
35 );
C / C++

406 OpenMP API – Version 5.2 November 2021


Fortran
1 subroutine omp_init_lock_with_hint(svar, hint)
2 integer (kind=omp_lock_kind) svar
3 integer (kind=omp_sync_hint_kind) hint
4
5 subroutine omp_init_nest_lock_with_hint(nvar, hint)
6 integer (kind=omp_nest_lock_kind) nvar
7 integer (kind=omp_sync_hint_kind) hint
Fortran
8 Constraints on Arguments
9 A program that accesses a lock that is not in the uninitialized state through either routine is
10 non-conforming. The second argument passed to these routines (hint) is a hint as described in
11 Section 15.1.

12 Effect
13 The effect of these routines is to initialize the lock to the unlocked state and, optionally, to choose a
14 specific lock implementation based on the hint. After initialization no task owns the lock. In
15 addition, the nesting count for a nestable lock is set to zero.

16 Execution Model Events


17 The lock-init-with-hint event occurs in a thread that executes an omp_init_lock_with_hint
18 region after initialization of the lock, but before it finishes the region. The nest-lock-init-with-hint
19 event occurs in a thread that executes an omp_init_nest_lock region after initialization of the
20 lock, but before it finishes the region.

21 Tool Callbacks
22 A thread dispatches a registered ompt_callback_lock_init callback with the same value
23 for its hint argument as the hint argument of the call to omp_init_lock_with_hint and
24 ompt_mutex_lock as the kind argument for each occurrence of a lock-init-with-hint event in
25 that thread. Similarly, a thread dispatches a registered ompt_callback_lock_init callback
26 with the same value for its hint argument as the hint argument of the call to
27 omp_init_nest_lock_with_hint and ompt_mutex_nest_lock as the kind argument
28 for each occurrence of a nest-lock-init-with-hint event in that thread. These callbacks have the type
29 signature ompt_callback_mutex_acquire_t and occur in the task that encounters the
30 routine.

31 Cross References
32 • Synchronization Hints, see Section 15.1
33 • ompt_callback_mutex_acquire_t, see Section 19.5.2.14

CHAPTER 18. RUNTIME LIBRARY ROUTINES 407


1 18.9.3 omp_destroy_lock and omp_destroy_nest_lock
2 Summary
3 These routines ensure that the OpenMP lock is uninitialized.

4 Format
C / C++
5 void omp_destroy_lock(omp_lock_t *lock);
6 void omp_destroy_nest_lock(omp_nest_lock_t *lock);
C / C++
Fortran
7 subroutine omp_destroy_lock(svar)
8 integer (kind=omp_lock_kind) svar
9
10 subroutine omp_destroy_nest_lock(nvar)
11 integer (kind=omp_nest_lock_kind) nvar
Fortran
12 Constraints on Arguments
13 A program that accesses a lock that is not in the unlocked state through either routine is
14 non-conforming.

15 Effect
16 The effect of these routines is to change the state of the lock to uninitialized.

17 Execution Model Events


18 The lock-destroy event occurs in a thread that executes an omp_destroy_lock region before it
19 finishes the region. The nest-lock-destroy event occurs in a thread that executes an
20 omp_destroy_nest_lock region before it finishes the region.

21 Tool Callbacks
22 A thread dispatches a registered ompt_callback_lock_destroy callback with
23 ompt_mutex_lock as the kind argument for each occurrence of a lock-destroy event in that
24 thread. Similarly, a thread dispatches a registered ompt_callback_lock_destroy callback
25 with ompt_mutex_nest_lock as the kind argument for each occurrence of a nest-lock-destroy
26 event in that thread. These callbacks have the type signature ompt_callback_mutex_t and
27 occur in the task that encounters the routine.

28 Cross References
29 • ompt_callback_mutex_t, see Section 19.5.2.15

408 OpenMP API – Version 5.2 November 2021


1 18.9.4 omp_set_lock and omp_set_nest_lock
2 Summary
3 These routines provide a means of setting an OpenMP lock. The calling task region behaves as if it
4 was suspended until the lock can be set by this task.

5 Format
C / C++
6 void omp_set_lock(omp_lock_t *lock);
7 void omp_set_nest_lock(omp_nest_lock_t *lock);
C / C++
Fortran
8 subroutine omp_set_lock(svar)
9 integer (kind=omp_lock_kind) svar
10
11 subroutine omp_set_nest_lock(nvar)
12 integer (kind=omp_nest_lock_kind) nvar
Fortran
13 Constraints on Arguments
14 A program that accesses a lock that is in the uninitialized state through either routine is
15 non-conforming. A simple lock accessed by omp_set_lock that is in the locked state must not
16 be owned by the task that contains the call or deadlock will result.

17 Effect
18 Each of these routines has an effect equivalent to suspension of the task that is executing the routine
19 until the specified lock is available.
20

21 Note – The semantics of these routines is specified as if they serialize execution of the region
22 guarded by the lock. However, implementations may implement them in other ways provided that
23 the isolation properties are respected so that the actual execution delivers a result that could arise
24 from some serialization.
25

26 A simple lock is available if it is unlocked. Ownership of the lock is granted to the task that
27 executes the routine. A nestable lock is available if it is unlocked or if it is already owned by the
28 task that executes the routine. The task that executes the routine is granted, or retains, ownership of
29 the lock, and the nesting count for the lock is incremented.

CHAPTER 18. RUNTIME LIBRARY ROUTINES 409


1 Execution Model Events
2 The lock-acquire event occurs in a thread that executes an omp_set_lock region before the
3 associated lock is requested. The nest-lock-acquire event occurs in a thread that executes an
4 omp_set_nest_lock region before the associated lock is requested.
5 The lock-acquired event occurs in a thread that executes an omp_set_lock region after it
6 acquires the associated lock but before it finishes the region. The nest-lock-acquired event occurs in
7 a thread that executes an omp_set_nest_lock region if the thread did not already own the
8 lock, after it acquires the associated lock but before it finishes the region.
9 The nest-lock-owned event occurs in a thread when it already owns the lock and executes an
10 omp_set_nest_lock region. The event occurs after the nesting count is incremented but
11 before the thread finishes the region.
12 Tool Callbacks
13 A thread dispatches a registered ompt_callback_mutex_acquire callback for each
14 occurrence of a lock-acquire or nest-lock-acquire event in that thread. This callback has the type
15 signature ompt_callback_mutex_acquire_t.
16 A thread dispatches a registered ompt_callback_mutex_acquired callback for each
17 occurrence of a lock-acquired or nest-lock-acquired event in that thread. This callback has the type
18 signature ompt_callback_mutex_t.
19 A thread dispatches a registered ompt_callback_nest_lock callback with
20 ompt_scope_begin as its endpoint argument for each occurrence of a nest-lock-owned event in
21 that thread. This callback has the type signature ompt_callback_nest_lock_t.
22 The above callbacks occur in the task that encounters the lock function. The kind argument of these
23 callbacks is ompt_mutex_lock when the events arise from an omp_set_lock region while it
24 is ompt_mutex_nest_lock when the events arise from an omp_set_nest_lock region.
25 Cross References
26 • ompt_callback_mutex_acquire_t, see Section 19.5.2.14
27 • ompt_callback_mutex_t, see Section 19.5.2.15
28 • ompt_callback_nest_lock_t, see Section 19.5.2.16

29 18.9.5 omp_unset_lock and omp_unset_nest_lock


30 Summary
31 These routines provide the means of unsetting an OpenMP lock.
32 Format
C / C++
33 void omp_unset_lock(omp_lock_t *lock);
34 void omp_unset_nest_lock(omp_nest_lock_t *lock);
C / C++

410 OpenMP API – Version 5.2 November 2021


Fortran
1 subroutine omp_unset_lock(svar)
2 integer (kind=omp_lock_kind) svar
3
4 subroutine omp_unset_nest_lock(nvar)
5 integer (kind=omp_nest_lock_kind) nvar
Fortran
6 Constraints on Arguments
7 A program that accesses a lock that is not in the locked state or that is not owned by the task that
8 contains the call through either routine is non-conforming.

9 Effect
10 For a simple lock, the omp_unset_lock routine causes the lock to become unlocked. For a
11 nestable lock, the omp_unset_nest_lock routine decrements the nesting count, and causes the
12 lock to become unlocked if the resulting nesting count is zero. For either routine, if the lock
13 becomes unlocked, and if one or more task regions were effectively suspended because the lock was
14 unavailable, the effect is that one task is chosen and given ownership of the lock.

15 Execution Model Events


16 The lock-release event occurs in a thread that executes an omp_unset_lock region after it
17 releases the associated lock but before it finishes the region. The nest-lock-release event occurs in a
18 thread that executes an omp_unset_nest_lock region after it releases the associated lock but
19 before it finishes the region.
20 The nest-lock-held event occurs in a thread that executes an omp_unset_nest_lock region
21 before it finishes the region when the thread still owns the lock after the nesting count is
22 decremented.

23 Tool Callbacks
24 A thread dispatches a registered ompt_callback_mutex_released callback with
25 ompt_mutex_lock as the kind argument for each occurrence of a lock-release event in that
26 thread. Similarly, a thread dispatches a registered ompt_callback_mutex_released
27 callback with ompt_mutex_nest_lock as the kind argument for each occurrence of a
28 nest-lock-release event in that thread. These callbacks have the type signature
29 ompt_callback_mutex_t and occur in the task that encounters the routine.
30 A thread dispatches a registered ompt_callback_nest_lock callback with
31 ompt_scope_end as its endpoint argument for each occurrence of a nest-lock-held event in that
32 thread. This callback has the type signature ompt_callback_nest_lock_t.

33 Cross References
34 • ompt_callback_mutex_t, see Section 19.5.2.15
35 • ompt_callback_nest_lock_t, see Section 19.5.2.16

CHAPTER 18. RUNTIME LIBRARY ROUTINES 411


1 18.9.6 omp_test_lock and omp_test_nest_lock
2 Summary
3 These routines attempt to set an OpenMP lock but do not suspend execution of the task that
4 executes the routine.

5 Format
C / C++
6 int omp_test_lock(omp_lock_t *lock);
7 int omp_test_nest_lock(omp_nest_lock_t *lock);
C / C++
Fortran
8 logical function omp_test_lock(svar)
9 integer (kind=omp_lock_kind) svar
10
11 integer function omp_test_nest_lock(nvar)
12 integer (kind=omp_nest_lock_kind) nvar
Fortran
13 Constraints on Arguments
14 A program that accesses a lock that is in the uninitialized state through either routine is
15 non-conforming. The behavior is unspecified if a simple lock accessed by omp_test_lock is in
16 the locked state and is owned by the task that contains the call.

17 Effect
18 These routines attempt to set a lock in the same manner as omp_set_lock and
19 omp_set_nest_lock, except that they do not suspend execution of the task that executes the
20 routine. For a simple lock, the omp_test_lock routine returns true if the lock is successfully
21 set; otherwise, it returns false. For a nestable lock, the omp_test_nest_lock routine returns
22 the new nesting count if the lock is successfully set; otherwise, it returns zero.

23 Execution Model Events


24 The lock-test event occurs in a thread that executes an omp_test_lock region before the
25 associated lock is tested. The nest-lock-test event occurs in a thread that executes an
26 omp_test_nest_lock region before the associated lock is tested.
27 The lock-test-acquired event occurs in a thread that executes an omp_test_lock region before it
28 finishes the region if the associated lock was acquired. The nest-lock-test-acquired event occurs in a
29 thread that executes an omp_test_nest_lock region before it finishes the region if the
30 associated lock was acquired and the thread did not already own the lock.
31 The nest-lock-owned event occurs in a thread that executes an omp_test_nest_lock region
32 before it finishes the region after the nesting count is incremented if the thread already owned the
33 lock.

412 OpenMP API – Version 5.2 November 2021


1 Tool Callbacks
2 A thread dispatches a registered ompt_callback_mutex_acquire callback for each
3 occurrence of a lock-test or nest-lock-test event in that thread. This callback has the type signature
4 ompt_callback_mutex_acquire_t.
5 A thread dispatches a registered ompt_callback_mutex_acquired callback for each
6 occurrence of a lock-test-acquired or nest-lock-test-acquired event in that thread. This callback has
7 the type signature ompt_callback_mutex_t.
8 A thread dispatches a registered ompt_callback_nest_lock callback with
9 ompt_scope_begin as its endpoint argument for each occurrence of a nest-lock-owned event in
10 that thread. This callback has the type signature ompt_callback_nest_lock_t.
11 The above callbacks occur in the task that encounters the lock function. The kind argument of these
12 callbacks is ompt_mutex_test_lock when the events arise from an omp_test_lock
13 region while it is ompt_mutex_test_nest_lock when the events arise from an
14 omp_test_nest_lock region.

15 Cross References
16 • ompt_callback_mutex_acquire_t, see Section 19.5.2.14
17 • ompt_callback_mutex_t, see Section 19.5.2.15
18 • ompt_callback_nest_lock_t, see Section 19.5.2.16

19 18.10 Timing Routines


20 This section describes routines that support a portable wall clock timer.

21 18.10.1 omp_get_wtime
22 Summary
23 The omp_get_wtime routine returns elapsed wall clock time in seconds.

24 Format
C / C++
25 double omp_get_wtime(void);
C / C++
Fortran
26 double precision function omp_get_wtime()
Fortran
27 Binding
28 The binding thread set for an omp_get_wtime region is the encountering thread. The routine’s
29 return value is not guaranteed to be consistent across any set of threads.

CHAPTER 18. RUNTIME LIBRARY ROUTINES 413


1 Effect
2 The omp_get_wtime routine returns a value equal to the elapsed wall clock time in seconds
3 since some time-in-the-past. The actual time-in-the-past is arbitrary, but it is guaranteed not to
4 change during the execution of the application program. The time returned is a per-thread time, so
5 it is not required to be globally consistent across all threads that participate in an application.

6 18.10.2 omp_get_wtick
7 Summary
8 The omp_get_wtick routine returns the precision of the timer used by omp_get_wtime.

9 Format
C / C++
10 double omp_get_wtick(void);
C / C++
Fortran
11 double precision function omp_get_wtick()
Fortran
12 Binding
13 The binding thread set for an omp_get_wtick region is the encountering thread. The routine’s
14 return value is not guaranteed to be consistent across any set of threads.

15 Effect
16 The omp_get_wtick routine returns a value equal to the number of seconds between successive
17 clock ticks of the timer used by omp_get_wtime.

18 18.11 Event Routine


19 This section describes a routine that supports OpenMP event objects.

20 Binding
21 The binding thread set for all event routine regions is the encountering thread.

22 18.11.1 omp_fulfill_event
23 Summary
24 This routine fulfills and destroys an OpenMP event.

414 OpenMP API – Version 5.2 November 2021


1 Format
C / C++
2 void omp_fulfill_event(omp_event_handle_t event);
C / C++
Fortran
3 subroutine omp_fulfill_event(event)
4 integer (kind=omp_event_handle_kind) event
Fortran
5 Constraints on Arguments
6 A program that calls this routine on an event that was already fulfilled is non-conforming. A
7 program that calls this routine with an event handle that was not created by the detach clause is
8 non-conforming.

9 Effect
10 The effect of this routine is to fulfill the event associated with the event handle argument. The effect
11 of fulfilling the event will depend on how the event was created. The event is destroyed and cannot
12 be accessed after calling this routine, and the event handle becomes unassociated with any event.

13 Execution Model Events


14 The task-fulfill event occurs in a thread that executes an omp_fulfill_event region before the
15 event is fulfilled if the OpenMP event object was created by a detach clause on a task.

16 Tool Callbacks
17 A thread dispatches a registered ompt_callback_task_schedule callback with NULL as its
18 next_task_data argument while the argument prior_task_data binds to the detachable task for each
19 occurrence of a task-fulfill event. If the task-fulfill event occurs before the detachable task finished
20 the execution of the associated structured-block, the callback has
21 ompt_task_early_fulfill as its prior_task_status argument; otherwise the callback has
22 ompt_task_late_fulfill as its prior_task_status argument. This callback has type
23 signature ompt_callback_task_schedule_t.

24 Restrictions
25 Restrictions to the omp_fulfill_event routine are as follows:
26 • The event handler passed to the routine must have been created by a thread in the same device as
27 the thread that invoked the routine.

28 Cross References
29 • ompt_callback_task_schedule_t, see Section 19.5.2.10
30 • detach clause, see Section 12.5.2

CHAPTER 18. RUNTIME LIBRARY ROUTINES 415


TABLE 18.1: Required Values of the omp_interop_property_t enum Type

Enum Name Contexts Name Property


omp_ipr_fr_id = -1 all fr_id An intptr_t value that rep-
resents the foreign runtime id of
context
omp_ipr_fr_name = -2 all fr_name C string value that represents the
foreign runtime name of context
omp_ipr_vendor = -3 all vendor An intptr_t that represents
the vendor of context
omp_ipr_vendor_name = all vendor_name C string value that represents the
-4 vendor of context
omp_ipr_device_num = -5 all device_num The OpenMP device ID for
the device in the range 0 to
omp_get_num_devices()
inclusive
omp_ipr_platform = -6 target platform A foreign platform handle usu-
ally spanning multiple devices
omp_ipr_device = -7 target device A foreign device handle
omp_ipr_device_context target device_context A handle to an instance of a
= -8 foreign device context
omp_ipr_targetsync = -9 targetsync targetsync A handle to a synchronization
object of a foreign execution
context
omp_ipr_first = -9

C / C++

1 18.12 Interoperability Routines


2 The interoperability routines provide mechanisms to inspect the properties associated with an
3 omp_interop_t object. Such objects may be initialized, destroyed or otherwise used by an
4 interop construct. Additionally, an omp_interop_t object can be initialized to
5 omp_interop_none, which is defined to be zero. An omp_interop_t object may only be
6 accessed or modified through OpenMP directives and API routines.
7 An omp_interop_t object can be copied without affecting, or copying, the underlying state.
8 Destruction of an omp_interop_t object destroys the state to which all copies of the object refer.
9 OpenMP reserves all negative values for properties, as listed in Table 18.1; implementation-defined
10 properties may use zero and positive values. The special property, omp_ipr_first, will always
11 have the lowest property value, which may change in future versions of this specification. Valid
12 values and types for the properties that Table 18.1 lists are specified in the OpenMP Additional
13 Definitions document or are implementation defined unless otherwise specified.

416 OpenMP API – Version 5.2 November 2021


TABLE 18.2: Required Values for the omp_interop_rc_t enum Type

Enum Name Description


omp_irc_no_value = 1 Parameters valid, no meaningful value available
omp_irc_success = 0 Successful, value is usable
omp_irc_empty = -1 The object provided is equal to omp_interop_none
omp_irc_out_of_range = -2 Property ID is out of range, see Table 18.1
omp_irc_type_int = -3 Property type is int; use omp_get_interop_int
omp_irc_type_ptr = -4 Property type is pointer; use omp_get_interop_ptr
omp_irc_type_str = -5 Property type is string; use omp_get_interop_str
omp_irc_other = -6 Other error; use omp_get_interop_rc_desc

1 Table 18.2 lists the return codes used by routines that take an int* ret_code argument.

2 Binding
3 The binding task set for all interoperability routine regions is the generating task.
C / C++

C / C++

4 18.12.1 omp_get_num_interop_properties
5 Summary
6 The omp_get_num_interop_properties routine retrieves the number of
7 implementation-defined properties available for an omp_interop_t object.

8 Format
9 int omp_get_num_interop_properties(const omp_interop_t interop);

10 Effect
11 The omp_get_num_interop_properties routine returns the number of
12 implementation-defined properties available for interop. The total number of properties available
13 for interop is the returned value minus omp_ipr_first.
C / C++

C / C++

14 18.12.2 omp_get_interop_int
15 Summary
16 The omp_get_interop_int routine retrieves an integer property from an omp_interop_t
17 object.

CHAPTER 18. RUNTIME LIBRARY ROUTINES 417


1 Format
2 omp_intptr_t omp_get_interop_int(const omp_interop_t interop,
3 omp_interop_property_t property_id,
4 int *ret_code);

5 Effect
6 The omp_get_interop_int routine returns the requested integer property, if available, and
7 zero if an error occurs or no value is available. If the interop is omp_interop_none, an empty
8 error occurs. If the property_id is less than omp_ipr_first or greater than or equal to
9 omp_get_num_interop_properties(interop), an out of range error occurs. If the
10 requested property value is not convertible into an integer value, a type error occurs.
11 If a non-null pointer is passed to ret_code, an omp_interop_rc_t value that indicates the
12 return code is stored in the object to which ret_code points. If an error occurred, the stored value
13 will be negative and it will match the error as defined in Table 18.2. On success, zero will be stored.
14 If no error occurred but no meaningful value can be returned, omp_irc_no_value, which is
15 one, will be stored.

16 Restrictions
17 Restrictions to the omp_get_interop_int routine are as follows:
18 • The behavior of the routine is unspecified if an invalid omp_interop_t object is provided.

19 Cross References
20 • omp_get_num_interop_properties, see Section 18.12.1
C / C++
C / C++

21 18.12.3 omp_get_interop_ptr
22 Summary
23 The omp_get_interop_ptr routine retrieves a pointer property from an omp_interop_t
24 object.

25 Format
26 void* omp_get_interop_ptr(const omp_interop_t interop,
27 omp_interop_property_t property_id,
28 int *ret_code);

29 Effect
30 The omp_get_interop_ptr routine returns the requested pointer property, if available, and
31 NULL if an error occurs or no value is available. If the interop is omp_interop_none, an empty
32 error occurs. If the property_id is less than omp_ipr_first or greater than or equal to
33 omp_get_num_interop_properties(interop), an out of range error occurs. If the
34 requested property value is not convertible into a pointer value, a type error occurs.

418 OpenMP API – Version 5.2 November 2021


1 If a non-null pointer is passed to ret_code, an omp_interop_rc_t value that indicates the
2 return code is stored in the object to which the ret_code points. If an error occurred, the stored
3 value will be negative and it will match the error as defined in Table 18.2. On success, zero will be
4 stored. If no error occurred but no meaningful value can be returned, omp_irc_no_value,
5 which is one, will be stored.

6 Restrictions
7 Restrictions to the omp_get_interop_ptr routine are as follows:
8 • The behavior of the routine is unspecified if an invalid omp_interop_t object is provided.
9 • Memory referenced by the pointer returned by the omp_get_interop_ptr routine is
10 managed by the OpenMP implementation and should not be freed or modified.

11 Cross References
12 • omp_get_num_interop_properties, see Section 18.12.1
C / C++

C / C++

13 18.12.4 omp_get_interop_str
14 Summary
15 The omp_get_interop_str routine retrieves a string property from an omp_interop_t
16 object.

17 Format
18 const char* omp_get_interop_str(const omp_interop_t interop,
19 omp_interop_property_t property_id,
20 int *ret_code);

21 Effect
22 The omp_get_interop_str routine returns the requested string property as a C string, if
23 available, and NULL if an error occurs or no value is available. If the interop is
24 omp_interop_none, an empty error occurs. If the property_id is less than omp_ipr_first
25 or greater than or equal to omp_get_num_interop_properties(interop), an out of range
26 error occurs. If the requested property value is not convertible into a string value, a type error
27 occurs.
28 If a non-null pointer is passed to ret_code, an omp_interop_rc_t value that indicates the
29 return code is stored in the object to which the ret_code points. If an error occurred, the stored
30 value will be negative and it will match the error as defined in Table 18.2. On success, zero will be
31 stored. If no error occurred but no meaningful value can be returned, omp_irc_no_value,
32 which is one, will be stored.

CHAPTER 18. RUNTIME LIBRARY ROUTINES 419


1 Restrictions
2 Restrictions to the omp_get_interop_str routine are as follows:
3 • The behavior of the routine is unspecified if an invalid omp_interop_t object is provided.
4 • Memory referenced by the pointer returned by the omp_get_interop_str routine is
5 managed by the OpenMP implementation and should not be freed or modified.

6 Cross References
7 • omp_get_num_interop_properties, see Section 18.12.1
C / C++
C / C++

8 18.12.5 omp_get_interop_name
9 Summary
10 The omp_get_interop_name routine retrieves a property name from an omp_interop_t
11 object.

12 Format
13 const char* omp_get_interop_name(const omp_interop_t interop,
14 omp_interop_property_t property_id)
15 ;

16 Effect
17 The omp_get_interop_name routine returns the name of the property identified by
18 property_id as a C string. Property names for non-implementation defined properties are listed in
19 Table 18.1. If the property_id is less than omp_ipr_first or greater than or equal to
20 omp_get_num_interop_properties(interop), NULL is returned.

21 Restrictions
22 Restrictions to the omp_get_interop_name routine are as follows:
23 • The behavior of the routine is unspecified if an invalid object is provided.
24 • Memory referenced by the pointer returned by the omp_get_interop_name routine is
25 managed by the OpenMP implementation and should not be freed or modified.

26 Cross References
27 • omp_get_num_interop_properties, see Section 18.12.1
C / C++

420 OpenMP API – Version 5.2 November 2021


C / C++

1 18.12.6 omp_get_interop_type_desc
2 Summary
3 The omp_get_interop_type_desc routine retrieves a description of the type of a property
4 associated with an omp_interop_t object.

5 Format
6 const char* omp_get_interop_type_desc(const omp_interop_t interop,
7 omp_interop_property_t
8 property_id);

9 Effect
10 The omp_get_interop_type_desc routine returns a C string that describes the type of the
11 property identified by property_id in human-readable form. That may contain a valid C type
12 declaration possibly followed by a description or name of the type. If interop has the value
13 omp_interop_none, NULL is returned. If the property_id is less than omp_ipr_first or
14 greater than or equal to omp_get_num_interop_properties(interop), NULL is returned.

15 Restrictions
16 Restrictions to the omp_get_interop_type_desc routine are as follows:
17 • The behavior of the routine is unspecified if an invalid object is provided.
18 • Memory referenced by the pointer returned from the omp_get_interop_type_desc
19 routine is managed by the OpenMP implementation and should not be freed or modified.

20 Cross References
21 • omp_get_num_interop_properties, see Section 18.12.1
C / C++
C / C++

22 18.12.7 omp_get_interop_rc_desc
23 Summary
24 The omp_get_interop_rc_desc routine retrieves a description of the return code associated
25 with an omp_interop_t object.

26 Format
27 const char* omp_get_interop_rc_desc(const omp_interop_t interop,
28 omp_interop_rc_t ret_code);

29 Effect
30 The omp_get_interop_rc_desc routine returns a C string that describes the return code
31 ret_code in human-readable form.

CHAPTER 18. RUNTIME LIBRARY ROUTINES 421


1 Restrictions
2 Restrictions to the omp_get_interop_rc_desc routine are as follows:
3 • The behavior of the routine is unspecified if an invalid object is provided or if ret_code was not
4 last written by an interoperability routine invoked with the omp_interop_t object interop.
5 • Memory referenced by the pointer returned by the omp_get_interop_rc_desc routine is
6 managed by the OpenMP implementation and should not be freed or modified.
C / C++

7 18.13 Memory Management Routines


8 This section describes routines that support memory management on the current device. Instances
9 of memory management types must be accessed only through the routines described in this section;
10 programs that otherwise access instances of these types are non-conforming.

11 18.13.1 Memory Management Types


12 The following type definitions are used by the memory management routines:
C / C++
13 typedef enum omp_alloctrait_key_t {
14 omp_atk_sync_hint = 1,
15 omp_atk_alignment = 2,
16 omp_atk_access = 3,
17 omp_atk_pool_size = 4,
18 omp_atk_fallback = 5,
19 omp_atk_fb_data = 6,
20 omp_atk_pinned = 7,
21 omp_atk_partition = 8
22 } omp_alloctrait_key_t;
23
24 typedef enum omp_alloctrait_value_t {
25 omp_atv_false = 0,
26 omp_atv_true = 1,
27 omp_atv_contended = 3,
28 omp_atv_uncontended = 4,
29 omp_atv_serialized = 5,
30 omp_atv_sequential = omp_atv_serialized, // (deprecated)
31 omp_atv_private = 6,
32 omp_atv_all = 7,
33 omp_atv_thread = 8,
34 omp_atv_pteam = 9,
35 omp_atv_cgroup = 10,

422 OpenMP API – Version 5.2 November 2021


1 omp_atv_default_mem_fb = 11,
2 omp_atv_null_fb = 12,
3 omp_atv_abort_fb = 13,
4 omp_atv_allocator_fb = 14,
5 omp_atv_environment = 15,
6 omp_atv_nearest = 16,
7 omp_atv_blocked = 17,
8 omp_atv_interleaved = 18
9 } omp_alloctrait_value_t;
10
11 typedef struct omp_alloctrait_t {
12 omp_alloctrait_key_t key;
13 omp_uintptr_t value;
14 } omp_alloctrait_t;
C / C++
Fortran
15 integer(kind=omp_alloctrait_key_kind), &
16 parameter :: omp_atk_sync_hint = 1
17 integer(kind=omp_alloctrait_key_kind), &
18 parameter :: omp_atk_alignment = 2
19 integer(kind=omp_alloctrait_key_kind), &
20 parameter :: omp_atk_access = 3
21 integer(kind=omp_alloctrait_key_kind), &
22 parameter :: omp_atk_pool_size = 4
23 integer(kind=omp_alloctrait_key_kind), &
24 parameter :: omp_atk_fallback = 5
25 integer(kind=omp_alloctrait_key_kind), &
26 parameter :: omp_atk_fb_data = 6
27 integer(kind=omp_alloctrait_key_kind), &
28 parameter :: omp_atk_pinned = 7
29 integer(kind=omp_alloctrait_key_kind), &
30 parameter :: omp_atk_partition = 8
31
32 integer(kind=omp_alloctrait_val_kind), &
33 parameter :: omp_atv_default = -1
34 integer(kind=omp_alloctrait_val_kind), &
35 parameter :: omp_atv_false = 0
36 integer(kind=omp_alloctrait_val_kind), &
37 parameter :: omp_atv_true = 1
38 integer(kind=omp_alloctrait_val_kind), &
39 parameter :: omp_atv_contended = 3
40 integer(kind=omp_alloctrait_val_kind), &
41 parameter :: omp_atv_uncontended = 4

CHAPTER 18. RUNTIME LIBRARY ROUTINES 423


1 integer(kind=omp_alloctrait_val_kind), &
2 parameter :: omp_atv_serialized = 5
3 integer(kind=omp_alloctrait_val_kind), &
4 parameter :: omp_atv_sequential = &
5 omp_atv_serialized ! (deprecated)
6 integer(kind=omp_alloctrait_val_kind), &
7 parameter :: omp_atv_private = 6
8 integer(kind=omp_alloctrait_val_kind), &
9 parameter :: omp_atv_all = 7
10 integer(kind=omp_alloctrait_val_kind), &
11 parameter :: omp_atv_thread = 8
12 integer(kind=omp_alloctrait_val_kind), &
13 parameter :: omp_atv_pteam = 9
14 integer(kind=omp_alloctrait_val_kind), &
15 parameter :: omp_atv_cgroup = 10
16 integer(kind=omp_alloctrait_val_kind), &
17 parameter :: omp_atv_default_mem_fb = 11
18 integer(kind=omp_alloctrait_val_kind), &
19 parameter :: omp_atv_null_fb = 12
20 integer(kind=omp_alloctrait_val_kind), &
21 parameter :: omp_atv_abort_fb = 13
22 integer(kind=omp_alloctrait_val_kind), &
23 parameter :: omp_atv_allocator_fb = 14
24 integer(kind=omp_alloctrait_val_kind), &
25 parameter :: omp_atv_environment = 15
26 integer(kind=omp_alloctrait_val_kind), &
27 parameter :: omp_atv_nearest = 16
28 integer(kind=omp_alloctrait_val_kind), &
29 parameter :: omp_atv_blocked = 17
30 integer(kind=omp_alloctrait_val_kind), &
31 parameter :: omp_atv_interleaved = 18
32
33 ! omp_alloctrait might not be provided in omp_lib.h.
34 type omp_alloctrait
35 integer(kind=omp_alloctrait_key_kind) key
36 integer(kind=omp_alloctrait_val_kind) value
37 end type omp_alloctrait
38
39 integer(kind=omp_allocator_handle_kind), &
40 parameter :: omp_null_allocator = 0
Fortran

424 OpenMP API – Version 5.2 November 2021


1 18.13.2 omp_init_allocator
2 Summary
3 The omp_init_allocator routine initializes an allocator and associates it with a memory
4 space.
5 Format
C / C++
6 omp_allocator_handle_t omp_init_allocator (
7 omp_memspace_handle_t memspace,
8 int ntraits,
9 const omp_alloctrait_t traits[]
10 );
C / C++
Fortran
11 integer(kind=omp_allocator_handle_kind) &
12 function omp_init_allocator ( memspace, ntraits, traits )
13 integer(kind=omp_memspace_handle_kind),intent(in) :: memspace
14 integer,intent(in) :: ntraits
15 type(omp_alloctrait),intent(in) :: traits(*)
Fortran
16 Constraints on Arguments
17 The memspace argument must be one of the predefined memory spaces defined in Table 6.1. If the
18 ntraits argument is greater than zero then the traits argument must specify at least that many traits.
19 If it specifies fewer than ntraits traits the behavior is unspecified.
20 Binding
21 The binding thread set for an omp_init_allocator region is all threads on a device. The
22 effect of executing this routine is not related to any specific region that corresponds to any construct
23 or API routine.
24 Effect
25 The omp_init_allocator routine creates a new allocator that is associated with the
26 memspace memory space and returns a handle to it. All allocations through the created allocator
27 will behave according to the allocator traits specified in the traits argument. The number of traits in
28 the traits argument is specified by the ntraits argument. Specifying the same allocator trait more
29 than once results in unspecified behavior. The routine returns a handle for the created allocator. If
30 the special omp_atv_default value is used for a given trait, then its value will be the default
31 value specified in Table 6.2 for that given trait.
32 If memspace is omp_default_mem_space and the traits argument is an empty set this routine
33 will always return a handle to an allocator. Otherwise if an allocator based on the requirements
34 cannot be created then the special omp_null_allocator handle is returned.

CHAPTER 18. RUNTIME LIBRARY ROUTINES 425


1 Restrictions
2 The restrictions to the omp_init_allocator routine are as follows:
3 • The use of an allocator returned by this routine on a device other than the one on which it was
4 created results in unspecified behavior.
5 • Unless a requires directive with the dynamic_allocators clause is present in the same
6 compilation unit, using this routine in a target region results in unspecified behavior.

7 Cross References
8 • Memory Allocators, see Section 6.2
9 • Memory Spaces, see Section 6.1
10 • requires directive, see Section 8.2
11 • target directive, see Section 13.8

12 18.13.3 omp_destroy_allocator
13 Summary
14 The omp_destroy_allocator routine releases all resources used by the allocator handle.

15 Format
C / C++
16 void omp_destroy_allocator (omp_allocator_handle_t allocator);
C / C++
Fortran
17 subroutine omp_destroy_allocator ( allocator )
18 integer(kind=omp_allocator_handle_kind),intent(in) :: allocator
Fortran
19 Constraints on Arguments
20 The allocator argument must not represent a predefined memory allocator.

21 Binding
22 The binding thread set for an omp_destroy_allocator region is all threads on a device. The
23 effect of executing this routine is not related to any specific region that corresponds to any construct
24 or API routine.

25 Effect
26 The omp_destroy_allocator routine releases all resources used to implement the allocator
27 handle. If allocator is omp_null_allocator then this routine will have no effect.

426 OpenMP API – Version 5.2 November 2021


1 Restrictions
2 The restrictions to the omp_destroy_allocator routine are as follows:
3 • Accessing any memory allocated by the allocator after this call results in unspecified behavior.
4 • Unless a requires directive with the dynamic_allocators clause is present in the same
5 compilation unit, using this routine in a target region results in unspecified behavior.

6 Cross References
7 • Memory Allocators, see Section 6.2
8 • requires directive, see Section 8.2
9 • target directive, see Section 13.8

10 18.13.4 omp_set_default_allocator
11 Summary
12 The omp_set_default_allocator routine sets the default memory allocator to be used by
13 allocation calls, allocate clauses and allocate and allocators directives that do not
14 specify an allocator.
15 Format
C / C++
16 void omp_set_default_allocator (omp_allocator_handle_t allocator);
C / C++
Fortran
17 subroutine omp_set_default_allocator ( allocator )
18 integer(kind=omp_allocator_handle_kind),intent(in) :: allocator
Fortran
19 Constraints on Arguments
20 The allocator argument must be a valid memory allocator handle.
21 Binding
22 The binding task set for an omp_set_default_allocator region is the binding implicit task.
23 Effect
24 The effect of this routine is to set the value of the def-allocator-var ICV of the binding implicit task
25 to the value specified in the allocator argument.
26 Cross References
27 • Memory Allocators, see Section 6.2
28 • allocate clause, see Section 6.6
29 • allocate directive, see Section 6.5
30 • allocators directive, see Section 6.7
31 • def-allocator-var ICV, see Table 2.1

CHAPTER 18. RUNTIME LIBRARY ROUTINES 427


1 18.13.5 omp_get_default_allocator
2 Summary
3 The omp_get_default_allocator routine returns a handle to the memory allocator to be
4 used by allocation calls, allocate clauses and allocate and allocators directives that do
5 not specify an allocator.
6 Format
C / C++
7 omp_allocator_handle_t omp_get_default_allocator (void);
C / C++
Fortran
8 integer(kind=omp_allocator_handle_kind)&
9 function omp_get_default_allocator ()
Fortran
10 Binding
11 The binding task set for an omp_get_default_allocator region is the binding implicit task.
12 Effect
13 The effect of this routine is to return the value of the def-allocator-var ICV of the binding implicit
14 task.
15 Cross References
16 • Memory Allocators, see Section 6.2
17 • allocate clause, see Section 6.6
18 • allocate directive, see Section 6.5
19 • allocators directive, see Section 6.7
20 • def-allocator-var ICV, see Table 2.1

21 18.13.6 omp_alloc and omp_aligned_alloc


22 Summary
23 The omp_alloc and omp_aligned_alloc routines request a memory allocation from a
24 memory allocator.
25 Format
C
26 void *omp_alloc(size_t size, omp_allocator_handle_t allocator);
27 void *omp_aligned_alloc(
28 size_t alignment,
29 size_t size,
30 omp_allocator_handle_t allocator);
C

428 OpenMP API – Version 5.2 November 2021


C++
1 void *omp_alloc(
2 size_t size,
3 omp_allocator_handle_t allocator=omp_null_allocator
4 );
5 void *omp_aligned_alloc(
6 size_t alignment,
7 size_t size,
8 omp_allocator_handle_t allocator=omp_null_allocator
9 );
C++
Fortran
10 type(c_ptr) function omp_alloc(size, allocator) bind(c)
11 use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t
12 integer(c_size_t), value :: size
13 integer(omp_allocator_handle_kind), value :: allocator
14
15 type(c_ptr) function omp_aligned_alloc(alignment, &
16 size, allocator) bind(c)
17 use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t
18 integer(c_size_t), value :: alignment, size
19 integer(omp_allocator_handle_kind), value :: allocator
Fortran
20 Constraints on Arguments
21 Unless dynamic_allocators appears on a requires directive in the same compilation unit,
22 omp_alloc and omp_aligned_alloc invocations that appear in target regions must not
23 pass omp_null_allocator as the allocator argument, which must be a constant expression
24 that evaluates to one of the predefined memory allocator values. The alignment argument to
25 omp_aligned_alloc must be a power of two and the size argument must be a multiple of
26 alignment.

27 Binding
28 The binding task set for an omp_alloc or omp_aligned_alloc region is the generating task.

29 Effect
30 The omp_alloc and omp_aligned_alloc routines request a memory allocation of size bytes
31 from the specified memory allocator. If the allocator argument is omp_null_allocator the
32 memory allocator used by the routines will be the one specified by the def-allocator-var ICV of the
33 binding implicit task. Upon success they return a pointer to the allocated memory. Otherwise, the
34 behavior that the fallback trait of the allocator specifies will be followed. If size is 0,
35 omp_alloc and omp_aligned_alloc will return NULL.

CHAPTER 18. RUNTIME LIBRARY ROUTINES 429


1 Memory allocated by omp_alloc will be byte-aligned to at least the maximum of the alignment
2 required by malloc and the alignment trait of the allocator. Memory allocated by
3 omp_aligned_alloc will be byte-aligned to at least the maximum of the alignment required by
4 malloc, the alignment trait of the allocator and the alignment argument value.
Fortran
5 The omp_alloc and omp_aligned_alloc routines require an explicit interface and so might
6 not be provided in omp_lib.h.
Fortran
7 Cross References
8 • Memory Allocators, see Section 6.2
9 • def-allocator-var ICV, see Table 2.1
10 • requires directive, see Section 8.2
11 • target directive, see Section 13.8

12 18.13.7 omp_free
13 Summary
14 The omp_free routine deallocates previously allocated memory.
15 Format
C
16 void omp_free (void *ptr, omp_allocator_handle_t allocator);
C
C++
17 void omp_free(
18 void *ptr,
19 omp_allocator_handle_t allocator=omp_null_allocator
20 );
C++
Fortran
21 subroutine omp_free(ptr, allocator) bind(c)
22 use, intrinsic :: iso_c_binding, only : c_ptr
23 type(c_ptr), value :: ptr
24 integer(omp_allocator_handle_kind), value :: allocator
Fortran
25 Binding
26 The binding task set for an omp_free region is the generating task.

430 OpenMP API – Version 5.2 November 2021


1 Effect
2 The omp_free routine deallocates the memory to which ptr points. The ptr argument must have
3 been returned by an OpenMP allocation routine. If the allocator argument is specified it must be
4 the memory allocator to which the allocation request was made. If the allocator argument is
5 omp_null_allocator the implementation will determine that value automatically. If ptr is
6 NULL, no operation is performed.
Fortran
7 The omp_free routine requires an explicit interface and so might not be provided in
8 omp_lib.h.
Fortran
9 Restrictions
10 The restrictions to the omp_free routine are as follows:
11 • Using omp_free on memory that was already deallocated or that was allocated by an allocator
12 that has already been destroyed with omp_destroy_allocator results in unspecified
13 behavior.
14 Cross References
15 • Memory Allocators, see Section 6.2
16 • omp_destroy_allocator, see Section 18.13.3

17 18.13.8 omp_calloc and omp_aligned_calloc


18 Summary
19 The omp_calloc and omp_aligned_calloc routines request a zero initialized memory
20 allocation from a memory allocator.
21 Format
C
22 void *omp_calloc(
23 size_t nmemb,
24 size_t size,
25 omp_allocator_handle_t allocator
26 );
27 void *omp_aligned_calloc(
28 size_t alignment,
29 size_t nmemb,
30 size_t size,
31 omp_allocator_handle_t allocator
32 );
C

CHAPTER 18. RUNTIME LIBRARY ROUTINES 431


C++
1 void *omp_calloc(
2 size_t nmemb,
3 size_t size,
4 omp_allocator_handle_t allocator=omp_null_allocator
5 );
6 void *omp_aligned_calloc(
7 size_t alignment,
8 size_t nmemb,
9 size_t size,
10 omp_allocator_handle_t allocator=omp_null_allocator
11 );
C++
Fortran
12 type(c_ptr) function omp_calloc(nmemb, size, allocator) bind(c)
13 use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t
14 integer(c_size_t), value :: nmemb, size
15 integer(omp_allocator_handle_kind), value :: allocator
16
17 type(c_ptr) function omp_aligned_calloc(alignment, nmemb, size, &
18 allocator) bind(c)
19 use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t
20 integer(c_size_t), value :: alignment, nmemb, size
21 integer(omp_allocator_handle_kind), value :: allocator
Fortran
22 Constraints on Arguments
23 Unless dynamic_allocators appears on a requires directive in the same compilation unit,
24 omp_calloc and omp_aligned_calloc invocations that appear in target regions must
25 not pass omp_null_allocator as the allocator argument, which must be a constant expression
26 that evaluates to one of the predefined memory allocator values. The alignment argument to
27 omp_aligned_calloc must be a power of two and the size argument must be a multiple of
28 alignment.

29 Binding
30 The binding task set for an omp_calloc or omp_aligned_calloc region is the generating
31 task.

432 OpenMP API – Version 5.2 November 2021


1 Effect
2 The omp_calloc and omp_aligned_calloc routines request a memory allocation from the
3 specified memory allocator for an array of nmemb elements each of which has a size of size bytes.
4 If the allocator argument is omp_null_allocator the memory allocator used by the routines
5 will be the one specified by the def-allocator-var ICV of the binding implicit task. Upon success
6 they return a pointer to the allocated memory. Otherwise, the behavior that the fallback trait of
7 the allocator specifies will be followed. Any memory allocated by these routines will be set to zero
8 before returning. If either nmemb or size is 0, omp_calloc will return NULL.
9 Memory allocated by omp_calloc will be byte-aligned to at least the maximum of the alignment
10 required by malloc and the alignment trait of the allocator. Memory allocated by
11 omp_aligned_calloc will be byte-aligned to at least the maximum of the alignment required
12 by malloc, the alignment trait of the allocator and the alignment argument value.
Fortran
13 The omp_calloc and omp_aligned_calloc routines require an explicit interface and so
14 might not be provided in omp_lib.h.
Fortran
15 Cross References
16 • Memory Allocators, see Section 6.2
17 • def-allocator-var ICV, see Table 2.1
18 • requires directive, see Section 8.2
19 • target directive, see Section 13.8

20 18.13.9 omp_realloc
21 Summary
22 The omp_realloc routine deallocates previously allocated memory and requests a memory
23 allocation from a memory allocator.

24 Format
C
25 void *omp_realloc(
26 void *ptr,
27 size_t size,
28 omp_allocator_handle_t allocator,
29 omp_allocator_handle_t free_allocator
30 );
C

CHAPTER 18. RUNTIME LIBRARY ROUTINES 433


C++
1 void *omp_realloc(
2 void *ptr,
3 size_t size,
4 omp_allocator_handle_t allocator=omp_null_allocator,
5 omp_allocator_handle_t free_allocator=omp_null_allocator
6 );
C++
Fortran
7 type(c_ptr) &
8 function omp_realloc(ptr, size, allocator, free_allocator) bind(c)
9 use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t
10 type(c_ptr), value :: ptr
11 integer(c_size_t), value :: size
12 integer(omp_allocator_handle_kind), value :: allocator, free_allocator
Fortran
13 Constraints on Arguments
14 Unless a dynamic_allocators clause appears on a requires directive in the same
15 compilation unit, omp_realloc invocations that appear in target regions must not pass
16 omp_null_allocator as the allocator or free_allocator argument, which must be constant
17 expressions that evaluate to one of the predefined memory allocator values.

18 Binding
19 The binding task set for an omp_realloc region is the generating task.

20 Effect
21 The omp_realloc routine deallocates the memory to which ptr points and requests a new
22 memory allocation of size bytes from the specified memory allocator. If the free_allocator
23 argument is specified, it must be the memory allocator to which the previous allocation request was
24 made. If the free_allocator argument is omp_null_allocator the implementation will
25 determine that value automatically. If the allocator argument is omp_null_allocator the
26 behavior is as if the memory allocator that allocated the memory to which ptr argument points is
27 passed to the allocator argument. Upon success it returns a (possibly moved) pointer to the
28 allocated memory and the contents of the new object shall be the same as that of the old object
29 prior to deallocation, up to the minimum size of old allocated size and size. Any bytes in the new
30 object beyond the old allocated size will have unspecified values. If the allocation failed, the
31 behavior that the fallback trait of the allocator specifies will be followed. If ptr is NULL,
32 omp_realloc will behave the same as omp_alloc with the same size and allocator arguments.
33 If size is 0, omp_realloc will return NULL and the old allocation will be deallocated. If size is
34 not 0, the old allocation will be deallocated if and only if the function returns a non-null value.
35 Memory allocated by omp_realloc will be byte-aligned to at least the maximum of the
36 alignment required by malloc and the alignment trait of the allocator.

434 OpenMP API – Version 5.2 November 2021


Fortran
1 The omp_realloc routine requires an explicit interface and so might not be provided in
2 omp_lib.h.
Fortran
3 Restrictions
4 The restrictions to the omp_realloc routine are as follows:
5 • The ptr argument must have been returned by an OpenMP allocation routine.
6 • Using omp_realloc on memory that was already deallocated or that was allocated by an
7 allocator that has already been destroyed with omp_destroy_allocator results in
8 unspecified behavior.

9 Cross References
10 • Memory Allocators, see Section 6.2
11 • omp_alloc and omp_aligned_alloc, see Section 18.13.6
12 • omp_destroy_allocator, see Section 18.13.3
13 • requires directive, see Section 8.2
14 • target directive, see Section 13.8

15 18.14 Tool Control Routine


16 Summary
17 The omp_control_tool routine enables a program to pass commands to an active tool.

18 Format
C / C++
19 int omp_control_tool(int command, int modifier, void *arg);
C / C++
Fortran
20 integer function omp_control_tool(command, modifier)
21 integer (kind=omp_control_tool_kind) command
22 integer modifier
Fortran
23 Constraints on Arguments
24 The following enumeration type defines four standard commands. Table 18.3 describes the actions
25 that these commands request from a tool.

CHAPTER 18. RUNTIME LIBRARY ROUTINES 435


C / C++
1 typedef enum omp_control_tool_t {
2 omp_control_tool_start = 1,
3 omp_control_tool_pause = 2,
4 omp_control_tool_flush = 3,
5 omp_control_tool_end = 4
6 } omp_control_tool_t;
C / C++
Fortran
7 integer (kind=omp_control_tool_kind), &
8 parameter :: omp_control_tool_start = 1
9 integer (kind=omp_control_tool_kind), &
10 parameter :: omp_control_tool_pause = 2
11 integer (kind=omp_control_tool_kind), &
12 parameter :: omp_control_tool_flush = 3
13 integer (kind=omp_control_tool_kind), &
14 parameter :: omp_control_tool_end = 4
Fortran
15 Tool-specific values for command must be greater or equal to 64. Tools must ignore command
16 values that they are not explicitly designed to handle. Other values accepted by a tool for command,
17 and any values for modifier and arg are tool-defined.

TABLE 18.3: Standard Tool Control Commands


Command Action

omp_control_tool_start Start or restart monitoring if it is off. If monitoring


is already on, this command is idempotent. If moni-
toring has already been turned off permanently, this
command will have no effect.
omp_control_tool_pause Temporarily turn monitoring off. If monitoring is
already off, it is idempotent.
omp_control_tool_flush Flush any data buffered by a tool. This command may
be applied whether monitoring is on or off.
omp_control_tool_end Turn monitoring off permanently; the tool finalizes
itself and flushes all output.

18 Binding
19 The binding task set for an omp_control_tool region is the generating task.

436 OpenMP API – Version 5.2 November 2021


1 Effect
2 An OpenMP program may use omp_control_tool to pass commands to a tool. An application
3 can use omp_control_tool to request that a tool starts or restarts data collection when a code
4 region of interest is encountered, that a tool pauses data collection when leaving the region of
5 interest, that a tool flushes any data that it has collected so far, or that a tool ends data collection.
6 Additionally, omp_control_tool can be used to pass tool-specific commands to a particular
7 tool. The following types correspond to return values from omp_control_tool:
C / C++
8 typedef enum omp_control_tool_result_t {
9 omp_control_tool_notool = -2,
10 omp_control_tool_nocallback = -1,
11 omp_control_tool_success = 0,
12 omp_control_tool_ignored = 1
13 } omp_control_tool_result_t;
C / C++
Fortran
14 integer (kind=omp_control_tool_result_kind), &
15 parameter :: omp_control_tool_notool = -2
16 integer (kind=omp_control_tool_result_kind), &
17 parameter :: omp_control_tool_nocallback = -1
18 integer (kind=omp_control_tool_result_kind), &
19 parameter :: omp_control_tool_success = 0
20 integer (kind=omp_control_tool_result_kind), &
21 parameter :: omp_control_tool_ignored = 1
Fortran
22 If the OMPT interface state is inactive, the OpenMP implementation returns
23 omp_control_tool_notool. If the OMPT interface state is active, but no callback is
24 registered for the tool-control event, the OpenMP implementation returns
25 omp_control_tool_nocallback. An OpenMP implementation may return other
26 implementation-defined negative values strictly smaller than -64; an application may assume that
27 any negative return value indicates that a tool has not received the command. A return value of
28 omp_control_tool_success indicates that the tool has performed the specified command. A
29 return value of omp_control_tool_ignored indicates that the tool has ignored the specified
30 command. A tool may return other positive values strictly greater than 64 that are tool-defined.

31 Execution Model Events


32 The tool-control event occurs in the thread that encounters a call to omp_control_tool at a
33 point inside its corresponding OpenMP region.

CHAPTER 18. RUNTIME LIBRARY ROUTINES 437


1 Tool Callbacks
2 A thread dispatches a registered ompt_callback_control_tool callback for each
3 occurrence of a tool-control event. The callback executes in the context of the call that occurs in the
4 user program and has type signature ompt_callback_control_tool_t. The callback may
5 return any non-negative value, which will be returned to the application by the OpenMP
6 implementation as the return value of the omp_control_tool call that triggered the callback.
7 Arguments passed to the callback are those passed by the user to omp_control_tool. If the
8 call is made in Fortran, the tool will be passed NULL as the third argument to the callback. If any of
9 the four standard commands is presented to a tool, the tool will ignore the modifier and arg
10 argument values.

11 Restrictions
12 Restrictions on access to the state of an OpenMP first-party tool are as follows:
13 • An application may access the tool state modified by an OMPT callback only by using
14 omp_control_tool.

15 Cross References
16 • OMPT Interface, see Chapter 19
17 • ompt_callback_control_tool_t, see Section 19.5.2.29

18 18.15 Environment Display Routine


19 Summary
20 The omp_display_env routine displays the OpenMP version number and the initial values of
21 ICVs associated with the environment variables described in Chapter 21.

22 Format
C / C++
23 void omp_display_env(int verbose);
C / C++
Fortran
24 subroutine omp_display_env(verbose)
25 logical,intent(in) :: verbose
Fortran
26 Binding
27 The binding thread set for an omp_display_env region is the encountering thread.

438 OpenMP API – Version 5.2 November 2021


1 Effect
2 Each time the omp_display_env routine is invoked, the runtime system prints the OpenMP
3 version number and the initial values of the ICVs associated with the environment variables
4 described in Chapter 21. The displayed values are the values of the ICVs after they have been
5 modified according to the environment variable settings and before the execution of any OpenMP
6 construct or API routine.
7 The display begins with "OPENMP DISPLAY ENVIRONMENT BEGIN", followed by the
8 _OPENMP version macro (or the openmp_version named constant for Fortran) and ICV values,
9 in the format NAME ’=’ VALUE. NAME corresponds to the macro or environment variable name,
10 optionally prepended with a bracketed DEVICE. VALUE corresponds to the value of the macro or
11 ICV associated with this environment variable. Values are enclosed in single quotes. DEVICE
12 corresponds to the device on which the value of the ICV is applied. The display is terminated with
13 "OPENMP DISPLAY ENVIRONMENT END".
14 For the OMP_NESTED environment variable, the printed value is true if the max-active-levels-var
15 ICV is initialized to a value greater than 1; otherwise the printed value is false. The OMP_NESTED
16 environment variable has been deprecated.
17 If the verbose argument evaluates to false, the runtime displays the OpenMP version number
18 defined by the _OPENMP version macro (or the openmp_version named constant for Fortran)
19 value and the initial ICV values for the environment variables listed in Chapter 21. If the verbose
20 argument evaluates to true, the runtime may also display the values of vendor-specific ICVs that
21 may be modified by vendor-specific environment variables.
22 Example output:
23 OPENMP DISPLAY ENVIRONMENT BEGIN
24 _OPENMP=’202111’
25 [host] OMP_SCHEDULE=’GUIDED,4’
26 [host] OMP_NUM_THREADS=’4,3,2’
27 [device] OMP_NUM_THREADS=’2’
28 [host,device] OMP_DYNAMIC=’TRUE’
29 [host] OMP_PLACES=’{0:4},{4:4},{8:4},{12:4}’
30 ...
31 OPENMP DISPLAY ENVIRONMENT END

32 Restrictions
33 Restrictions to the omp_display_env routine are as follows.
34 • When called from within a target region the effect is unspecified.

35 Cross References
36 • OMP_DISPLAY_ENV, see Section 21.7

CHAPTER 18. RUNTIME LIBRARY ROUTINES 439


1 19 OMPT Interface
2 This chapter describes OMPT, which is an interface for first-party tools. First-party tools are linked
3 or loaded directly into the OpenMP program. OMPT defines mechanisms to initialize a tool, to
4 examine OpenMP state associated with an OpenMP thread, to interpret the call stack of an OpenMP
5 thread, to receive notification about OpenMP events, to trace activity on OpenMP target devices, to
6 assess implementation-dependent details of an OpenMP implementation (such as supported states
7 and mutual exclusion implementations), and to control a tool from an OpenMP application.

8 19.1 OMPT Interfaces Definitions


C / C++
9 A compliant implementation must supply a set of definitions for the OMPT runtime entry points,
10 OMPT callback signatures, and the special data types of their parameters and return values. These
11 definitions, which are listed throughout this chapter, and their associated declarations shall be
12 provided in a header file named omp-tools.h. In addition, the set of definitions may specify
13 other implementation-specific values.
14 The ompt_start_tool function is an external function with C linkage.
C / C++

15 19.2 Activating a First-Party Tool


16 To activate a tool, an OpenMP implementation first determines whether the tool should be
17 initialized. If so, the OpenMP implementation invokes the initializer of the tool, which enables the
18 tool to prepare to monitor execution on the host. The tool may then also arrange to monitor
19 computation that executes on target devices. This section explains how the tool and an OpenMP
20 implementation interact to accomplish these tasks.

21 19.2.1 ompt_start_tool
22 Summary
23 In order to use the OMPT interface provided by an OpenMP implementation, a tool must implement
24 the ompt_start_tool function, through which the OpenMP implementation initializes the tool.

440 OpenMP API – Version 5.2 November 2021


1 Format
C
2 ompt_start_tool_result_t *ompt_start_tool(
3 unsigned int omp_version,
4 const char *runtime_version
5 );
C
6 Semantics
7 For a tool to use the OMPT interface that an OpenMP implementation provides, the tool must define
8 a globally-visible implementation of the function ompt_start_tool. The tool indicates that it
9 will use the OMPT interface that an OpenMP implementation provides by returning a non-null
10 pointer to an ompt_start_tool_result_t structure from the ompt_start_tool
11 implementation that it provides. The ompt_start_tool_result_t structure contains
12 pointers to tool initialization and finalization callbacks as well as a tool data word that an OpenMP
13 implementation must pass by reference to these callbacks. A tool may return NULL from
14 ompt_start_tool to indicate that it will not use the OMPT interface in a particular execution.
15 A tool may use the omp_version argument to determine if it is compatible with the OMPT interface
16 that the OpenMP implementation provides.

17 Description of Arguments
18 The argument omp_version is the value of the _OPENMP version macro associated with the
19 OpenMP API implementation. This value identifies the OpenMP API version that an OpenMP
20 implementation supports, which specifies the version of the OMPT interface that it supports.
21 The argument runtime_version is a version string that unambiguously identifies the OpenMP
22 implementation.

23 Constraints on Arguments
24 The argument runtime_version must be an immutable string that is defined for the lifetime of a
25 program execution.

26 Effect
27 If a tool returns a non-null pointer to an ompt_start_tool_result_t structure, an OpenMP
28 implementation will call the tool initializer specified by the initialize field in this structure before
29 beginning execution of any OpenMP construct or completing execution of any environment routine
30 invocation; the OpenMP implementation will call the tool finalizer specified by the finalize field in
31 this structure when the OpenMP implementation shuts down.

32 Cross References
33 • Tool Initialization and Finalization, see Section 19.4.1

CHAPTER 19. OMPT INTERFACE 441


Runtime enabled
Inactive tool-var Pending
(re)start

disabled

Runtime shutdown no
or pause Inactive Found? Find next tool

yes r=NULL

Call Return
ompt_start_tool value r

0
r=non-null

1 Return Call
Active
value r->initialize

F IGURE 19.1: First-Party Tool Activation Flow Chart

1 19.2.2 Determining Whether a First-Party Tool Should be


2 Initialized
3 An OpenMP implementation examines the tool-var ICV as one of its first initialization steps. If the
4 value of tool-var is disabled, the initialization continues without a check for the presence of a tool
5 and the functionality of the OMPT interface will be unavailable as the program executes. In this
6 case, the OMPT interface state remains inactive.
7 Otherwise, the OMPT interface state changes to pending and the OpenMP implementation activates
8 any first-party tool that it finds. A tool can provide a definition of ompt_start_tool to an
9 OpenMP implementation in three ways:
10 • By statically-linking its definition of ompt_start_tool into an OpenMP application;
11 • By introducing a dynamically-linked library that includes its definition of ompt_start_tool
12 into the application’s address space; or
13 • By providing, in the tool-libraries-var ICV, the name of a dynamically-linked library that is
14 appropriate for the architecture and operating system used by the application and that includes a

442 OpenMP API – Version 5.2 November 2021


1 definition of ompt_start_tool.
2 If the value of tool-var is enabled, the OpenMP implementation must check if a tool has provided
3 an implementation of ompt_start_tool. The OpenMP implementation first checks if a
4 tool-provided implementation of ompt_start_tool is available in the address space, either
5 statically-linked into the application or in a dynamically-linked library loaded in the address space.
6 If multiple implementations of ompt_start_tool are available, the OpenMP implementation
7 will use the first tool-provided implementation of ompt_start_tool that it finds.
8 If the implementation does not find a tool-provided implementation of ompt_start_tool in the
9 address space, it consults the tool-libraries-var ICV, which contains a (possibly empty) list of
10 dynamically-linked libraries. As described in detail in Section 21.3.2, the libraries in
11 tool-libraries-var are then searched for the first usable implementation of ompt_start_tool
12 that one of the libraries in the list provides.
13 If the implementation finds a tool-provided definition of ompt_start_tool, it invokes that
14 method; if a NULL pointer is returned, the OMPT interface state remains pending and the
15 implementation continues to look for implementations of ompt_start_tool; otherwise a
16 non-null pointer to an ompt_start_tool_result_t structure is returned, the OMPT
17 interface state changes to active and the OpenMP implementation makes the OMPT interface
18 available as the program executes. In this case, as the OpenMP implementation completes its
19 initialization, it initializes the OMPT interface.
20 If no tool can be found, the OMPT interface state changes to inactive.

21 Cross References
22 • Tool Initialization and Finalization, see Section 19.4.1
23 • ompt_start_tool, see Section 19.2.1
24 • tool-libraries-var ICV, see Table 2.1
25 • tool-var ICV, see Table 2.1

26 19.2.3 Initializing a First-Party Tool


27 To initialize the OMPT interface, the OpenMP implementation invokes the tool initializer that is
28 specified in the ompt_start_tool_result_t structure that is indicated by the non-null
29 pointer that ompt_start_tool returns. The initializer is invoked prior to the occurrence of any
30 OpenMP event.
31 A tool initializer, described in Section 19.5.1.1, uses the function specified in its lookup argument
32 to look up pointers to OMPT interface runtime entry points that the OpenMP implementation
33 provides; this process is described in Section 19.2.3.1. Typically, a tool initializer obtains a pointer
34 to the ompt_set_callback runtime entry point with type signature
35 ompt_set_callback_t and then uses this runtime entry point to register tool callbacks for
36 OpenMP events, as described in Section 19.2.4.

CHAPTER 19. OMPT INTERFACE 443


1 A tool initializer may use the ompt_enumerate_states runtime entry point, which has type
2 signature ompt_enumerate_states_t, to determine the thread states that an OpenMP
3 implementation employs. Similarly, it may use the ompt_enumerate_mutex_impls runtime
4 entry point, which has type signature ompt_enumerate_mutex_impls_t, to determine the
5 mutual exclusion implementations that the OpenMP implementation employs.
6 If a tool initializer returns a non-zero value, the OMPT interface state remains active for the
7 execution; otherwise, the OMPT interface state changes to inactive.

8 Cross References
9 • Tool Initialization and Finalization, see Section 19.4.1
10 • ompt_enumerate_mutex_impls_t, see Section 19.6.1.2
11 • ompt_enumerate_states_t, see Section 19.6.1.1
12 • ompt_set_callback_t, see Section 19.6.1.3
13 • ompt_start_tool, see Section 19.2.1

14 19.2.3.1 Binding Entry Points in the OMPT Callback Interface


15 Functions that an OpenMP implementation provides to support the OMPT interface are not defined
16 as global function symbols. Instead, they are defined as runtime entry points that a tool can only
17 identify through the lookup function that is provided as an argument with type signature
18 ompt_function_lookup_t to the tool initializer. A tool can use this function to obtain a
19 pointer to each of the runtime entry points that an OpenMP implementation provides to support the
20 OMPT interface. Once a tool has obtained a lookup function, it may employ it at any point in the
21 future.
22 For each runtime entry point in the OMPT interface for the host device, Table 19.1 provides the
23 string name by which it is known and its associated type signature. Implementations can provide
24 additional implementation-specific names and corresponding entry points. Any names that begin
25 with ompt_ are reserved names.
26 During initialization, a tool should look up each runtime entry point in the OMPT interface by
27 name and bind a pointer maintained by the tool that can later be used to invoke the entry point. The
28 entry points described in Table 19.1 enable a tool to assess the thread states and mutual exclusion
29 implementations that an OpenMP implementation supports to register tool callbacks, to inspect
30 registered callbacks, to introspect OpenMP state associated with threads, and to use tracing to
31 monitor computations that execute on target devices.
32 Detailed information about each runtime entry point listed in Table 19.1 is included as part of the
33 description of its type signature.

444 OpenMP API – Version 5.2 November 2021


TABLE 19.1: OMPT Callback Interface Runtime Entry Point Names and Their Type Signatures

Entry Point String Name Type signature


“ompt_enumerate_states” ompt_enumerate_states_t
“ompt_enumerate_mutex_impls” ompt_enumerate_mutex_impls_t
“ompt_set_callback” ompt_set_callback_t
“ompt_get_callback” ompt_get_callback_t
“ompt_get_thread_data” ompt_get_thread_data_t
“ompt_get_num_places” ompt_get_num_places_t
“ompt_get_place_proc_ids” ompt_get_place_proc_ids_t
“ompt_get_place_num” ompt_get_place_num_t
“ompt_get_partition_place_nums” ompt_get_partition_place_nums_t
“ompt_get_proc_id” ompt_get_proc_id_t
“ompt_get_state” ompt_get_state_t
“ompt_get_parallel_info” ompt_get_parallel_info_t
“ompt_get_task_info” ompt_get_task_info_t
“ompt_get_task_memory” ompt_get_task_memory_t
“ompt_get_num_devices” ompt_get_num_devices_t
“ompt_get_num_procs” ompt_get_num_procs_t
“ompt_get_target_info” ompt_get_target_info_t
“ompt_get_unique_id” ompt_get_unique_id_t
“ompt_finalize_tool” ompt_finalize_tool_t

1 Cross References
2 • Lookup Entry Points: ompt_function_lookup_t, see Section 19.6.3
3 • ompt_enumerate_mutex_impls_t, see Section 19.6.1.2
4 • ompt_enumerate_states_t, see Section 19.6.1.1
5 • ompt_get_callback_t, see Section 19.6.1.4
6 • ompt_get_num_devices_t, see Section 19.6.1.17
7 • ompt_get_num_places_t, see Section 19.6.1.7
8 • ompt_get_num_procs_t, see Section 19.6.1.6
9 • ompt_get_parallel_info_t, see Section 19.6.1.13
10 • ompt_get_partition_place_nums_t, see Section 19.6.1.10
11 • ompt_get_place_num_t, see Section 19.6.1.9
12 • ompt_get_place_proc_ids_t, see Section 19.6.1.8
13 • ompt_get_proc_id_t, see Section 19.6.1.11
14 • ompt_get_state_t, see Section 19.6.1.12

CHAPTER 19. OMPT INTERFACE 445


1 • ompt_get_target_info_t, see Section 19.6.1.16
2 • ompt_get_task_info_t, see Section 19.6.1.14
3 • ompt_get_task_memory_t, see Section 19.6.1.15
4 • ompt_get_thread_data_t, see Section 19.6.1.5
5 • ompt_get_unique_id_t, see Section 19.6.1.18
6 • ompt_set_callback_t, see Section 19.6.1.3

7 19.2.4 Monitoring Activity on the Host with OMPT


8 To monitor the execution of an OpenMP program on the host device, a tool initializer must register
9 to receive notification of events that occur as an OpenMP program executes. A tool can use the
10 ompt_set_callback runtime entry point to register callbacks for OpenMP events. The return
11 codes for ompt_set_callback use the ompt_set_result_t enumeration type. If the
12 ompt_set_callback runtime entry point is called outside a tool initializer, registration of
13 supported callbacks may fail with a return value of ompt_set_error.
14 All callbacks registered with ompt_set_callback or returned by ompt_get_callback use
15 the dummy type signature ompt_callback_t.
16 For callbacks listed in Table 19.2, ompt_set_always is the only registration return code that is
17 allowed. An OpenMP implementation must guarantee that the callback will be invoked every time
18 that a runtime event that is associated with it occurs. Support for such callbacks is required in a
19 minimal implementation of the OMPT interface.
20 For callbacks listed in Table 19.3, the ompt_set_callback runtime entry may return any
21 non-error code. Whether an OpenMP implementation invokes a registered callback never,
22 sometimes, or always is implementation defined. If registration for a callback allows a return code
23 of ompt_set_never, support for invoking such a callback may not be present in a minimal
24 implementation of the OMPT interface. The return code from registering a callback indicates the
25 implementation-defined level of support for the callback.
26 Two techniques reduce the size of the OMPT interface. First, in cases where events are naturally
27 paired, for example, the beginning and end of a region, and the arguments needed by the callback at
28 each endpoint are identical, a tool registers a single callback for the pair of events, with
29 ompt_scope_begin or ompt_scope_end provided as an argument to identify for which
30 endpoint the callback is invoked. Second, when a class of events is amenable to uniform treatment,
31 OMPT provides a single callback for that class of events, for example, an
32 ompt_callback_sync_region_wait callback is used for multiple kinds of synchronization
33 regions, such as barrier, taskwait, and taskgroup regions. Some events, for example,
34 ompt_callback_sync_region_wait, use both techniques.

35 Cross References
36 • ompt_get_callback_t, see Section 19.6.1.4

446 OpenMP API – Version 5.2 November 2021


TABLE 19.2: Callbacks for which ompt_set_callback Must Return ompt_set_always

Callback Name
ompt_callback_thread_begin
ompt_callback_thread_end
ompt_callback_parallel_begin
ompt_callback_parallel_end
ompt_callback_task_create
ompt_callback_task_schedule
ompt_callback_implicit_task
ompt_callback_target
ompt_callback_target_emi
ompt_callback_target_data_op
ompt_callback_target_data_op_emi
ompt_callback_target_submit
ompt_callback_target_submit_emi
ompt_callback_control_tool
ompt_callback_device_initialize
ompt_callback_device_finalize
ompt_callback_device_load
ompt_callback_device_unload

1 • ompt_set_callback_t, see Section 19.6.1.3


2 • ompt_set_result_t, see Section 19.4.4.2

3 19.2.5 Tracing Activity on Target Devices with OMPT


4 A target device may or may not initialize a full OpenMP runtime system. Unless it does,
5 monitoring activity on a device using a tool interface based on callbacks may not be possible. To
6 accommodate such cases, the OMPT interface defines a monitoring interface for tracing activity on
7 target devices. Tracing activity on a target device involves the following steps:
8 • To prepare to trace activity on a target device, a tool must register for an
9 ompt_callback_device_initialize callback. A tool may also register for an
10 ompt_callback_device_load callback to be notified when code is loaded onto a target
11 device or an ompt_callback_device_unload callback to be notified when code is
12 unloaded from a target device. A tool may also optionally register an
13 ompt_callback_device_finalize callback.
14 • When an OpenMP implementation initializes a target device, the OpenMP implementation
15 dispatches the device initialization callback of the tool on the host device. If the OpenMP
16 implementation or target device does not support tracing, the OpenMP implementation passes

CHAPTER 19. OMPT INTERFACE 447


TABLE 19.3: Callbacks for which ompt_set_callback May Return Any Non-Error Code

Callback Name
ompt_callback_sync_region_wait
ompt_callback_mutex_released
ompt_callback_dependences
ompt_callback_task_dependence
ompt_callback_work
ompt_callback_master // (deprecated)
ompt_callback_masked
ompt_callback_target_map
ompt_callback_target_map_emi
ompt_callback_sync_region
ompt_callback_reduction
ompt_callback_lock_init
ompt_callback_lock_destroy
ompt_callback_mutex_acquire
ompt_callback_mutex_acquired
ompt_callback_nest_lock
ompt_callback_flush
ompt_callback_cancel
ompt_callback_dispatch

1 NULL to the device initializer of the tool for its lookup argument; otherwise, the OpenMP
2 implementation passes a pointer to a device-specific runtime entry point with type signature
3 ompt_function_lookup_t to the device initializer of the tool.
4 • If a non-null lookup pointer is provided to the device initializer of the tool, the tool may use it to
5 determine the runtime entry points in the tracing interface that are available for the device and
6 may bind the returned function pointers to tool variables. Table 19.4 indicates the names of
7 runtime entry points that may be available for a device; an implementation may provide
8 additional implementation-defined names and corresponding entry points. The driver for the
9 device provides the runtime entry points that enable a tool to control the trace collection interface
10 of the device. The native trace format that the interface uses may be device specific and the
11 available kinds of trace records are implementation defined. Some devices may allow a tool to
12 collect traces of records in a standard format known as OMPT trace records. Each OMPT trace
13 record serves as a substitute for an OMPT callback that cannot be made on the device. The fields
14 in each trace record type are defined in the description of the callback that the record represents.
15 If this type of record is provided then the lookup function returns values for the runtime entry
16 points ompt_set_trace_ompt and ompt_get_record_ompt, which support collecting
17 and decoding OMPT traces. If the native tracing format for a device is the OMPT format then
18 tracing can be controlled using the runtime entry points for native or OMPT tracing.

448 OpenMP API – Version 5.2 November 2021


TABLE 19.4: OMPT Tracing Interface Runtime Entry Point Names and Their Type Signatures

Entry Point String Name Type Signature


“ompt_get_device_num_procs” ompt_get_device_num_procs_t
“ompt_get_device_time” ompt_get_device_time_t
“ompt_translate_time” ompt_translate_time_t
“ompt_set_trace_ompt” ompt_set_trace_ompt_t
“ompt_set_trace_native” ompt_set_trace_native_t
“ompt_start_trace” ompt_start_trace_t
“ompt_pause_trace” ompt_pause_trace_t
“ompt_flush_trace” ompt_flush_trace_t
“ompt_stop_trace” ompt_stop_trace_t
“ompt_advance_buffer_cursor” ompt_advance_buffer_cursor_t
“ompt_get_record_type” ompt_get_record_type_t
“ompt_get_record_ompt” ompt_get_record_ompt_t
“ompt_get_record_native” ompt_get_record_native_t
“ompt_get_record_abstract” ompt_get_record_abstract_t

1 • The tool uses the ompt_set_trace_native and/or the ompt_set_trace_ompt


2 runtime entry point to specify what types of events or activities to monitor on the device. The
3 return codes for ompt_set_trace_ompt and ompt_set_trace_native use the
4 ompt_set_result_t enumeration type. If the ompt_set_trace_native or the
5 ompt_set_trace_ompt runtime entry point is called outside a device initializer, registration
6 of supported callbacks may fail with a return code of ompt_set_error.
7 • The tool initiates tracing on the device by invoking ompt_start_trace. Arguments to
8 ompt_start_trace include two tool callbacks through which the OpenMP implementation
9 can manage traces associated with the device. One callback allocates a buffer in which the device
10 can deposit trace events. The second callback processes a buffer of trace events from the device.
11 • If the device requires a trace buffer, the OpenMP implementation invokes the tool-supplied
12 callback function on the host device to request a new buffer.
13 • The OpenMP implementation monitors the execution of OpenMP constructs on the device and
14 records a trace of events or activities into a trace buffer. If possible, device trace records are
15 marked with a host_op_id—an identifier that associates device activities with the target
16 operation that the host initiated to cause these activities. To correlate activities on the host with
17 activities on a device, a tool can register a ompt_callback_target_submit_emi
18 callback. Before and after the host initiates creation of an initial task on a device associated with
19 a structured block for a target construct, the OpenMP implementation dispatches the
20 ompt_callback_target_submit_emi callback on the host in the thread that is executing
21 the task that encounters the target construct. This callback provides the tool with a pair of
22 identifiers: one that identifies the target region and a second that uniquely identifies the initial
23 task associated with that region. These identifiers help the tool correlate activities on the target
24 device with their target region.

CHAPTER 19. OMPT INTERFACE 449


1 • When appropriate, for example, when a trace buffer fills or needs to be flushed, the OpenMP
2 implementation invokes the tool-supplied buffer completion callback to process a non-empty
3 sequence of records in a trace buffer that is associated with the device.
4 • The tool-supplied buffer completion callback may return immediately, ignoring records in the
5 trace buffer, or it may iterate through them using the ompt_advance_buffer_cursor
6 entry point to inspect each record. A tool may use the ompt_get_record_type runtime
7 entry point to inspect the type of the record at the current cursor position. Three runtime entry
8 points (ompt_get_record_ompt, ompt_get_record_native, and
9 ompt_get_record_abstract) allow tools to inspect the contents of some or all records in
10 a trace buffer. The ompt_get_record_native runtime entry point uses the native trace
11 format of the device. The ompt_get_record_abstract runtime entry point decodes the
12 contents of a native trace record and summarizes them as an ompt_record_abstract_t
13 record. The ompt_get_record_ompt runtime entry point can only be used to retrieve
14 records in OMPT format.
15 • Once tracing has been started on a device, a tool may pause or resume tracing on the device at
16 any time by invoking ompt_pause_trace with an appropriate flag value as an argument.
17 • A tool may invoke the ompt_flush_trace runtime entry point for a device at any time
18 between device initialization and finalization to cause the device to flush pending trace records.
19 • At any time, a tool may use the ompt_start_trace runtime entry point to start tracing or the
20 ompt_stop_trace runtime entry point to stop tracing on a device. When tracing is stopped
21 on a device, the OpenMP implementation eventually gathers all trace records already collected
22 on the device and presents them to the tool using the buffer completion callback.
23 • An OpenMP implementation can be shut down while device tracing is in progress.
24 • When an OpenMP implementation is shut down, it finalizes each device. Device finalization
25 occurs in three steps. First, the OpenMP implementation halts any tracing in progress for the
26 device. Second, the OpenMP implementation flushes all trace records collected for the device
27 and uses the buffer completion callback associated with that device to present them to the tool.
28 Finally, the OpenMP implementation dispatches any ompt_callback_device_finalize
29 callback registered for the device.

30 Restrictions
31 Restrictions on tracing activity on devices are as follows:
32 • Implementation-defined names must not start with the prefix ompt_, which is reserved for the
33 OpenMP specification.

34 Cross References
35 • ompt_advance_buffer_cursor_t, see Section 19.6.2.10
36 • ompt_callback_device_finalize_t, see Section 19.5.2.20
37 • ompt_callback_device_initialize_t, see Section 19.5.2.19

450 OpenMP API – Version 5.2 November 2021


1 • ompt_flush_trace_t, see Section 19.6.2.8
2 • ompt_get_device_num_procs_t, see Section 19.6.2.1
3 • ompt_get_device_time_t, see Section 19.6.2.2
4 • ompt_get_record_abstract_t, see Section 19.6.2.14
5 • ompt_get_record_native_t, see Section 19.6.2.13
6 • ompt_get_record_ompt_t, see Section 19.6.2.12
7 • ompt_get_record_type_t, see Section 19.6.2.11
8 • ompt_pause_trace_t, see Section 19.6.2.7
9 • ompt_set_trace_native_t, see Section 19.6.2.5
10 • ompt_set_trace_ompt_t, see Section 19.6.2.4
11 • ompt_start_trace_t, see Section 19.6.2.6
12 • ompt_stop_trace_t, see Section 19.6.2.9
13 • ompt_translate_time_t, see Section 19.6.2.3

14 19.3 Finalizing a First-Party Tool


15 If the OMPT interface state is active, the tool finalizer, which has type signature
16 ompt_finalize_t and is specified by the finalize field in the
17 ompt_start_tool_result_t structure returned from the ompt_start_tool function, is
18 called when the OpenMP implementation shuts down.

19 Cross References
20 • ompt_finalize_t, see Section 19.5.1.2

21 19.4 OMPT Data Types


22 The C/C++ header file (omp-tools.h) provides the definitions of the types that are specified
23 throughout this subsection.

24 19.4.1 Tool Initialization and Finalization


25 Summary
26 A tool’s implementation of ompt_start_tool returns a pointer to an
27 ompt_start_tool_result_t structure, which contains pointers to the tool’s initialization
28 and finalization callbacks as well as an ompt_data_t object for use by the tool.

CHAPTER 19. OMPT INTERFACE 451


1 Format
C / C++
2 typedef struct ompt_start_tool_result_t {
3 ompt_initialize_t initialize;
4 ompt_finalize_t finalize;
5 ompt_data_t tool_data;
6 } ompt_start_tool_result_t;
C / C++
7 Restrictions
8 Restrictions to the ompt_start_tool_result_t type are as follows:
9 • The initialize and finalize callback pointer values in an ompt_start_tool_result_t
10 structure that ompt_start_tool returns must be non-null.
11 Cross References
12 • ompt_data_t, see Section 19.4.4.4
13 • ompt_finalize_t, see Section 19.5.1.2
14 • ompt_initialize_t, see Section 19.5.1.1
15 • ompt_start_tool, see Section 19.2.1

16 19.4.2 Callbacks
17 Summary
18 The ompt_callbacks_t enumeration type indicates the integer codes used to identify OpenMP
19 callbacks when registering or querying them.
20 Format
C / C++
21 typedef enum ompt_callbacks_t {
22 ompt_callback_thread_begin = 1,
23 ompt_callback_thread_end = 2,
24 ompt_callback_parallel_begin = 3,
25 ompt_callback_parallel_end = 4,
26 ompt_callback_task_create = 5,
27 ompt_callback_task_schedule = 6,
28 ompt_callback_implicit_task = 7,
29 ompt_callback_target = 8,
30 ompt_callback_target_data_op = 9,
31 ompt_callback_target_submit = 10,
32 ompt_callback_control_tool = 11,
33 ompt_callback_device_initialize = 12,
34 ompt_callback_device_finalize = 13,
35 ompt_callback_device_load = 14,
36 ompt_callback_device_unload = 15,

452 OpenMP API – Version 5.2 November 2021


1 ompt_callback_sync_region_wait = 16,
2 ompt_callback_mutex_released = 17,
3 ompt_callback_dependences = 18,
4 ompt_callback_task_dependence = 19,
5 ompt_callback_work = 20,
6 ompt_callback_masked = 21,
7 ompt_callback_master /*(deprecated)*/ = ompt_callback_masked,
8 ompt_callback_target_map = 22,
9 ompt_callback_sync_region = 23,
10 ompt_callback_lock_init = 24,
11 ompt_callback_lock_destroy = 25,
12 ompt_callback_mutex_acquire = 26,
13 ompt_callback_mutex_acquired = 27,
14 ompt_callback_nest_lock = 28,
15 ompt_callback_flush = 29,
16 ompt_callback_cancel = 30,
17 ompt_callback_reduction = 31,
18 ompt_callback_dispatch = 32,
19 ompt_callback_target_emi = 33,
20 ompt_callback_target_data_op_emi = 34,
21 ompt_callback_target_submit_emi = 35,
22 ompt_callback_target_map_emi = 36,
23 ompt_callback_error = 37
24 } ompt_callbacks_t;
C / C++

25 19.4.3 Tracing
26 OpenMP provides type definitions that support tracing with OMPT.

27 19.4.3.1 Record Type


28 Summary
29 The ompt_record_t enumeration type indicates the integer codes used to identify OpenMP
30 trace record formats.
31 Format
C / C++
32 typedef enum ompt_record_t {
33 ompt_record_ompt = 1,
34 ompt_record_native = 2,
35 ompt_record_invalid = 3
36 } ompt_record_t;
C / C++

CHAPTER 19. OMPT INTERFACE 453


1 19.4.3.2 Native Record Kind
2 Summary
3 The ompt_record_native_t enumeration type indicates the integer codes used to identify
4 OpenMP native trace record contents.

5 Format
C / C++
6 typedef enum ompt_record_native_t {
7 ompt_record_native_info = 1,
8 ompt_record_native_event = 2
9 } ompt_record_native_t;
C / C++

10 19.4.3.3 Native Record Abstract Type


11 Summary
12 The ompt_record_abstract_t type provides an abstract trace record format that is used to
13 summarize native device trace records.

14 Format
C / C++
15 typedef struct ompt_record_abstract_t {
16 ompt_record_native_t rclass;
17 const char *type;
18 ompt_device_time_t start_time;
19 ompt_device_time_t end_time;
20 ompt_hwid_t hwid;
21 } ompt_record_abstract_t;
C / C++
22 Semantics
23 An ompt_record_abstract_t record contains information that a tool can use to process a
24 native record that it may not fully understand. The rclass field indicates that the record is
25 informational or that it represents an event; this information can help a tool determine how to
26 present the record. The record type field points to a statically-allocated, immutable character string
27 that provides a meaningful name that a tool can use to describe the event to a user. The start_time
28 and end_time fields are used to place an event in time. The times are relative to the device clock. If
29 an event does not have an associated start_time (end_time), the value of the start_time (end_time)
30 field is ompt_time_none. The hardware identifier field, hwid, indicates the location on the
31 device where the event occurred. A hwid may represent a hardware abstraction such as a core or a
32 hardware thread identifier. The meaning of a hwid value for a device is implementation defined. If
33 no hardware abstraction is associated with the record then the value of hwid is ompt_hwid_none.

454 OpenMP API – Version 5.2 November 2021


1 19.4.3.4 Standard Trace Record Type
2 Summary
3 The ompt_record_ompt_t type provides a standard complete trace record format.

4 Format
C / C++
5 typedef struct ompt_record_ompt_t {
6 ompt_callbacks_t type;
7 ompt_device_time_t time;
8 ompt_id_t thread_id;
9 ompt_id_t target_id;
10 union {
11 ompt_record_thread_begin_t thread_begin;
12 ompt_record_parallel_begin_t parallel_begin;
13 ompt_record_parallel_end_t parallel_end;
14 ompt_record_work_t work;
15 ompt_record_dispatch_t dispatch;
16 ompt_record_task_create_t task_create;
17 ompt_record_dependences_t dependences;
18 ompt_record_task_dependence_t task_dependence;
19 ompt_record_task_schedule_t task_schedule;
20 ompt_record_implicit_task_t implicit_task;
21 ompt_record_masked_t masked;
22 ompt_record_sync_region_t sync_region;
23 ompt_record_mutex_acquire_t mutex_acquire;
24 ompt_record_mutex_t mutex;
25 ompt_record_nest_lock_t nest_lock;
26 ompt_record_flush_t flush;
27 ompt_record_cancel_t cancel;
28 ompt_record_target_t target;
29 ompt_record_target_data_op_t target_data_op;
30 ompt_record_target_map_t target_map;
31 ompt_record_target_kernel_t target_kernel;
32 ompt_record_control_tool_t control_tool;
33 ompt_record_error_t error;
34 } record;
35 } ompt_record_ompt_t;
C / C++
36 Semantics
37 The field type specifies the type of record provided by this structure. According to the type, event
38 specific information is stored in the matching record entry.

CHAPTER 19. OMPT INTERFACE 455


1 Restrictions
2 Restrictions to the ompt_record_ompt_t type are as follows:
3 • If type is set to ompt_callback_thread_end_t then the value of record is undefined.

4 19.4.4 Miscellaneous Type Definitions


5 This section describes miscellaneous types and enumerations used by the tool interface.

6 19.4.4.1 ompt_callback_t
7 Summary
8 Pointers to tool callback functions with different type signatures are passed to the
9 ompt_set_callback runtime entry point and returned by the ompt_get_callback
10 runtime entry point. For convenience, these runtime entry points expect all type signatures to be
11 cast to a dummy type ompt_callback_t.
12 Format
C / C++
13 typedef void (*ompt_callback_t) (void);
C / C++
14 19.4.4.2 ompt_set_result_t
15 Summary
16 The ompt_set_result_t enumeration type corresponds to values that the
17 ompt_set_callback, ompt_set_trace_ompt and ompt_set_trace_native
18 runtime entry points return.
19 Format
C / C++
20 typedef enum ompt_set_result_t {
21 ompt_set_error = 0,
22 ompt_set_never = 1,
23 ompt_set_impossible = 2,
24 ompt_set_sometimes = 3,
25 ompt_set_sometimes_paired = 4,
26 ompt_set_always = 5
27 } ompt_set_result_t;
C / C++
28 Semantics
29 Values of ompt_set_result_t, may indicate several possible outcomes. The
30 ompt_set_error value indicates that the associated call failed. Otherwise, the value indicates
31 when an event may occur and, when appropriate, dispatching a callback event leads to the
32 invocation of the callback. The ompt_set_never value indicates that the event will never occur
33 or that the callback will never be invoked at runtime. The ompt_set_impossible value
34 indicates that the event may occur but that tracing of it is not possible. The
35 ompt_set_sometimes value indicates that the event may occur and, for an

456 OpenMP API – Version 5.2 November 2021


1 implementation-defined subset of associated event occurrences, will be traced or the callback will
2 be invoked at runtime. The ompt_set_sometimes_paired value indicates the same result as
3 ompt_set_sometimes and, in addition, that a callback with an endpoint value of
4 ompt_scope_begin will be invoked if and only if the same callback with an endpoint value of
5 ompt_scope_end will also be invoked sometime in the future. The ompt_set_always value
6 indicates that, whenever an associated event occurs, it will be traced or the callback will be invoked.

7 Cross References
8 • ompt_set_callback_t, see Section 19.6.1.3
9 • ompt_set_trace_native_t, see Section 19.6.2.5
10 • ompt_set_trace_ompt_t, see Section 19.6.2.4

11 19.4.4.3 ompt_id_t
12 Summary
13 The ompt_id_t type is used to provide various identifiers to tools.
14 Format
C / C++
15 typedef uint64_t ompt_id_t;
C / C++
16 Semantics
17 When tracing asynchronous activity on devices, identifiers enable tools to correlate target regions
18 and operations that the host initiates with associated activities on a target device. In addition,
19 OMPT provides identifiers to refer to parallel regions and tasks that execute on a device. These
20 various identifiers are of type ompt_id_t.
21 ompt_id_none is defined as an instance of type ompt_id_t with the value 0.

22 Restrictions
23 Restrictions to the ompt_id_t type are as follows:
24 • Identifiers created on each device must be unique from the time an OpenMP implementation is
25 initialized until it is shut down. Identifiers for each target region and target data operation
26 instance that the host device initiates must be unique over time on the host. Identifiers for parallel
27 and task region instances that execute on a device must be unique over time within that device.

28 19.4.4.4 ompt_data_t
29 Summary
30 The ompt_data_t type represents data associated with threads and with parallel and task regions.

CHAPTER 19. OMPT INTERFACE 457


1 Format
C / C++
2 typedef union ompt_data_t {
3 uint64_t value;
4 void *ptr;
5 } ompt_data_t;
C / C++
6 Semantics
7 The ompt_data_t type represents data that is reserved for tool use and that is related to a thread
8 or to a parallel or task region. When an OpenMP implementation creates a thread or an instance of
9 a parallel, teams, task, or target region, it initializes the associated ompt_data_t object with
10 the value ompt_data_none, which is an instance of the type with the data and pointer fields
11 equal to 0.

12 19.4.4.5 ompt_device_t
13 Summary
14 The ompt_device_t opaque object type represents a device.
15 Format
C / C++
16 typedef void ompt_device_t;
C / C++

17 19.4.4.6 ompt_device_time_t
18 Summary
19 The ompt_device_time_t type represents raw device time values.
20 Format
C / C++
21 typedef uint64_t ompt_device_time_t;
C / C++
22 Semantics
23 The ompt_device_time_t opaque object type represents raw device time values.
24 ompt_time_none refers to an unknown or unspecified time and is defined as an instance of type
25 ompt_device_time_t with the value 0.

26 19.4.4.7 ompt_buffer_t
27 Summary
28 The ompt_buffer_t opaque object type is a handle for a target buffer.

458 OpenMP API – Version 5.2 November 2021


1 Format
C / C++
2 typedef void ompt_buffer_t;
C / C++

3 19.4.4.8 ompt_buffer_cursor_t
4 Summary
5 The ompt_buffer_cursor_t opaque type is a handle for a position in a target buffer.
6 Format
C / C++
7 typedef uint64_t ompt_buffer_cursor_t;
C / C++

8 19.4.4.9 ompt_dependence_t
9 Summary
10 The ompt_dependence_t type represents a task dependence.
11 Format
C / C++
12 typedef struct ompt_dependence_t {
13 ompt_data_t variable;
14 ompt_dependence_type_t dependence_type;
15 } ompt_dependence_t;
C / C++
16 Semantics
17 The ompt_dependence_t type is a structure that holds information about a depend clause. For
18 task dependences, the variable field points to the storage location of the dependence. For doacross
19 dependences, the variable field contains the value of a vector element that describes the
20 dependence. The dependence_type field indicates the type of the dependence.

21 Cross References
22 • ompt_dependence_type_t, see Section 19.4.4.24

23 19.4.4.10 ompt_thread_t
24 Summary
25 The ompt_thread_t enumeration type defines the valid thread type values.

CHAPTER 19. OMPT INTERFACE 459


1 Format
C / C++
2 typedef enum ompt_thread_t {
3 ompt_thread_initial = 1,
4 ompt_thread_worker = 2,
5 ompt_thread_other = 3,
6 ompt_thread_unknown = 4
7 } ompt_thread_t;
C / C++
8 Semantics
9 Any initial thread has thread type ompt_thread_initial. All OpenMP threads that are not
10 initial threads have thread type ompt_thread_worker. A thread that an OpenMP
11 implementation uses but that does not execute user code has thread type ompt_thread_other.
12 Any thread that is created outside an OpenMP implementation and that is not an initial thread has
13 thread type ompt_thread_unknown.

14 19.4.4.11 ompt_scope_endpoint_t
15 Summary
16 The ompt_scope_endpoint_t enumeration type defines valid scope endpoint values.
17 Format
C / C++
18 typedef enum ompt_scope_endpoint_t {
19 ompt_scope_begin = 1,
20 ompt_scope_end = 2,
21 ompt_scope_beginend = 3
22 } ompt_scope_endpoint_t;
C / C++
23 19.4.4.12 ompt_dispatch_t
24 Summary
25 The ompt_dispatch_t enumeration type defines the valid dispatch kind values.
26 Format
C / C++
27 typedef enum ompt_dispatch_t {
28 ompt_dispatch_iteration = 1,
29 ompt_dispatch_section = 2,
30 ompt_dispatch_ws_loop_chunk = 3,
31 ompt_dispatch_taskloop_chunk = 4,
32 ompt_dispatch_distribute_chunk = 5
33 } ompt_dispatch_t;
C / C++

460 OpenMP API – Version 5.2 November 2021


1 19.4.4.13 ompt_dispatch_chunk_t
2 Summary
3 The ompt_dispatch_chunk_t type represents a the chunk information for a dispatched chunk.

4 Format
C / C++
5 typedef struct ompt_dispatch_chunk_t {
6 uint64_t start;
7 uint64_t iterations;
8 } ompt_dispatch_chunk_t;
C / C++
9 Semantics
10 The ompt_dispatch_chunk_t type is a structure that holds information about a chunk of
11 logical iterations of a loop nest. The start field specifies the first logical iteration of the chunk and
12 the iterations field specifies the number of iterations in the chunk. Whether the chunk of a taskloop
13 is contiguous is implementation defined.

14 19.4.4.14 ompt_sync_region_t
15 Summary
16 The ompt_sync_region_t enumeration type defines the valid synchronization region kind
17 values.
18 Format
C / C++
19 typedef enum ompt_sync_region_t {
20 ompt_sync_region_barrier = 1, // deprecated
21 ompt_sync_region_barrier_implicit = 2, // deprecated
22 ompt_sync_region_barrier_explicit = 3,
23 ompt_sync_region_barrier_implementation = 4,
24 ompt_sync_region_taskwait = 5,
25 ompt_sync_region_taskgroup = 6,
26 ompt_sync_region_reduction = 7,
27 ompt_sync_region_barrier_implicit_workshare = 8,
28 ompt_sync_region_barrier_implicit_parallel = 9,
29 ompt_sync_region_barrier_teams = 10
30 } ompt_sync_region_t;
C / C++

31 19.4.4.15 ompt_target_data_op_t
32 Summary
33 The ompt_target_data_op_t enumeration type defines the valid target data operation values.

CHAPTER 19. OMPT INTERFACE 461


1 Format
C / C++
2 typedef enum ompt_target_data_op_t {
3 ompt_target_data_alloc = 1,
4 ompt_target_data_transfer_to_device = 2,
5 ompt_target_data_transfer_from_device = 3,
6 ompt_target_data_delete = 4,
7 ompt_target_data_associate = 5,
8 ompt_target_data_disassociate = 6,
9 ompt_target_data_alloc_async = 17,
10 ompt_target_data_transfer_to_device_async = 18,
11 ompt_target_data_transfer_from_device_async = 19,
12 ompt_target_data_delete_async = 20
13 } ompt_target_data_op_t;
C / C++

14 19.4.4.16 ompt_work_t
15 Summary
16 The ompt_work_t enumeration type defines the valid work type values.
17 Format
C / C++
18 typedef enum ompt_work_t {
19 ompt_work_loop = 1,
20 ompt_work_sections = 2,
21 ompt_work_single_executor = 3,
22 ompt_work_single_other = 4,
23 ompt_work_workshare = 5,
24 ompt_work_distribute = 6,
25 ompt_work_taskloop = 7,
26 ompt_work_scope = 8,
27 ompt_work_loop_static = 10,
28 ompt_work_loop_dynamic = 11,
29 ompt_work_loop_guided = 12,
30 ompt_work_loop_other = 13
31 } ompt_work_t;
C / C++

32 19.4.4.17 ompt_mutex_t
33 Summary
34 The ompt_mutex_t enumeration type defines the valid mutex kind values.

462 OpenMP API – Version 5.2 November 2021


1 Format
C / C++
2 typedef enum ompt_mutex_t {
3 ompt_mutex_lock = 1,
4 ompt_mutex_test_lock = 2,
5 ompt_mutex_nest_lock = 3,
6 ompt_mutex_test_nest_lock = 4,
7 ompt_mutex_critical = 5,
8 ompt_mutex_atomic = 6,
9 ompt_mutex_ordered = 7
10 } ompt_mutex_t;
C / C++

11 19.4.4.18 ompt_native_mon_flag_t
12 Summary
13 The ompt_native_mon_flag_t enumeration type defines the valid native monitoring flag
14 values.
15 Format
C / C++
16 typedef enum ompt_native_mon_flag_t {
17 ompt_native_data_motion_explicit = 0x01,
18 ompt_native_data_motion_implicit = 0x02,
19 ompt_native_kernel_invocation = 0x04,
20 ompt_native_kernel_execution = 0x08,
21 ompt_native_driver = 0x10,
22 ompt_native_runtime = 0x20,
23 ompt_native_overhead = 0x40,
24 ompt_native_idleness = 0x80
25 } ompt_native_mon_flag_t;
C / C++

26 19.4.4.19 ompt_task_flag_t
27 Summary
28 The ompt_task_flag_t enumeration type defines valid task types.
29 Format
C / C++
30 typedef enum ompt_task_flag_t {
31 ompt_task_initial = 0x00000001,
32 ompt_task_implicit = 0x00000002,
33 ompt_task_explicit = 0x00000004,
34 ompt_task_target = 0x00000008,
35 ompt_task_taskwait = 0x00000010,

CHAPTER 19. OMPT INTERFACE 463


1 ompt_task_undeferred = 0x08000000,
2 ompt_task_untied = 0x10000000,
3 ompt_task_final = 0x20000000,
4 ompt_task_mergeable = 0x40000000,
5 ompt_task_merged = 0x80000000
6 ompt_task_flag_t;
C / C++
7 Semantics
8 The ompt_task_flag_t enumeration type defines valid task type values. The least significant
9 byte provides information about the general classification of the task. The other bits represent
10 properties of the task.

11 19.4.4.20 ompt_task_status_t
12 Summary
13 The ompt_task_status_t enumeration type indicates the reason that a task was switched
14 when it reached a task scheduling point.

15 Format
C / C++
16 typedef enum ompt_task_status_t {
17 ompt_task_complete = 1,
18 ompt_task_yield = 2,
19 ompt_task_cancel = 3,
20 ompt_task_detach = 4,
21 ompt_task_early_fulfill = 5,
22 ompt_task_late_fulfill = 6,
23 ompt_task_switch = 7,
24 ompt_taskwait_complete = 8
25 } ompt_task_status_t;
C / C++
26 Semantics
27 The value ompt_task_complete of the ompt_task_status_t type indicates that the task
28 that encountered the task scheduling point completed execution of the associated structured block
29 and an associated allow-completion event was fulfilled. The value ompt_task_yield indicates
30 that the task encountered a taskyield construct. The value ompt_task_cancel indicates
31 that the task was canceled when it encountered an active cancellation point. The value
32 ompt_task_detach indicates that a task for which the detach clause was specified completed
33 execution of the associated structured block and is waiting for an allow-completion event to be

464 OpenMP API – Version 5.2 November 2021


1 fulfilled. The value ompt_task_early_fulfill indicates that the allow-completion event of
2 the task was fulfilled before the task completed execution of the associated structured block. The
3 value ompt_task_late_fulfill indicates that the allow-completion event of the task was
4 fulfilled after the task completed execution of the associated structured block. The value
5 ompt_taskwait_complete indicates completion of the dependent task that results from a
6 taskwait construct with one or more depend clauses. The value ompt_task_switch is
7 used for all other cases that a task was switched.

8 19.4.4.21 ompt_target_t
9 Summary
10 The ompt_target_t enumeration type defines the valid target type values.

11 Format
C / C++
12 typedef enum ompt_target_t {
13 ompt_target = 1,
14 ompt_target_enter_data = 2,
15 ompt_target_exit_data = 3,
16 ompt_target_update = 4,
17 ompt_target_nowait = 9,
18 ompt_target_enter_data_nowait = 10,
19 ompt_target_exit_data_nowait = 11,
20 ompt_target_update_nowait = 12
21 ompt_target_t;
C / C++

22 19.4.4.22 ompt_parallel_flag_t
23 Summary
24 The ompt_parallel_flag_t enumeration type defines valid invoker values.

25 Format
C / C++
26 typedef enum ompt_parallel_flag_t {
27 ompt_parallel_invoker_program = 0x00000001,
28 ompt_parallel_invoker_runtime = 0x00000002,
29 ompt_parallel_league = 0x40000000,
30 ompt_parallel_team = 0x80000000
31 } ompt_parallel_flag_t;
C / C++

CHAPTER 19. OMPT INTERFACE 465


1 Semantics
2 The ompt_parallel_flag_t enumeration type defines valid invoker values, which indicate
3 how an outlined function is invoked. The value ompt_parallel_invoker_program
4 indicates that the outlined function associated with implicit tasks for the region is invoked directly
5 by the application on the primary thread for a parallel region. The value
6 ompt_parallel_invoker_runtime indicates that the outlined function associated with
7 implicit tasks for the region is invoked by the runtime on the primary thread for a parallel region.
8 The value ompt_parallel_league indicates that the callback is invoked due to the creation of
9 a league of teams by a teams construct. The value ompt_parallel_team indicates that the
10 callback is invoked due to the creation of a team of threads by a parallel construct.

11 19.4.4.23 ompt_target_map_flag_t
12 Summary
13 The ompt_target_map_flag_t enumeration type defines the valid target map flag values.

14 Format
C / C++
15 typedef enum ompt_target_map_flag_t {
16 ompt_target_map_flag_to = 0x01,
17 ompt_target_map_flag_from = 0x02,
18 ompt_target_map_flag_alloc = 0x04,
19 ompt_target_map_flag_release = 0x08,
20 ompt_target_map_flag_delete = 0x10,
21 ompt_target_map_flag_implicit = 0x20,
22 ompt_target_map_flag_always = 0x40,
23 ompt_target_map_flag_present = 0x80,
24 ompt_target_map_flag_close = 0x100,
25 ompt_target_map_flag_shared = 0x200
26 } ompt_target_map_flag_t;
C / C++
27 Semantics
28 The ompt_target_map_flag_ map-type flag is set if the mapping operations have that
29 map-type. If the map-type for the mapping operations is tofrom, both the
30 ompt_target_map_flag_to and ompt_target_map_flag_from flags are set. The
31 ompt_target_map_implicit flag is set if the mapping operations result from implicit
32 data-mapping rules. The ompt_target_map_flag_ map-type-modifier flag is set if the
33 mapping operations are specified with that map-type-modifier. The
34 ompt_target_map_flag_shared flag is set if the original and corresponding storage are
35 shared in the mapping operation.

466 OpenMP API – Version 5.2 November 2021


1 19.4.4.24 ompt_dependence_type_t
2 Summary
3 The ompt_dependence_type_t enumeration type defines the valid task dependence type
4 values.

5 Format
C / C++
6 typedef enum ompt_dependence_type_t {
7 ompt_dependence_type_in = 1,
8 ompt_dependence_type_out = 2,
9 ompt_dependence_type_inout = 3,
10 ompt_dependence_type_mutexinoutset = 4,
11 ompt_dependence_type_source = 5,
12 ompt_dependence_type_sink = 6,
13 ompt_dependence_type_inoutset = 7
14 } ompt_dependence_type_t;
C / C++

15 19.4.4.25 ompt_severity_t
16 Summary
17 The ompt_severity_t enumeration type defines the valid severity values.

18 Format
C / C++
19 typedef enum ompt_severity_t {
20 ompt_warning = 1,
21 ompt_fatal = 2
22 } ompt_severity_t;
C / C++

23 19.4.4.26 ompt_cancel_flag_t
24 Summary
25 The ompt_cancel_flag_t enumeration type defines the valid cancel flag values.

26 Format
C / C++
27 typedef enum ompt_cancel_flag_t {
28 ompt_cancel_parallel = 0x01,
29 ompt_cancel_sections = 0x02,
30 ompt_cancel_loop = 0x04,
31 ompt_cancel_taskgroup = 0x08,

CHAPTER 19. OMPT INTERFACE 467


1 ompt_cancel_activated = 0x10,
2 ompt_cancel_detected = 0x20,
3 ompt_cancel_discarded_task = 0x40
4 ompt_cancel_flag_t;
C / C++

5 19.4.4.27 ompt_hwid_t
6 Summary
7 The ompt_hwid_t opaque type is a handle for a hardware identifier for a target device.
8 Format
C / C++
9 typedef uint64_t ompt_hwid_t;
C / C++
10 Semantics
11 The ompt_hwid_t opaque type is a handle for a hardware identifier for a target device.
12 ompt_hwid_none is an instance of the type that refers to an unknown or unspecified hardware
13 identifier and that has the value 0. If no hwid is associated with an
14 ompt_record_abstract_t then the value of hwid is ompt_hwid_none.
15 Cross References
16 • Native Record Abstract Type, see Section 19.4.3.3

17 19.4.4.28 ompt_state_t
18 Summary
19 If the OMPT interface is in the active state then an OpenMP implementation must maintain thread
20 state information for each thread. The thread state maintained is an approximation of the
21 instantaneous state of a thread.
22 Format
C / C++
23 A thread state must be one of the values of the enumeration type ompt_state_t or an
24 implementation-defined state value of 512 or higher.
25 typedef enum ompt_state_t {
26 ompt_state_work_serial = 0x000,
27 ompt_state_work_parallel = 0x001,
28 ompt_state_work_reduction = 0x002,
29
30 ompt_state_wait_barrier = 0x010, //
31 deprecated
32 ompt_state_wait_barrier_implicit_parallel = 0x011,
33 ompt_state_wait_barrier_implicit_workshare = 0x012,

468 OpenMP API – Version 5.2 November 2021


1 ompt_state_wait_barrier_implicit = 0x013, //
2 deprecated
3 ompt_state_wait_barrier_explicit = 0x014,
4 ompt_state_wait_barrier_implementation = 0x015,
5 ompt_state_wait_barrier_teams = 0x016,
6
7 ompt_state_wait_taskwait = 0x020,
8 ompt_state_wait_taskgroup = 0x021,
9
10 ompt_state_wait_mutex = 0x040,
11 ompt_state_wait_lock = 0x041,
12 ompt_state_wait_critical = 0x042,
13 ompt_state_wait_atomic = 0x043,
14 ompt_state_wait_ordered = 0x044,
15
16 ompt_state_wait_target = 0x080,
17 ompt_state_wait_target_map = 0x081,
18 ompt_state_wait_target_update = 0x082,
19
20 ompt_state_idle = 0x100,
21 ompt_state_overhead = 0x101,
22 ompt_state_undefined = 0x102
23 ompt_state_t;
C / C++
24 Semantics
25 A tool can query the OpenMP state of a thread at any time. If a tool queries the state of a thread that
26 is not associated with OpenMP then the implementation reports the state as
27 ompt_state_undefined.
28 The value ompt_state_work_serial indicates that the thread is executing code outside all
29 parallel regions. The value ompt_state_work_parallel indicates that the thread is
30 executing code within the scope of a parallel region. The value
31 ompt_state_work_reduction indicates that the thread is combining partial reduction
32 results from threads in its team. An OpenMP implementation may never report a thread in this
33 state; a thread that is combining partial reduction results may have its state reported as
34 ompt_state_work_parallel or ompt_state_overhead. The value
35 ompt_state_wait_barrier_implicit_parallel indicates that the thread is waiting at
36 the implicit barrier at the end of a parallel region. The value
37 ompt_state_wait_barrier_implicit_workshare indicates that the thread is waiting
38 at an implicit barrier at the end of a worksharing construct. The value
39 ompt_state_wait_barrier_explicit indicates that the thread is waiting in an explicit
40 barrier region. The value ompt_state_wait_barrier_implementation indicates
41 that the thread is waiting in a barrier not required by the OpenMP standard but introduced by an

CHAPTER 19. OMPT INTERFACE 469


1 OpenMP implementation. The value ompt_state_wait_barrier_teams indicates that the
2 thread is waiting at a barrier at the end of a teams region. The value
3 ompt_state_wait_taskwait indicates that the thread is waiting at a taskwait construct.
4 The value ompt_state_wait_taskgroup indicates that the thread is waiting at the end of a
5 taskgroup construct. The value ompt_state_wait_mutex indicates that the thread is
6 waiting for a mutex of an unspecified type. The value ompt_state_wait_lock indicates that
7 the thread is waiting for a lock or nestable lock. The value ompt_state_wait_critical
8 indicates that the thread is waiting to enter a critical region. The value
9 ompt_state_wait_atomic indicates that the thread is waiting to enter an atomic region.
10 The value ompt_state_wait_ordered indicates that the thread is waiting to enter an
11 ordered region. The value ompt_state_wait_target indicates that the thread is waiting
12 for a target region to complete. The value ompt_state_wait_target_map indicates that
13 the thread is waiting for a target data mapping operation to complete. An implementation may
14 report ompt_state_wait_target for target data constructs. The value
15 ompt_state_wait_target_update indicates that the thread is waiting for a
16 target update operation to complete. An implementation may report
17 ompt_state_wait_target for target update constructs. The value
18 ompt_state_idle indicates that the thread is idle, that is, it is not part of an OpenMP team.
19 The value ompt_state_overhead indicates that the thread is in the overhead state at any point
20 while executing within the OpenMP runtime, except while waiting at a synchronization point. The
21 value ompt_state_undefined indicates that the native thread is not created by the OpenMP
22 implementation.

23 19.4.4.29 ompt_frame_t
24 Summary
25 The ompt_frame_t type describes procedure frame information for an OpenMP task.

26 Format
C / C++
27 typedef struct ompt_frame_t {
28 ompt_data_t exit_frame;
29 ompt_data_t enter_frame;
30 int exit_frame_flags;
31 int enter_frame_flags;
32 } ompt_frame_t;
C / C++

470 OpenMP API – Version 5.2 November 2021


1 Semantics
2 Each ompt_frame_t object is associated with the task to which the procedure frames belong.
3 Each non-merged initial, implicit, explicit, or target task with one or more frames on the stack of a
4 native thread has an associated ompt_frame_t object.
5 The exit_frame field of an ompt_frame_t object contains information to identify the first
6 procedure frame executing the task region. The exit_frame for the ompt_frame_t object
7 associated with the initial task that is not nested inside any OpenMP construct is
8 ompt_data_none.
9 The enter_frame field of an ompt_frame_t object contains information to identify the latest still
10 active procedure frame executing the task region before entering the OpenMP runtime
11 implementation or before executing a different task. If a task with frames on the stack is not
12 executing implementation code in the OpenMP runtime, the value of enter_frame for the
13 ompt_frame_t object associated with the task will be ompt_data_none.
14 For exit_frame, the exit_frame_flags and, for enter_frame, the enter_frame_flags field indicates that
15 the provided frame information points to a runtime or an application frame address. The same
16 fields also specify the kind of information that is provided to identify the frame, These fields are a
17 disjunction of values in the ompt_frame_flag_t enumeration type.
18 The lifetime of an ompt_frame_t object begins when a task is created and ends when the task is
19 destroyed. Tools should not assume that a frame structure remains at a constant location in memory
20 throughout the lifetime of the task. A pointer to an ompt_frame_t object is passed to some
21 callbacks; a pointer to the ompt_frame_t object of a task can also be retrieved by a tool at any
22 time, including in a signal handler, by invoking the ompt_get_task_info runtime entry point
23 (described in Section 19.6.1.14). A pointer to an ompt_frame_t object that a tool retrieved is
24 valid as long as the tool does not pass back control to the OpenMP implementation.
25
26 Note – A monitoring tool that uses asynchronous sampling can observe values of exit_frame and
27 enter_frame at inconvenient times. Tools must be prepared to handle ompt_frame_t objects
28 observed just prior to when their field values will be set or cleared.
29

30 19.4.4.30 ompt_frame_flag_t
31 Summary
32 The ompt_frame_flag_t enumeration type defines valid frame information flags.

CHAPTER 19. OMPT INTERFACE 471


1 Format
C / C++
2 typedef enum ompt_frame_flag_t {
3 ompt_frame_runtime = 0x00,
4 ompt_frame_application = 0x01,
5 ompt_frame_cfa = 0x10,
6 ompt_frame_framepointer = 0x20,
7 ompt_frame_stackaddress = 0x30
8 } ompt_frame_flag_t;
C / C++
9 Semantics
10 The value ompt_frame_runtime of the ompt_frame_flag_t type indicates that a frame
11 address is a procedure frame in the OpenMP runtime implementation. The value
12 ompt_frame_application of the ompt_frame_flag_t type indicates that a frame
13 address is a procedure frame in the OpenMP application.
14 Higher order bits indicate the kind of provided information that is unique for the particular frame
15 pointer. The value ompt_frame_cfa indicates that a frame address specifies a canonical frame
16 address. The value ompt_frame_framepointer indicates that a frame address provides the
17 value of the frame pointer register. The value ompt_frame_stackaddress indicates that a
18 frame address specifies a pointer address that is contained in the current stack frame.

19 19.4.4.31 ompt_wait_id_t
20 Summary
21 The ompt_wait_id_t type describes wait identifiers for an OpenMP thread.
22 Format
C / C++
23 typedef uint64_t ompt_wait_id_t;
C / C++
24 Semantics
25 Each thread maintains a wait identifier of type ompt_wait_id_t. When a task that a thread
26 executes is waiting for mutual exclusion, the wait identifier of the thread indicates the reason that
27 the thread is waiting. A wait identifier may represent a critical section name, a lock, a program
28 variable accessed in an atomic region, or a synchronization object that is internal to an OpenMP
29 implementation. When a thread is not in a wait state then the value of the wait identifier of the
30 thread is undefined. ompt_wait_id_none is defined as an instance of type
31 ompt_wait_id_t with the value 0.

472 OpenMP API – Version 5.2 November 2021


1 19.5 OMPT Tool Callback Signatures and Trace
2 Records
3 The C/C++ header file (omp-tools.h) provides the definitions of the types that are specified
4 throughout this subsection. Restrictions to the OpenMP tool callbacks are as follows:
5 Restrictions
6 • Tool callbacks may not use OpenMP directives or call any runtime library routines described in
7 Chapter 18.
8 • Tool callbacks must exit by either returning to the caller or aborting.

9 19.5.1 Initialization and Finalization Callback Signature


10 19.5.1.1 ompt_initialize_t
11 Summary
12 A callback with type signature ompt_initialize_t initializes the use of the OMPT interface.

13 Format
C / C++
14 typedef int (*ompt_initialize_t) (
15 ompt_function_lookup_t lookup,
16 int initial_device_num,
17 ompt_data_t *tool_data
18 );
C / C++
19 Semantics
20 To use the OMPT interface, an implementation of ompt_start_tool must return a non-null
21 pointer to an ompt_start_tool_result_t structure that contains a pointer to a tool
22 initializer function with type signature ompt_initialize_t. An OpenMP implementation will
23 call the initializer after fully initializing itself but before beginning execution of any OpenMP
24 construct or runtime library routine. The initializer returns a non-zero value if it succeeds;
25 otherwise, the OMPT interface state changes to inactive as described in Section 19.2.3.

26 Description of Arguments
27 The lookup argument is a callback to an OpenMP runtime routine that must be used to obtain a
28 pointer to each runtime entry point in the OMPT interface. The initial_device_num argument
29 provides the value of omp_get_initial_device(). The tool_data argument is a pointer to
30 the tool_data field in the ompt_start_tool_result_t structure that ompt_start_tool
31 returned.

CHAPTER 19. OMPT INTERFACE 473


1 Cross References
2 • Tool Initialization and Finalization, see Section 19.4.1
3 • omp_get_initial_device, see Section 18.7.7
4 • ompt_data_t, see Section 19.4.4.4
5 • ompt_start_tool, see Section 19.2.1

6 19.5.1.2 ompt_finalize_t
7 Summary
8 A tool implements a finalizer with the type signature ompt_finalize_t to finalize its use of the
9 OMPT interface.
10 Format
C / C++
11 typedef void (*ompt_finalize_t) (
12 ompt_data_t *tool_data
13 );
C / C++
14 Semantics
15 To use the OMPT interface, an implementation of ompt_start_tool must return a non-null
16 pointer to an ompt_start_tool_result_t structure that contains a non-null pointer to a tool
17 finalizer with type signature ompt_finalize_t. An OpenMP implementation must call the tool
18 finalizer after the last OMPT event as the OpenMP implementation shuts down.
19 Description of Arguments
20 The tool_data argument is a pointer to the tool_data field in the
21 ompt_start_tool_result_t structure returned by ompt_start_tool.
22 Cross References
23 • Tool Initialization and Finalization, see Section 19.4.1
24 • ompt_data_t, see Section 19.4.4.4
25 • ompt_start_tool, see Section 19.2.1

26 19.5.2 Event Callback Signatures and Trace Records


27 This section describes the signatures of tool callback functions that an OMPT tool may register and
28 that are called during the runtime of an OpenMP program. An implementation may also provide a
29 trace of events per device. Along with the callbacks, the following defines standard trace records.
30 For the trace records, tool data arguments are replaced by an ID, which must be initialized by the
31 OpenMP implementation. Each of parallel_id, task_id, and thread_id must be unique per target
32 region. Tool implementations of callbacks are not required to be async signal safe.

474 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • ompt_data_t, see Section 19.4.4.4
3 • ompt_id_t, see Section 19.4.4.3

4 19.5.2.1 ompt_callback_thread_begin_t
5 Summary
6 The ompt_callback_thread_begin_t type is used for callbacks that are dispatched when
7 native threads are created.

8 Format
C / C++
9 typedef void (*ompt_callback_thread_begin_t) (
10 ompt_thread_t thread_type,
11 ompt_data_t *thread_data
12 );
C / C++
13 Trace Record
C / C++
14 typedef struct ompt_record_thread_begin_t {
15 ompt_thread_t thread_type;
16 } ompt_record_thread_begin_t;
C / C++
17 Description of Arguments
18 The thread_type argument indicates the type of the new thread: initial, worker, or other. The
19 binding of the thread_data argument is the new thread.

20 Cross References
21 • Initial Task, see Section 12.8
22 • ompt_data_t, see Section 19.4.4.4
23 • ompt_thread_t, see Section 19.4.4.10
24 • parallel directive, see Section 10.1
25 • teams directive, see Section 10.2

CHAPTER 19. OMPT INTERFACE 475


1 19.5.2.2 ompt_callback_thread_end_t
2 Summary
3 The ompt_callback_thread_end_t type is used for callbacks that are dispatched when
4 native threads are destroyed.

5 Format
C / C++
6 typedef void (*ompt_callback_thread_end_t) (
7 ompt_data_t *thread_data
8 );
C / C++
9 Description of Arguments
10 The binding of the thread_data argument is the thread that will be destroyed.

11 Cross References
12 • Initial Task, see Section 12.8
13 • Standard Trace Record Type, see Section 19.4.3.4
14 • ompt_data_t, see Section 19.4.4.4
15 • parallel directive, see Section 10.1
16 • teams directive, see Section 10.2

17 19.5.2.3 ompt_callback_parallel_begin_t
18 Summary
19 The ompt_callback_parallel_begin_t type is used for callbacks that are dispatched
20 when a parallel or teams region starts.

21 Format
C / C++
22 typedef void (*ompt_callback_parallel_begin_t) (
23 ompt_data_t *encountering_task_data,
24 const ompt_frame_t *encountering_task_frame,
25 ompt_data_t *parallel_data,
26 unsigned int requested_parallelism,
27 int flags,
28 const void *codeptr_ra
29 );
C / C++

476 OpenMP API – Version 5.2 November 2021


1 Trace Record
C / C++
2 typedef struct ompt_record_parallel_begin_t {
3 ompt_id_t encountering_task_id;
4 ompt_id_t parallel_id;
5 unsigned int requested_parallelism;
6 int flags;
7 const void *codeptr_ra;
8 } ompt_record_parallel_begin_t;
C / C++
9 Description of Arguments
10 The binding of the encountering_task_data argument is the encountering task.
11 The encountering_task_frame argument points to the frame object that is associated with the
12 encountering task. The behavior for accessing the frame object after the callback returned is
13 unspecified.
14 The binding of the parallel_data argument is the parallel or teams region that is beginning.
15 The requested_parallelism argument indicates the number of threads or teams that the user
16 requested.
17 The flags argument indicates whether the code for the region is inlined into the application or
18 invoked by the runtime and also whether the region is a parallel or teams region. Valid values
19 for flags are a disjunction of elements in the enum ompt_parallel_flag_t.
20 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
21 runtime routine implements the region associated with a callback that has type signature
22 ompt_callback_parallel_begin_t then codeptr_ra contains the return address of the call
23 to that runtime routine. If the implementation of the region is inlined then codeptr_ra contains the
24 return address of the callback invocation. If attribution to source code is impossible or
25 inappropriate, codeptr_ra may be NULL.
26 Cross References
27 • ompt_data_t, see Section 19.4.4.4
28 • ompt_frame_t, see Section 19.4.4.29
29 • ompt_parallel_flag_t, see Section 19.4.4.22
30 • parallel directive, see Section 10.1
31 • teams directive, see Section 10.2

32 19.5.2.4 ompt_callback_parallel_end_t
33 Summary
34 The ompt_callback_parallel_end_t type is used for callbacks that are dispatched when a
35 parallel or teams region ends.

CHAPTER 19. OMPT INTERFACE 477


1 Format
C / C++
2 typedef void (*ompt_callback_parallel_end_t) (
3 ompt_data_t *parallel_data,
4 ompt_data_t *encountering_task_data,
5 int flags,
6 const void *codeptr_ra
7 );
C / C++
8 Trace Record
C / C++
9 typedef struct ompt_record_parallel_end_t {
10 ompt_id_t parallel_id;
11 ompt_id_t encountering_task_id;
12 int flags;
13 const void *codeptr_ra;
14 } ompt_record_parallel_end_t;
C / C++
15 Description of Arguments
16 The binding of the parallel_data argument is the parallel or teams region that is ending.
17 The binding of the encountering_task_data argument is the encountering task.
18 The flags argument indicates whether the execution of the region is inlined into the application or
19 invoked by the runtime and also whether it is a parallel or teams region. Values for flags are a
20 disjunction of elements in the enum ompt_parallel_flag_t.
21 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
22 runtime routine implements the region associated with a callback that has type signature
23 ompt_callback_parallel_end_t then codeptr_ra contains the return address of the call to
24 that runtime routine. If the implementation of the region is inlined then codeptr_ra contains the
25 return address of the callback invocation. If attribution to source code is impossible or
26 inappropriate, codeptr_ra may be NULL.

27 Cross References
28 • ompt_data_t, see Section 19.4.4.4
29 • ompt_parallel_flag_t, see Section 19.4.4.22
30 • parallel directive, see Section 10.1
31 • teams directive, see Section 10.2

478 OpenMP API – Version 5.2 November 2021


1 19.5.2.5 ompt_callback_work_t
2 Summary
3 The ompt_callback_work_t type is used for callbacks that are dispatched when worksharing
4 regions and taskloop regions begin and end.

5 Format
C / C++
6 typedef void (*ompt_callback_work_t) (
7 ompt_work_t work_type,
8 ompt_scope_endpoint_t endpoint,
9 ompt_data_t *parallel_data,
10 ompt_data_t *task_data,
11 uint64_t count,
12 const void *codeptr_ra
13 );
C / C++
14 Trace Record
C / C++
15 typedef struct ompt_record_work_t {
16 ompt_work_t work_type;
17 ompt_scope_endpoint_t endpoint;
18 ompt_id_t parallel_id;
19 ompt_id_t task_id;
20 uint64_t count;
21 const void *codeptr_ra;
22 } ompt_record_work_t;
C / C++
23 Description of Arguments
24 The work_type argument indicates the kind of region.
25 The endpoint argument indicates that the callback signals the beginning of a scope or the end of a
26 scope.
27 The binding of the parallel_data argument is the current parallel region.
28 The binding of the task_data argument is the current task.
29 The count argument is a measure of the quantity of work involved in the construct. For a
30 worksharing-loop or taskloop construct, count represents the number of iterations in the
31 iteration space, which may be the result of collapsing several associated loops. For a sections
32 construct, count represents the number of sections. For a workshare construct, count represents
33 the units of work, as defined by the workshare construct. For a single or scope construct,

CHAPTER 19. OMPT INTERFACE 479


1 count is always 1. When the endpoint argument signals the end of a scope, a count value of 0
2 indicates that the actual count value is not available.
3 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
4 runtime routine implements the region associated with a callback that has type signature
5 ompt_callback_work_t then codeptr_ra contains the return address of the call to that
6 runtime routine. If the implementation of the region is inlined then codeptr_ra contains the return
7 address of the callback invocation. If attribution to source code is impossible or inappropriate,
8 codeptr_ra may be NULL.

9 Cross References
10 • Work-Distribution Constructs, see Chapter 11
11 • ompt_data_t, see Section 19.4.4.4
12 • ompt_scope_endpoint_t, see Section 19.4.4.11
13 • ompt_work_t, see Section 19.4.4.16
14 • taskloop directive, see Section 12.6

15 19.5.2.6 ompt_callback_dispatch_t
16 Summary
17 The ompt_callback_dispatch_t type is used for callbacks that are dispatched when a
18 thread begins to execute a section or loop iteration.
19 Format
C / C++
20 typedef void (*ompt_callback_dispatch_t) (
21 ompt_data_t *parallel_data,
22 ompt_data_t *task_data,
23 ompt_dispatch_t kind,
24 ompt_data_t instance
25 );
C / C++
26 Trace Record
C / C++
27 typedef struct ompt_record_dispatch_t {
28 ompt_id_t parallel_id;
29 ompt_id_t task_id;
30 ompt_dispatch_t kind;
31 ompt_data_t instance;
32 } ompt_record_dispatch_t;
C / C++

480 OpenMP API – Version 5.2 November 2021


1 Description of Arguments
2 The binding of the parallel_data argument is the current parallel region.
3 The binding of the task_data argument is the implicit task that executes the structured block of the
4 parallel region.
5 The kind argument indicates whether a loop iteration or a section is being dispatched.
6 If the kind argument is ompt_dispatch_iteration, the value field of the instance argument
7 contains the logical iteration number. If the kind argument is ompt_dispatch_section, the
8 ptr field of the instance argument contains a code address that identifies the structured block. In
9 cases where a runtime routine implements the structured block associated with this callback, the ptr
10 field of the instance argument contains the return address of the call to the runtime routine. In cases
11 where the implementation of the structured block is inlined, the ptr field of the instance argument
12 contains the return address of the invocation of this callback. If the kind argument is
13 ompt_dispatch_ws_loop_chunk, ompt_dispatch_taskloop_chunk or
14 ompt_dispatch_distribute_chunk, the ptr field of the instance argument points to a
15 structure of type ompt_dispatch_chunk_t that contains the information for the chunk.
16 Cross References
17 • Worksharing-Loop Constructs, see Section 11.5
18 • ompt_data_t, see Section 19.4.4.4
19 • ompt_dispatch_chunk_t, see Section 19.4.4.13
20 • ompt_dispatch_t, see Section 19.4.4.12
21 • sections directive, see Section 11.3
22 • taskloop directive, see Section 12.6

23 19.5.2.7 ompt_callback_task_create_t
24 Summary
25 The ompt_callback_task_create_t type is used for callbacks that are dispatched when
26 task regions are generated.
27 Format
C / C++
28 typedef void (*ompt_callback_task_create_t) (
29 ompt_data_t *encountering_task_data,
30 const ompt_frame_t *encountering_task_frame,
31 ompt_data_t *new_task_data,
32 int flags,
33 int has_dependences,
34 const void *codeptr_ra
35 );
C / C++

CHAPTER 19. OMPT INTERFACE 481


1 Trace Record
C / C++
2 typedef struct ompt_record_task_create_t {
3 ompt_id_t encountering_task_id;
4 ompt_id_t new_task_id;
5 int flags;
6 int has_dependences;
7 const void *codeptr_ra;
8 } ompt_record_task_create_t;
C / C++
9 Description of Arguments
10 The binding of the encountering_task_data argument is the encountering task.
11 The encountering_task_frame argument points to the frame object associated with the encountering
12 task. The behavior for accessing the frame object after the callback returned is unspecified.
13 The binding of the new_task_data argument is the generated task.
14 The flags argument indicates the kind of task (explicit or target) that is generated. Values for flags
15 are a disjunction of elements in the ompt_task_flag_t enumeration type.
16 The has_dependences argument is true if the generated task has dependences and false otherwise.
17 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
18 runtime routine implements the region associated with a callback that has type signature
19 ompt_callback_task_create_t then codeptr_ra contains the return address of the call to
20 that runtime routine. If the implementation of the region is inlined then codeptr_ra contains the
21 return address of the callback invocation. If attribution to source code is impossible or
22 inappropriate, codeptr_ra may be NULL.

23 Cross References
24 • Initial Task, see Section 12.8
25 • ompt_data_t, see Section 19.4.4.4
26 • ompt_frame_t, see Section 19.4.4.29
27 • ompt_task_flag_t, see Section 19.4.4.19
28 • task directive, see Section 12.5

29 19.5.2.8 ompt_callback_dependences_t
30 Summary
31 The ompt_callback_dependences_t type is used for callbacks that are related to
32 dependences and that are dispatched when new tasks are generated and when ordered constructs
33 are encountered.

482 OpenMP API – Version 5.2 November 2021


1 Format
C / C++
2 typedef void (*ompt_callback_dependences_t) (
3 ompt_data_t *task_data,
4 const ompt_dependence_t *deps,
5 int ndeps
6 );
C / C++
7 Trace Record
C / C++
8 typedef struct ompt_record_dependences_t {
9 ompt_id_t task_id;
10 ompt_dependence_t dep;
11 int ndeps;
12 } ompt_record_dependences_t;
C / C++
13 Description of Arguments
14 The binding of the task_data argument is the generated task for a depend clause on a task construct,
15 the target task for a depend clause on a target construct respectively depend object in an
16 asynchronous runtime routine, or the encountering implicit task for a depend clause of the ordered
17 construct.
18 The deps argument lists dependences of the new task or the dependence vector of the ordered
19 construct. Dependences denoted with depend objects are described in terms of their dependence
20 semantics.
21 The ndeps argument specifies the length of the list passed by the deps argument. The memory for
22 deps is owned by the caller; the tool cannot rely on the data after the callback returns.
23 The performance monitor interface for tracing activity on target devices provides one record per
24 dependence.

25 Cross References
26 • ompt_data_t, see Section 19.4.4.4
27 • ompt_dependence_t, see Section 19.4.4.9
28 • depend clause, see Section 15.9.5
29 • ordered directive, see Section 15.10.1

CHAPTER 19. OMPT INTERFACE 483


1 19.5.2.9 ompt_callback_task_dependence_t
2 Summary
3 The ompt_callback_task_dependence_t type is used for callbacks that are dispatched
4 when unfulfilled task dependences are encountered.

5 Format
C / C++
6 typedef void (*ompt_callback_task_dependence_t) (
7 ompt_data_t *src_task_data,
8 ompt_data_t *sink_task_data
9 );
C / C++
10 Trace Record
C / C++
11 typedef struct ompt_record_task_dependence_t {
12 ompt_id_t src_task_id;
13 ompt_id_t sink_task_id;
14 } ompt_record_task_dependence_t;
C / C++
15 Description of Arguments
16 The binding of the src_task_data argument is a running task with an outgoing dependence.
17 The binding of the sink_task_data argument is a task with an unsatisfied incoming dependence.

18 Cross References
19 • ompt_data_t, see Section 19.4.4.4
20 • depend clause, see Section 15.9.5

21 19.5.2.10 ompt_callback_task_schedule_t
22 Summary
23 The ompt_callback_task_schedule_t type is used for callbacks that are dispatched when
24 task scheduling decisions are made.

25 Format
C / C++
26 typedef void (*ompt_callback_task_schedule_t) (
27 ompt_data_t *prior_task_data,
28 ompt_task_status_t prior_task_status,
29 ompt_data_t *next_task_data
30 );
C / C++

484 OpenMP API – Version 5.2 November 2021


1 Trace Record
C / C++
2 typedef struct ompt_record_task_schedule_t {
3 ompt_id_t prior_task_id;
4 ompt_task_status_t prior_task_status;
5 ompt_id_t next_task_id;
6 } ompt_record_task_schedule_t;
C / C++
7 Description of Arguments
8 The prior_task_status argument indicates the status of the task that arrived at a task scheduling
9 point.
10 The binding of the prior_task_data argument is the task that arrived at the scheduling point.
11 The binding of the next_task_data argument is the task that is resumed at the scheduling point.
12 This argument is NULL if the callback is dispatched for a task-fulfill event or if the callback signals
13 completion of a taskwait construct.

14 Cross References
15 • Task Scheduling, see Section 12.9
16 • ompt_data_t, see Section 19.4.4.4
17 • ompt_task_status_t, see Section 19.4.4.20

18 19.5.2.11 ompt_callback_implicit_task_t
19 Summary
20 The ompt_callback_implicit_task_t type is used for callbacks that are dispatched when
21 initial tasks and implicit tasks are generated and completed.

22 Format
C / C++
23 typedef void (*ompt_callback_implicit_task_t) (
24 ompt_scope_endpoint_t endpoint,
25 ompt_data_t *parallel_data,
26 ompt_data_t *task_data,
27 unsigned int actual_parallelism,
28 unsigned int index,
29 int flags
30 );
C / C++

CHAPTER 19. OMPT INTERFACE 485


1 Trace Record
C / C++
2 typedef struct ompt_record_implicit_task_t {
3 ompt_scope_endpoint_t endpoint;
4 ompt_id_t parallel_id;
5 ompt_id_t task_id;
6 unsigned int actual_parallelism;
7 unsigned int index;
8 int flags;
9 } ompt_record_implicit_task_t;
C / C++
10 Description of Arguments
11 The endpoint argument indicates that the callback signals the beginning of a scope or the end of a
12 scope.
13 The binding of the parallel_data argument is the current parallel or teams region. For the
14 implicit-task-end and the initial-task-end events, this argument is NULL.
15 The binding of the task_data argument is the implicit task that executes the structured block of the
16 parallel or teams region.
17 The actual_parallelism argument indicates the number of threads in the parallel region or the
18 number of teams in the teams region. For initial tasks that are not closely nested in a teams
19 construct, this argument is 1. For the implicit-task-end and the initial-task-end events, this
20 argument is 0.
21 The index argument indicates the thread number or team number of the calling thread, within the
22 team or league that is executing the parallel or teams region to which the implicit task region
23 binds. For initial tasks, that are not created by a teams construct, this argument is 1.
24 The flags argument indicates the kind of task (initial or implicit).

25 Cross References
26 • ompt_data_t, see Section 19.4.4.4
27 • ompt_scope_endpoint_t, see Section 19.4.4.11
28 • parallel directive, see Section 10.1
29 • teams directive, see Section 10.2

30 19.5.2.12 ompt_callback_masked_t
31 Summary
32 The ompt_callback_masked_t type is used for callbacks that are dispatched when masked
33 regions start and end.

486 OpenMP API – Version 5.2 November 2021


1 Format
C / C++
2 typedef void (*ompt_callback_masked_t) (
3 ompt_scope_endpoint_t endpoint,
4 ompt_data_t *parallel_data,
5 ompt_data_t *task_data,
6 const void *codeptr_ra
7 );
C / C++
8 Trace Record
C / C++
9 typedef struct ompt_record_masked_t {
10 ompt_scope_endpoint_t endpoint;
11 ompt_id_t parallel_id;
12 ompt_id_t task_id;
13 const void *codeptr_ra;
14 } ompt_record_masked_t;
C / C++
15 Description of Arguments
16 The endpoint argument indicates that the callback signals the beginning of a scope or the end of a
17 scope.
18 The binding of the parallel_data argument is the current parallel region.
19 The binding of the task_data argument is the encountering task.
20 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
21 runtime routine implements the region associated with a callback that has type signature
22 ompt_callback_masked_t then codeptr_ra contains the return address of the call to that
23 runtime routine. If the implementation of the region is inlined then codeptr_ra contains the return
24 address of the callback invocation. If attribution to source code is impossible or inappropriate,
25 codeptr_ra may be NULL.
26 Cross References
27 • ompt_data_t, see Section 19.4.4.4
28 • ompt_scope_endpoint_t, see Section 19.4.4.11
29 • masked directive, see Section 10.5

30 19.5.2.13 ompt_callback_sync_region_t
31 Summary
32 The ompt_callback_sync_region_t type is used for callbacks that are dispatched when
33 barrier regions, taskwait regions, and taskgroup regions begin and end and when waiting
34 begins and ends for them as well as for when reductions are performed.

CHAPTER 19. OMPT INTERFACE 487


1 Format
C / C++
2 typedef void (*ompt_callback_sync_region_t) (
3 ompt_sync_region_t kind,
4 ompt_scope_endpoint_t endpoint,
5 ompt_data_t *parallel_data,
6 ompt_data_t *task_data,
7 const void *codeptr_ra
8 );
C / C++
9 Trace Record
C / C++
10 typedef struct ompt_record_sync_region_t {
11 ompt_sync_region_t kind;
12 ompt_scope_endpoint_t endpoint;
13 ompt_id_t parallel_id;
14 ompt_id_t task_id;
15 const void *codeptr_ra;
16 } ompt_record_sync_region_t;
C / C++
17 Description of Arguments
18 The kind argument indicates the kind of synchronization.
19 The endpoint argument indicates that the callback signals the beginning of a scope or the end of a
20 scope.
21 The binding of the parallel_data argument is the current parallel region. For the
22 implicit-barrier-end event at the end of a parallel region this argument is NULL. For the
23 implicit-barrier-wait-begin and implicit-barrier-wait-end event at the end of a parallel region,
24 whether this argument is NULL or points to the parallel data of the current parallel region is
25 implementation defined.
26 The binding of the task_data argument is the current task.
27 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
28 runtime routine implements the region associated with a callback that has type signature
29 ompt_callback_sync_region_t then codeptr_ra contains the return address of the call to
30 that runtime routine. If the implementation of the region is inlined then codeptr_ra contains the
31 return address of the callback invocation. If attribution to source code is impossible or
32 inappropriate, codeptr_ra may be NULL.

488 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • Implicit Barriers, see Section 15.3.2
3 • Properties Common to All Reduction Clauses, see Section 5.5.5
4 • ompt_data_t, see Section 19.4.4.4
5 • ompt_scope_endpoint_t, see Section 19.4.4.11
6 • ompt_sync_region_t, see Section 19.4.4.14
7 • barrier directive, see Section 15.3.1
8 • taskgroup directive, see Section 15.4
9 • taskwait directive, see Section 15.5

10 19.5.2.14 ompt_callback_mutex_acquire_t
11 Summary
12 The ompt_callback_mutex_acquire_t type is used for callbacks that are dispatched when
13 locks are initialized, acquired and tested and when critical regions, atomic regions, and
14 ordered regions are begun.

15 Format
C / C++
16 typedef void (*ompt_callback_mutex_acquire_t) (
17 ompt_mutex_t kind,
18 unsigned int hint,
19 unsigned int impl,
20 ompt_wait_id_t wait_id,
21 const void *codeptr_ra
22 );
C / C++
23 Trace Record
C / C++
24 typedef struct ompt_record_mutex_acquire_t {
25 ompt_mutex_t kind;
26 unsigned int hint;
27 unsigned int impl;
28 ompt_wait_id_t wait_id;
29 const void *codeptr_ra;
30 } ompt_record_mutex_acquire_t;
C / C++

CHAPTER 19. OMPT INTERFACE 489


1 Description of Arguments
2 The kind argument indicates the kind of mutual exclusion event.
3 The hint argument indicates the hint that was provided when initializing an implementation of
4 mutual exclusion. If no hint is available when a thread initiates acquisition of mutual exclusion, the
5 runtime may supply omp_sync_hint_none as the value for hint.
6 The impl argument indicates the mechanism chosen by the runtime to implement the mutual
7 exclusion.
8 The wait_id argument indicates the object being awaited.
9 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
10 runtime routine implements the region associated with a callback that has type signature
11 ompt_callback_mutex_acquire_t then codeptr_ra contains the return address of the call
12 to that runtime routine. If the implementation of the region is inlined then codeptr_ra contains the
13 return address of the callback invocation. If attribution to source code is impossible or
14 inappropriate, codeptr_ra may be NULL.

15 Cross References
16 • omp_init_lock and omp_init_nest_lock, see Section 18.9.1
17 • ompt_mutex_t, see Section 19.4.4.17
18 • ordered Construct, see Section 15.10
19 • atomic directive, see Section 15.8.4
20 • critical directive, see Section 15.2
21 • ompt_wait_id_t, see Section 19.4.4.31

22 19.5.2.15 ompt_callback_mutex_t
23 Summary
24 The ompt_callback_mutex_t type is used for callbacks that indicate important
25 synchronization events.

26 Format
C / C++
27 typedef void (*ompt_callback_mutex_t) (
28 ompt_mutex_t kind,
29 ompt_wait_id_t wait_id,
30 const void *codeptr_ra
31 );
C / C++

490 OpenMP API – Version 5.2 November 2021


1 Trace Record
C / C++
2 typedef struct ompt_record_mutex_t {
3 ompt_mutex_t kind;
4 ompt_wait_id_t wait_id;
5 const void *codeptr_ra;
6 } ompt_record_mutex_t;
C / C++
7 Description of Arguments
8 The kind argument indicates the kind of mutual exclusion event.
9 The wait_id argument indicates the object being awaited.
10 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
11 runtime routine implements the region associated with a callback that has type signature
12 ompt_callback_mutex_t then codeptr_ra contains the return address of the call to that
13 runtime routine. If the implementation of the region is inlined then codeptr_ra contains the return
14 address of the callback invocation. If attribution to source code is impossible or inappropriate,
15 codeptr_ra may be NULL.

16 Cross References
17 • omp_set_lock and omp_set_nest_lock, see Section 18.9.4
18 • omp_test_lock and omp_test_nest_lock, see Section 18.9.6
19 • omp_unset_lock and omp_unset_nest_lock, see Section 18.9.5
20 • ompt_mutex_t, see Section 19.4.4.17
21 • ordered Construct, see Section 15.10
22 • atomic directive, see Section 15.8.4
23 • critical directive, see Section 15.2
24 • omp_destroy_lock and omp_destroy_nest_lock, see Section 18.9.3
25 • ompt_wait_id_t, see Section 19.4.4.31

26 19.5.2.16 ompt_callback_nest_lock_t
27 Summary
28 The ompt_callback_nest_lock_t type is used for callbacks that indicate that a thread that
29 owns a nested lock has performed an action related to the lock but has not relinquished ownership.

CHAPTER 19. OMPT INTERFACE 491


1 Format
C / C++
2 typedef void (*ompt_callback_nest_lock_t) (
3 ompt_scope_endpoint_t endpoint,
4 ompt_wait_id_t wait_id,
5 const void *codeptr_ra
6 );
C / C++
7 Trace Record
C / C++
8 typedef struct ompt_record_nest_lock_t {
9 ompt_scope_endpoint_t endpoint;
10 ompt_wait_id_t wait_id;
11 const void *codeptr_ra;
12 } ompt_record_nest_lock_t;
C / C++
13 Description of Arguments
14 The endpoint argument indicates that the callback signals the beginning of a scope or the end of a
15 scope.
16 The wait_id argument indicates the object being awaited.
17 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
18 runtime routine implements the region associated with a callback that has type signature
19 ompt_callback_nest_lock_t then codeptr_ra contains the return address of the call to that
20 runtime routine. If the implementation of the region is inlined then codeptr_ra contains the return
21 address of the callback invocation. If attribution to source code is impossible or inappropriate,
22 codeptr_ra may be NULL.

23 Cross References
24 • omp_set_lock and omp_set_nest_lock, see Section 18.9.4
25 • omp_test_lock and omp_test_nest_lock, see Section 18.9.6
26 • omp_unset_lock and omp_unset_nest_lock, see Section 18.9.5
27 • ompt_scope_endpoint_t, see Section 19.4.4.11
28 • ompt_wait_id_t, see Section 19.4.4.31

29 19.5.2.17 ompt_callback_flush_t
30 Summary
31 The ompt_callback_flush_t type is used for callbacks that are dispatched when flush
32 constructs are encountered.

492 OpenMP API – Version 5.2 November 2021


1 Format
C / C++
2 typedef void (*ompt_callback_flush_t) (
3 ompt_data_t *thread_data,
4 const void *codeptr_ra
5 );
C / C++
6 Trace Record
C / C++
7 typedef struct ompt_record_flush_t {
8 const void *codeptr_ra;
9 } ompt_record_flush_t;
C / C++
10 Description of Arguments
11 The binding of the thread_data argument is the executing thread.
12 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
13 runtime routine implements the region associated with a callback that has type signature
14 ompt_callback_flush_t then codeptr_ra contains the return address of the call to that
15 runtime routine. If the implementation of the region is inlined then codeptr_ra contains the return
16 address of the callback invocation. If attribution to source code is impossible or inappropriate,
17 codeptr_ra may be NULL.
18 Cross References
19 • ompt_data_t, see Section 19.4.4.4
20 • flush directive, see Section 15.8.5

21 19.5.2.18 ompt_callback_cancel_t
22 Summary
23 The ompt_callback_cancel_t type is used for callbacks that are dispatched for cancellation,
24 cancel and discarded-task events.
25 Format
C / C++
26 typedef void (*ompt_callback_cancel_t) (
27 ompt_data_t *task_data,
28 int flags,
29 const void *codeptr_ra
30 );
C / C++

CHAPTER 19. OMPT INTERFACE 493


1 Trace Record
C / C++
2 typedef struct ompt_record_cancel_t {
3 ompt_id_t task_id;
4 int flags;
5 const void *codeptr_ra;
6 } ompt_record_cancel_t;
C / C++
7 Description of Arguments
8 The binding of the task_data argument is the task that encounters a cancel construct, a
9 cancellation point construct, or a construct defined as having an implicit cancellation
10 point.
11 The flags argument, defined by the ompt_cancel_flag_t enumeration type, indicates whether
12 cancellation is activated by the current task or detected as being activated by another task. The
13 construct that is being canceled is also described in the flags argument. When several constructs are
14 detected as being concurrently canceled, each corresponding bit in the argument will be set.
15 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
16 runtime routine implements the region associated with a callback that has type signature
17 ompt_callback_cancel_t then codeptr_ra contains the return address of the call to that
18 runtime routine. If the implementation of the region is inlined then codeptr_ra contains the return
19 address of the callback invocation. If attribution to source code is impossible or inappropriate,
20 codeptr_ra may be NULL.
21 Cross References
22 • ompt_cancel_flag_t, see Section 19.4.4.26

23 19.5.2.19 ompt_callback_device_initialize_t
24 Summary
25 The ompt_callback_device_initialize_t type is used for callbacks that initialize
26 device tracing interfaces.

27 Format
C / C++
28 typedef void (*ompt_callback_device_initialize_t) (
29 int device_num,
30 const char *type,
31 ompt_device_t *device,
32 ompt_function_lookup_t lookup,
33 const char *documentation
34 );
C / C++

494 OpenMP API – Version 5.2 November 2021


1 Semantics
2 Registration of a callback with type signature ompt_callback_device_initialize_t for
3 the ompt_callback_device_initialize event enables asynchronous collection of a trace
4 for a device. The OpenMP implementation invokes this callback after OpenMP is initialized for the
5 device but before execution of any OpenMP construct is started on the device.

6 Description of Arguments
7 The device_num argument identifies the logical device that is being initialized.
8 The type argument is a C string that indicates the type of the device. A device type string is a
9 semicolon-separated character string that includes, at a minimum, the vendor and model name of
10 the device. These names may be followed by a semicolon-separated sequence of properties that
11 describe the hardware or software of the device.
12 The device argument is a pointer to an opaque object that represents the target device instance.
13 Functions in the device tracing interface use this pointer to identify the device that is being
14 addressed.
15 The lookup argument points to a runtime callback that a tool must use to obtain pointers to runtime
16 entry points in the device’s OMPT tracing interface. If a device does not support tracing then
17 lookup is NULL.
18 The documentation argument is a C string that describes how to use any device-specific runtime
19 entry points that can be obtained through the lookup argument. This documentation string may be a
20 pointer to external documentation, or it may be inline descriptions that include names and type
21 signatures for any device-specific interfaces that are available through the lookup argument along
22 with descriptions of how to use these interface functions to control monitoring and analysis of
23 device traces.

24 Constraints on Arguments
25 The type and documentation arguments must be immutable strings that are defined for the lifetime
26 of program execution.

27 Effect
28 A device initializer must fulfill several duties. First, the type argument should be used to determine
29 if any special knowledge about the hardware and/or software of a device is employed. Second, the
30 lookup argument should be used to look up pointers to runtime entry points in the OMPT tracing
31 interface for the device. Finally, these runtime entry points should be used to set up tracing for the
32 device. Initialization of tracing for a target device is described in Section 19.2.5.

33 Cross References
34 • Lookup Entry Points: ompt_function_lookup_t, see Section 19.6.3

CHAPTER 19. OMPT INTERFACE 495


1 19.5.2.20 ompt_callback_device_finalize_t
2 Summary
3 The ompt_callback_device_initialize_t type is used for callbacks that finalize device
4 tracing interfaces.
5 Format
C / C++
6 typedef void (*ompt_callback_device_finalize_t) (
7 int device_num
8 );
C / C++
9 Description of Arguments
10 The device_num argument identifies the logical device that is being finalized.
11 Semantics
12 A registered callback with type signature ompt_callback_device_finalize_t is
13 dispatched for a device immediately prior to finalizing the device. Prior to dispatching a finalization
14 callback for a device on which tracing is active, the OpenMP implementation stops tracing on the
15 device and synchronously flushes all trace records for the device that have not yet been reported.
16 These trace records are flushed through one or more buffer completion callbacks with type
17 signature ompt_callback_buffer_complete_t as needed prior to the dispatch of the
18 callback with type signature ompt_callback_device_finalize_t.
19 Cross References
20 • ompt_callback_buffer_complete_t, see Section 19.5.2.24

21 19.5.2.21 ompt_callback_device_load_t
22 Summary
23 The ompt_callback_device_load_t type is used for callbacks that the OpenMP runtime
24 invokes to indicate that it has just loaded code onto the specified device.
25 Format
C / C++
26 typedef void (*ompt_callback_device_load_t) (
27 int device_num,
28 const char *filename,
29 int64_t offset_in_file,
30 void *vma_in_file,
31 size_t bytes,
32 void *host_addr,
33 void *device_addr,
34 uint64_t module_id
35 );
C / C++

496 OpenMP API – Version 5.2 November 2021


1 Description of Arguments
2 The device_num argument specifies the device.
3 The filename argument indicates the name of a file in which the device code can be found. A NULL
4 filename indicates that the code is not available in a file in the file system.
5 The offset_in_file argument indicates an offset into filename at which the code can be found. A
6 value of -1 indicates that no offset is provided.
7 ompt_addr_none is defined as a pointer with the value ~0.
8 The vma_in_file argument indicates a virtual address in filename at which the code can be found. A
9 value of ompt_addr_none indicates that a virtual address in the file is not available.
10 The bytes argument indicates the size of the device code object in bytes.
11 The host_addr argument indicates the address at which a copy of the device code is available in
12 host memory. A value of ompt_addr_none indicates that a host code address is not available.
13 The device_addr argument indicates the address at which the device code has been loaded in device
14 memory. A value of ompt_addr_none indicates that a device code address is not available.
15 The module_id argument is an identifier that is associated with the device code object.

16 Cross References
17 • Device Directives and Clauses, see Chapter 13

18 19.5.2.22 ompt_callback_device_unload_t
19 Summary
20 The ompt_callback_device_unload_t type is used for callbacks that the OpenMP
21 runtime invokes to indicate that it is about to unload code from the specified device.

22 Format
C / C++
23 typedef void (*ompt_callback_device_unload_t) (
24 int device_num,
25 uint64_t module_id
26 );
C / C++
27 Description of Arguments
28 The device_num argument specifies the device.
29 The module_id argument is an identifier that is associated with the device code object.

30 Cross References
31 • Device Directives and Clauses, see Chapter 13

CHAPTER 19. OMPT INTERFACE 497


1 19.5.2.23 ompt_callback_buffer_request_t
2 Summary
3 The ompt_callback_buffer_request_t type is used for callbacks that are dispatched
4 when a buffer to store event records for a device is requested.
5 Format
C / C++
6 typedef void (*ompt_callback_buffer_request_t) (
7 int device_num,
8 ompt_buffer_t **buffer,
9 size_t *bytes
10 );
C / C++
11 Semantics
12 A callback with type signature ompt_callback_buffer_request_t requests a buffer to
13 store trace records for the specified device. A buffer request callback may set *bytes to 0 if it does
14 not provide a buffer. If a callback sets *bytes to 0, further recording of events for the device is
15 disabled until the next invocation of ompt_start_trace. This action causes the device to drop
16 future trace records until recording is restarted.
17 Description of Arguments
18 The device_num argument specifies the device.
19 The *buffer argument points to a buffer where device events may be recorded. The *bytes argument
20 indicates the length of that buffer.
21 Cross References
22 • ompt_buffer_t, see Section 19.4.4.7

23 19.5.2.24 ompt_callback_buffer_complete_t
24 Summary
25 The ompt_callback_buffer_complete_t type is used for callbacks that are dispatched
26 when devices will not record any more trace records in an event buffer and all records written to the
27 buffer are valid.
28 Format
C / C++
29 typedef void (*ompt_callback_buffer_complete_t) (
30 int device_num,
31 ompt_buffer_t *buffer,
32 size_t bytes,
33 ompt_buffer_cursor_t begin,
34 int buffer_owned
35 );
C / C++

498 OpenMP API – Version 5.2 November 2021


1 Semantics
2 A callback with type signature ompt_callback_buffer_complete_t provides a buffer that
3 contains trace records for the specified device. Typically, a tool will iterate through the records in
4 the buffer and process them. The OpenMP implementation makes these callbacks on a thread that
5 is not an OpenMP primary or worker thread. The callee may not delete the buffer if the
6 buffer_owned argument is 0. The buffer completion callback is not required to be async signal safe.

7 Description of Arguments
8 The device_num argument indicates the device for which the buffer contains events.
9 The buffer argument is the address of a buffer that was previously allocated by a buffer request
10 callback.
11 The bytes argument indicates the full size of the buffer.
12 The begin argument is an opaque cursor that indicates the position of the beginning of the first
13 record in the buffer.
14 The buffer_owned argument is 1 if the data to which the buffer points can be deleted by the callback
15 and 0 otherwise. If multiple devices accumulate trace events into a single buffer, this callback may
16 be invoked with a pointer to one or more trace records in a shared buffer with buffer_owned = 0. In
17 this case, the callback may not delete the buffer.

18 Cross References
19 • ompt_buffer_cursor_t, see Section 19.4.4.8
20 • ompt_buffer_t, see Section 19.4.4.7

21 19.5.2.25 ompt_callback_target_data_op_emi_t and


22 ompt_callback_target_data_op_t
23 Summary
24 The ompt_callback_target_data_op_emi_t and
25 ompt_callback_target_data_op_t types are used for callbacks that are dispatched when
26 a thread maps data to a device.

27 Format
C / C++
28 typedef void (*ompt_callback_target_data_op_emi_t) (
29 ompt_scope_endpoint_t endpoint,
30 ompt_data_t *target_task_data,
31 ompt_data_t *target_data,
32 ompt_id_t *host_op_id,
33 ompt_target_data_op_t optype,
34 void *src_addr,
35 int src_device_num,

CHAPTER 19. OMPT INTERFACE 499


1 void *dest_addr,
2 int dest_device_num,
3 size_t bytes,
4 const void *codeptr_ra
5 );
6 typedef void (*ompt_callback_target_data_op_t) (
7 ompt_id_t target_id,
8 ompt_id_t host_op_id,
9 ompt_target_data_op_t optype,
10 void *src_addr,
11 int src_device_num,
12 void *dest_addr,
13 int dest_device_num,
14 size_t bytes,
15 const void *codeptr_ra
16 );
C / C++
17 Trace Record
C / C++
18 typedef struct ompt_record_target_data_op_t {
19 ompt_id_t host_op_id;
20 ompt_target_data_op_t optype;
21 void *src_addr;
22 int src_device_num;
23 void *dest_addr;
24 int dest_device_num;
25 size_t bytes;
26 ompt_device_time_t end_time;
27 const void *codeptr_ra;
28 } ompt_record_target_data_op_t;
C / C++
29 Semantics
30 A thread dispatches a registered ompt_callback_target_data_op_emi or
31 ompt_callback_target_data_op callback when device memory is allocated or freed, as
32 well as when data is copied to or from a device.
33
34 Note – An OpenMP implementation may aggregate program variables and data operations upon
35 them. For instance, an OpenMP implementation may synthesize a composite to represent multiple
36 scalars and then allocate, free, or copy this composite as a whole rather than performing data
37 operations on each scalar individually. Thus, callbacks may not be dispatched as separate data
38 operations on each variable.
39

500 OpenMP API – Version 5.2 November 2021


1 Description of Arguments
2 The endpoint argument indicates that the callback signals the beginning or end of a scope.
3 The binding of the target_task_data argument is the target task region.
4 The binding of the target_data argument is the target region.
5 The host_op_id argument points to a tool-controlled integer value, which identifies a data operation
6 on a target device.
7 The optype argument indicates the kind of data operation.
8 The src_addr argument indicates the data address before the operation, where applicable.
9 The src_device_num argument indicates the source device number for the data operation, where
10 applicable.
11 The dest_addr argument indicates the data address after the operation.
12 The dest_device_num argument indicates the destination device number for the data operation.
13 Whether in some operations src_addr or dest_addr may point to an intermediate buffer is
14 implementation defined.
15 The bytes argument indicates the size of data.
16 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
17 runtime routine implements the region associated with a callback that has type signature
18 ompt_callback_target_data_op_emi_t or ompt_callback_target_data_op_t
19 then codeptr_ra contains the return address of the call to that runtime routine. If the
20 implementation of the region is inlined then codeptr_ra contains the return address of the callback
21 invocation. If attribution to source code is impossible or inappropriate, codeptr_ra may be NULL.
22 Restrictions
23 Restrictions to the ompt_callback_target_data_op_emi and
24 ompt_callback_target_data_op callbacks are as follows:
25 • These callbacks must not be registered at the same time.
26 Cross References
27 • ompt_data_t, see Section 19.4.4.4
28 • ompt_id_t, see Section 19.4.4.3
29 • ompt_scope_endpoint_t, see Section 19.4.4.11
30 • ompt_target_data_op_t, see Section 19.4.4.15
31 • map clause, see Section 5.8.3

CHAPTER 19. OMPT INTERFACE 501


1 19.5.2.26 ompt_callback_target_emi_t and
2 ompt_callback_target_t
3 Summary
4 The ompt_callback_target_emi_t and ompt_callback_target_t types are used
5 for callbacks that are dispatched when a thread begins to execute a device construct.
6 Format
C / C++
7 typedef void (*ompt_callback_target_emi_t) (
8 ompt_target_t kind,
9 ompt_scope_endpoint_t endpoint,
10 int device_num,
11 ompt_data_t *task_data,
12 ompt_data_t *target_task_data,
13 ompt_data_t *target_data,
14 const void *codeptr_ra
15 );
16 typedef void (*ompt_callback_target_t) (
17 ompt_target_t kind,
18 ompt_scope_endpoint_t endpoint,
19 int device_num,
20 ompt_data_t *task_data,
21 ompt_id_t target_id,
22 const void *codeptr_ra
23 );
C / C++
24 Trace Record
C / C++
25 typedef struct ompt_record_target_t {
26 ompt_target_t kind;
27 ompt_scope_endpoint_t endpoint;
28 int device_num;
29 ompt_id_t task_id;
30 ompt_id_t target_id;
31 const void *codeptr_ra;
32 } ompt_record_target_t;
C / C++

502 OpenMP API – Version 5.2 November 2021


1 Description of Arguments
2 The kind argument indicates the kind of target region.
3 The endpoint argument indicates that the callback signals the beginning of a scope or the end of a
4 scope.
5 The device_num argument indicates the device number of the device that will execute the target
6 region.
7 The binding of the task_data argument is the encountering task.
8 The binding of the target_task_data argument is the target task region. If a target region has no
9 target task or if the target task is merged, this argument is NULL.
10 The binding of the target_data argument is the target region.
11 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
12 runtime routine implements the region associated with a callback that has type signature
13 ompt_callback_target_emi_t or ompt_callback_target_t then codeptr_ra
14 contains the return address of the call to that runtime routine. If the implementation of the region is
15 inlined then codeptr_ra contains the return address of the callback invocation. If attribution to
16 source code is impossible or inappropriate, codeptr_ra may be NULL.

17 Restrictions
18 Restrictions to the ompt_callback_target_emi and ompt_callback_target callbacks
19 are as follows:
20 • These callbacks must not be registered at the same time.

21 Cross References
22 • ompt_data_t, see Section 19.4.4.4
23 • ompt_id_t, see Section 19.4.4.3
24 • ompt_scope_endpoint_t, see Section 19.4.4.11
25 • ompt_target_t, see Section 19.4.4.21
26 • target data directive, see Section 13.5
27 • target directive, see Section 13.8
28 • target enter data directive, see Section 13.6
29 • target exit data directive, see Section 13.7
30 • target update directive, see Section 13.9

CHAPTER 19. OMPT INTERFACE 503


1 19.5.2.27 ompt_callback_target_map_emi_t and
2 ompt_callback_target_map_t
3 Summary
4 The ompt_callback_target_map_emi_t and ompt_callback_target_map_t types
5 are used for callbacks that are dispatched to indicate data mapping relationships.

6 Format
C / C++
7 typedef void (*ompt_callback_target_map_emi_t) (
8 ompt_data_t *target_data,
9 unsigned int nitems,
10 void **host_addr,
11 void **device_addr,
12 size_t *bytes,
13 unsigned int *mapping_flags,
14 const void *codeptr_ra
15 );
16 typedef void (*ompt_callback_target_map_t) (
17 ompt_id_t target_id,
18 unsigned int nitems,
19 void **host_addr,
20 void **device_addr,
21 size_t *bytes,
22 unsigned int *mapping_flags,
23 const void *codeptr_ra
24 );
C / C++
25 Trace Record
C / C++
26 typedef struct ompt_record_target_map_t {
27 ompt_id_t target_id;
28 unsigned int nitems;
29 void **host_addr;
30 void **device_addr;
31 size_t *bytes;
32 unsigned int *mapping_flags;
33 const void *codeptr_ra;
34 } ompt_record_target_map_t;
C / C++

504 OpenMP API – Version 5.2 November 2021


1 Semantics
2 An instance of a target, target data, target enter data, or target exit data
3 construct may contain one or more map clauses. An OpenMP implementation may report the set of
4 mappings associated with map clauses for a construct with a single
5 ompt_callback_target_map_emi or ompt_callback_target_map callback to report
6 the effect of all mappings or multiple ompt_callback_target_map_emi or
7 ompt_callback_target_map callbacks with each reporting a subset of the mappings.
8 Furthermore, an OpenMP implementation may omit mappings that it determines are unnecessary.
9 If an OpenMP implementation issues multiple ompt_callback_target_map_emi or
10 ompt_callback_target_map callbacks, these callbacks may be interleaved with
11 ompt_callback_target_data_op_emi or ompt_callback_target_data_op
12 callbacks used to report data operations associated with the mappings.
13 Description of Arguments
14 The binding of the target_data argument is the target region.
15 The nitems argument indicates the number of data mappings that this callback reports.
16 The host_addr argument indicates an array of host data addresses.
17 The device_addr argument indicates an array of device data addresses.
18 The bytes argument indicates an array of sizes of data.
19 The mapping_flags argument indicates the kind of mapping operations, which may result from
20 explicit map clauses or the implicit data-mapping rules defined in Section 5.8. Flags for the
21 mapping operations include one or more values specified by the ompt_target_map_flag_t
22 type.
23 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
24 runtime routine implements the region associated with a callback that has type signature
25 ompt_callback_target_map_t or ompt_callback_target_map_emi_t then
26 codeptr_ra contains the return address of the call to that runtime routine. If the implementation of
27 the region is inlined then codeptr_ra contains the return address of the callback invocation. If
28 attribution to source code is impossible or inappropriate, codeptr_ra may be NULL.

29 Restrictions
30 Restrictions to the ompt_callback_target_data_map_emi and
31 ompt_callback_target_data_map callbacks are as follows:
32 • These callbacks must not be registered at the same time.

33 Cross References
34 • ompt_callback_target_data_op_emi_t and
35 ompt_callback_target_data_op_t, see Section 19.5.2.25
36 • ompt_data_t, see Section 19.4.4.4
37 • ompt_id_t, see Section 19.4.4.3

CHAPTER 19. OMPT INTERFACE 505


1 • ompt_target_map_flag_t, see Section 19.4.4.23
2 • target data directive, see Section 13.5
3 • target directive, see Section 13.8
4 • target enter data directive, see Section 13.6
5 • target exit data directive, see Section 13.7

6 19.5.2.28 ompt_callback_target_submit_emi_t and


7 ompt_callback_target_submit_t
8 Summary
9 The ompt_callback_target_submit_emi_t and
10 ompt_callback_target_submit_t types are used for callbacks that are dispatched before
11 and after the host initiates creation of an initial task on a device.

12 Format
C / C++
13 typedef void (*ompt_callback_target_submit_emi_t) (
14 ompt_scope_endpoint_t endpoint,
15 ompt_data_t *target_data,
16 ompt_id_t *host_op_id,
17 unsigned int requested_num_teams
18 );
19 typedef void (*ompt_callback_target_submit_t) (
20 ompt_id_t target_id,
21 ompt_id_t host_op_id,
22 unsigned int requested_num_teams
23 );
C / C++
24 Trace Record
C / C++
25 typedef struct ompt_record_target_kernel_t {
26 ompt_id_t host_op_id;
27 unsigned int requested_num_teams;
28 unsigned int granted_num_teams;
29 ompt_device_time_t end_time;
30 } ompt_record_target_kernel_t;
C / C++

506 OpenMP API – Version 5.2 November 2021


1 Semantics
2 A thread dispatches a registered ompt_callback_target_submit_emi or
3 ompt_callback_target_submit callback on the host before and after a target task initiates
4 creation of an initial task on a device.
5 Description of Arguments
6 The endpoint argument indicates that the callback signals the beginning or end of a scope.
7 The binding of the target_data argument is the target region.
8 The host_op_id argument points to a tool-controlled integer value, which identifies an initial task
9 on a target device.
10 The requested_num_teams argument is the number of teams that the host requested to execute the
11 kernel. The actual number of teams that execute the kernel may be smaller and generally will not be
12 known until the kernel begins to execute on the device.
13 If ompt_set_trace_ompt has configured the device to trace kernel execution then the device
14 will log a ompt_record_target_kernel_t record in a trace. The fields in the record are as
15 follows:
16 • The host_op_id field contains a tool-controlled identifier that can be used to correlate a
17 ompt_record_target_kernel_t record with its associated
18 ompt_callback_target_submit_emi or ompt_callback_target_submit
19 callback on the host;
20 • The requested_num_teams field contains the number of teams that the host requested to execute
21 the kernel;
22 • The granted_num_teams field contains the number of teams that the device actually used to
23 execute the kernel;
24 • The time when the initial task began execution on the device is recorded in the time field of an
25 enclosing ompt_record_t structure; and
26 • The time when the initial task completed execution on the device is recorded in the end_time
27 field.
28 Restrictions
29 Restrictions to the ompt_callback_target_submit_emi and
30 ompt_callback_target_submit callbacks are as follows:
31 • These callbacks must not be registered at the same time.
32 Cross References
33 • ompt_data_t, see Section 19.4.4.4
34 • ompt_id_t, see Section 19.4.4.3
35 • ompt_scope_endpoint_t, see Section 19.4.4.11
36 • target directive, see Section 13.8

CHAPTER 19. OMPT INTERFACE 507


1 19.5.2.29 ompt_callback_control_tool_t
2 Summary
3 The ompt_callback_control_tool_t type is used for callbacks that dispatch tool-control
4 events.

5 Format
C / C++
6 typedef int (*ompt_callback_control_tool_t) (
7 uint64_t command,
8 uint64_t modifier,
9 void *arg,
10 const void *codeptr_ra
11 );
C / C++
12 Trace Record
C / C++
13 typedef struct ompt_record_control_tool_t {
14 uint64_t command;
15 uint64_t modifier;
16 const void *codeptr_ra;
17 } ompt_record_control_tool_t;
C / C++
18 Semantics
19 Callbacks with type signature ompt_callback_control_tool_t may return any
20 non-negative value, which will be returned to the application as the return value of the
21 omp_control_tool call that triggered the callback.

22 Description of Arguments
23 The command argument passes a command from an application to a tool. Standard values for
24 command are defined by omp_control_tool_t in Section 18.14.
25 The modifier argument passes a command modifier from an application to a tool.
26 The command and modifier arguments may have tool-specific values. Tools must ignore command
27 values that they are not designed to handle.
28 The arg argument is a void pointer that enables a tool and an application to exchange arbitrary state.
29 The arg argument may be NULL.

508 OpenMP API – Version 5.2 November 2021


1 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
2 runtime routine implements the region associated with a callback that has type signature
3 ompt_callback_control_tool_t then codeptr_ra contains the return address of the call to
4 that runtime routine. If the implementation of the region is inlined then codeptr_ra contains the
5 return address of the callback invocation. If attribution to source code is impossible or
6 inappropriate, codeptr_ra may be NULL.

7 Constraints on Arguments
8 Tool-specific values for command must be ≥ 64.

9 Cross References
10 • Tool Control Routine, see Section 18.14

11 19.5.2.30 ompt_callback_error_t
12 Summary
13 The ompt_callback_error_t type is used for callbacks that dispatch runtime-error events.

14 Format
C / C++
15 typedef void (*ompt_callback_error_t) (
16 ompt_severity_t severity,
17 const char *message,
18 size_t length,
19 const void *codeptr_ra
20 );
C / C++
21 Trace Record
C / C++
22 typedef struct ompt_record_error_t {
23 ompt_severity_t severity;
24 const char *message;
25 size_t length;
26 const void *codeptr_ra;
27 } ompt_record_error_t;
C / C++
28 Semantics
29 A thread dispatches a registered ompt_callback_error_t callback when an error directive
30 is encountered for which the at(execution) clause is specified.

CHAPTER 19. OMPT INTERFACE 509


1 Description of Arguments
2 The severity argument passes the specified severity level.
3 The message argument passes the C string from the message clause.
4 The length argument provides the length of the C string.
5 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
6 runtime routine implements the region associated with a callback that has type signature
7 ompt_callback_error_t then codeptr_ra contains the return address of the call to that
8 runtime routine. If the implementation of the region is inlined then codeptr_ra contains the return
9 address of the callback invocation. If attribution to source code is impossible or inappropriate,
10 codeptr_ra may be NULL.
11 Cross References
12 • ompt_severity_t, see Section 19.4.4.25
13 • error directive, see Section 8.5

14 19.6 OMPT Runtime Entry Points for Tools


15 OMPT supports two principal sets of runtime entry points for tools. One set of runtime entry points
16 enables a tool to register callbacks for OpenMP events and to inspect the state of an OpenMP thread
17 while executing in a tool callback or a signal handler. The second set of runtime entry points
18 enables a tool to trace activities on a device. When directed by the tracing interface, an OpenMP
19 implementation will trace activities on a device, collect buffers of trace records, and invoke
20 callbacks on the host to process these records. OMPT runtime entry points should not be global
21 symbols since tools cannot rely on the visibility of such symbols.
22 OMPT also supports runtime entry points for two classes of lookup routines. The first class of
23 lookup routines contains a single member: a routine that returns runtime entry points in the OMPT
24 callback interface. The second class of lookup routines includes a unique lookup routine for each
25 kind of device that can return runtime entry points in a device’s OMPT tracing interface.
26 The omp-tools.h C/C++ header file provides the definitions of the types that are specified
27 throughout this subsection.
28 Binding
29 The binding thread set for each of the entry points in this section is the encountering thread unless
30 otherwise specified. The binding task set is the task executing on the encountering thread.
31 Restrictions
32 Restrictions on OMPT runtime entry points are as follows:
33 • OMPT runtime entry points must not be called from a signal handler on a native thread before a
34 native-thread-begin or after a native-thread-end event.
35 • OMPT device runtime entry points must not be called after a device-finalize event for that device.

510 OpenMP API – Version 5.2 November 2021


1 19.6.1 Entry Points in the OMPT Callback Interface
2 Entry points in the OMPT callback interface enable a tool to register callbacks for OpenMP events
3 and to inspect the state of an OpenMP thread while executing in a tool callback or a signal handler.
4 Pointers to these runtime entry points are obtained through the lookup function that is provided
5 through the OMPT initializer.

6 19.6.1.1 ompt_enumerate_states_t
7 Summary
8 The ompt_enumerate_states_t type is the type signature of the
9 ompt_enumerate_states runtime entry point, which enumerates the thread states that an
10 OpenMP implementation supports.

11 Format
C / C++
12 typedef int (*ompt_enumerate_states_t) (
13 int current_state,
14 int *next_state,
15 const char **next_state_name
16 );
C / C++
17 Semantics
18 An OpenMP implementation may support only a subset of the states that the ompt_state_t
19 enumeration type defines. An OpenMP implementation may also support implementation-specific
20 states. The ompt_enumerate_states runtime entry point, which has type signature
21 ompt_enumerate_states_t, enables a tool to enumerate the supported thread states.
22 When a supported thread state is passed as current_state, the runtime entry point assigns the next
23 thread state in the enumeration to the variable passed by reference in next_state and assigns the
24 name associated with that state to the character pointer passed by reference in next_state_name.
25 Whenever one or more states are left in the enumeration, the ompt_enumerate_states
26 runtime entry point returns 1. When the last state in the enumeration is passed as current_state,
27 ompt_enumerate_states returns 0, which indicates that the enumeration is complete.

28 Description of Arguments
29 The current_state argument must be a thread state that the OpenMP implementation supports. To
30 begin enumerating the supported states, a tool should pass ompt_state_undefined as
31 current_state. Subsequent invocations of ompt_enumerate_states should pass the value
32 assigned to the variable that was passed by reference in next_state to the previous call.
33 The value ompt_state_undefined is reserved to indicate an invalid thread state.
34 ompt_state_undefined is defined as an integer with the value 0x102.

CHAPTER 19. OMPT INTERFACE 511


1 The next_state argument is a pointer to an integer in which ompt_enumerate_states returns
2 the value of the next state in the enumeration.
3 The next_state_name argument is a pointer to a character string pointer through which
4 ompt_enumerate_states returns a string that describes the next state.

5 Constraints on Arguments
6 Any string returned through the next_state_name argument must be immutable and defined for the
7 lifetime of program execution.

8 Cross References
9 • ompt_state_t, see Section 19.4.4.28

10 19.6.1.2 ompt_enumerate_mutex_impls_t
11 Summary
12 The ompt_enumerate_mutex_impls_t type is the type signature of the
13 ompt_enumerate_mutex_impls runtime entry point, which enumerates the kinds of mutual
14 exclusion implementations that an OpenMP implementation employs.

15 Format
C / C++
16 typedef int (*ompt_enumerate_mutex_impls_t) (
17 int current_impl,
18 int *next_impl,
19 const char **next_impl_name
20 );
C / C++
21 Semantics
22 Mutual exclusion for locks, critical sections, and atomic regions may be implemented in
23 several ways. The ompt_enumerate_mutex_impls runtime entry point, which has type
24 signature ompt_enumerate_mutex_impls_t, enables a tool to enumerate the supported
25 mutual exclusion implementations.
26 When a supported mutex implementation is passed as current_impl, the runtime entry point assigns
27 the next mutex implementation in the enumeration to the variable passed by reference in next_impl
28 and assigns the name associated with that mutex implementation to the character pointer passed by
29 reference in next_impl_name.
30 Whenever one or more mutex implementations are left in the enumeration, the
31 ompt_enumerate_mutex_impls runtime entry point returns 1. When the last mutex
32 implementation in the enumeration is passed as current_impl, the runtime entry point returns 0,
33 which indicates that the enumeration is complete.

512 OpenMP API – Version 5.2 November 2021


1 Description of Arguments
2 The current_impl argument must be a mutex implementation that an OpenMP implementation
3 supports. To begin enumerating the supported mutex implementations, a tool should pass
4 ompt_mutex_impl_none as current_impl. Subsequent invocations of
5 ompt_enumerate_mutex_impls should pass the value assigned to the variable that was
6 passed in next_impl to the previous call.
7 The value ompt_mutex_impl_none is reserved to indicate an invalid mutex implementation.
8 ompt_mutex_impl_none is defined as an integer with the value 0.
9 The next_impl argument is a pointer to an integer in which ompt_enumerate_mutex_impls
10 returns the value of the next mutex implementation in the enumeration.
11 The next_impl_name argument is a pointer to a character string pointer in which
12 ompt_enumerate_mutex_impls returns a string that describes the next mutex
13 implementation.

14 Constraints on Arguments
15 Any string returned through the next_impl_name argument must be immutable and defined for the
16 lifetime of a program execution.

17 19.6.1.3 ompt_set_callback_t
18 Summary
19 The ompt_set_callback_t type is the type signature of the ompt_set_callback runtime
20 entry point, which registers a pointer to a tool callback that an OpenMP implementation invokes
21 when a host OpenMP event occurs.

22 Format
C / C++
23 typedef ompt_set_result_t (*ompt_set_callback_t) (
24 ompt_callbacks_t event,
25 ompt_callback_t callback
26 );
C / C++
27 Semantics
28 OpenMP implementations can use callbacks to indicate the occurrence of events during the
29 execution of an OpenMP program. The ompt_set_callback runtime entry point, which has
30 type signature ompt_set_callback_t, registers a callback for an OpenMP event on the
31 current device, The return value of ompt_set_callback indicates the outcome of registering
32 the callback.

CHAPTER 19. OMPT INTERFACE 513


1 Description of Arguments
2 The event argument indicates the event for which the callback is being registered.
3 The callback argument is a tool callback function. If callback is NULL then callbacks associated
4 with event are disabled. If callbacks are successfully disabled then ompt_set_always is
5 returned.
6 Constraints on Arguments
7 When a tool registers a callback for an event, the type signature for the callback must match the
8 type signature appropriate for the event.
9 Restrictions
10 Restrictions on the ompt_set_callback runtime entry point are as follows:
11 • The entry point must not return ompt_set_impossible.
12 Cross References
13 • Callbacks, see Section 19.4.2
14 • Monitoring Activity on the Host with OMPT, see Section 19.2.4
15 • ompt_callback_t, see Section 19.4.4.1
16 • ompt_get_callback_t, see Section 19.6.1.4
17 • ompt_set_result_t, see Section 19.4.4.2

18 19.6.1.4 ompt_get_callback_t
19 Summary
20 The ompt_get_callback_t type is the type signature of the ompt_get_callback runtime
21 entry point, which retrieves a pointer to a registered tool callback routine (if any) that an OpenMP
22 implementation invokes when a host OpenMP event occurs.
23 Format
C / C++
24 typedef int (*ompt_get_callback_t) (
25 ompt_callbacks_t event,
26 ompt_callback_t *callback
27 );
C / C++
28 Semantics
29 The ompt_get_callback runtime entry point, which has type signature
30 ompt_get_callback_t, retrieves a pointer to the tool callback that an OpenMP
31 implementation may invoke when a host OpenMP event occurs. If a non-null tool callback is
32 registered for the specified event, the pointer to the tool callback is assigned to the variable passed
33 by reference in callback and ompt_get_callback returns 1; otherwise, it returns 0. If
34 ompt_get_callback returns 0, the value of the variable passed by reference as callback is
35 undefined.

514 OpenMP API – Version 5.2 November 2021


1 Description of Arguments
2 The event argument indicates the event for which the callback would be invoked.
3 The callback argument returns a pointer to the callback associated with event.
4 Constraints on Arguments
5 The callback argument cannot be NULL and must point to valid storage.
6 Cross References
7 • Callbacks, see Section 19.4.2
8 • ompt_callback_t, see Section 19.4.4.1
9 • ompt_set_callback_t, see Section 19.6.1.3

10 19.6.1.5 ompt_get_thread_data_t
11 Summary
12 The ompt_get_thread_data_t type is the type signature of the
13 ompt_get_thread_data runtime entry point, which returns the address of the thread data
14 object for the current thread.

15 Format
C / C++
16 typedef ompt_data_t *(*ompt_get_thread_data_t) (void);
C / C++
17 Semantics
18 Each OpenMP thread can have an associated thread data object of type ompt_data_t. The
19 ompt_get_thread_data runtime entry point, which has type signature
20 ompt_get_thread_data_t, retrieves a pointer to the thread data object, if any, that is
21 associated with the current thread. A tool may use a pointer to an OpenMP thread’s data object that
22 ompt_get_thread_data retrieves to inspect or to modify the value of the data object. When
23 an OpenMP thread is created, its data object is initialized with value ompt_data_none. This
24 runtime entry point is async signal safe.

25 Cross References
26 • ompt_data_t, see Section 19.4.4.4

27 19.6.1.6 ompt_get_num_procs_t
28 Summary
29 The ompt_get_num_procs_t type is the type signature of the ompt_get_num_procs
30 runtime entry point, which returns the number of processors currently available to the execution
31 environment on the host device.

CHAPTER 19. OMPT INTERFACE 515


1 Format
C / C++
2 typedef int (*ompt_get_num_procs_t) (void);
C / C++
3 Binding
4 The binding thread set is all threads on the host device.

5 Semantics
6 The ompt_get_num_procs runtime entry point, which has type signature
7 ompt_get_num_procs_t, returns the number of processors that are available on the host
8 device at the time the routine is called. This value may change between the time that it is
9 determined and the time that it is read in the calling context due to system actions outside the
10 control of the OpenMP implementation. This runtime entry point is async signal safe.

11 19.6.1.7 ompt_get_num_places_t
12 Summary
13 The ompt_get_num_places_t type is the type signature of the ompt_get_num_places
14 runtime entry point, which returns the number of places currently available to the execution
15 environment in the place list.

16 Format
C / C++
17 typedef int (*ompt_get_num_places_t) (void);
C / C++
18 Binding
19 The binding thread set is all threads on a device.

20 Semantics
21 The ompt_get_num_places runtime entry point, which has type signature
22 ompt_get_num_places_t, returns the number of places in the place list. This value is
23 equivalent to the number of places in the place-partition-var ICV in the execution environment of
24 the initial task. This runtime entry point is async signal safe.

25 Cross References
26 • OMP_PLACES, see Section 21.1.6
27 • place-partition-var ICV, see Table 2.1

516 OpenMP API – Version 5.2 November 2021


1 19.6.1.8 ompt_get_place_proc_ids_t
2 Summary
3 The ompt_get_place_procs_ids_t type is the type signature of the
4 ompt_get_num_place_procs_ids runtime entry point, which returns the numerical
5 identifiers of the processors that are available to the execution environment in the specified place.

6 Format
C / C++
7 typedef int (*ompt_get_place_proc_ids_t) (
8 int place_num,
9 int ids_size,
10 int *ids
11 );
C / C++
12 Binding
13 The binding thread set is all threads on a device.

14 Semantics
15 The ompt_get_place_proc_ids runtime entry point, which has type signature
16 ompt_get_place_proc_ids_t, returns the numerical identifiers of each processor that is
17 associated with the specified place. These numerical identifiers are non-negative, and their meaning
18 is implementation defined.

19 Description of Arguments
20 The place_num argument specifies the place that is being queried.
21 The ids argument is an array in which the routine can return a vector of processor identifiers in the
22 specified place.
23 The ids_size argument indicates the size of the result array that is specified by ids.

24 Effect
25 If the ids array of size ids_size is large enough to contain all identifiers then they are returned in ids
26 and their order in the array is implementation defined. Otherwise, if the ids array is too small, the
27 values in ids when the function returns are unspecified. The routine always returns the number of
28 numerical identifiers of the processors that are available to the execution environment in the
29 specified place.

30 19.6.1.9 ompt_get_place_num_t
31 Summary
32 The ompt_get_place_num_t type is the type signature of the ompt_get_place_num
33 runtime entry point, which returns the place number of the place to which the current thread is
34 bound.

CHAPTER 19. OMPT INTERFACE 517


1 Format
C / C++
2 typedef int (*ompt_get_place_num_t) (void);
C / C++
3 Semantics
4 When the current thread is bound to a place, ompt_get_place_num returns the place number
5 associated with the thread. The returned value is between 0 and one less than the value returned by
6 ompt_get_num_places, inclusive. When the current thread is not bound to a place, the routine
7 returns -1. This runtime entry point is async signal safe.

8 19.6.1.10 ompt_get_partition_place_nums_t
9 Summary
10 The ompt_get_partition_place_nums_t type is the type signature of the
11 ompt_get_partition_place_nums runtime entry point, which returns a list of place
12 numbers that correspond to the places in the place-partition-var ICV of the innermost implicit task.

13 Format
C / C++
14 typedef int (*ompt_get_partition_place_nums_t) (
15 int place_nums_size,
16 int *place_nums
17 );
C / C++
18 Semantics
19 The ompt_get_partition_place_nums runtime entry point, which has type signature
20 ompt_get_partition_place_nums_t, returns a list of place numbers that correspond to
21 the places in the place-partition-var ICV of the innermost implicit task. This runtime entry point is
22 async signal safe.

23 Description of Arguments
24 The place_nums argument is an array in which the routine can return a vector of place identifiers.
25 The place_nums_size argument indicates the size of the result array that the place_nums argument
26 specifies.

27 Effect
28 If the place_nums array of size place_nums_size is large enough to contain all identifiers then they
29 are returned in place_nums and their order in the array is implementation defined. Otherwise, if the
30 place_nums array is too small, the values in place_nums when the function returns are unspecified.
31 The routine always returns the number of places in the place-partition-var ICV of the innermost
32 implicit task.

518 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • OMP_PLACES, see Section 21.1.6
3 • place-partition-var ICV, see Table 2.1

4 19.6.1.11 ompt_get_proc_id_t
5 Summary
6 The ompt_get_proc_id_t type is the type signature of the ompt_get_proc_id runtime
7 entry point, which returns the numerical identifier of the processor of the current thread.

8 Format
C / C++
9 typedef int (*ompt_get_proc_id_t) (void);
C / C++
10 Semantics
11 The ompt_get_proc_id runtime entry point, which has type signature
12 ompt_get_proc_id_t, returns the numerical identifier of the processor of the current thread.
13 A defined numerical identifier is non-negative, and its meaning is implementation defined. A
14 negative number indicates a failure to retrieve the numerical identifier. This runtime entry point is
15 async signal safe.

16 19.6.1.12 ompt_get_state_t
17 Summary
18 The ompt_get_state_t type is the type signature of the ompt_get_state runtime entry
19 point, which returns the state and the wait identifier of the current thread.

20 Format
C / C++
21 typedef int (*ompt_get_state_t) (
22 ompt_wait_id_t *wait_id
23 );
C / C++
24 Semantics
25 Each OpenMP thread has an associated state and a wait identifier. If a thread’s state indicates that
26 the thread is waiting for mutual exclusion then its wait identifier contains an opaque handle that
27 indicates the data object upon which the thread is waiting. The ompt_get_state runtime entry
28 point, which has type signature ompt_get_state_t, retrieves the state and wait identifier of the
29 current thread. The returned value may be any one of the states predefined by ompt_state_t or
30 a value that represents an implementation-specific state. The tool may obtain a string representation

CHAPTER 19. OMPT INTERFACE 519


1 for each state with the ompt_enumerate_states function. If the returned state indicates that
2 the thread is waiting for a lock, nest lock, critical region, atomic region, or ordered region
3 then the value of the thread’s wait identifier is assigned to a non-null wait identifier passed as the
4 wait_id argument. This runtime entry point is async signal safe.

5 Description of Arguments
6 The wait_id argument is a pointer to an opaque handle that is available to receive the value of the
7 wait identifier of the thread. If wait_id is not NULL then the entry point assigns the value of the
8 wait identifier of the thread to the object to which wait_id points. If the returned state is not one of
9 the specified wait states then the value of the opaque object to which wait_id points is undefined
10 after the call.

11 Constraints on Arguments
12 The argument passed to the entry point must be a reference to a variable of the specified type or
13 NULL.

14 Cross References
15 • ompt_enumerate_states_t, see Section 19.6.1.1
16 • ompt_state_t, see Section 19.4.4.28
17 • ompt_wait_id_t, see Section 19.4.4.31

18 19.6.1.13 ompt_get_parallel_info_t
19 Summary
20 The ompt_get_parallel_info_t type is the type signature of the
21 ompt_get_parallel_info runtime entry point, which returns information about the parallel
22 region, if any, at the specified ancestor level for the current execution context.

23 Format
C / C++
24 typedef int (*ompt_get_parallel_info_t) (
25 int ancestor_level,
26 ompt_data_t **parallel_data,
27 int *team_size
28 );
C / C++
29 Semantics
30 During execution, an OpenMP program may employ nested parallel regions. The
31 ompt_get_parallel_info runtime entry point, which has type signature
32 ompt_get_parallel_info_t, retrieves information about the current parallel region and any
33 enclosing parallel regions for the current execution context. The entry point returns 2 if a parallel
34 region exists at the specified ancestor level and the information is available, 1 if a parallel region
35 exists at the specified ancestor level but the information is currently unavailable, and 0 otherwise.

520 OpenMP API – Version 5.2 November 2021


1 A tool may use the pointer to the data object of a parallel region that it obtains from this runtime
2 entry point to inspect or to modify the value of the data object. When a parallel region is created, its
3 data object will be initialized with the value ompt_data_none.
4 This runtime entry point is async signal safe.
5 Between a parallel-begin event and an implicit-task-begin event, a call to
6 ompt_get_parallel_info(0,...) may return information about the outer parallel team or
7 the new parallel team.
8 If a thread is in the state ompt_state_wait_barrier_implicit_parallel then a call to
9 ompt_get_parallel_info may return a pointer to a copy of the specified parallel region’s
10 parallel_data rather than a pointer to the data word for the region itself. This convention enables
11 the primary thread for a parallel region to free storage for the region immediately after the region
12 ends, yet avoid having some other thread in the team that is executing the region potentially
13 reference the parallel_data object for the region after it has been freed.
14 Description of Arguments
15 The ancestor_level argument specifies the parallel region of interest by its ancestor level. Ancestor
16 level 0 refers to the innermost parallel region; information about enclosing parallel regions may be
17 obtained using larger values for ancestor_level.
18 The parallel_data argument returns the parallel data if the argument is not NULL.
19 The team_size argument returns the team size if the argument is not NULL.
20 Effect
21 If the runtime entry point returns 0 or 1, no argument is modified. Otherwise,
22 ompt_get_parallel_info has the following effects:
23 • If a non-null value was passed for parallel_data, the value returned in parallel_data is a pointer
24 to a data word that is associated with the parallel region at the specified level; and
25 • If a non-null value was passed for team_size, the value returned in the integer to which team_size
26 point is the number of threads in the team that is associated with the parallel region.
27 Constraints on Arguments
28 While argument ancestor_level is passed by value, all other arguments to the entry point must be
29 pointers to variables of the specified types or NULL.
30 Cross References
31 • ompt_data_t, see Section 19.4.4.4

32 19.6.1.14 ompt_get_task_info_t
33 Summary
34 The ompt_get_task_info_t type is the type signature of the ompt_get_task_info
35 runtime entry point, which returns information about the task, if any, at the specified ancestor level
36 in the current execution context.

CHAPTER 19. OMPT INTERFACE 521


1 Format
C / C++
2 typedef int (*ompt_get_task_info_t) (
3 int ancestor_level,
4 int *flags,
5 ompt_data_t **task_data,
6 ompt_frame_t **task_frame,
7 ompt_data_t **parallel_data,
8 int *thread_num
9 );
C / C++
10 Semantics
11 During execution, an OpenMP thread may be executing an OpenMP task. Additionally, the stack of
12 the thread may contain procedure frames that are associated with suspended OpenMP tasks or
13 OpenMP runtime system routines. To obtain information about any task on the stack of the current
14 thread, a tool uses the ompt_get_task_info runtime entry point, which has type signature
15 ompt_get_task_info_t.
16 Ancestor level 0 refers to the active task; information about other tasks with associated frames
17 present on the stack in the current execution context may be queried at higher ancestor levels.
18 The ompt_get_task_info runtime entry point returns 2 if a task region exists at the specified
19 ancestor level and the information is available, 1 if a task region exists at the specified ancestor level
20 but the information is currently unavailable, and 0 otherwise.
21 If a task exists at the specified ancestor level and the information is available then information is
22 returned in the variables passed by reference to the entry point. If no task region exists at the
23 specified ancestor level or the information is unavailable then the values of variables passed by
24 reference to the entry point are undefined when ompt_get_task_info returns.
25 A tool may use a pointer to a data object for a task or parallel region that it obtains from
26 ompt_get_task_info to inspect or to modify the value of the data object. When either a
27 parallel region or a task region is created, its data object will be initialized with the value
28 ompt_data_none.
29 This runtime entry point is async signal safe.
30 Description of Arguments
31 The ancestor_level argument specifies the task region of interest by its ancestor level. Ancestor
32 level 0 refers to the active task; information about ancestor tasks found in the current execution
33 context may be queried at higher ancestor levels.
34 The flags argument returns the task type if the argument is not NULL.
35 The task_data argument returns the task data if the argument is not NULL.
36 The task_frame argument returns the task frame pointer if the argument is not NULL.

522 OpenMP API – Version 5.2 November 2021


1 The parallel_data argument returns the parallel data if the argument is not NULL.
2 The thread_num argument returns the thread number if the argument is not NULL.
3 Effect
4 If the runtime entry point returns 0 or 1, no argument is modified. Otherwise,
5 ompt_get_task_info has the following effects:
6 • If a non-null value was passed for flags then the value returned in the integer to which flags
7 points represents the type of the task at the specified level; possible task types include initial,
8 implicit, explicit, and target tasks;
9 • If a non-null value was passed for task_data then the value that is returned in the object to which
10 it points is a pointer to a data word that is associated with the task at the specified level;
11 • If a non-null value was passed for task_frame then the value that is returned in the object to
12 which task_frame points is a pointer to the ompt_frame_t structure that is associated with the
13 task at the specified level;
14 • If a non-null value was passed for parallel_data then the value that is returned in the object to
15 which parallel_data points is a pointer to a data word that is associated with the parallel region
16 that contains the task at the specified level or, if the task at the specified level is an initial task,
17 NULL; and
18 • If a non-null value was passed for thread_num, then the value that is returned in the object to
19 which thread_num points indicates the number of the thread in the parallel region that is
20 executing the task at the specified level.
21 Constraints on Arguments
22 While argument ancestor_level is passed by value, all other arguments to
23 ompt_get_task_info must be pointers to variables of the specified types or NULL.

24 Cross References
25 • ompt_data_t, see Section 19.4.4.4
26 • ompt_frame_t, see Section 19.4.4.29
27 • ompt_task_flag_t, see Section 19.4.4.19

28 19.6.1.15 ompt_get_task_memory_t
29 Summary
30 The ompt_get_task_memory_t type is the type signature of the
31 ompt_get_task_memory runtime entry point, which returns information about memory ranges
32 that are associated with the task.

CHAPTER 19. OMPT INTERFACE 523


1 Format
C / C++
2 typedef int (*ompt_get_task_memory_t)(
3 void **addr,
4 size_t *size,
5 int block
6 );
C / C++
7 Semantics
8 During execution, an OpenMP thread may be executing an OpenMP task. The OpenMP
9 implementation must preserve the data environment from the creation of the task for the execution
10 of the task. The ompt_get_task_memory runtime entry point, which has type signature
11 ompt_get_task_memory_t, provides information about the memory ranges used to store the
12 data environment for the current task. Multiple memory ranges may be used to store these data.
13 The block argument supports iteration over these memory ranges. The
14 ompt_get_task_memory runtime entry point returns 1 if more memory ranges are available,
15 and 0 otherwise. If no memory is used for a task, size is set to 0. In this case, addr is unspecified.
16 This runtime entry point is async signal safe.

17 Description of Arguments
18 The addr argument is a pointer to a void pointer return value to provide the start address of a
19 memory block.
20 The size argument is a pointer to a size type return value to provide the size of the memory block.
21 The block argument is an integer value to specify the memory block of interest.

22 19.6.1.16 ompt_get_target_info_t
23 Summary
24 The ompt_get_target_info_t type is the type signature of the
25 ompt_get_target_info runtime entry point, which returns identifiers that specify a thread’s
26 current target region and target operation ID, if any.

27 Format
C / C++
28 typedef int (*ompt_get_target_info_t) (
29 uint64_t *device_num,
30 ompt_id_t *target_id,
31 ompt_id_t *host_op_id
32 );
C / C++

524 OpenMP API – Version 5.2 November 2021


1 Semantics
2 The ompt_get_target_info entry point, which has type signature
3 ompt_get_target_info_t, returns 1 if the current thread is in a target region and 0
4 otherwise. If the entry point returns 0 then the values of the variables passed by reference as its
5 arguments are undefined. If the current thread is in a target region then
6 ompt_get_target_info returns information about the current device, active target region,
7 and active host operation, if any. This runtime entry point is async signal safe.

8 Description of Arguments
9 The device_num argument returns the device number if the current thread is in a target region.
10 The target_id argument returns the target region identifier if the current thread is in a target
11 region.
12 If the current thread is in the process of initiating an operation on a target device (for example,
13 copying data to or from an accelerator or launching a kernel), then host_op_id returns the identifier
14 for the operation; otherwise, host_op_id returns ompt_id_none.

15 Constraints on Arguments
16 Arguments passed to the entry point must be valid references to variables of the specified types.

17 Cross References
18 • ompt_id_t, see Section 19.4.4.3

19 19.6.1.17 ompt_get_num_devices_t
20 Summary
21 The ompt_get_num_devices_t type is the type signature of the
22 ompt_get_num_devices runtime entry point, which returns the number of available devices.

23 Format
C / C++
24 typedef int (*ompt_get_num_devices_t) (void);
C / C++
25 Semantics
26 The ompt_get_num_devices runtime entry point, which has type signature
27 ompt_get_num_devices_t, returns the number of devices available to an OpenMP program.
28 This runtime entry point is async signal safe.

29 19.6.1.18 ompt_get_unique_id_t
30 Summary
31 The ompt_get_unique_id_t type is the type signature of the ompt_get_unique_id
32 runtime entry point, which returns a unique number.

CHAPTER 19. OMPT INTERFACE 525


1 Format
C / C++
2 typedef uint64_t (*ompt_get_unique_id_t) (void);
C / C++
3 Semantics
4 The ompt_get_unique_id runtime entry point, which has type signature
5 ompt_get_unique_id_t, returns a number that is unique for the duration of an OpenMP
6 program. Successive invocations may not result in consecutive or even increasing numbers. This
7 runtime entry point is async signal safe.

8 19.6.1.19 ompt_finalize_tool_t
9 Summary
10 The ompt_finalize_tool_t type is the type signature of the ompt_finalize_tool
11 runtime entry point, which enables a tool to finalize itself.

12 Format
C / C++
13 typedef void (*ompt_finalize_tool_t) (void);
C / C++
14 Semantics
15 A tool may detect that the execution of an OpenMP program is ending before the OpenMP
16 implementation does. To facilitate clean termination of the tool, the tool may invoke the
17 ompt_finalize_tool runtime entry point, which has type signature
18 ompt_finalize_tool_t. Upon completion of ompt_finalize_tool, no OMPT
19 callbacks are dispatched.
20 Effect
21 The ompt_finalize_tool routine detaches the tool from the runtime, unregisters all callbacks
22 and invalidates all OMPT entry points passed to the tool in the lookup-function. Upon completion
23 of ompt_finalize_tool, no further callbacks will be issued on any thread. Before the
24 callbacks are unregistered, the OpenMP runtime should attempt to dispatch all outstanding
25 registered callbacks as well as the callbacks that would be encountered during shutdown of the
26 runtime, if possible in the current execution context.

27 19.6.2 Entry Points in the OMPT Device Tracing Interface


28 The runtime entry points with type signatures of the types that are specified in this section enable a
29 tool to trace activities on a device.

526 OpenMP API – Version 5.2 November 2021


1 19.6.2.1 ompt_get_device_num_procs_t
2 Summary
3 The ompt_get_device_num_procs_t type is the type signature of the
4 ompt_get_device_num_procs runtime entry point, which returns the number of processors
5 currently available to the execution environment on the specified device.
6 Format
C / C++
7 typedef int (*ompt_get_device_num_procs_t) (
8 ompt_device_t *device
9 );
C / C++
10 Semantics
11 The ompt_get_device_num_procs runtime entry point, which has type signature
12 ompt_get_device_num_procs_t, returns the number of processors that are available on the
13 device at the time the routine is called. This value may change between the time that it is
14 determined and the time that it is read in the calling context due to system actions outside the
15 control of the OpenMP implementation.

16 Description of Arguments
17 The device argument is a pointer to an opaque object that represents the target device instance. The
18 pointer to the device instance object is used by functions in the device tracing interface to identify
19 the device being addressed.

20 Cross References
21 • ompt_device_t, see Section 19.4.4.5

22 19.6.2.2 ompt_get_device_time_t
23 Summary
24 The ompt_get_device_time_t type is the type signature of the
25 ompt_get_device_time runtime entry point, which returns the current time on the specified
26 device.
27 Format
C / C++
28 typedef ompt_device_time_t (*ompt_get_device_time_t) (
29 ompt_device_t *device
30 );
C / C++

CHAPTER 19. OMPT INTERFACE 527


1 Semantics
2 Host and target devices are typically distinct and run independently. If host and target devices are
3 different hardware components, they may use different clock generators. For this reason, a common
4 time base for ordering host-side and device-side events may not be available. The
5 ompt_get_device_time runtime entry point, which has type signature
6 ompt_get_device_time_t, returns the current time on the specified device. A tool can use
7 this information to align time stamps from different devices.
8 Description of Arguments
9 The device argument is a pointer to an opaque object that represents the target device instance. The
10 pointer to the device instance object is used by functions in the device tracing interface to identify
11 the device being addressed.
12 Cross References
13 • ompt_device_t, see Section 19.4.4.5
14 • ompt_device_time_t, see Section 19.4.4.6

15 19.6.2.3 ompt_translate_time_t
16 Summary
17 The ompt_translate_time_t type is the type signature of the ompt_translate_time
18 runtime entry point, which translates a time value that is obtained from the specified device to a
19 corresponding time value on the host device.
20 Format
C / C++
21 typedef double (*ompt_translate_time_t) (
22 ompt_device_t *device,
23 ompt_device_time_t time
24 );
C / C++
25 Semantics
26 The ompt_translate_time runtime entry point, which has type signature
27 ompt_translate_time_t, translates a time value obtained from the specified device to a
28 corresponding time value on the host device. The returned value for the host time has the same
29 meaning as the value returned from omp_get_wtime.
30 Description of Arguments
31 The device argument is a pointer to an opaque object that represents the target device instance. The
32 pointer to the device instance object is used by functions in the device tracing interface to identify
33 the device being addressed.
34 The time argument is a time from the specified device.

528 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • omp_get_wtime, see Section 18.10.1
3 • ompt_device_t, see Section 19.4.4.5
4 • ompt_device_time_t, see Section 19.4.4.6

5 19.6.2.4 ompt_set_trace_ompt_t
6 Summary
7 The ompt_set_trace_ompt_t type is the type signature of the ompt_set_trace_ompt
8 runtime entry point, which enables or disables the recording of trace records for one or more types
9 of OMPT events.

10 Format
C / C++
11 typedef ompt_set_result_t (*ompt_set_trace_ompt_t) (
12 ompt_device_t *device,
13 unsigned int enable,
14 unsigned int etype
15 );
C / C++
16 Description of Arguments
17 The device argument points to an opaque object that represents the target device instance. Functions
18 in the device tracing interface use this pointer to identify the device that is being addressed.
19 The etype argument indicates the events to which the invocation of ompt_set_trace_ompt
20 applies. If the value of etype is 0 then the invocation applies to all events. If etype is positive then it
21 applies to the event in ompt_callbacks_t that matches that value.
22 The enable argument indicates whether tracing should be enabled or disabled for the event or events
23 that the etype argument specifies. A positive value for enable indicates that recording should be
24 enabled; a value of 0 for enable indicates that recording should be disabled.

25 Restrictions
26 Restrictions on the ompt_set_trace_ompt runtime entry point are as follows:
27 • The entry point must not return ompt_set_sometimes_paired.

28 Cross References
29 • Callbacks, see Section 19.4.2
30 • Tracing Activity on Target Devices with OMPT, see Section 19.2.5
31 • ompt_device_t, see Section 19.4.4.5
32 • ompt_set_result_t, see Section 19.4.4.2

CHAPTER 19. OMPT INTERFACE 529


1 19.6.2.5 ompt_set_trace_native_t
2 Summary
3 The ompt_set_trace_native_t type is the type signature of the
4 ompt_set_trace_native runtime entry point, which enables or disables the recording of
5 native trace records for a device.

6 Format
C / C++
7 typedef ompt_set_result_t (*ompt_set_trace_native_t) (
8 ompt_device_t *device,
9 int enable,
10 int flags
11 );
C / C++
12 Semantics
13 This interface is designed for use by a tool that cannot directly use native control functions for the
14 device. If a tool can directly use the native control functions then it can invoke native control
15 functions directly using pointers that the lookup function associated with the device provides and
16 that are described in the documentation string that is provided to the device initializer callback.

17 Description of Arguments
18 The device argument points to an opaque object that represents the target device instance. Functions
19 in the device tracing interface use this pointer to identify the device that is being addressed.
20 The enable argument indicates whether this invocation should enable or disable recording of events.
21 The flags argument specifies the kinds of native device monitoring to enable or to disable. Each
22 kind of monitoring is specified by a flag bit. Flags can be composed by using logical or to combine
23 enumeration values from type ompt_native_mon_flag_t.

24 Restrictions
25 Restrictions on the ompt_set_trace_native runtime entry point are as follows:
26 • The entry point must not return ompt_set_sometimes_paired.

27 Cross References
28 • Tracing Activity on Target Devices with OMPT, see Section 19.2.5
29 • ompt_device_t, see Section 19.4.4.5
30 • ompt_native_mon_flag_t, see Section 19.4.4.18
31 • ompt_set_result_t, see Section 19.4.4.2

530 OpenMP API – Version 5.2 November 2021


1 19.6.2.6 ompt_start_trace_t
2 Summary
3 The ompt_start_trace_t type is the type signature of the ompt_start_trace runtime
4 entry point, which starts tracing of activity on a specific device.

5 Format
C / C++
6 typedef int (*ompt_start_trace_t) (
7 ompt_device_t *device,
8 ompt_callback_buffer_request_t request,
9 ompt_callback_buffer_complete_t complete
10 );
C / C++
11 Semantics
12 A device’s ompt_start_trace runtime entry point, which has type signature
13 ompt_start_trace_t, initiates tracing on the device. Under normal operating conditions,
14 every event buffer provided to a device by a tool callback is returned to the tool before the OpenMP
15 runtime shuts down. If an exceptional condition terminates execution of an OpenMP program, the
16 OpenMP runtime may not return buffers provided to the device. An invocation of
17 ompt_start_trace returns 1 if the command succeeds and 0 otherwise.

18 Description of Arguments
19 The device argument points to an opaque object that represents the target device instance. Functions
20 in the device tracing interface use this pointer to identify the device that is being addressed.
21 The request argument specifies a tool callback that supplies a buffer in which a device can deposit
22 events.
23 The complete argument specifies a tool callback that is invoked by the OpenMP implementation to
24 empty a buffer that contains event records.

25 Cross References
26 • ompt_callback_buffer_complete_t, see Section 19.5.2.24
27 • ompt_callback_buffer_request_t, see Section 19.5.2.23
28 • ompt_device_t, see Section 19.4.4.5

29 19.6.2.7 ompt_pause_trace_t
30 Summary
31 The ompt_pause_trace_t type is the type signature of the ompt_pause_trace runtime
32 entry point, which pauses or restarts activity tracing on a specific device.

CHAPTER 19. OMPT INTERFACE 531


1 Format
C / C++
2 typedef int (*ompt_pause_trace_t) (
3 ompt_device_t *device,
4 int begin_pause
5 );
C / C++
6 Semantics
7 A device’s ompt_pause_trace runtime entry point, which has type signature
8 ompt_pause_trace_t, pauses or resumes tracing on a device. An invocation of
9 ompt_pause_trace returns 1 if the command succeeds and 0 otherwise. Redundant pause or
10 resume commands are idempotent and will return the same value as the prior command.

11 Description of Arguments
12 The device argument points to an opaque object that represents the target device instance. Functions
13 in the device tracing interface use this pointer to identify the device that is being addressed.
14 The begin_pause argument indicates whether to pause or to resume tracing. To resume tracing,
15 zero should be supplied for begin_pause; to pause tracing, any other value should be supplied.

16 Cross References
17 • ompt_device_t, see Section 19.4.4.5

18 19.6.2.8 ompt_flush_trace_t
19 Summary
20 The ompt_flush_trace_t type is the type signature of the ompt_flush_trace runtime
21 entry point, which causes all pending trace records for the specified device to be delivered.
22 Format
C / C++
23 typedef int (*ompt_flush_trace_t) (
24 ompt_device_t *device
25 );
C / C++
26 Semantics
27 A device’s ompt_flush_trace runtime entry point, which has type signature
28 ompt_flush_trace_t, causes the OpenMP implementation to issue a sequence of zero or more
29 buffer completion callbacks to deliver all trace records that have been collected prior to the flush.
30 An invocation of ompt_flush_trace returns 1 if the command succeeds and 0 otherwise.
31 Description of Arguments
32 The device argument points to an opaque object that represents the target device instance. Functions
33 in the device tracing interface use this pointer to identify the device that is being addressed.

532 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • ompt_device_t, see Section 19.4.4.5

3 19.6.2.9 ompt_stop_trace_t
4 Summary
5 The ompt_stop_trace_t type is the type signature of the ompt_stop_trace runtime entry
6 point, which stops tracing for a device.
7 Format
C / C++
8 typedef int (*ompt_stop_trace_t) (
9 ompt_device_t *device
10 );
C / C++
11 Semantics
12 A device’s ompt_stop_trace runtime entry point, which has type signature
13 ompt_stop_trace_t, halts tracing on the device and requests that any pending trace records be
14 flushed. An invocation of ompt_stop_trace returns 1 if the command succeeds and 0
15 otherwise.
16 Description of Arguments
17 The device argument points to an opaque object that represents the target device instance. Functions
18 in the device tracing interface use this pointer to identify the device that is being addressed.

19 Cross References
20 • ompt_device_t, see Section 19.4.4.5

21 19.6.2.10 ompt_advance_buffer_cursor_t
22 Summary
23 The ompt_advance_buffer_cursor_t type is the type signature of the
24 ompt_advance_buffer_cursor runtime entry point, which advances a trace buffer cursor to
25 the next record.

26 Format
C / C++
27 typedef int (*ompt_advance_buffer_cursor_t) (
28 ompt_device_t *device,
29 ompt_buffer_t *buffer,
30 size_t size,
31 ompt_buffer_cursor_t current,
32 ompt_buffer_cursor_t *next
33 );
C / C++

CHAPTER 19. OMPT INTERFACE 533


1 Semantics
2 A device’s ompt_advance_buffer_cursor runtime entry point, which has type signature
3 ompt_advance_buffer_cursor_t, advances a trace buffer pointer to the next trace record.
4 An invocation of ompt_advance_buffer_cursor returns true if the advance is successful
5 and the next position in the buffer is valid.

6 Description of Arguments
7 The device argument points to an opaque object that represents the target device instance. Functions
8 in the device tracing interface use this pointer to identify the device that is being addressed.
9 The buffer argument indicates a trace buffer that is associated with the cursors.
10 The argument size indicates the size of buffer in bytes.
11 The current argument is an opaque buffer cursor.
12 The next argument returns the next value of an opaque buffer cursor.

13 Cross References
14 • ompt_buffer_cursor_t, see Section 19.4.4.8
15 • ompt_device_t, see Section 19.4.4.5

16 19.6.2.11 ompt_get_record_type_t
17 Summary
18 The ompt_get_record_type_t type is the type signature of the
19 ompt_get_record_type runtime entry point, which inspects the type of a trace record.
20 Format
C / C++
21 typedef ompt_record_t (*ompt_get_record_type_t) (
22 ompt_buffer_t *buffer,
23 ompt_buffer_cursor_t current
24 );
C / C++
25 Semantics
26 Trace records for a device may be in one of two forms: native record format, which may be
27 device-specific, or OMPT record format, in which each trace record corresponds to an OpenMP
28 event and most fields in the record structure are the arguments that would be passed to the OMPT
29 callback for the event. A device’s ompt_get_record_type runtime entry point, which has
30 type signature ompt_get_record_type_t, inspects the type of a trace record and indicates
31 whether the record at the current position in the trace buffer is an OMPT record, a native record, or
32 an invalid record. An invalid record type is returned if the cursor is out of bounds.

534 OpenMP API – Version 5.2 November 2021


1 Description of Arguments
2 The buffer argument indicates a trace buffer.
3 The current argument is an opaque buffer cursor.
4 Cross References
5 • Record Type, see Section 19.4.3.1
6 • ompt_buffer_cursor_t, see Section 19.4.4.8
7 • ompt_buffer_t, see Section 19.4.4.7

8 19.6.2.12 ompt_get_record_ompt_t
9 Summary
10 The ompt_get_record_ompt_t type is the type signature of the
11 ompt_get_record_ompt runtime entry point, which obtains a pointer to an OMPT trace
12 record from a trace buffer associated with a device.
13 Format
C / C++
14 typedef ompt_record_ompt_t *(*ompt_get_record_ompt_t) (
15 ompt_buffer_t *buffer,
16 ompt_buffer_cursor_t current
17 );
C / C++
18 Semantics
19 A device’s ompt_get_record_ompt runtime entry point, which has type signature
20 ompt_get_record_ompt_t, returns a pointer that may point to a record in the trace buffer, or
21 it may point to a record in thread-local storage in which the information extracted from a record was
22 assembled. The information available for an event depends upon its type. The return value of the
23 ompt_record_ompt_t type includes a field of a union type that can represent information for
24 any OMPT event record type. Another call to the runtime entry point may overwrite the contents of
25 the fields in a record returned by a prior invocation.

26 Description of Arguments
27 The buffer argument indicates a trace buffer.
28 The current argument is an opaque buffer cursor.

29 Cross References
30 • Standard Trace Record Type, see Section 19.4.3.4
31 • ompt_buffer_cursor_t, see Section 19.4.4.8
32 • ompt_device_t, see Section 19.4.4.5

CHAPTER 19. OMPT INTERFACE 535


1 19.6.2.13 ompt_get_record_native_t
2 Summary
3 The ompt_get_record_native_t type is the type signature of the
4 ompt_get_record_native runtime entry point, which obtains a pointer to a native trace
5 record from a trace buffer associated with a device.

6 Format
C / C++
7 typedef void *(*ompt_get_record_native_t) (
8 ompt_buffer_t *buffer,
9 ompt_buffer_cursor_t current,
10 ompt_id_t *host_op_id
11 );
C / C++
12 Semantics
13 A device’s ompt_get_record_native runtime entry point, which has type signature
14 ompt_get_record_native_t, returns a pointer that may point into the specified trace buffer,
15 or into thread-local storage in which the information extracted from a trace record was assembled.
16 The information available for a native event depends upon its type. If the function returns a non-null
17 result, it will also set the object to which host_op_id points to a host-side identifier for the
18 operation that is associated with the record. A subsequent call to ompt_get_record_native
19 may overwrite the contents of the fields in a record returned by a prior invocation.

20 Description of Arguments
21 The buffer argument indicates a trace buffer.
22 The current argument is an opaque buffer cursor.
23 The host_op_id argument is a pointer to an identifier that is returned by the function. The entry
24 point sets the identifier to which host_op_id points to the value of a host-side identifier for an
25 operation on a target device that was created when the operation was initiated by the host.

26 Cross References
27 • ompt_buffer_cursor_t, see Section 19.4.4.8
28 • ompt_buffer_t, see Section 19.4.4.7
29 • ompt_id_t, see Section 19.4.4.3

30 19.6.2.14 ompt_get_record_abstract_t
31 Summary
32 The ompt_get_record_abstract_t type is the type signature of the
33 ompt_get_record_abstract runtime entry point, which summarizes the context of a native
34 (device-specific) trace record.

536 OpenMP API – Version 5.2 November 2021


1 Format
C / C++
2 typedef ompt_record_abstract_t *(*ompt_get_record_abstract_t) (
3 void *native_record
4 );
C / C++
5 Semantics
6 An OpenMP implementation may execute on a device that logs trace records in a native
7 (device-specific) format that a tool cannot interpret directly. The
8 ompt_get_record_abstract runtime entry point of a device, which has type signature
9 ompt_get_record_abstract_t, translates a native trace record into a standard form.

10 Description of Arguments
11 The native_record argument is a pointer to a native trace record.

12 Cross References
13 • Native Record Abstract Type, see Section 19.4.3.3

14 19.6.3 Lookup Entry Points: ompt_function_lookup_t


15 Summary
16 The ompt_function_lookup_t type is the type signature of the lookup runtime entry points
17 that provide pointers to runtime entry points that are part of the OMPT interface.

18 Format
C / C++
19 typedef void (*ompt_interface_fn_t) (void);
20
21 typedef ompt_interface_fn_t (*ompt_function_lookup_t) (
22 const char *interface_function_name
23 );
C / C++
24 Semantics
25 An OpenMP implementation provides pointers to lookup routines that provide pointers to OMPT
26 runtime entry points. When the implementation invokes a tool initializer to configure the OMPT
27 callback interface, it provides a lookup function that provides pointers to runtime entry points that
28 implement routines that are part of the OMPT callback interface. Alternatively, when it invokes a
29 tool initializer to configure the OMPT tracing interface for a device, it provides a lookup function
30 that provides pointers to runtime entry points that implement tracing control routines appropriate
31 for that device.

CHAPTER 19. OMPT INTERFACE 537


1 If the provided function name is unknown to the OpenMP implementation, the function returns
2 NULL. In a compliant implementation, the lookup function provided by the tool initializer for the
3 OMPT callback interface returns a valid function pointer for any OMPT runtime entry point name
4 listed in Table 19.1.
5 A compliant implementation of a lookup function passed to a tool’s
6 ompt_device_initialize callback must provide non-NULL function pointers for all strings
7 in Table 19.4, except for ompt_set_trace_ompt and ompt_get_record_ompt, as
8 described in Section 19.2.5.

9 Description of Arguments
10 The interface_function_name argument is a C string that represents the name of a runtime entry
11 point.

12 Cross References
13 • Entry Points in the OMPT Callback Interface, see Section 19.6.1
14 • Entry Points in the OMPT Device Tracing Interface, see Section 19.6.2
15 • Tracing Activity on Target Devices with OMPT, see Section 19.2.5
16 • ompt_initialize_t, see Section 19.5.1.1

538 OpenMP API – Version 5.2 November 2021


1 20 OMPD Interface
2 This chapter describes OMPD, which is an interface for third-party tools. Third-party tools exist in
3 separate processes from the OpenMP program. To provide OMPD support, an OpenMP
4 implementation must provide an OMPD library that the third-party tool can load. An OpenMP
5 implementation does not need to maintain any extra information to support OMPD inquiries from
6 third-party tools unless it is explicitly instructed to do so.
7 OMPD allows third-party tools such as debuggers to inspect the OpenMP state of a live program or
8 core file in an implementation-agnostic manner. That is, a third-party tool that uses OMPD should
9 work with any conforming OpenMP implementation. An OpenMP implementer provides a library
10 for OMPD that a third-party tool can dynamically load. The third-party tool can use the interface
11 exported by the OMPD library to inspect the OpenMP state of a program. In order to satisfy
12 requests from the third-party tool, the OMPD library may need to read data from the OpenMP
13 program, or to find the addresses of symbols in it. The OMPD library provides this functionality
14 through a callback interface that the third-party tool must instantiate for the OMPD library.
15 To use OMPD, the third-party tool loads the OMPD library. The OMPD library exports the API
16 that is defined throughout this section, and the third-party tool uses the API to determine OpenMP
17 information about the OpenMP program. The OMPD library must look up the symbols and read
18 data out of the program. It does not perform these operations directly but instead directs the third-
19 party tool to perform them by using the callback interface that the third-party tool exports.
20 The OMPD design insulates third-party tools from the internal structure of the OpenMP runtime,
21 while the OMPD library is insulated from the details of how to access the OpenMP program. This
22 decoupled design allows for flexibility in how the OpenMP program and third-party tool are
23 deployed, so that, for example, the third-party tool and the OpenMP program are not required to
24 execute on the same machine.
25 Generally, the third-party tool does not interact directly with the OpenMP runtime but instead
26 interacts with the runtime through the OMPD library. However, a few cases require the third-party
27 tool to access the OpenMP runtime directly. These cases fall into two broad categories. The first is
28 during initialization where the third-party tool must look up symbols and read variables in the
29 OpenMP runtime in order to identify the OMPD library that it should use, which is discussed in
30 Section 20.2.2 and Section 20.2.3. The second category relates to arranging for the third-party tool
31 to be notified when certain events occur during the execution of the OpenMP program. For this
32 purpose, the OpenMP implementation must define certain symbols in the runtime code, as is
33 discussed in Section 20.6. Each of these symbols corresponds to an event type. The OpenMP
34 runtime must ensure that control passes through the appropriate named location when events occur.
35 If the third-party tool requires notification of an event, it can plant a breakpoint at the matching

539
1 location. The location can, but may not, be a function. It can, for example, simply be a label.
2 However, the names of the locations must have external C linkage.

3 20.1 OMPD Interfaces Definitions


C / C++
4 A compliant implementation must supply a set of definitions for the OMPD runtime entry points,
5 OMPD third-party tool callback signatures, third-party tool interface functions and the special data
6 types of their parameters and return values. These definitions, which are listed throughout this
7 chapter, and their associated declarations shall be provided in a header file named omp-tools.h.
8 In addition, the set of definitions may specify other implementation-specific values.
9 The ompd_dll_locations variable, all OMPD third-party tool interface functions, and all
10 OMPD runtime entry points are external symbols with C linkage.
C / C++

11 20.2 Activating a Third-Party Tool


12 The third-party tool and the OpenMP program exist as separate processes. Thus, coordination is
13 required between the OpenMP runtime and the third-party tool for OMPD.

14 20.2.1 Enabling Runtime Support for OMPD


15 In order to support third-party tools, the OpenMP runtime may need to collect and to store
16 information that it may not otherwise maintain. The OpenMP runtime collects whatever
17 information is necessary to support OMPD if the environment variable OMP_DEBUG is set to
18 enabled.

19 Cross References
20 • OMP_DEBUG, see Section 21.4.1

21 20.2.2 ompd_dll_locations
22 Summary
23 The ompd_dll_locations global variable points to the locations of OMPD libraries that are
24 compatible with the OpenMP implementation.

25 Format
C
26 extern const char **ompd_dll_locations;
C

540 OpenMP API – Version 5.2 November 2021


1 Semantics
2 An OpenMP runtime may have more than one OMPD library. The third-party tool must be able to
3 locate the right library to use for the OpenMP program that it is examining. The OpenMP runtime
4 system must provide a public variable ompd_dll_locations, which is an argv-style vector of
5 pathname string pointers that provides the names of any compatible OMPD libraries. This variable
6 must have C linkage. The third-party tool uses the name of the variable verbatim and, in particular,
7 does not apply any name mangling before performing the look up.
8 The architecture on which the third-party tool and, thus, the OMPD library execute does not have to
9 match the architecture on which the OpenMP program that is being examined executes. The
10 third-party tool must interpret the contents of ompd_dll_locations to find a suitable OMPD
11 library that matches its own architectural characteristics. On platforms that support different
12 architectures (for example, 32-bit vs 64-bit), OpenMP implementations are encouraged to provide
13 an OMPD library for each supported architecture that can handle OpenMP programs that run on
14 any supported architecture. Thus, for example, a 32-bit debugger that uses OMPD should be able to
15 debug a 64-bit OpenMP program by loading a 32-bit OMPD implementation that can manage a
16 64-bit OpenMP runtime.
17 The ompd_dll_locations variable points to a NULL-terminated vector of zero or more
18 null-terminated pathname strings that do not have any filename conventions. This vector must be
19 fully initialized before ompd_dll_locations is set to a non-null value. Thus, if a third-party
20 tool, such as a debugger, stops execution of the OpenMP program at any point at which
21 ompd_dll_locations is non-null, the vector of strings to which it points shall be valid and
22 complete.

23 Cross References
24 • ompd_dll_locations_valid, see Section 20.2.3

25 20.2.3 ompd_dll_locations_valid
26 Summary
27 The OpenMP runtime notifies third-party tools that ompd_dll_locations is valid by allowing
28 execution to pass through a location that the symbol ompd_dll_locations_valid identifies.
29 Format
C
30 void ompd_dll_locations_valid(void);
C
31 Semantics
32 Since ompd_dll_locations may not be a static variable, it may require runtime initialization.
33 The OpenMP runtime notifies third-party tools that ompd_dll_locations is valid by having
34 execution pass through a location that the symbol ompd_dll_locations_valid identifies. If
35 ompd_dll_locations is NULL, a third-party tool can place a breakpoint at
36 ompd_dll_locations_valid to be notified that ompd_dll_locations is initialized. In
37 practice, the symbol ompd_dll_locations_valid may not be a function; instead, it may be a
38 labeled machine instruction through which execution passes once the vector is valid.

CHAPTER 20. OMPD INTERFACE 541


1 20.3 OMPD Data Types
2 This section defines OMPD data types.

3 20.3.1 Size Type


4 Summary
5 The ompd_size_t type specifies the number of bytes in opaque data objects that are passed
6 across the OMPD API.
7 Format
C / C++
8 typedef uint64_t ompd_size_t;
C / C++

9 20.3.2 Wait ID Type


10 Summary
11 A variable of ompd_wait_id_t type identifies the object on which a thread waits.
12 Format
C / C++
13 typedef uint64_t ompd_wait_id_t;
C / C++
14 Semantics
15 The values and meaning of ompd_wait_id_t are the same as those defined for the
16 ompt_wait_id_t type.

17 Cross References
18 • ompt_wait_id_t, see Section 19.4.4.31

19 20.3.3 Basic Value Types


20 Summary
21 These definitions represent word, address, and segment value types.
22 Format
C / C++
23 typedef uint64_t ompd_addr_t;
24 typedef int64_t ompd_word_t;
25 typedef uint64_t ompd_seg_t;
C / C++

542 OpenMP API – Version 5.2 November 2021


1 Semantics
2 The ompd_addr_t type represents an address in an OpenMP process with an unsigned integer type.
3 The ompd_word_t type represents a data word from the OpenMP runtime with a signed integer
4 type. The ompd_seg_t type represents a segment value with an unsigned integer type.

5 20.3.4 Address Type


6 Summary
7 The ompd_address_t type is used to specify device addresses.
8 Format
C / C++
9 typedef struct ompd_address_t {
10 ompd_seg_t segment;
11 ompd_addr_t address;
12 } ompd_address_t;
C / C++
13 Semantics
14 The ompd_address_t type is a structure that OMPD uses to specify device addresses, which
15 may or may not be segmented. For non-segmented architectures, ompd_segment_none is used
16 in the segment field of ompd_address_t; it is an instance of the ompd_seg_t type that has the
17 value 0.

18 Cross References
19 • Basic Value Types, see Section 20.3.3

20 20.3.5 Frame Information Type


21 Summary
22 The ompd_frame_info_t type is used to specify frame information.
23 Format
C / C++
24 typedef struct ompd_frame_info_t {
25 ompd_address_t frame_address;
26 ompd_word_t frame_flag;
27 } ompd_frame_info_t;
C / C++
28 Semantics
29 The ompd_frame_info_t type is a structure that OMPD uses to specify frame information.
30 The frame_address field of ompd_frame_info_t identifies a frame. The frame_flag field of
31 ompd_frame_info_t indicates what type of information is provided in frame_address. The
32 values and meaning is the same as defined for the ompt_frame_flag_t enumeration type.

CHAPTER 20. OMPD INTERFACE 543


1 Cross References
2 • Address Type, see Section 20.3.4
3 • Basic Value Types, see Section 20.3.3
4 • ompt_frame_flag_t, see Section 19.4.4.30

5 20.3.6 System Device Identifiers


6 Summary
7 The ompd_device_t type provides information about OpenMP devices.
8 Format
C / C++
9 typedef uint64_t ompd_device_t;
C / C++
10 Semantics
11 OpenMP runtimes may utilize different underlying devices, each represented by a device identifier.
12 The device identifiers can vary in size and format and, thus, are not explicitly represented in the
13 OMPD interface. Instead, a device identifier is passed across the interface via its
14 ompd_device_t kind, its size in bytes and a pointer to where it is stored. The OMPD library and
15 the third-party tool use the ompd_device_t kind to interpret the format of the device identifier
16 that is referenced by the pointer argument. Each different device identifier kind is represented by a
17 unique unsigned 64-bit integer value. Recommended values of ompd_device_t kinds are
18 defined in the ompd-types.h header file, which is available on http://www.openmp.org/.

19 20.3.7 Native Thread Identifiers


20 Summary
21 The ompd_thread_id_t type provides information about native threads.
22 Format
C / C++
23 typedef uint64_t ompd_thread_id_t;
C / C++
24 Semantics
25 OpenMP runtimes may use different native thread implementations. Native thread identifiers for
26 these implementations can vary in size and format and, thus, are not explicitly represented in the
27 OMPD interface. Instead, a native thread identifier is passed across the interface via its
28 ompd_thread_id_t kind, its size in bytes and a pointer to where it is stored. The OMPD
29 library and the third-party tool use the ompd_thread_id_t kind to interpret the format of the
30 native thread identifier that is referenced by the pointer argument. Each different native thread
31 identifier kind is represented by a unique unsigned 64-bit integer value. Recommended values of
32 ompd_thread_id_t kinds, and formats for some corresponding native thread identifiers, are
33 defined in the ompd-types.h header file, which is available on http://www.openmp.org/.

544 OpenMP API – Version 5.2 November 2021


1 20.3.8 OMPD Handle Types
2 Summary
3 The OMPD library defines handles for referring to address spaces, threads, parallel regions and
4 tasks that are managed by the OpenMP runtime. The internal structures that these handles represent
5 are opaque to the third-party tool.
6 Format
C / C++
7 typedef struct _ompd_aspace_handle ompd_address_space_handle_t;
8 typedef struct _ompd_thread_handle ompd_thread_handle_t;
9 typedef struct _ompd_parallel_handle ompd_parallel_handle_t;
10 typedef struct _ompd_task_handle ompd_task_handle_t;
C / C++
11 Semantics
12 OMPD uses handles for the following entities that are managed by the OpenMP runtime: address
13 spaces (ompd_address_space_handle_t), threads (ompd_thread_handle_t), parallel
14 regions (ompd_parallel_handle_t), and tasks (ompd_task_handle_t). Each operation
15 of the OMPD interface that applies to a particular address space, thread, parallel region or task
16 must explicitly specify a corresponding handle. Handles are defined by the OMPD library and are
17 opaque to the third-party tool. A handle remains constant and valid while the associated entity is
18 managed by the OpenMP runtime or until it is released with the corresponding third-party tool
19 interface routine for releasing handles of that type. If a tool receives notification of the end of the
20 lifetime of a managed entity (see Section 20.6) or it releases the handle, the handle may no longer
21 be referenced.
22 Defining externally visible type names in this way introduces type safety to the interface, and helps
23 to catch instances where incorrect handles are passed by the third-party tool to the OMPD library.
24 The structures do not need to be defined; instead, the OMPD library must cast incoming (pointers
25 to) handles to the appropriate internal, private types.

26 20.3.9 OMPD Scope Types


27 Summary
28 The ompd_scope_t type identifies OMPD scopes.
29 Format
C / C++
30 typedef enum ompd_scope_t {
31 ompd_scope_global = 1,
32 ompd_scope_address_space = 2,
33 ompd_scope_thread = 3,
34 ompd_scope_parallel = 4,
35 ompd_scope_implicit_task = 5,
36 ompd_scope_task = 6
37 } ompd_scope_t;
C / C++

CHAPTER 20. OMPD INTERFACE 545


1 Semantics
2 The ompd_scope_t type identifies OpenMP scopes, including those related to parallel regions
3 and tasks. When used in an OMPD interface function call, the scope type and the OMPD handle
4 must match according to Table 20.1.

TABLE 20.1: Mapping of Scope Type and OMPD Handles

Scope types Handles


ompd_scope_global Address space handle for the host device
ompd_scope_address_space Any address space handle
ompd_scope_thread Any thread handle
ompd_scope_parallel Any parallel region handle
ompd_scope_implicit_task Task handle for an implicit task
ompd_scope_task Any task handle

5 20.3.10 ICV ID Type


6 Summary
7 The ompd_icv_id_t type identifies an OpenMP implementation ICV.
8 Format
C / C++
9 typedef uint64_t ompd_icv_id_t;
C / C++
10 Semantics
11 The ompd_icv_id_t type identifies OpenMP implementation ICVs. ompd_icv_undefined
12 is an instance of this type with the value 0.

13 20.3.11 Tool Context Types


14 Summary
15 A third-party tool defines contexts to identify abstractions uniquely. The internal structures that
16 these contexts represent are opaque to the OMPD library.
17 Format
C / C++
18 typedef struct _ompd_aspace_cont ompd_address_space_context_t;
19 typedef struct _ompd_thread_cont ompd_thread_context_t;
C / C++
20 Semantics
21 A third-party tool uniquely defines an address space context to identify the address space for the
22 process that it is monitoring. Similarly, it uniquely defines a thread context to identify a native
23 thread of the process that it is monitoring. These contexts are opaque to the OMPD library.

546 OpenMP API – Version 5.2 November 2021


1 20.3.12 Return Code Types
2 Summary
3 The ompd_rc_t type is the return code type of an OMPD operation.
4 Format
C / C++
5 typedef enum ompd_rc_t {
6 ompd_rc_ok = 0,
7 ompd_rc_unavailable = 1,
8 ompd_rc_stale_handle = 2,
9 ompd_rc_bad_input = 3,
10 ompd_rc_error = 4,
11 ompd_rc_unsupported = 5,
12 ompd_rc_needs_state_tracking = 6,
13 ompd_rc_incompatible = 7,
14 ompd_rc_device_read_error = 8,
15 ompd_rc_device_write_error = 9,
16 ompd_rc_nomem = 10,
17 ompd_rc_incomplete = 11,
18 ompd_rc_callback_error = 12
19 } ompd_rc_t;
C / C++
20 Semantics
21 The ompd_rc_t type is used for the return codes of OMPD operations. The return code types and
22 their semantics are defined as follows:
23 • ompd_rc_ok is returned when the operation is successful;
24 • ompd_rc_unavailable is returned when information is not available for the specified
25 context;
26 • ompd_rc_stale_handle is returned when the specified handle is no longer valid;
27 • ompd_rc_bad_input is returned when the input parameters (other than handle) are invalid;
28 • ompd_rc_error is returned when a fatal error occurred;
29 • ompd_rc_unsupported is returned when the requested operation is not supported;
30 • ompd_rc_needs_state_tracking is returned when the state tracking operation failed
31 because state tracking is not currently enabled;
32 • ompd_rc_device_read_error is returned when a read operation failed on the device;
33 • ompd_rc_device_write_error is returned when a write operation failed on the device;
34 • ompd_rc_incompatible is returned when this OMPD library is incompatible with the
35 OpenMP program or is not capable of handling it;

CHAPTER 20. OMPD INTERFACE 547


1 • ompd_rc_nomem is returned when a memory allocation fails;
2 • ompd_rc_incomplete is returned when the information provided on return is incomplete,
3 while the arguments are still set to valid values; and
4 • ompd_rc_callback_error is returned when the callback interface or any one of the
5 required callback routines provided by the third-party tool is invalid.

6 20.3.13 Primitive Type Sizes


7 Summary
8 The ompd_device_type_sizes_t type provides the size of primitive types in the OpenMP
9 architecture address space.

10 Format
C / C++
11 typedef struct ompd_device_type_sizes_t {
12 uint8_t sizeof_char;
13 uint8_t sizeof_short;
14 uint8_t sizeof_int;
15 uint8_t sizeof_long;
16 uint8_t sizeof_long_long;
17 uint8_t sizeof_pointer;
18 } ompd_device_type_sizes_t;
C / C++
19 Semantics
20 The ompd_device_type_sizes_t type is used in operations through which the OMPD
21 library can interrogate the third-party tool about the size of primitive types for the target
22 architecture of the OpenMP runtime, as returned by the sizeof operator. The fields of
23 ompd_device_type_sizes_t give the sizes of the eponymous basic types used by the
24 OpenMP runtime. As the third-party tool and the OMPD library, by definition, execute on the same
25 architecture, the size of the fields can be given as uint8_t.

26 Cross References
27 • ompd_callback_sizeof_fn_t, see Section 20.4.2.2

548 OpenMP API – Version 5.2 November 2021


1 20.4 OMPD Third-Party Tool Callback Interface
2 For the OMPD library to provide information about the internal state of the OpenMP runtime
3 system in an OpenMP process or core file, it must have a means to extract information from the
4 OpenMP process that the third-party tool is examining. The OpenMP process on which the
5 third-party tool is operating may be either a “live” process or a core file, and a thread may be either
6 a “live” thread in an OpenMP process or a thread in a core file. To enable the OMPD library to
7 extract state information from an OpenMP process or core file, the third-party tool must supply the
8 OMPD library with callback functions to inquire about the size of primitive types in the device of
9 the OpenMP process, to look up the addresses of symbols, and to read and to write memory in the
10 device. The OMPD library uses these callbacks to implement its interface operations. The OMPD
11 library only invokes the callback functions in direct response to calls made by the third-party tool to
12 the OMPD library.

13 Description of Return Codes


14 All of the OMPD callback functions must return the following return codes or function-specific
15 return codes:
16 • ompd_rc_ok on success; or
17 • ompd_rc_stale_handle if an invalid context argument is provided.

18 20.4.1 Memory Management of OMPD Library


19 ompd_callback_memory_alloc_fn_t (see Section 20.4.1.1) and
20 ompd_callback_memory_free_fn_t (see Section 20.4.1.2) are provided by the third-party
21 tool to obtain and to release heap memory. This mechanism ensures that the library does not
22 interfere with any custom memory management scheme that the third-party tool may use.
23 If the OMPD library is implemented in C++ then memory management operators, like new and
24 delete and their variants, must all be overloaded and implemented in terms of the callbacks that
25 the third-party tool provides. The OMPD library must be implemented in a manner such that any of
26 its definitions of new or delete do not interfere with any that the third-party tool defines.
27 In some cases, the OMPD library must allocate memory to return results to the third-party tool.
28 The third-party tool then owns this memory and has the responsibility to release it. Thus, the
29 OMPD library and the third-party tool must use the same memory manager.
30 The OMPD library creates OMPD handles, which are opaque to the third-party tool and may have a
31 complex internal structure. The third-party tool cannot determine if the handle pointers that the
32 API returns correspond to discrete heap allocations. Thus, the third-party tool must not simply
33 deallocate a handle by passing an address that it receives from the OMPD library to its own
34 memory manager. Instead, the OMPD API includes functions that the third-party tool must use
35 when it no longer needs a handle.

CHAPTER 20. OMPD INTERFACE 549


1 A third-party tool creates contexts and passes them to the OMPD library. The OMPD library does
2 not release contexts; instead the third-party tool releases them after it releases any handles that may
3 reference the contexts.

4 20.4.1.1 ompd_callback_memory_alloc_fn_t
5 Summary
6 The ompd_callback_memory_alloc_fn_t type is the type signature of the callback routine
7 that the third-party tool provides to the OMPD library to allocate memory.

8 Format
C
9 typedef ompd_rc_t (*ompd_callback_memory_alloc_fn_t) (
10 ompd_size_t nbytes,
11 void **ptr
12 );
C
13 Semantics
14 The ompd_callback_memory_alloc_fn_t type is the type signature of the memory
15 allocation callback routine that the third-party tool provides. The OMPD library may call the
16 ompd_callback_memory_alloc_fn_t callback function to allocate memory.

17 Description of Arguments
18 The nbytes argument is the size in bytes of the block of memory to allocate.
19 The address of the newly allocated block of memory is returned in the location to which the ptr
20 argument points. The newly allocated block is suitably aligned for any type of variable and is not
21 guaranteed to be set to zero.

22 Description of Return Codes


23 Routines that use the ompd_callback_memory_alloc_fn_t type may return the general
24 return codes listed at the beginning of Section 20.4.

25 Cross References
26 • Return Code Types, see Section 20.3.12
27 • Size Type, see Section 20.3.1
28 • The Callback Interface, see Section 20.4.6

29 20.4.1.2 ompd_callback_memory_free_fn_t
30 Summary
31 The ompd_callback_memory_free_fn_t type is the type signature of the callback routine
32 that the third-party tool provides to the OMPD library to deallocate memory.

550 OpenMP API – Version 5.2 November 2021


1 Format
C
2 typedef ompd_rc_t (*ompd_callback_memory_free_fn_t) (
3 void *ptr
4 );
C
5 Semantics
6 The ompd_callback_memory_free_fn_t type is the type signature of the memory
7 deallocation callback routine that the third-party tool provides. The OMPD library may call the
8 ompd_callback_memory_free_fn_t callback function to deallocate memory that was
9 obtained from a prior call to the ompd_callback_memory_alloc_fn_t callback function.

10 Description of Arguments
11 The ptr argument is the address of the block to be deallocated.

12 Description of Return Codes


13 Routines that use the ompd_callback_memory_free_fn_t type may return the general
14 return codes listed at the beginning of Section 20.4.

15 Cross References
16 • Return Code Types, see Section 20.3.12
17 • The Callback Interface, see Section 20.4.6
18 • ompd_callback_memory_alloc_fn_t, see Section 20.4.1.1

19 20.4.2 Context Management and Navigation


20 Summary
21 The third-party tool provides the OMPD library with callbacks to manage and to navigate context
22 relationships.

23 20.4.2.1 ompd_callback_get_thread_context_for_thread_id_fn_t
24 Summary
25 The ompd_callback_get_thread_context_for_thread_id_fn_t is the type
26 signature of the callback routine that the third-party tool provides to the OMPD library to map a
27 native thread identifier to a third-party tool thread context.

CHAPTER 20. OMPD INTERFACE 551


1 Format
C
2 typedef ompd_rc_t
3 (*ompd_callback_get_thread_context_for_thread_id_fn_t) (
4 ompd_address_space_context_t *address_space_context,
5 ompd_thread_id_t kind,
6 ompd_size_t sizeof_thread_id,
7 const void *thread_id,
8 ompd_thread_context_t **thread_context
9 );
C
10 Semantics
11 The ompd_callback_get_thread_context_for_thread_id_fn_t is the type
12 signature of the context mapping callback routine that the third-party tool provides. This callback
13 maps a native thread identifier to a third-party tool thread context. The native thread identifier is
14 within the address space that address_space_context identifies. The OMPD library can use the
15 thread context, for example, to access thread local storage.
16 Description of Arguments
17 The address_space_context argument is an opaque handle that the third-party tool provides to
18 reference an address space. The kind, sizeof_thread_id, and thread_id arguments represent a native
19 thread identifier. On return, the thread_context argument provides an opaque handle that maps a
20 native thread identifier to a third-party tool thread context.
21 Description of Return Codes
22 In addition to the general return codes listed at the beginning of Section 20.4, routines that use the
23 ompd_callback_get_thread_context_for_thread_id_fn_t type may also return
24 the following return codes:
25 • ompd_rc_bad_input if a different value in sizeof_thread_id is expected for the native thread
26 identifier kind given by kind; or
27 • ompd_rc_unsupported if the native thread identifier kind is not supported.

28 Restrictions
29 Restrictions on routines that use
30 ompd_callback_get_thread_context_for_thread_id_fn_t are as follows:
31 • The provided thread_context must be valid until the OMPD library returns from the OMPD
32 third-party tool interface routine.

552 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • Native Thread Identifiers, see Section 20.3.7
3 • Return Code Types, see Section 20.3.12
4 • Size Type, see Section 20.3.1
5 • The Callback Interface, see Section 20.4.6
6 • Tool Context Types, see Section 20.3.11

7 20.4.2.2 ompd_callback_sizeof_fn_t
8 Summary
9 The ompd_callback_sizeof_fn_t type is the type signature of the callback routine that the
10 third-party tool provides to the OMPD library to determine the sizes of the primitive types in an
11 address space.

12 Format
C
13 typedef ompd_rc_t (*ompd_callback_sizeof_fn_t) (
14 ompd_address_space_context_t *address_space_context,
15 ompd_device_type_sizes_t *sizes
16 );
C
17 Semantics
18 The ompd_callback_sizeof_fn_t is the type signature of the type-size query callback
19 routine that the third-party tool provides. This callback provides the sizes of the basic primitive
20 types for a given address space.

21 Description of Arguments
22 The callback returns the sizes of the basic primitive types used by the address space context that the
23 address_space_context argument specifies in the location to which the sizes argument points.

24 Description of Return Codes


25 Routines that use the ompd_callback_sizeof_fn_t type may return the general return
26 codes listed at the beginning of Section 20.4.

27 Cross References
28 • Primitive Type Sizes, see Section 20.3.13
29 • Return Code Types, see Section 20.3.12
30 • The Callback Interface, see Section 20.4.6
31 • Tool Context Types, see Section 20.3.11

CHAPTER 20. OMPD INTERFACE 553


1 20.4.3 Accessing Memory in the OpenMP Program or
2 Runtime
3 The OMPD library cannot directly read from or write to memory of the OpenMP program. Instead
4 the OMPD library must use callbacks that the third-party tool provides so that the third-party tool
5 performs the operation.

6 20.4.3.1 ompd_callback_symbol_addr_fn_t
7 Summary
8 The ompd_callback_symbol_addr_fn_t type is the type signature of the callback that the
9 third-party tool provides to look up the addresses of symbols in an OpenMP program.
10 Format
C
11 typedef ompd_rc_t (*ompd_callback_symbol_addr_fn_t) (
12 ompd_address_space_context_t *address_space_context,
13 ompd_thread_context_t *thread_context,
14 const char *symbol_name,
15 ompd_address_t *symbol_addr,
16 const char *file_name
17 );
C
18 Semantics
19 The ompd_callback_symbol_addr_fn_t is the type signature of the symbol-address query
20 callback routine that the third-party tool provides. This callback looks up addresses of symbols
21 within a specified address space.
22 Description of Arguments
23 This callback looks up the symbol provided in the symbol_name argument.
24 The address_space_context argument is the third-party tool’s representation of the address space of
25 the process, core file, or device.
26 The thread_context argument is NULL for global memory accesses. If thread_context is not NULL,
27 thread_context gives the thread-specific context for the symbol lookup for the purpose of
28 calculating thread local storage addresses. In this case, the thread to which thread_context refers
29 must be associated with either the process or the device that corresponds to the
30 address_space_context argument.
31 The third-party tool uses the symbol_name argument that the OMPD library supplies verbatim. In
32 particular, no name mangling, demangling or other transformations are performed prior to the
33 lookup. The symbol_name parameter must correspond to a statically allocated symbol within the
34 specified address space. The symbol can correspond to any type of object, such as a variable,
35 thread local storage variable, function, or untyped label. The symbol can have local, global, or
36 weak binding.

554 OpenMP API – Version 5.2 November 2021


1 The file_name argument is an optional input parameter that indicates the name of the shared library
2 in which the symbol is defined, and it is intended to help the third-party tool disambiguate symbols
3 that are defined multiple times across the executable or shared library files. The shared library name
4 may not be an exact match for the name seen by the third-party tool. If file_name is NULL then the
5 third-party tool first tries to find the symbol in the executable file, and, if the symbol is not found,
6 the third-party tool tries to find the symbol in the shared libraries in the order in which the shared
7 libraries are loaded into the address space. If file_name is non-null then the third-party tool first
8 tries to find the symbol in the libraries that match the name in the file_name argument, and, if the
9 symbol is not found, the third-party tool then uses the same procedure as when file_name is NULL.
10 The callback does not support finding either symbols that are dynamically allocated on the call
11 stack or statically allocated symbols that are defined within the scope of a function or subroutine.
12 The callback returns the address of the symbol in the location to which symbol_addr points.

13 Description of Return Codes


14 In addition to the general return codes listed at the beginning of Section 20.4, routines that use the
15 ompd_callback_symbol_addr_fn_t type may also return the following return codes:
16 • ompd_rc_error if the requested symbol is not found; or
17 • ompd_rc_bad_input if no symbol name is provided.

18 Restrictions
19 Restrictions on routines that use the ompd_callback_symbol_addr_fn_t type are as
20 follows:
21 • The address_space_context argument must be non-null.
22 • The symbol that the symbol_name argument specifies must be defined.

23 Cross References
24 • Address Type, see Section 20.3.4
25 • Return Code Types, see Section 20.3.12
26 • The Callback Interface, see Section 20.4.6
27 • Tool Context Types, see Section 20.3.11

28 20.4.3.2 ompd_callback_memory_read_fn_t
29 Summary
30 The ompd_callback_memory_read_fn_t type is the type signature of the callback that the
31 third-party tool provides to read data (read_memory) or a string (read_string) from an OpenMP
32 program.

CHAPTER 20. OMPD INTERFACE 555


1 Format
C
2 typedef ompd_rc_t (*ompd_callback_memory_read_fn_t) (
3 ompd_address_space_context_t *address_space_context,
4 ompd_thread_context_t *thread_context,
5 const ompd_address_t *addr,
6 ompd_size_t nbytes,
7 void *buffer
8 );
C
9 Semantics
10 The ompd_callback_memory_read_fn_t is the type signature of the read callback routines
11 that the third-party tool provides.
12 The read_memory callback copies a block of data from addr within the address space given by
13 address_space_context to the third-party tool buffer.
14 The read_string callback copies a string to which addr points, including the terminating null byte
15 (’\0’), to the third-party tool buffer. At most nbytes bytes are copied. If a null byte is not among
16 the first nbytes bytes, the string placed in buffer is not null-terminated.

17 Description of Arguments
18 The address from which the data are to be read in the OpenMP program that
19 address_space_context specifies is given by addr. The nbytes argument is the number of bytes to
20 be transferred. The thread_context argument for global memory accesses should be NULL. If it is
21 non-null, thread_context identifies the thread-specific context for the memory access for the
22 purpose of accessing thread local storage.
23 The data are returned through buffer, which is allocated and owned by the OMPD library. The
24 contents of the buffer are unstructured, raw bytes. The OMPD library must arrange for any
25 transformations such as byte-swapping that may be necessary (see Section 20.4.4) to interpret the
26 data.

27 Description of Return Codes


28 In addition to the general return codes listed at the beginning of Section 20.4, routines that use the
29 ompd_callback_memory_read_fn_t type may also return the following return codes:
30 • ompd_rc_incomplete if no terminating null byte is found while reading nbytes using the
31 read_string callback; or
32 • ompd_rc_error if unallocated memory is reached while reading nbytes using either the
33 read_memory or read_string callback.

556 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • Address Type, see Section 20.3.4
3 • Data Format Conversion: ompd_callback_device_host_fn_t, see Section 20.4.4
4 • Return Code Types, see Section 20.3.12
5 • Size Type, see Section 20.3.1
6 • The Callback Interface, see Section 20.4.6
7 • Tool Context Types, see Section 20.3.11

8 20.4.3.3 ompd_callback_memory_write_fn_t
9 Summary
10 The ompd_callback_memory_write_fn_t type is the type signature of the callback that
11 the third-party tool provides to write data to an OpenMP program.

12 Format
C
13 typedef ompd_rc_t (*ompd_callback_memory_write_fn_t) (
14 ompd_address_space_context_t *address_space_context,
15 ompd_thread_context_t *thread_context,
16 const ompd_address_t *addr,
17 ompd_size_t nbytes,
18 const void *buffer
19 );
C
20 Semantics
21 The ompd_callback_memory_write_fn_t is the type signature of the write callback
22 routine that the third-party tool provides. The OMPD library may call this callback to have the
23 third-party tool write a block of data to a location within an address space from a provided buffer.

24 Description of Arguments
25 The address to which the data are to be written in the OpenMP program that address_space_context
26 specifies is given by addr. The nbytes argument is the number of bytes to be transferred. The
27 thread_context argument for global memory accesses should be NULL. If it is non-null, then
28 thread_context identifies the thread-specific context for the memory access for the purpose of
29 accessing thread local storage.
30 The data to be written are passed through buffer, which is allocated and owned by the OMPD
31 library. The contents of the buffer are unstructured, raw bytes. The OMPD library must arrange for
32 any transformations such as byte-swapping that may be necessary (see Section 20.4.4) to render the
33 data into a form that is compatible with the OpenMP runtime.

CHAPTER 20. OMPD INTERFACE 557


1 Description of Return Codes
2 Routines that use the ompd_callback_memory_write_fn_t type may return the general
3 return codes listed at the beginning of Section 20.4.

4 Cross References
5 • Address Type, see Section 20.3.4
6 • Data Format Conversion: ompd_callback_device_host_fn_t, see Section 20.4.4
7 • Return Code Types, see Section 20.3.12
8 • Size Type, see Section 20.3.1
9 • The Callback Interface, see Section 20.4.6
10 • Tool Context Types, see Section 20.3.11

11 20.4.4 Data Format Conversion:


12 ompd_callback_device_host_fn_t
13 Summary
14 The ompd_callback_device_host_fn_t type is the type signature of the callback that the
15 third-party tool provides to convert data between the formats that the third-party tool and the
16 OMPD library use and that the OpenMP program uses.

17 Format
C
18 typedef ompd_rc_t (*ompd_callback_device_host_fn_t) (
19 ompd_address_space_context_t *address_space_context,
20 const void *input,
21 ompd_size_t unit_size,
22 ompd_size_t count,
23 void *output
24 );
C
25 Semantics
26 The architecture on which the third-party tool and the OMPD library execute may be different from
27 the architecture on which the OpenMP program that is being examined executes. Thus, the
28 conventions for representing data may differ. The callback interface includes operations to convert
29 between the conventions, such as the byte order (endianness), that the third-party tool and OMPD
30 library use and the ones that the OpenMP program use. The callback with the
31 ompd_callback_device_host_fn_t type signature converts data between the formats.

558 OpenMP API – Version 5.2 November 2021


1 Description of Arguments
2 The address_space_context argument specifies the OpenMP address space that is associated with
3 the data. The input argument is the source buffer and the output argument is the destination buffer.
4 The unit_size argument is the size of each of the elements to be converted. The count argument is
5 the number of elements to be transformed.
6 The OMPD library allocates and owns the input and output buffers. It must ensure that the buffers
7 have the correct size and are eventually deallocated when they are no longer needed.
8 Description of Return Codes
9 Routines that use the ompd_callback_device_host_fn_t type may return the general
10 return codes listed at the beginning of Section 20.4.
11 Cross References
12 • Return Code Types, see Section 20.3.12
13 • Size Type, see Section 20.3.1
14 • The Callback Interface, see Section 20.4.6
15 • Tool Context Types, see Section 20.3.11

16 20.4.5 ompd_callback_print_string_fn_t
17 Summary
18 The ompd_callback_print_string_fn_t type is the type signature of the callback that
19 the third-party tool provides so that the OMPD library can emit output.
20 Format
C
21 typedef ompd_rc_t (*ompd_callback_print_string_fn_t) (
22 const char *string,
23 int category
24 );
C
25 Semantics
26 The OMPD library may call the ompd_callback_print_string_fn_t callback function to
27 emit output, such as logging or debug information. The third-party tool may set the
28 ompd_callback_print_string_fn_t callback function to NULL to prevent the OMPD
29 library from emitting output. The OMPD library may not write to file descriptors that it did not
30 open.
31 Description of Arguments
32 The string argument is the null-terminated string to be printed. No conversion or formatting is
33 performed on the string.
34 The category argument is the implementation-defined category of the string to be printed.

CHAPTER 20. OMPD INTERFACE 559


1 Description of Return Codes
2 Routines that use the ompd_callback_print_string_fn_t type may return the general
3 return codes listed at the beginning of Section 20.4.

4 Cross References
5 • Return Code Types, see Section 20.3.12
6 • The Callback Interface, see Section 20.4.6

7 20.4.6 The Callback Interface


8 Summary
9 All OMPD library interactions with the OpenMP program must be through a set of callbacks that
10 the third-party tool provides. These callbacks must also be used for allocating or releasing
11 resources, such as memory, that the OMPD library needs.

12 Format
C
13 typedef struct ompd_callbacks_t {
14 ompd_callback_memory_alloc_fn_t alloc_memory;
15 ompd_callback_memory_free_fn_t free_memory;
16 ompd_callback_print_string_fn_t print_string;
17 ompd_callback_sizeof_fn_t sizeof_type;
18 ompd_callback_symbol_addr_fn_t symbol_addr_lookup;
19 ompd_callback_memory_read_fn_t read_memory;
20 ompd_callback_memory_write_fn_t write_memory;
21 ompd_callback_memory_read_fn_t read_string;
22 ompd_callback_device_host_fn_t device_to_host;
23 ompd_callback_device_host_fn_t host_to_device;
24 ompd_callback_get_thread_context_for_thread_id_fn_t
25 get_thread_context_for_thread_id;
26 } ompd_callbacks_t;

C
27 Semantics
28 The set of callbacks that the OMPD library must use is collected in the ompd_callbacks_t
29 structure. An instance of this type is passed to the OMPD library as a parameter to
30 ompd_initialize (see Section 20.5.1.1). Each field points to a function that the OMPD library
31 must use either to interact with the OpenMP program or for memory operations.
32 The alloc_memory and free_memory fields are pointers to functions the OMPD library uses to
33 allocate and to release dynamic memory.
34 The print_string field points to a function that prints a string.
35 The architecture on which the OMPD library and third-party tool execute may be different from the
36 architecture on which the OpenMP program that is being examined executes. The sizeof_type field

560 OpenMP API – Version 5.2 November 2021


1 points to a function that allows the OMPD library to determine the sizes of the basic integer and
2 pointer types that the OpenMP program uses. Because of the potential differences in the targeted
3 architectures, the conventions for representing data in the OMPD library and the OpenMP program
4 may be different. The device_to_host field points to a function that translates data from the
5 conventions that the OpenMP program uses to those that the third-party tool and OMPD library
6 use. The reverse operation is performed by the function to which the host_to_device field points.
7 The symbol_addr_lookup field points to a callback that the OMPD library can use to find the
8 address of a global or thread local storage symbol. The read_memory, read_string and
9 write_memory fields are pointers to functions for reading from and writing to global memory or
10 thread local storage in the OpenMP program.
11 The get_thread_context_for_thread_id field is a pointer to a function that the OMPD library can
12 use to obtain a thread context that corresponds to a native thread identifier.

13 Cross References
14 • Data Format Conversion: ompd_callback_device_host_fn_t, see Section 20.4.4
15 • ompd_callback_get_thread_context_for_thread_id_fn_t, see
16 Section 20.4.2.1
17 • ompd_callback_memory_alloc_fn_t, see Section 20.4.1.1
18 • ompd_callback_memory_free_fn_t, see Section 20.4.1.2
19 • ompd_callback_memory_read_fn_t, see Section 20.4.3.2
20 • ompd_callback_memory_write_fn_t, see Section 20.4.3.3
21 • ompd_callback_print_string_fn_t, see Section 20.4.5
22 • ompd_callback_sizeof_fn_t, see Section 20.4.2.2
23 • ompd_callback_symbol_addr_fn_t, see Section 20.4.3.1

24 20.5 OMPD Tool Interface Routines


25 This section defines the interface provided by the OMPD library to be used by the third-party tool.
26 Some interface routines require one or more specified threads to be stopped for the returned values
27 to be meaningful. In this context, a stopped thread is a thread that is not modifying the observable
28 OpenMP runtime state.
29 Description of Return Codes
30 All of the OMPD Tool Interface Routines must return function-specific return codes or any of the
31 following return codes:
32 • ompd_rc_stale_handle if a provided handle is stale;
33 • ompd_rc_bad_input if an invalid value is provided for any input argument;

CHAPTER 20. OMPD INTERFACE 561


1 • ompd_rc_callback if a callback returned an unexpected error, which leads to a failure of the
2 query;
3 • ompd_rc_needs_state_tracking if the information cannot be provided while the
4 debug-var is disabled;
5 • ompd_rc_ok on success; or
6 • ompd_rc_error for any other error.

7 20.5.1 Per OMPD Library Initialization and Finalization


8 The OMPD library must be initialized exactly once after it is loaded, and finalized exactly once
9 before it is unloaded. Per OpenMP process or core file initialization and finalization are also
10 required. Once loaded, the tool can determine the version of the OMPD API that the library
11 supports by calling ompd_get_api_version (see Section 20.5.1.2). If the tool supports the
12 version that ompd_get_api_version returns, the tool starts the initialization by calling
13 ompd_initialize (see Section 20.5.1.1) using the version of the OMPD API that the library
14 supports. If the tool does not support the version that ompd_get_api_version returns, it may
15 attempt to call ompd_initialize with a different version.

16 20.5.1.1 ompd_initialize
17 Summary
18 The ompd_initialize function initializes the OMPD library.
19 Format
C
20 ompd_rc_t ompd_initialize(
21 ompd_word_t api_version,
22 const ompd_callbacks_t *callbacks
23 );
C
24 Semantics
25 A tool that uses OMPD calls ompd_initialize to initialize each OMPD library that it loads.
26 More than one library may be present in a third-party tool, such as a debugger, because the tool
27 may control multiple devices, which may use different runtime systems that require different
28 OMPD libraries. This initialization must be performed exactly once before the tool can begin to
29 operate on an OpenMP process or core file.
30 Description of Arguments
31 The api_version argument is the OMPD API version that the tool requests to use. The tool may call
32 ompd_get_api_version to obtain the latest OMPD API version that the OMPD library
33 supports.

562 OpenMP API – Version 5.2 November 2021


1 The tool provides the OMPD library with a set of callback functions in the callbacks input
2 argument which enables the OMPD library to allocate and to deallocate memory in the tool’s
3 address space, to lookup the sizes of basic primitive types in the device, to lookup symbols in the
4 device, and to read and to write memory in the device.
5 Description of Return Codes
6 This routine must return any of the general return codes listed at the beginning of Section 20.5 or
7 any of the following return codes:
8 • ompd_rc_bad_input if invalid callbacks are provided; or
9 • ompd_rc_unsupported if the requested API version cannot be provided.
10 Cross References
11 • Return Code Types, see Section 20.3.12
12 • The Callback Interface, see Section 20.4.6
13 • ompd_get_api_version, see Section 20.5.1.2

14 20.5.1.2 ompd_get_api_version
15 Summary
16 The ompd_get_api_version function returns the OMPD API version.
17 Format
C
18 ompd_rc_t ompd_get_api_version(ompd_word_t *version);
C
19 Semantics
20 The tool may call the ompd_get_api_version function to obtain the latest OMPD API
21 version number of the OMPD library. The OMPD API version number is equal to the value of the
22 _OPENMP macro defined in the associated OpenMP implementation, if the C preprocessor is
23 supported. If the associated OpenMP implementation compiles Fortran codes without the use of a
24 C preprocessor, the OMPD API version number is equal to the value of the Fortran integer
25 parameter openmp_version.

26 Description of Arguments
27 The latest version number is returned into the location to which the version argument points.

28 Description of Return Codes


29 This routine must return any of the general return codes listed at the beginning of Section 20.5.

30 Cross References
31 • Return Code Types, see Section 20.3.12

CHAPTER 20. OMPD INTERFACE 563


1 20.5.1.3 ompd_get_version_string
2 Summary
3 The ompd_get_version_string function returns a descriptive string for the OMPD library
4 version.
5 Format
C
6 ompd_rc_t ompd_get_version_string(const char **string);
C
7 Semantics
8 The tool may call this function to obtain a pointer to a descriptive version string of the OMPD
9 library vendor, implementation, internal version, date, or any other information that may be useful
10 to a tool user or vendor. An implementation should provide a different string for every change to its
11 source code or build that could be visible to the interface user.
12 Description of Arguments
13 A pointer to a descriptive version string is placed into the location to which the string output
14 argument points. The OMPD library owns the string that the OMPD library returns; the tool must
15 not modify or release this string. The string remains valid for as long as the library is loaded. The
16 ompd_get_version_string function may be called before ompd_initialize (see
17 Section 20.5.1.1). Accordingly, the OMPD library must not use heap or stack memory for the string.
18 The signatures of ompd_get_api_version (see Section 20.5.1.2) and
19 ompd_get_version_string are guaranteed not to change in future versions of the API. In
20 contrast, the type definitions and prototypes in the rest of the API do not carry the same guarantee.
21 Therefore a tool that uses OMPD should check the version of the API of the loaded OMPD library
22 before it calls any other function of the API.
23 Description of Return Codes
24 This routine must return any of the general return codes listed at the beginning of Section 20.5.

25 Cross References
26 • Return Code Types, see Section 20.3.12

27 20.5.1.4 ompd_finalize
28 Summary
29 When the tool is finished with the OMPD library it should call ompd_finalize before it
30 unloads the library.
31 Format
C
32 ompd_rc_t ompd_finalize(void);
C

564 OpenMP API – Version 5.2 November 2021


1 Semantics
2 The call to ompd_finalize must be the last OMPD call that the tool makes before it unloads the
3 library. This call allows the OMPD library to free any resources that it may be holding. The OMPD
4 library may implement a finalizer section, which executes as the library is unloaded and therefore
5 after the call to ompd_finalize. During finalization, the OMPD library may use the callbacks
6 that the tool provided earlier during the call to ompd_initialize.

7 Description of Return Codes


8 This routine must return any of the general return codes listed at the beginning of Section 20.5 or
9 the following return code:
10 • ompd_rc_unsupported if the OMPD library is not initialized.

11 Cross References
12 • Return Code Types, see Section 20.3.12

13 20.5.2 Per OpenMP Process Initialization and Finalization


14 20.5.2.1 ompd_process_initialize
15 Summary
16 A tool calls ompd_process_initialize to obtain an address space handle for the host device
17 when it initializes a session on a live process or core file.

18 Format
C
19 ompd_rc_t ompd_process_initialize(
20 ompd_address_space_context_t *context,
21 ompd_address_space_handle_t **host_handle
22 );
C
23 Semantics
24 A tool calls ompd_process_initialize to obtain an address space handle for the host device
25 when it initializes a session on a live process or core file. On return from
26 ompd_process_initialize, the tool owns the address space handle, which it must release
27 with ompd_rel_address_space_handle. The initialization function must be called before
28 any OMPD operations are performed on the OpenMP process or core file. This call allows the
29 OMPD library to confirm that it can handle the OpenMP process or core file that context identifies.

30 Description of Arguments
31 The context argument is an opaque handle that the tool provides to address an address space from
32 the host device. On return, the host_handle argument provides an opaque handle to the tool for this
33 address space, which the tool must release when it is no longer needed.

CHAPTER 20. OMPD INTERFACE 565


1 Description of Return Codes
2 This routine must return any of the general return codes listed at the beginning of Section 20.5 or
3 the following return code:
4 • ompd_rc_incompatible if the OMPD library is incompatible with the runtime library
5 loaded in the process.

6 Cross References
7 • OMPD Handle Types, see Section 20.3.8
8 • Return Code Types, see Section 20.3.12
9 • Tool Context Types, see Section 20.3.11
10 • ompd_rel_address_space_handle, see Section 20.5.2.3

11 20.5.2.2 ompd_device_initialize
12 Summary
13 A tool calls ompd_device_initialize to obtain an address space handle for a non-host
14 device that has at least one active target region.

15 Format
C
16 ompd_rc_t ompd_device_initialize(
17 ompd_address_space_handle_t *host_handle,
18 ompd_address_space_context_t *device_context,
19 ompd_device_t kind,
20 ompd_size_t sizeof_id,
21 void *id,
22 ompd_address_space_handle_t **device_handle
23 );
C
24 Semantics
25 A tool calls ompd_device_initialize to obtain an address space handle for a non-host
26 device that has at least one active target region. On return from ompd_device_initialize,
27 the tool owns the address space handle.

28 Description of Arguments
29 The host_handle argument is an opaque handle that the tool provides to reference the host device
30 address space associated with an OpenMP process or core file. The device_context argument is an
31 opaque handle that the tool provides to reference a non-host device address space. The kind,
32 sizeof_id, and id arguments represent a device identifier. On return the device_handle argument
33 provides an opaque handle to the tool for this address space.

566 OpenMP API – Version 5.2 November 2021


1 Description of Return Codes
2 This routine must return any of the general return codes listed at the beginning of Section 20.5 or
3 the following return code:
4 • ompd_rc_unsupported if the OMPD library has no support for the specific device.

5 Cross References
6 • OMPD Handle Types, see Section 20.3.8
7 • Return Code Types, see Section 20.3.12
8 • Size Type, see Section 20.3.1
9 • System Device Identifiers, see Section 20.3.6
10 • Tool Context Types, see Section 20.3.11

11 20.5.2.3 ompd_rel_address_space_handle
12 Summary
13 A tool calls ompd_rel_address_space_handle to release an address space handle.

14 Format
C
15 ompd_rc_t ompd_rel_address_space_handle(
16 ompd_address_space_handle_t *handle
17 );
C
18 Semantics
19 When the tool is finished with the OpenMP process address space handle it should call
20 ompd_rel_address_space_handle to release the handle, which allows the OMPD library
21 to release any resources that it has related to the address space.

22 Description of Arguments
23 The handle argument is an opaque handle for the address space to be released.

24 Restrictions
25 Restrictions to the ompd_rel_address_space_handle routine are as follows:
26 • An address space context must not be used after the corresponding address space handle is
27 released.

28 Description of Return Codes


29 This routine must return any of the general return codes listed at the beginning of Section 20.5.

CHAPTER 20. OMPD INTERFACE 567


1 Cross References
2 • OMPD Handle Types, see Section 20.3.8
3 • Return Code Types, see Section 20.3.12

4 20.5.2.4 ompd_get_device_thread_id_kinds
5 Summary
6 The ompd_get_device_thread_id_kinds function returns a list of supported native
7 thread identifier kinds and a corresponding list of their respective sizes.

8 Format
C
9 ompd_rc_t ompd_get_device_thread_id_kinds(
10 ompd_address_space_handle_t *device_handle,
11 ompd_thread_id_t **kinds,
12 ompd_size_t **thread_id_sizes,
13 int *count
14 );
C
15 Semantics
16 The ompd_get_device_thread_id_kinds function returns an array of supported native
17 thread identifier kinds and a corresponding array of their respective sizes for a given device. The
18 OMPD library allocates storage for the arrays with the memory allocation callback that the tool
19 provides. Each supported native thread identifier kind is guaranteed to be recognizable by the
20 OMPD library and may be mapped to and from any OpenMP thread that executes on the device.
21 The third-party tool owns the storage for the array of kinds and the array of sizes that is returned via
22 the kinds and thread_id_sizes arguments, and it is responsible for freeing that storage.

23 Description of Arguments
24 The device_handle argument is a pointer to an opaque address space handle that represents a host
25 device (returned by ompd_process_initialize) or a non-host device (returned by
26 ompd_device_initialize). On return, the kinds argument is the address of a pointer to an
27 array of native thread identifier kinds, the thread_id_sizes argument is the address of a pointer to an
28 array of the corresponding native thread identifier sizes used by the OMPD library, and the count
29 argument is the address of a variable that indicates the sizes of the returned arrays.

30 Description of Return Codes


31 This routine must return any of the general return codes listed at the beginning of Section 20.5.

568 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • Native Thread Identifiers, see Section 20.3.7
3 • OMPD Handle Types, see Section 20.3.8
4 • Return Code Types, see Section 20.3.12
5 • Size Type, see Section 20.3.1

6 20.5.3 Thread and Signal Safety


7 The OMPD library does not need to be reentrant. The tool must ensure that only one thread enters
8 the OMPD library at a time. The OMPD library must not install signal handlers or otherwise
9 interfere with the tool’s signal configuration.

10 20.5.4 Address Space Information


11 20.5.4.1 ompd_get_omp_version
12 Summary
13 The tool may call the ompd_get_omp_version function to obtain the version of the OpenMP
14 API that is associated with an address space.

15 Format
C
16 ompd_rc_t ompd_get_omp_version(
17 ompd_address_space_handle_t *address_space,
18 ompd_word_t *omp_version
19 );
C
20 Semantics
21 The tool may call the ompd_get_omp_version function to obtain the version of the OpenMP
22 API that is associated with the address space.

23 Description of Arguments
24 The address_space argument is an opaque handle that the tool provides to reference the address
25 space of the OpenMP process or device.
26 Upon return, the omp_version argument contains the version of the OpenMP runtime in the
27 _OPENMP version macro format.

28 Description of Return Codes


29 This routine must return any of the general return codes listed at the beginning of Section 20.5.

CHAPTER 20. OMPD INTERFACE 569


1 Cross References
2 • OMPD Handle Types, see Section 20.3.8
3 • Return Code Types, see Section 20.3.12

4 20.5.4.2 ompd_get_omp_version_string
5 Summary
6 The ompd_get_omp_version_string function returns a descriptive string for the OpenMP
7 API version that is associated with an address space.
8 Format
C
9 ompd_rc_t ompd_get_omp_version_string(
10 ompd_address_space_handle_t *address_space,
11 const char **string
12 );
C
13 Semantics
14 After initialization, the tool may call the ompd_get_omp_version_string function to obtain
15 the version of the OpenMP API that is associated with an address space.
16 Description of Arguments
17 The address_space argument is an opaque handle that the tool provides to reference the address
18 space of the OpenMP process or device. A pointer to a descriptive version string is placed into the
19 location to which the string output argument points. After returning from the call, the tool owns the
20 string. The OMPD library must use the memory allocation callback that the tool provides to
21 allocate the string storage. The tool is responsible for releasing the memory.
22 Description of Return Codes
23 This routine must return any of the general return codes listed at the beginning of Section 20.5.

24 Cross References
25 • OMPD Handle Types, see Section 20.3.8
26 • Return Code Types, see Section 20.3.12

27 20.5.5 Thread Handles


28 20.5.5.1 ompd_get_thread_in_parallel
29 Summary
30 The ompd_get_thread_in_parallel function enables a tool to obtain handles for OpenMP
31 threads that are associated with a parallel region.

570 OpenMP API – Version 5.2 November 2021


1 Format
C
2 ompd_rc_t ompd_get_thread_in_parallel(
3 ompd_parallel_handle_t *parallel_handle,
4 int thread_num,
5 ompd_thread_handle_t **thread_handle
6 );
C
7 Semantics
8 A successful invocation of ompd_get_thread_in_parallel returns a pointer to a thread
9 handle in the location to which thread_handle points. This call yields meaningful results only
10 if all OpenMP threads in the team that is executing the parallel region are stopped.
11 Description of Arguments
12 The parallel_handle argument is an opaque handle for a parallel region and selects the parallel
13 region on which to operate. The thread_num argument represents the OpenMP thread number and
14 selects the thread, the handle for which is to be returned. On return, the thread_handle argument is
15 an opaque handle for the selected thread.
16 Description of Return Codes
17 This routine must return any of the general return codes listed at the beginning of Section 20.5 or
18 the following return code:
19 • ompd_rc_bad_input if the thread_num argument is greater than or equal to the
20 team-size-var ICV or negative.
21 Restrictions
22 Restrictions on the ompd_get_thread_in_parallel function are as follows:
23 • The value of thread_num must be a non-negative integer smaller than the team size that was
24 provided as the team-size-var ICV from ompd_get_icv_from_scope.

25 Cross References
26 • OMPD Handle Types, see Section 20.3.8
27 • Return Code Types, see Section 20.3.12
28 • ompd_get_icv_from_scope, see Section 20.5.10.2

29 20.5.5.2 ompd_get_thread_handle
30 Summary
31 The ompd_get_thread_handle function maps a native thread to an OMPD thread handle.

CHAPTER 20. OMPD INTERFACE 571


1 Format
C
2 ompd_rc_t ompd_get_thread_handle(
3 ompd_address_space_handle_t *handle,
4 ompd_thread_id_t kind,
5 ompd_size_t sizeof_thread_id,
6 const void *thread_id,
7 ompd_thread_handle_t **thread_handle
8 );
C
9 Semantics
10 The ompd_get_thread_handle function determines if the native thread identifier to which
11 thread_id points represents an OpenMP thread. If so, the function returns ompd_rc_ok and the
12 location to which thread_handle points is set to the thread handle for the OpenMP thread.
13 Description of Arguments
14 The handle argument is an opaque handle that the tool provides to reference an address space. The
15 kind, sizeof_thread_id, and thread_id arguments represent a native thread identifier. On return, the
16 thread_handle argument provides an opaque handle to the thread within the provided address space.
17 The native thread identifier to which thread_id points is guaranteed to be valid for the duration of
18 the call. If the OMPD library must retain the native thread identifier, it must copy it.
19 Description of Return Codes
20 This routine must return any of the general return codes listed at the beginning of Section 20.5 or
21 any of the following return codes:
22 • ompd_rc_bad_input if a different value in sizeof_thread_id is expected for a thread kind of
23 kind.
24 • ompd_rc_unsupported if the kind of thread is not supported.
25 • ompd_rc_unavailable if the thread is not an OpenMP thread.
26 Cross References
27 • Native Thread Identifiers, see Section 20.3.7
28 • OMPD Handle Types, see Section 20.3.8
29 • Return Code Types, see Section 20.3.12
30 • Size Type, see Section 20.3.1

31 20.5.5.3 ompd_rel_thread_handle
32 Summary
33 The ompd_rel_thread_handle function releases a thread handle.

572 OpenMP API – Version 5.2 November 2021


1 Format
C
2 ompd_rc_t ompd_rel_thread_handle(
3 ompd_thread_handle_t *thread_handle
4 );
C
5 Semantics
6 Thread handles are opaque to tools, which therefore cannot release them directly. Instead, when the
7 tool is finished with a thread handle it must pass it to ompd_rel_thread_handle for disposal.
8 Description of Arguments
9 The thread_handle argument is an opaque handle for a thread to be released.
10 Description of Return Codes
11 This routine must return any of the general return codes listed at the beginning of Section 20.5.
12 Cross References
13 • OMPD Handle Types, see Section 20.3.8
14 • Return Code Types, see Section 20.3.12

15 20.5.5.4 ompd_thread_handle_compare
16 Summary
17 The ompd_thread_handle_compare function allows tools to compare two thread handles.
18 Format
C
19 ompd_rc_t ompd_thread_handle_compare(
20 ompd_thread_handle_t *thread_handle_1,
21 ompd_thread_handle_t *thread_handle_2,
22 int *cmp_value
23 );
C
24 Semantics
25 The internal structure of thread handles is opaque to a tool. While the tool can easily compare
26 pointers to thread handles, it cannot determine whether handles of two different addresses refer to
27 the same underlying thread. The ompd_thread_handle_compare function compares thread
28 handles.
29 On success, ompd_thread_handle_compare returns in the location to which cmp_value
30 points a signed integer value that indicates how the underlying threads compare: a value less than,
31 equal to, or greater than 0 indicates that the thread corresponding to thread_handle_1 is,
32 respectively, less than, equal to, or greater than that corresponding to thread_handle_2.

CHAPTER 20. OMPD INTERFACE 573


1 Description of Arguments
2 The thread_handle_1 and thread_handle_2 arguments are opaque handles for threads. On return
3 the cmp_value argument is set to a signed integer value.
4 Description of Return Codes
5 This routine must return any of the general return codes listed at the beginning of Section 20.5.

6 Cross References
7 • OMPD Handle Types, see Section 20.3.8
8 • Return Code Types, see Section 20.3.12

9 20.5.5.5 ompd_get_thread_id
10 Summary
11 The ompd_get_thread_id function maps an OMPD thread handle to a native thread.

12 Format
C
13 ompd_rc_t ompd_get_thread_id(
14 ompd_thread_handle_t *thread_handle,
15 ompd_thread_id_t kind,
16 ompd_size_t sizeof_thread_id,
17 void *thread_id
18 );
C
19 Semantics
20 The ompd_get_thread_id function maps an OMPD thread handle to a native thread identifier.
21 This call yields meaningful results only if the referenced OpenMP thread is stopped.

22 Description of Arguments
23 The thread_handle argument is an opaque thread handle. The kind argument represents the native
24 thread identifier. The sizeof_thread_id argument represents the size of the native thread identifier.
25 On return, the thread_id argument is a buffer that represents a native thread identifier.

26 Description of Return Codes


27 This routine must return any of the general return codes listed at the beginning of Section 20.5 or
28 any of the following return codes:
29 • ompd_rc_bad_input if a different value in sizeof_thread_id is expected for a thread kind of
30 kind; or
31 • ompd_rc_unsupported if the kind of thread is not supported.

574 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • Native Thread Identifiers, see Section 20.3.7
3 • OMPD Handle Types, see Section 20.3.8
4 • Return Code Types, see Section 20.3.12
5 • Size Type, see Section 20.3.1

6 20.5.5.6 ompd_get_device_from_thread
7 Summary
8 The ompd_get_device_from_thread function obtains a pointer to the address space handle
9 for a device on which an OpenMP thread is executing.

10 Format
C
11 ompd_rc_t ompd_get_device_from_thread(
12 ompd_thread_handle_t *thread_handle,
13 ompd_address_space_handle_t **device
14 );
C
15 Semantics
16 The ompd_get_device_from_thread function obtains a pointer to the address space handle
17 for a device on which an OpenMP thread is executing. The returned pointer will be the same as the
18 address space handle pointer that was previously returned by a call to
19 ompd_process_initialize (for a host device) or a call to ompd_device_initialize
20 (for a non-host device). This call yields meaningful results only if the referenced OpenMP thread is
21 stopped.

22 Description of Arguments
23 The thread_handle argument is a pointer to an opaque thread handle that represents an OpenMP
24 thread. On return, the device argument is the address of a pointer to an OMPD address space
25 handle.

26 Description of Return Codes


27 This routine must return any of the general return codes listed at the beginning of Section 20.5.

28 Cross References
29 • OMPD Handle Types, see Section 20.3.8
30 • Return Code Types, see Section 20.3.12

CHAPTER 20. OMPD INTERFACE 575


1 20.5.6 Parallel Region Handles
2 20.5.6.1 ompd_get_curr_parallel_handle
3 Summary
4 The ompd_get_curr_parallel_handle function obtains a pointer to the parallel handle for
5 an OpenMP thread’s current parallel region.

6 Format
C
7 ompd_rc_t ompd_get_curr_parallel_handle(
8 ompd_thread_handle_t *thread_handle,
9 ompd_parallel_handle_t **parallel_handle
10 );
C
11 Semantics
12 The ompd_get_curr_parallel_handle function enables the tool to obtain a pointer to the
13 parallel handle for the current parallel region that is associated with an OpenMP thread. This call
14 yields meaningful results only if the referenced OpenMP thread is stopped. The parallel handle is
15 owned by the tool and it must be released by calling ompd_rel_parallel_handle.

16 Description of Arguments
17 The thread_handle argument is an opaque handle for a thread and selects the thread on which to
18 operate. On return, the parallel_handle argument is set to a handle for the parallel region that the
19 associated thread is currently executing, if any.

20 Description of Return Codes


21 This routine must return any of the general return codes listed at the beginning of Section 20.5 or
22 the following return code:
23 • ompd_rc_unavailable if the thread is not currently part of a team.

24 Cross References
25 • OMPD Handle Types, see Section 20.3.8
26 • Return Code Types, see Section 20.3.12
27 • ompd_rel_parallel_handle, see Section 20.5.6.4

28 20.5.6.2 ompd_get_enclosing_parallel_handle
29 Summary
30 The ompd_get_enclosing_parallel_handle function obtains a pointer to the parallel
31 handle for an enclosing parallel region.

576 OpenMP API – Version 5.2 November 2021


1 Format
C
2 ompd_rc_t ompd_get_enclosing_parallel_handle(
3 ompd_parallel_handle_t *parallel_handle,
4 ompd_parallel_handle_t **enclosing_parallel_handle
5 );
C
6 Semantics
7 The ompd_get_enclosing_parallel_handle function enables a tool to obtain a pointer
8 to the parallel handle for the parallel region that encloses the parallel region that
9 parallel_handle specifies. This call is meaningful only if at least one thread in the team that
10 is executing the parallel region is stopped. A pointer to the parallel handle for the enclosing region
11 is returned in the location to which enclosing_parallel_handle points. After the call, the tool owns
12 the handle; the tool must release the handle with ompd_rel_parallel_handle when it is no
13 longer required.
14 Description of Arguments
15 The parallel_handle argument is an opaque handle for a parallel region that selects the parallel
16 region on which to operate. On return, the enclosing_parallel_handle argument is set to a handle
17 for the parallel region that encloses the selected parallel region.
18 Description of Return Codes
19 This routine must return any of the general return codes listed at the beginning of Section 20.5 or
20 the following return code:
21 • ompd_rc_unavailable if no enclosing parallel region exists.
22 Cross References
23 • OMPD Handle Types, see Section 20.3.8
24 • Return Code Types, see Section 20.3.12
25 • ompd_rel_parallel_handle, see Section 20.5.6.4

26 20.5.6.3 ompd_get_task_parallel_handle
27 Summary
28 The ompd_get_task_parallel_handle function obtains a pointer to the parallel handle for
29 the parallel region that encloses a task region.
30 Format
C
31 ompd_rc_t ompd_get_task_parallel_handle(
32 ompd_task_handle_t *task_handle,
33 ompd_parallel_handle_t **task_parallel_handle
34 );
C

CHAPTER 20. OMPD INTERFACE 577


1 Semantics
2 The ompd_get_task_parallel_handle function enables a tool to obtain a pointer to the
3 parallel handle for the parallel region that encloses the task region that task_handle specifies. This
4 call yields meaningful results only if at least one thread in the team that is executing the parallel
5 region is stopped. A pointer to the parallel regions handle is returned in the location to which
6 task_parallel_handle points. The tool owns that parallel handle, which it must release with
7 ompd_rel_parallel_handle.

8 Description of Arguments
9 The task_handle argument is an opaque handle that selects the task on which to operate. On return,
10 the parallel_handle argument is set to a handle for the parallel region that encloses the selected task.

11 Description of Return Codes


12 This routine must return any of the general return codes listed at the beginning of Section 20.5.

13 Cross References
14 • OMPD Handle Types, see Section 20.3.8
15 • Return Code Types, see Section 20.3.12
16 • ompd_rel_parallel_handle, see Section 20.5.6.4

17 20.5.6.4 ompd_rel_parallel_handle
18 Summary
19 The ompd_rel_parallel_handle function releases a parallel region handle.

20 Format
C
21 ompd_rc_t ompd_rel_parallel_handle(
22 ompd_parallel_handle_t *parallel_handle
23 );
C
24 Semantics
25 Parallel region handles are opaque so tools cannot release them directly. Instead, a tool must pass a
26 parallel region handle to the ompd_rel_parallel_handle function for disposal when
27 finished with it.

28 Description of Arguments
29 The parallel_handle argument is an opaque handle to be released.

30 Description of Return Codes


31 This routine must return any of the general return codes listed at the beginning of Section 20.5.

578 OpenMP API – Version 5.2 November 2021


1 Cross References
2 • OMPD Handle Types, see Section 20.3.8
3 • Return Code Types, see Section 20.3.12

4 20.5.6.5 ompd_parallel_handle_compare
5 Summary
6 The ompd_parallel_handle_compare function compares two parallel region handles.

7 Format
C
8 ompd_rc_t ompd_parallel_handle_compare(
9 ompd_parallel_handle_t *parallel_handle_1,
10 ompd_parallel_handle_t *parallel_handle_2,
11 int *cmp_value
12 );
C
13 Semantics
14 The internal structure of parallel region handles is opaque to tools. While tools can easily compare
15 pointers to parallel region handles, they cannot determine whether handles at two different
16 addresses refer to the same underlying parallel region and, instead must use the
17 ompd_parallel_handle_compare function.
18 On success, ompd_parallel_handle_compare returns a signed integer value in the location
19 to which cmp_value points that indicates how the underlying parallel regions compare. A value less
20 than, equal to, or greater than 0 indicates that the region corresponding to parallel_handle_1 is,
21 respectively, less than, equal to, or greater than that corresponding to parallel_handle_2. This
22 function is provided since the means by which parallel region handles are ordered is
23 implementation defined.

24 Description of Arguments
25 The parallel_handle_1 and parallel_handle_2 arguments are opaque handles that correspond to
26 parallel regions. On return the cmp_value argument points to a signed integer value that indicates
27 how the underlying parallel regions compare.

28 Description of Return Codes


29 This routine must return any of the general return codes listed at the beginning of Section 20.5.

30 Cross References
31 • OMPD Handle Types, see Section 20.3.8
32 • Return Code Types, see Section 20.3.12

CHAPTER 20. OMPD INTERFACE 579


1 20.5.7 Task Handles
2 20.5.7.1 ompd_get_curr_task_handle
3 Summary
4 The ompd_get_curr_task_handle function obtains a pointer to the task handle for the
5 current task region that is associated with an OpenMP thread.

6 Format
C
7 ompd_rc_t ompd_get_curr_task_handle(
8 ompd_thread_handle_t *thread_handle,
9 ompd_task_handle_t **task_handle
10 );
C
11 Semantics
12 The ompd_get_curr_task_handle function obtains a pointer to the task handle for the
13 current task region that is associated with an OpenMP thread. This call yields meaningful results
14 only if the thread for which the handle is provided is stopped. The task handle must be released
15 with ompd_rel_task_handle.

16 Description of Arguments
17 The thread_handle argument is an opaque handle that selects the thread on which to operate. On
18 return, the task_handle argument points to a location that points to a handle for the task that the
19 thread is currently executing.

20 Description of Return Codes


21 This routine must return any of the general return codes listed at the beginning of Section 20.5 or
22 the following return code:
23 • ompd_rc_unavailable if the thread is currently not executing a task.

24 Cross References
25 • OMPD Handle Types, see Section 20.3.8
26 • Return Code Types, see Section 20.3.12
27 • ompd_rel_task_handle, see Section 20.5.7.5

28 20.5.7.2 ompd_get_generating_task_handle
29 Summary
30 The ompd_get_generating_task_handle function obtains a pointer to the task handle of
31 the generating task region.

580 OpenMP API – Version 5.2 November 2021


1 Format
C
2 ompd_rc_t ompd_get_generating_task_handle(
3 ompd_task_handle_t *task_handle,
4 ompd_task_handle_t **generating_task_handle
5 );
C
6 Semantics
7 The ompd_get_generating_task_handle function obtains a pointer to the task handle for
8 the task that encountered the OpenMP task construct that generated the task represented by
9 task_handle. The generating task is the OpenMP task that was active when the task specified by
10 task_handle was created. This call yields meaningful results only if the thread that is executing the
11 task that task_handle specifies is stopped while executing the task. The generating task handle must
12 be released with ompd_rel_task_handle.
13 Description of Arguments
14 The task_handle argument is an opaque handle that selects the task on which to operate. On return,
15 the generating_task_handle argument points to a location that points to a handle for the generating
16 task.
17 Description of Return Codes
18 This routine must return any of the general return codes listed at the beginning of Section 20.5 or
19 the following return code:
20 • ompd_rc_unavailable if no generating task region exists.
21 Cross References
22 • OMPD Handle Types, see Section 20.3.8
23 • Return Code Types, see Section 20.3.12
24 • ompd_rel_task_handle, see Section 20.5.7.5

25 20.5.7.3 ompd_get_scheduling_task_handle
26 Summary
27 The ompd_get_scheduling_task_handle function obtains a task handle for the task that
28 was active at a task scheduling point.
29 Format
C
30 ompd_rc_t ompd_get_scheduling_task_handle(
31 ompd_task_handle_t *task_handle,
32 ompd_task_handle_t **scheduling_task_handle
33 );
C

CHAPTER 20. OMPD INTERFACE 581


1 Semantics
2 The ompd_get_scheduling_task_handle function obtains a task handle for the task that
3 was active when the task that task_handle represents was scheduled. An implicit task does not have
4 a scheduling task. This call yields meaningful results only if the thread that is executing the task
5 that task_handle specifies is stopped while executing the task. The scheduling task handle must be
6 released with ompd_rel_task_handle.

7 Description of Arguments
8 The task_handle argument is an opaque handle for a task and selects the task on which to operate.
9 On return, the scheduling_task_handle argument points to a location that points to a handle for the
10 task that is still on the stack of execution on the same thread and was deferred in favor of executing
11 the selected task.

12 Description of Return Codes


13 This routine must return any of the general return codes listed at the beginning of Section 20.5 or
14 the following return code:
15 • ompd_rc_unavailable if no scheduling task exists.

16 Cross References
17 • OMPD Handle Types, see Section 20.3.8
18 • Return Code Types, see Section 20.3.12
19 • ompd_rel_task_handle, see Section 20.5.7.5

20 20.5.7.4 ompd_get_task_in_parallel
21 Summary
22 The ompd_get_task_in_parallel function obtains handles for the implicit tasks that are
23 associated with a parallel region.

24 Format
C
25 ompd_rc_t ompd_get_task_in_parallel(
26 ompd_parallel_handle_t *parallel_handle,
27 int thread_num,
28 ompd_task_handle_t **task_handle
29 );
C
30 Semantics
31 The ompd_get_task_in_parallel function obtains handles for the implicit tasks that are
32 associated with a parallel region. A successful invocation of ompd_get_task_in_parallel
33 returns a pointer to a task handle in the location to which task_handle points. This call yields
34 meaningful results only if all OpenMP threads in the parallel region are stopped.

582 OpenMP API – Version 5.2 November 2021


1 Description of Arguments
2 The parallel_handle argument is an opaque handle that selects the parallel region on which to
3 operate. The thread_num argument selects the implicit task of the team to be returned. The
4 thread_num argument is equal to the thread-num-var ICV value of the selected implicit task. On
5 return, the task_handle argument points to a location that points to an opaque handle for the
6 selected implicit task.

7 Description of Return Codes


8 This routine must return any of the general return codes listed at the beginning of Section 20.5 or
9 the following return code:
10 • ompd_rc_bad_input if the thread_num argument is greater than or equal to the
11 team-size-var ICV or negative.

12 Restrictions
13 Restrictions on the ompd_get_task_in_parallel function are as follows:
14 • The value of thread_num must be a non-negative integer that is smaller than the size of the team
15 size that is the value of the team-size-var ICV that ompd_get_icv_from_scope returns.

16 Cross References
17 • OMPD Handle Types, see Section 20.3.8
18 • Return Code Types, see Section 20.3.12
19 • ompd_get_icv_from_scope, see Section 20.5.10.2

20 20.5.7.5 ompd_rel_task_handle
21 Summary
22 This ompd_rel_task_handle function releases a task handle.

23 Format
C
24 ompd_rc_t ompd_rel_task_handle(
25 ompd_task_handle_t *task_handle
26 );
C
27 Semantics
28 Task handles are opaque to tools; thus tools cannot release them directly. Instead, when a tool is
29 finished with a task handle it must use the ompd_rel_task_handle function to release it.

30 Description of Arguments
31 The task_handle argument is an opaque task handle to be released.

CHAPTER 20. OMPD INTERFACE 583


1 Description of Return Codes
2 This routine must return any of the general return codes listed at the beginning of Section 20.5.

3 Cross References
4 • OMPD Handle Types, see Section 20.3.8
5 • Return Code Types, see Section 20.3.12

6 20.5.7.6 ompd_task_handle_compare
7 Summary
8 The ompd_task_handle_compare function compares task handles.

9 Format
C
10 ompd_rc_t ompd_task_handle_compare(
11 ompd_task_handle_t *task_handle_1,
12 ompd_task_handle_t *task_handle_2,
13 int *cmp_value
14 );
C
15 Semantics
16 The internal structure of task handles is opaque; so tools cannot directly determine if handles at two
17 different addresses refer to the same underlying task. The ompd_task_handle_compare
18 function compares task handles. After a successful call to ompd_task_handle_compare, the
19 value of the location to which cmp_value points is a signed integer that indicates how the underlying
20 tasks compare: a value less than, equal to, or greater than 0 indicates that the task that corresponds
21 to task_handle_1 is, respectively, less than, equal to, or greater than the task that corresponds to
22 task_handle_2. The means by which task handles are ordered is implementation defined.

23 Description of Arguments
24 The task_handle_1 and task_handle_2 arguments are opaque handles that correspond to tasks. On
25 return, the cmp_value argument points to a location in which a signed integer value indicates how
26 the underlying tasks compare.

27 Description of Return Codes


28 This routine must return any of the general return codes listed at the beginning of Section 20.5.

29 Cross References
30 • OMPD Handle Types, see Section 20.3.8
31 • Return Code Types, see Section 20.3.12

584 OpenMP API – Version 5.2 November 2021


1 20.5.7.7 ompd_get_task_function
2 Summary
3 This ompd_get_task_function function returns the entry point of the code that corresponds
4 to the body of a task.
5 Format
C
6 ompd_rc_t ompd_get_task_function (
7 ompd_task_handle_t *task_handle,
8 ompd_address_t *entry_point
9 );
C
10 Semantics
11 The ompd_get_task_function function returns the entry point of the code that corresponds
12 to the body of code that the task executes. This call is meaningful only if the thread that is
13 executing the task that task_handle specifies is stopped while executing the task.
14 Description of Arguments
15 The task_handle argument is an opaque handle that selects the task on which to operate. On return,
16 the entry_point argument is set to an address that describes the beginning of application code that
17 executes the task region.
18 Description of Return Codes
19 This routine must return any of the general return codes listed at the beginning of Section 20.5.
20 Cross References
21 • Address Type, see Section 20.3.4
22 • OMPD Handle Types, see Section 20.3.8
23 • Return Code Types, see Section 20.3.12

24 20.5.7.8 ompd_get_task_frame
25 Summary
26 The ompd_get_task_frame function extracts the frame pointers of a task.
27 Format
C
28 ompd_rc_t ompd_get_task_frame (
29 ompd_task_handle_t *task_handle,
30 ompd_frame_info_t *exit_frame,
31 ompd_frame_info_t *enter_frame
32 );
C

CHAPTER 20. OMPD INTERFACE 585


1 Semantics
2 An OpenMP implementation maintains an ompt_frame_t object for every implicit or explicit
3 task. The ompd_get_task_frame function extracts the enter_frame and exit_frame fields of
4 the ompt_frame_t object of the task that task_handle identifies. This call yields meaningful
5 results only if the thread that is executing the task that task_handle specifies is stopped while
6 executing the task.

7 Description of Arguments
8 The task_handle argument specifies an OpenMP task. On return, the exit_frame argument points to
9 an ompd_frame_info_t object that has the frame information with the same semantics as the
10 exit_frame field in the ompt_frame_t object that is associated with the specified task. On return,
11 the enter_frame argument points to an ompd_frame_info_t object that has the frame
12 information with the same semantics as the enter_frame field in the ompt_frame_t object that is
13 associated with the specified task.

14 Description of Return Codes


15 This routine must return any of the general return codes listed at the beginning of Section 20.5.

16 Cross References
17 • Address Type, see Section 20.3.4
18 • Frame Information Type, see Section 20.3.5
19 • OMPD Handle Types, see Section 20.3.8
20 • Return Code Types, see Section 20.3.12
21 • ompt_frame_t, see Section 19.4.4.29

22 20.5.8 Querying Thread States


23 20.5.8.1 ompd_enumerate_states
24 Summary
25 The ompd_enumerate_states function enumerates thread states that an OpenMP
26 implementation supports.

27 Format
C
28 ompd_rc_t ompd_enumerate_states (
29 ompd_address_space_handle_t *address_space_handle,
30 ompd_word_t current_state,
31 ompd_word_t *next_state,
32 const char **next_state_name,
33 ompd_word_t *more_enums
34 );
C

586 OpenMP API – Version 5.2 November 2021


1 Semantics
2 An OpenMP implementation may support only a subset of the states that the ompt_state_t
3 enumeration type defines. In addition, an OpenMP implementation may support
4 implementation-specific states. The ompd_enumerate_states call enables a tool to
5 enumerate the thread states that an OpenMP implementation supports.
6 When the current_state argument is a thread state that an OpenMP implementation supports, the
7 call assigns the value and string name of the next thread state in the enumeration to the locations to
8 which the next_state and next_state_name arguments point.
9 On return, the third-party tool owns the next_state_name string. The OMPD library allocates
10 storage for the string with the memory allocation callback that the tool provides. The tool is
11 responsible for releasing the memory.
12 On return, the location to which the more_enums argument points has the value 1 whenever one or
13 more states are left in the enumeration. On return, the location to which the more_enums argument
14 points has the value 0 when current_state is the last state in the enumeration.

15 Description of Arguments
16 The address_space_handle argument identifies the address space. The current_state argument must
17 be a thread state that the OpenMP implementation supports. To begin enumerating the supported
18 states, a tool should pass ompt_state_undefined as the value of current_state. Subsequent
19 calls to ompd_enumerate_states by the tool should pass the value that the call returned in
20 the next_state argument. On return, the next_state argument points to an integer with the value of
21 the next state in the enumeration. On return, the next_state_name argument points to a character
22 string that describes the next state. On return, the more_enums argument points to an integer with a
23 value of 1 when more states are left to enumerate and a value of 0 when no more states are left.

24 Description of Return Codes


25 This routine must return any of the general return codes listed at the beginning of Section 20.5 or
26 the following return code:
27 • ompd_rc_bad_input if an unknown value is provided in current_state.

28 Cross References
29 • OMPD Handle Types, see Section 20.3.8
30 • Return Code Types, see Section 20.3.12
31 • ompt_state_t, see Section 19.4.4.28

32 20.5.8.2 ompd_get_state
33 Summary
34 The ompd_get_state function obtains the state of a thread.

CHAPTER 20. OMPD INTERFACE 587


1 Format
C
2 ompd_rc_t ompd_get_state (
3 ompd_thread_handle_t *thread_handle,
4 ompd_word_t *state,
5 ompd_wait_id_t *wait_id
6 );
C
7 Semantics
8 The ompd_get_state function returns the state of an OpenMP thread. This call yields
9 meaningful results only if the referenced OpenMP thread is stopped.
10 Description of Arguments
11 The thread_handle argument identifies the thread. The state argument represents the state of that
12 thread as represented by a value that ompd_enumerate_states returns. On return, if the
13 wait_id argument is non-null then it points to a handle that corresponds to the wait_id wait
14 identifier of the thread. If the thread state is not one of the specified wait states, the value to which
15 wait_id points is undefined.
16 Description of Return Codes
17 This routine must return any of the general return codes listed at the beginning of Section 20.5.
18 Cross References
19 • OMPD Handle Types, see Section 20.3.8
20 • Return Code Types, see Section 20.3.12
21 • Wait ID Type, see Section 20.3.2
22 • ompd_enumerate_states, see Section 20.5.8.1

23 20.5.9 Display Control Variables


24 20.5.9.1 ompd_get_display_control_vars
25 Summary
26 The ompd_get_display_control_vars function returns a list of name/value pairs for
27 OpenMP control variables.
28 Format
C
29 ompd_rc_t ompd_get_display_control_vars (
30 ompd_address_space_handle_t *address_space_handle,
31 const char * const **control_vars
32 );
C

588 OpenMP API – Version 5.2 November 2021


1 Semantics
2 The ompd_get_display_control_vars function returns a NULL-terminated vector of
3 null-terminated strings of name/value pairs of control variables that have user controllable settings
4 and are important to the operation or performance of an OpenMP runtime system. The control
5 variables that this interface exposes include all OpenMP environment variables, settings that may
6 come from vendor or platform-specific environment variables, and other settings that affect the
7 operation or functioning of an OpenMP runtime.
8 The format of the strings is "icv-name=icv-value".
9 On return, the third-party tool owns the vector and the strings. The OMPD library must satisfy the
10 termination constraints; it may use static or dynamic memory for the vector and/or the strings and is
11 unconstrained in how it arranges them in memory. If it uses dynamic memory then the OMPD
12 library must use the allocate callback that the tool provides to ompd_initialize. The tool must
13 use the ompd_rel_display_control_vars function to release the vector and the strings.

14 Description of Arguments
15 The address_space_handle argument identifies the address space. On return, the control_vars
16 argument points to the vector of display control variables.

17 Description of Return Codes


18 This routine must return any of the general return codes listed at the beginning of Section 20.5.

19 Cross References
20 • OMPD Handle Types, see Section 20.3.8
21 • Return Code Types, see Section 20.3.12
22 • ompd_initialize, see Section 20.5.1.1
23 • ompd_rel_display_control_vars, see Section 20.5.9.2

24 20.5.9.2 ompd_rel_display_control_vars
25 Summary
26 The ompd_rel_display_control_vars releases a list of name/value pairs of OpenMP
27 control variables previously acquired with ompd_get_display_control_vars.

28 Format
C
29 ompd_rc_t ompd_rel_display_control_vars (
30 const char * const **control_vars
31 );
C

CHAPTER 20. OMPD INTERFACE 589


1 Semantics
2 The third-party tool owns the vector and strings that ompd_get_display_control_vars
3 returns. The tool must call ompd_rel_display_control_vars to release the vector and the
4 strings.
5 Description of Arguments
6 The control_vars argument is the vector of display control variables to be released.
7 Description of Return Codes
8 This routine must return any of the general return codes listed at the beginning of Section 20.5.

9 Cross References
10 • Return Code Types, see Section 20.3.12
11 • ompd_get_display_control_vars, see Section 20.5.9.1

12 20.5.10 Accessing Scope-Specific Information


13 20.5.10.1 ompd_enumerate_icvs
14 Summary
15 The ompd_enumerate_icvs function enumerates ICVs.
16 Format
C
17 ompd_rc_t ompd_enumerate_icvs (
18 ompd_address_space_handle_t *handle,
19 ompd_icv_id_t current,
20 ompd_icv_id_t *next_id,
21 const char **next_icv_name,
22 ompd_scope_t *next_scope,
23 int *more
24 );
C
25 Semantics
26 An OpenMP implementation must support all ICVs listed in Section 2.1. An OpenMP
27 implementation may support additional implementation-specific variables. An implementation may
28 store ICVs in a different scope than Table 2.1 indicates. The ompd_enumerate_icvs function
29 enables a tool to enumerate the ICVs that an OpenMP implementation supports and their related
30 scopes. The ICVs num-procs-var, thread-num-var, final-task-var, explicit-task-var and
31 team-size-var must also be available with an ompd- prefix; this requirement has been deprecated.
32 When the current argument is set to the identifier of a supported ICV, ompd_enumerate_icvs
33 assigns the value, string name, and scope of the next ICV in the enumeration to the locations to
34 which the next_id, next_icv_name, and next_scope arguments point. On return, the third-party tool
35 owns the next_icv_name string. The OMPD library uses the memory allocation callback that the
36 tool provides to allocate the string storage; the tool is responsible for releasing the memory.

590 OpenMP API – Version 5.2 November 2021


1 On return, the location to which the more argument points has the value of 1 whenever one or more
2 ICV are left in the enumeration. On return, that location has the value 0 when current is the last
3 ICV in the enumeration.

4 Description of Arguments
5 The address_space_handle argument identifies the address space. The current argument must be
6 an ICV that the OpenMP implementation supports. To begin enumerating the ICVs, a tool should
7 pass ompd_icv_undefined as the value of current. Subsequent calls to
8 ompd_enumerate_icvs should pass the value returned by the call in the next_id output
9 argument. On return, the next_id argument points to an integer with the value of the ID of the next
10 ICV in the enumeration. On return, the next_icv_name argument points to a character string with
11 the name of the next ICV. On return, the next_scope argument points to the scope enum value of the
12 scope of the next ICV. On return, the more_enums argument points to an integer with the value of 1
13 when more ICVs are left to enumerate and the value of 0 when no more ICVs are left.

14 Description of Return Codes


15 This routine must return any of the general return codes listed at the beginning of Section 20.5 or
16 the following return code:
17 • ompd_rc_bad_input if an unknown value is provided in current.

18 Cross References
19 • ICV ID Type, see Section 20.3.10
20 • OMPD Handle Types, see Section 20.3.8
21 • OMPD Scope Types, see Section 20.3.9
22 • Return Code Types, see Section 20.3.12

23 20.5.10.2 ompd_get_icv_from_scope
24 Summary
25 The ompd_get_icv_from_scope function returns the value of an ICV.

26 Format
C
27 ompd_rc_t ompd_get_icv_from_scope (
28 void *handle,
29 ompd_scope_t scope,
30 ompd_icv_id_t icv_id,
31 ompd_word_t *icv_value
32 );
C

CHAPTER 20. OMPD INTERFACE 591


1 Semantics
2 The ompd_get_icv_from_scope function provides access to the ICVs that
3 ompd_enumerate_icvs identifies.

4 Description of Arguments
5 The handle argument provides an OpenMP scope handle. The scope argument specifies the kind of
6 scope provided in handle. The icv_id argument specifies the ID of the requested ICV. On return,
7 the icv_value argument points to a location with the value of the requested ICV.
8 Constraints on Arguments
9 The provided handle must match the scope as defined in Section 20.3.10.
10 The provided scope must match the scope for icv_id as requested by ompd_enumerate_icvs.
11 Description of Return Codes
12 This routine must return any of the general return codes listed at the beginning of Section 20.5 or
13 any of the following return codes:
14 • ompd_rc_incompatible if the ICV cannot be represented as an integer;
15 • ompd_rc_incomplete if only the first item of the ICV is returned in the integer (e.g., if
16 nthreads-var is a list); or
17 • ompd_rc_bad_input if an unknown value is provided in icv_id.
18 Cross References
19 • ICV ID Type, see Section 20.3.10
20 • OMPD Handle Types, see Section 20.3.8
21 • OMPD Scope Types, see Section 20.3.9
22 • Return Code Types, see Section 20.3.12
23 • ompd_enumerate_icvs, see Section 20.5.10.1

24 20.5.10.3 ompd_get_icv_string_from_scope
25 Summary
26 The ompd_get_icv_string_from_scope function returns the value of an ICV.
27 Format
C
28 ompd_rc_t ompd_get_icv_string_from_scope (
29 void *handle,
30 ompd_scope_t scope,
31 ompd_icv_id_t icv_id,
32 const char **icv_string
33 );
C

592 OpenMP API – Version 5.2 November 2021


1 Semantics
2 The ompd_get_icv_string_from_scope function provides access to the ICVs that
3 ompd_enumerate_icvs identifies.
4 Description of Arguments
5 The handle argument provides an OpenMP scope handle. The scope argument specifies the kind of
6 scope provided in handle. The icv_id argument specifies the ID of the requested ICV. On return,
7 the icv_string argument points to a string representation of the requested ICV.
8 On return, the third-party tool owns the icv_string string. The OMPD library allocates the string
9 storage with the memory allocation callback that the tool provides. The tool is responsible for
10 releasing the memory.
11 Constraints on Arguments
12 The provided handle must match the scope as defined in Section 20.3.10.
13 The provided scope must match the scope for icv_id as requested by ompd_enumerate_icvs.
14 Description of Return Codes
15 This routine must return any of the general return codes listed at the beginning of Section 20.5 or
16 the following return code:
17 • ompd_rc_bad_input if an unknown value is provided in icv_id.
18 Cross References
19 • ICV ID Type, see Section 20.3.10
20 • OMPD Handle Types, see Section 20.3.8
21 • OMPD Scope Types, see Section 20.3.9
22 • Return Code Types, see Section 20.3.12
23 • ompd_enumerate_icvs, see Section 20.5.10.1

24 20.5.10.4 ompd_get_tool_data
25 Summary
26 The ompd_get_tool_data function provides access to the OMPT data variable stored for each
27 OpenMP scope.
28 Format
C
29 ompd_rc_t ompd_get_tool_data(
30 void* handle,
31 ompd_scope_t scope,
32 ompd_word_t *value,
33 ompd_address_t *ptr
34 );
C

CHAPTER 20. OMPD INTERFACE 593


1 Semantics
2 The ompd_get_tool_data function provides access to the OMPT tool data stored for each
3 scope. If the runtime library does not support OMPT then the function returns
4 ompd_rc_unsupported.
5 Description of Arguments
6 The handle argument provides an OpenMP scope handle. The scope argument specifies the kind of
7 scope provided in handle. On return, the value argument points to the value field of the
8 ompt_data_t union stored for the selected scope. On return, the ptr argument points to the ptr
9 field of the ompt_data_t union stored for the selected scope.
10 Description of Return Codes
11 This routine must return any of the general return codes listed at the beginning of Section 20.5 or
12 the following return code:
13 • ompd_rc_unsupported if the runtime library does not support OMPT.
14 Cross References
15 • OMPD Handle Types, see Section 20.3.8
16 • OMPD Scope Types, see Section 20.3.9
17 • Return Code Types, see Section 20.3.12
18 • ompt_data_t, see Section 19.4.4.4

19 20.6 Runtime Entry Points for OMPD


20 The OpenMP implementation must define several entry point symbols through which execution
21 must pass when particular events occur and data collection for OMPD is enabled. A tool can enable
22 notification of an event by setting a breakpoint at the address of the entry point symbol.
23 Entry point symbols have external C linkage and do not require demangling or other
24 transformations to look up their names to obtain the address in the OpenMP program. While each
25 entry point symbol conceptually has a function type signature, it may not be a function. It may be a
26 labeled location

27 20.6.1 Beginning Parallel Regions


28 Summary
29 Before starting the execution of an OpenMP parallel region, the implementation executes
30 ompd_bp_parallel_begin.
31 Format
C
32 void ompd_bp_parallel_begin(void);
C

594 OpenMP API – Version 5.2 November 2021


1 Semantics
2 The OpenMP implementation must execute ompd_bp_parallel_begin at every
3 parallel-begin event. At the point that the implementation reaches
4 ompd_bp_parallel_begin, the binding for ompd_get_curr_parallel_handle is the
5 parallel region that is beginning and the binding for ompd_get_curr_task_handle is the
6 task that encountered the parallel construct.
7 Cross References
8 • ompd_get_curr_parallel_handle, see Section 20.5.6.1
9 • ompd_get_curr_task_handle, see Section 20.5.7.1
10 • parallel directive, see Section 10.1

11 20.6.2 Ending Parallel Regions


12 Summary
13 After finishing the execution of an OpenMP parallel region, the implementation executes
14 ompd_bp_parallel_end.
15 Format
C
16 void ompd_bp_parallel_end(void);
C
17 Semantics
18 The OpenMP implementation must execute ompd_bp_parallel_end at every parallel-end
19 event. At the point that the implementation reaches ompd_bp_parallel_end, the binding for
20 ompd_get_curr_parallel_handle is the parallel region that is ending and the binding
21 for ompd_get_curr_task_handle is the task that encountered the parallel construct.
22 After execution of ompd_bp_parallel_end, any parallel_handle that was acquired for the
23 parallel region is invalid and should be released.
24 Cross References
25 • ompd_get_curr_parallel_handle, see Section 20.5.6.1
26 • ompd_get_curr_task_handle, see Section 20.5.7.1
27 • ompd_rel_parallel_handle, see Section 20.5.6.4
28 • parallel directive, see Section 10.1

29 20.6.3 Beginning Task Regions


30 Summary
31 Before starting the execution of an OpenMP task region, the implementation executes
32 ompd_bp_task_begin.

CHAPTER 20. OMPD INTERFACE 595


1 Format
C
2 void ompd_bp_task_begin(void);
C
3 Semantics
4 The OpenMP implementation must execute ompd_bp_task_begin immediately before starting
5 execution of a structured-block that is associated with a non-merged task. At the point that the
6 implementation reaches ompd_bp_task_begin, the binding for
7 ompd_get_curr_task_handle is the task that is scheduled to execute.

8 Cross References
9 • ompd_get_curr_task_handle, see Section 20.5.7.1

10 20.6.4 Ending Task Regions


11 Summary
12 After finishing the execution of an OpenMP task region, the implementation executes
13 ompd_bp_task_end.

14 Format
C
15 void ompd_bp_task_end(void);
C
16 Semantics
17 The OpenMP implementation must execute ompd_bp_task_end immediately after completion
18 of a structured-block that is associated with a non-merged task. At the point that the implementation
19 reaches ompd_bp_task_end, the binding for ompd_get_curr_task_handle is the task
20 that finished execution. After execution of ompd_bp_task_end, any task_handle that was
21 acquired for the task region is invalid and should be released.

22 Cross References
23 • ompd_get_curr_task_handle, see Section 20.5.7.1
24 • ompd_rel_task_handle, see Section 20.5.7.5

25 20.6.5 Beginning OpenMP Threads


26 Summary
27 When starting an OpenMP thread, the implementation executes ompd_bp_thread_begin.

596 OpenMP API – Version 5.2 November 2021


1 Format
C
2 void ompd_bp_thread_begin(void);
C
3 Semantics
4 The OpenMP implementation must execute ompd_bp_thread_begin at every
5 native-thread-begin and initial-thread-begin event. This execution occurs before the thread starts
6 the execution of any OpenMP region.

7 Cross References
8 • Initial Task, see Section 12.8
9 • parallel directive, see Section 10.1

10 20.6.6 Ending OpenMP Threads


11 Summary
12 When terminating an OpenMP thread, the implementation executes ompd_bp_thread_end.

13 Format
C
14 void ompd_bp_thread_end(void);
C
15 Semantics
16 The OpenMP implementation must execute ompd_bp_thread_end at every native-thread-end
17 and initial-thread-end event. This execution occurs after the thread completes the execution of all
18 OpenMP regions. After executing ompd_bp_thread_end, any thread_handle that was acquired
19 for this thread is invalid and should be released.

20 Cross References
21 • Initial Task, see Section 12.8
22 • ompd_rel_thread_handle, see Section 20.5.5.3
23 • parallel directive, see Section 10.1

24 20.6.7 Initializing OpenMP Devices


25 Summary
26 The OpenMP implementation must execute ompd_bp_device_begin at every device-initialize
27 event.

CHAPTER 20. OMPD INTERFACE 597


1 Format
C
2 void ompd_bp_device_begin(void);
C
3 Semantics
4 When initializing a device for execution of a target region, the implementation must execute
5 ompd_bp_device_begin. This execution occurs before the work associated with any OpenMP
6 region executes on the device.

7 Cross References
8 • Device Initialization, see Section 13.4

9 20.6.8 Finalizing OpenMP Devices


10 Summary
11 When terminating an OpenMP thread, the implementation executes ompd_bp_device_end.

12 Format
C
13 void ompd_bp_device_end(void);
C
14 Semantics
15 The OpenMP implementation must execute ompd_bp_device_end at every device-finalize
16 event. This execution occurs after the thread executes all OpenMP regions. After execution of
17 ompd_bp_device_end, any address_space_handle that was acquired for this device is invalid
18 and should be released.

19 Cross References
20 • Device Initialization, see Section 13.4
21 • ompd_rel_address_space_handle, see Section 20.5.2.3

598 OpenMP API – Version 5.2 November 2021


1 21 Environment Variables
2 This chapter describes the OpenMP environment variables that specify the settings of the ICVs that
3 affect the execution of OpenMP programs (see Chapter 2). The names of the environment variables
4 must be upper case. Unless otherwise specified, the values assigned to the environment variables
5 are case insensitive and may have leading and trailing white space. Modifications to the
6 environment variables after the program has started, even if modified by the program itself, are
7 ignored by the OpenMP implementation. However, the settings of some of the ICVs can be
8 modified during the execution of the OpenMP program by the use of the appropriate directive
9 clauses or OpenMP API routines.
10 The following examples demonstrate how the OpenMP environment variables can be set in
11 different environments:
12 • csh-like shells:
13 setenv OMP_SCHEDULE "dynamic"

14 • bash-like shells:
15 export OMP_SCHEDULE="dynamic"

16 • Windows Command Line:


17 set OMP_SCHEDULE=dynamic

18 As defined following Table 2.2 in Section 2.2, device-specific environment variables extend many
19 of the environment variables defined in this chapter. If the corresponding environment variable for
20 a specific device number, including the host device, is set, then the setting for that environment
21 variable is used to set the value of the associated ICV of the device with the corresponding device
22 number. If the corresponding environment variable that includes the _DEV suffix but no device
23 number is set, then the setting of that environment variable is used to set the value of the associated
24 ICV of any non-host device for which the device-number-specific corresponding environment
25 variable is not set. In all cases the setting of an environment variable for which a device number is
26 specified takes precedence.

27 Restrictions
28 Restrictions to device-specific environment variables are as follows:
29 • Device-specific environment variables must not correspond to environment variables that
30 initialize ICVs with global scope.

599
1 21.1 Parallel Region Environment Variables
2 This section defines environment variables that affect the operation of parallel regions.

3 21.1.1 OMP_DYNAMIC
4 The OMP_DYNAMIC environment variable controls dynamic adjustment of the number of threads
5 to use for executing parallel regions by setting the initial value of the dyn-var ICV.
6 The value of this environment variable must be one of the following:
7 true | false
8 If the environment variable is set to true, the OpenMP implementation may adjust the number of
9 threads to use for executing parallel regions in order to optimize the use of system resources. If
10 the environment variable is set to false, the dynamic adjustment of the number of threads is
11 disabled. The behavior of the program is implementation defined if the value of OMP_DYNAMIC is
12 neither true nor false.
13 Example:
14 setenv OMP_DYNAMIC true

15 Cross References
16 • omp_get_dynamic, see Section 18.2.7
17 • omp_set_dynamic, see Section 18.2.6
18 • dyn-var ICV, see Table 2.1
19 • parallel directive, see Section 10.1

20 21.1.2 OMP_NUM_THREADS
21 The OMP_NUM_THREADS environment variable sets the number of threads to use for parallel
22 regions by setting the initial value of the nthreads-var ICV. See Chapter 2 for a comprehensive set
23 of rules about the interaction between the OMP_NUM_THREADS environment variable, the
24 num_threads clause, the omp_set_num_threads library routine and dynamic adjustment of
25 threads, and Section 10.1.1 for a complete algorithm that describes how the number of threads for a
26 parallel region is determined.
27 The value of this environment variable must be a list of positive integer values. The values of the
28 list set the number of threads to use for parallel regions at the corresponding nested levels.
29 The behavior of the program is implementation defined if any value of the list specified in the
30 OMP_NUM_THREADS environment variable leads to a number of threads that is greater than an
31 implementation can support, or if any value is not a positive integer.

600 OpenMP API – Version 5.2 November 2021


1 The OMP_NUM_THREADS environment variable sets the max-active-levels-var ICV to the number
2 of active levels of parallelism that the implementation supports if the OMP_NUM_THREADS
3 environment variable is set to a comma-separated list of more than one value. The value of the
4 max-active-level-var ICV may be overridden by setting OMP_MAX_ACTIVE_LEVELS or
5 OMP_NESTED. See Section 21.1.4 and Section 21.1.5 for details.
6 Example:
7 setenv OMP_NUM_THREADS 4,3,2

8 Cross References
9 • OMP_MAX_ACTIVE_LEVELS, see Section 21.1.4
10 • OMP_NESTED (Deprecated), see Section 21.1.5
11 • omp_set_num_threads, see Section 18.2.1
12 • nthreads-var ICV, see Table 2.1
13 • num_threads clause, see Section 10.1.2
14 • parallel directive, see Section 10.1

15 21.1.3 OMP_THREAD_LIMIT
16 The OMP_THREAD_LIMIT environment variable sets the maximum number of OpenMP threads
17 to use in a contention group by setting the thread-limit-var ICV. The value of this environment
18 variable must be a positive integer. The behavior of the program is implementation defined if the
19 requested value of OMP_THREAD_LIMIT is greater than the number of threads an implementation
20 can support, or if the value is not a positive integer.

21 Cross References
22 • thread-limit-var ICV, see Table 2.1

23 21.1.4 OMP_MAX_ACTIVE_LEVELS
24 The OMP_MAX_ACTIVE_LEVELS environment variable controls the maximum number of nested
25 active parallel regions by setting the initial value of the max-active-levels-var ICV. The value
26 of this environment variable must be a non-negative integer. The behavior of the program is
27 implementation defined if the requested value of OMP_MAX_ACTIVE_LEVELS is greater than the
28 maximum number of nested active parallel levels an implementation can support, or if the value is
29 not a non-negative integer.

30 Cross References
31 • max-active-levels-var ICV, see Table 2.1

CHAPTER 21. ENVIRONMENT VARIABLES 601


1 21.1.5 OMP_NESTED (Deprecated)
2 The OMP_NESTED environment variable controls nested parallelism by setting the initial value of
3 the max-active-levels-var ICV. If the environment variable is set to true, the initial value of
4 max-active-levels-var is set to the number of active levels of parallelism supported by the
5 implementation. If the environment variable is set to false, the initial value of
6 max-active-levels-var is set to 1. The behavior of the program is implementation defined if the
7 value of OMP_NESTED is neither true nor false.
8 If both the OMP_NESTED and OMP_MAX_ACTIVE_LEVELS environment variables are set, the
9 value of OMP_NESTED is false, and the value of OMP_MAX_ACTIVE_LEVELS is greater than
10 1, then the behavior is implementation defined. Otherwise, if both environment variables are set
11 then the OMP_NESTED environment variable has no effect.
12 The OMP_NESTED environment variable has been deprecated.
13 Example:
14 setenv OMP_NESTED false

15 Cross References
16 • OMP_MAX_ACTIVE_LEVELS, see Section 21.1.4
17 • max-active-levels-var ICV, see Table 2.1

18 21.1.6 OMP_PLACES
19 The OMP_PLACES environment variable sets the initial value of the place-partition-var ICV. A list
20 of places can be specified in the OMP_PLACES environment variable. The value of OMP_PLACES
21 can be one of two types of values: either an abstract name that describes a set of places or an
22 explicit list of places described by non-negative numbers.
23 The OMP_PLACES environment variable can be defined using an explicit ordered list of
24 comma-separated places. A place is defined by an unordered set of comma-separated non-negative
25 numbers enclosed by braces, or a non-negative number. The meaning of the numbers and how the
26 numbering is done are implementation defined. Generally, the numbers represent the smallest unit
27 of execution exposed by the execution environment, typically a hardware thread.
28 Intervals may also be used to define places. Intervals can be specified using the <lower-bound> :
29 <length> : <stride> notation to represent the following list of numbers: “<lower-bound>,
30 <lower-bound> + <stride>, ..., <lower-bound> + (<length> - 1)*<stride>.” When <stride> is
31 omitted, a unit stride is assumed. Intervals can specify numbers within a place as well as sequences
32 of places.
33 An exclusion operator “!” can also be used to exclude the number or place immediately following
34 the operator.

602 OpenMP API – Version 5.2 November 2021


1 Alternatively, the abstract names listed in Table 21.1 should be understood by the execution and
2 runtime environment. The precise definitions of the abstract names are implementation defined. An
3 implementation may also add abstract names as appropriate for the target platform.
4 The abstract name may be appended by a positive number in parentheses to denote the length of the
5 place list to be created, that is abstract_name(num-places). When requesting fewer places than
6 available on the system, the determination of which resources of type abstract_name are to be
7 included in the place list is implementation defined. When requesting more resources than
8 available, the length of the place list is implementation defined.

TABLE 21.1: Predefined Abstract Names for OMP_PLACES

Abstract Name Meaning


threads Each place corresponds to a single hardware thread on the de-
vice.
cores Each place corresponds to a single core (having one or more
hardware threads) on the device.
ll_caches Each place corresponds to a set of cores that share the last
level cache on the device.
numa_domains Each place corresponds to a set of cores for which their closest
memory on the device is:
• the same memory; and
• at a similar distance from the cores.

sockets Each place corresponds to a single socket (consisting of one or


more cores) on the device.
9 The behavior of the program is implementation defined when the execution environment cannot
10 map a numerical value (either explicitly defined or implicitly derived from an interval) within the
11 OMP_PLACES list to a processor on the target platform, or if it maps to an unavailable processor.
12 The behavior is also implementation defined when the OMP_PLACES environment variable is
13 defined using an abstract name.
14 The following grammar describes the values accepted for the OMP_PLACES environment variable.

hlisti |= hp-listi | hanamei


hp-listi |= hp-intervali | hp-listi,hp-intervali
hp-intervali |= hplacei:hleni:hstridei | hplacei:hleni | hplacei | !hplacei
hplacei |= {hres-listi} | hresi
hres-listi |= hres-intervali | hres-listi,hres-intervali
hres-intervali |= hresi:hnum-placesi:hstridei | hresi:hnum-placesi | hresi | !hresi

CHAPTER 21. ENVIRONMENT VARIABLES 603


hanamei |= hwordi(hnum-placesi) | hwordi
hwordi |= sockets | cores | ll_caches | numa_domains
| threads | <implementation-defined abstract name>
hresi |= non-negative integer
hnum-placesi |= positive integer
hstridei |= integer
hleni |= positive integer
1 Examples:
2 setenv OMP_PLACES threads
3 setenv OMP_PLACES "threads(4)"
4 setenv OMP_PLACES
5 "{0,1,2,3},{4,5,6,7},{8,9,10,11},{12,13,14,15}"
6 setenv OMP_PLACES "{0:4},{4:4},{8:4},{12:4}"
7 setenv OMP_PLACES "{0:4}:4:4"

8 where each of the last three definitions corresponds to the same 4 places including the smallest
9 units of execution exposed by the execution environment numbered, in turn, 0 to 3, 4 to 7, 8 to 11,
10 and 12 to 15.

11 Cross References
12 • place-partition-var ICV, see Table 2.1

13 21.1.7 OMP_PROC_BIND
14 The OMP_PROC_BIND environment variable sets the initial value of the bind-var ICV. The value
15 of this environment variable is either true, false, or a comma separated list of primary,
16 master (master has been deprecated), close, or spread. The values of the list set the thread
17 affinity policy to be used for parallel regions at the corresponding nested level.
18 If the environment variable is set to false, the execution environment may move OpenMP threads
19 between OpenMP places, thread affinity is disabled, and proc_bind clauses on parallel
20 constructs are ignored.
21 Otherwise, the execution environment should not move OpenMP threads between OpenMP places,
22 thread affinity is enabled, and the initial thread is bound to the first place in the place-partition-var
23 ICV prior to the first active parallel region. An initial thread that is created by a teams construct is
24 bound to the first place in its place-partition-var ICV before it begins execution of the associated
25 structured block.
26 If the environment variable is set to true, the thread affinity policy is implementation defined but
27 must conform to the previous paragraph. The behavior of the program is implementation defined if
28 the value in the OMP_PROC_BIND environment variable is not true, false, or a comma

604 OpenMP API – Version 5.2 November 2021


1 separated list of primary, master (master has been deprecated), close, or spread. The
2 behavior is also implementation defined if an initial thread cannot be bound to the first place in the
3 place-partition-var ICV.
4 The OMP_PROC_BIND environment variable sets the max-active-levels-var ICV to the number of
5 active levels of parallelism that the implementation supports if the OMP_PROC_BIND environment
6 variable is set to a comma-separated list of more than one element. The value of the
7 max-active-level-var ICV may be overridden by setting OMP_MAX_ACTIVE_LEVELS or
8 OMP_NESTED. See Section 21.1.4 and Section 21.1.5 for details.
9 Examples:
10 setenv OMP_PROC_BIND false
11 setenv OMP_PROC_BIND "spread, spread, close"
12 Cross References
13 • Controlling OpenMP Thread Affinity, see Section 10.1.3
14 • OMP_MAX_ACTIVE_LEVELS, see Section 21.1.4
15 • OMP_NESTED (Deprecated), see Section 21.1.5
16 • omp_get_proc_bind, see Section 18.3.1
17 • bind-var ICV, see Table 2.1
18 • max-active-levels-var ICV, see Table 2.1
19 • parallel directive, see Section 10.1
20 • place-partition-var ICV, see Table 2.1
21 • proc_bind clause, see Section 10.1.4
22 • teams directive, see Section 10.2

23 21.2 Program Execution Environment Variables


24 This section defines environment variables that affect program execution.

25 21.2.1 OMP_SCHEDULE
26 The OMP_SCHEDULE environment variable controls the schedule kind and chunk size of all
27 worksharing-loop directives that have the schedule kind runtime, by setting the value of the
28 run-sched-var ICV. The value of this environment variable takes the form [modifier:]kind[, chunk],
29 where:
30 • modifier is one of monotonic or nonmonotonic;
31 • kind is one of static, dynamic, guided, or auto;
32 • chunk is an optional positive integer that specifies the chunk size.

CHAPTER 21. ENVIRONMENT VARIABLES 605


1 If the modifier is not present, the modifier is set to monotonic if kind is static; for any other
2 kind it is set to nonmonotonic.
3 If chunk is present, white space may be on either side of the “,”. See Section 11.5.3 for a detailed
4 description of the schedule kinds.
5 The behavior of the program is implementation defined if the value of OMP_SCHEDULE does not
6 conform to the above format.
7 Examples:
8 setenv OMP_SCHEDULE "guided,4"
9 setenv OMP_SCHEDULE "dynamic"
10 setenv OMP_SCHEDULE "nonmonotonic:dynamic,4"

11 Cross References
12 • run-sched-var ICV, see Table 2.1
13 • schedule clause, see Section 11.5.3

14 21.2.2 OMP_STACKSIZE
15 The OMP_STACKSIZE environment variable controls the size of the stack for threads created by
16 the OpenMP implementation, by setting the value of the stacksize-var ICV. The environment
17 variable does not control the size of the stack for an initial thread. The value of this environment
18 variable takes the form size[unit], where:
19 • size is a positive integer that specifies the size of the stack for threads that are created by the
20 OpenMP implementation.
21 • unit is B, K, M, or G and specifies whether the given size is in Bytes, Kilobytes (1024 Bytes),
22 Megabytes (1024 Kilobytes), or Gigabytes (1024 Megabytes), respectively. If unit is present,
23 white space may occur between size and it, whereas if unit is not present then K is assumed.
24 The behavior of the program is implementation defined if OMP_STACKSIZE does not conform to
25 the above format, or if the implementation cannot provide a stack with the requested size.
26 Examples:
27 setenv OMP_STACKSIZE 2000500B
28 setenv OMP_STACKSIZE "3000 k "
29 setenv OMP_STACKSIZE 10M
30 setenv OMP_STACKSIZE " 10 M "
31 setenv OMP_STACKSIZE "20 m "
32 setenv OMP_STACKSIZE " 1G"
33 setenv OMP_STACKSIZE 20000

34 Cross References
35 • stacksize-var ICV, see Table 2.1

606 OpenMP API – Version 5.2 November 2021


1 21.2.3 OMP_WAIT_POLICY
2 The OMP_WAIT_POLICY environment variable provides a hint to an OpenMP implementation
3 about the desired behavior of waiting threads by setting the wait-policy-var ICV. A compliant
4 OpenMP implementation may or may not abide by the setting of the environment variable. The
5 value of this environment variable must be one of the following:
6 active | passive
7 The active value specifies that waiting threads should mostly be active, consuming processor
8 cycles, while waiting. An OpenMP implementation may, for example, make waiting threads spin.
9 The passive value specifies that waiting threads should mostly be passive, not consuming
10 processor cycles, while waiting. For example, an OpenMP implementation may make waiting
11 threads yield the processor to other threads or go to sleep. The details of the active and
12 passive behaviors are implementation defined. The behavior of the program is implementation
13 defined if the value of OMP_WAIT_POLICY is neither active nor passive.
14 Examples:
15 setenv OMP_WAIT_POLICY ACTIVE
16 setenv OMP_WAIT_POLICY active
17 setenv OMP_WAIT_POLICY PASSIVE
18 setenv OMP_WAIT_POLICY passive

19 Cross References
20 • wait-policy-var ICV, see Table 2.1

21 21.2.4 OMP_DISPLAY_AFFINITY
22 The OMP_DISPLAY_AFFINITY environment variable instructs the runtime to display formatted
23 affinity information by setting the display-affinity-var ICV. Affinity information is printed for all
24 OpenMP threads in the parallel region upon entering it and when any change occurs in the
25 information accessible by the format specifiers listed in Table 21.2. If affinity of any thread in a
26 parallel region changes then thread affinity information for all threads in that region is displayed. If
27 the thread affinity for each respective parallel region at each nesting level has already been displayed
28 and the thread affinity has not changed, then the information is not displayed again. Thread affinity
29 information for threads in the same parallel region may be displayed in any order. The value of the
30 OMP_DISPLAY_AFFINITY environment variable may be set to one of these values:
31 true | false
32 The true value instructs the runtime to display the OpenMP thread affinity information, and uses
33 the format setting defined in the affinity-format-var ICV. The runtime does not display the OpenMP
34 thread affinity information when the value of the OMP_DISPLAY_AFFINITY environment
35 variable is false or undefined. For all values of the environment variable other than true or
36 false, the display action is implementation defined.

CHAPTER 21. ENVIRONMENT VARIABLES 607


1 Example:
2 setenv OMP_DISPLAY_AFFINITY TRUE

3 For this example, an OpenMP implementation displays thread affinity information during program
4 execution, in a format given by the affinity-format-var ICV. The following is a sample output:
5 nesting_level= 1, thread_num= 0, thread_affinity= 0,1
6 nesting_level= 1, thread_num= 1, thread_affinity= 2,3

7 Cross References
8 • Controlling OpenMP Thread Affinity, see Section 10.1.3
9 • OMP_AFFINITY_FORMAT, see Section 21.2.5
10 • affinity-format-var ICV, see Table 2.1
11 • display-affinity-var ICV, see Table 2.1

12 21.2.5 OMP_AFFINITY_FORMAT
13 The OMP_AFFINITY_FORMAT environment variable sets the initial value of the
14 affinity-format-var ICV which defines the format when displaying OpenMP thread affinity
15 information. The value of this environment variable is case sensitive and leading and trailing
16 whitespace is significant. Its value is a character string that may contain as substrings one or more
17 field specifiers (as well as other characters). The format of each field specifier is
18 %[[[0].] size ] type

19 where each specifier must contain the percent symbol (%) and a type, that must be either a single
20 character short name or its corresponding long name delimited with curly braces, such as %n or
21 %{thread_num}. A literal percent is specified as %%. Field specifiers can be provided in any
22 order. The behavior is implementation defined for field specifiers that do not conform to this format.
23 The 0 modifier indicates whether or not to add leading zeros to the output, following any indication
24 of sign or base. The . modifier indicates the output should be right justified when size is specified.
25 By default, output is left justified. The minimum field length is size, which is a decimal digit string
26 with a non-zero first digit. If no size is specified, the actual length needed to print the field will be
27 used. If the 0 modifier is used with type of A, {thread_affinity}, H, {host}, or a type that
28 is not printed as a number, the result is unspecified. Any other characters in the format string that
29 are not part of a field specifier will be included literally in the output.
30 Implementations may define additional field types. If an implementation does not have information
31 for a field type or an unknown field type is part of a field specifier, "undefined" is printed for this
32 field when displaying the OpenMP thread affinity information.

608 OpenMP API – Version 5.2 November 2021


TABLE 21.2: Available Field Types for Formatting OpenMP Thread Affinity Information
Short Long Name Meaning
Name

t team_num The value returned by omp_get_team_num().


T num_teams The value returned by omp_get_num_teams().
L nesting_level The value returned by omp_get_level().
n thread_num The value returned by omp_get_thread_num().
N num_threads The value returned by omp_get_num_threads().
a ancestor_tnum The value returned by
omp_get_ancestor_thread_num(level),
where level is omp_get_level() minus 1.
H host The name for the host device on which the OpenMP pro-
gram is running.
P process_id The process identifier used by the implementation.
i native_thread_id The native thread identifier used by the implementation.
A thread_affinity The list of numerical identifiers, in the format of a comma-
separated list of integers or integer ranges, that represent
processors on which a thread may execute, subject to
OpenMP thread affinity control and/or other external affin-
ity mechanisms.
1 Example:
2 setenv OMP_AFFINITY_FORMAT
3 "Thread Affinity: %0.3L %.8n %.15{thread_affinity} %.12H"

4 The above example causes an OpenMP implementation to display OpenMP thread affinity
5 information in the following form:
6 Thread Affinity: 001 0 0-1,16-17 nid003
7 Thread Affinity: 001 1 2-3,18-19 nid003

8 Cross References
9 • Controlling OpenMP Thread Affinity, see Section 10.1.3
10 • omp_get_ancestor_thread_num, see Section 18.2.18
11 • omp_get_level, see Section 18.2.17
12 • omp_get_num_teams, see Section 18.4.1

CHAPTER 21. ENVIRONMENT VARIABLES 609


1 • omp_get_num_threads, see Section 18.2.2
2 • omp_get_thread_num, see Section 18.2.4
3 • omp_get_thread_num, see Section 18.2.4
4 • affinity-format-var ICV, see Table 2.1

5 21.2.6 OMP_CANCELLATION
6 The OMP_CANCELLATION environment variable sets the initial value of the cancel-var ICV. The
7 value of this environment variable must be one of the following:
8 true|false
9 If the environment variable is set to true, the effects of the cancel construct and of cancellation
10 points are enabled (i.e., cancellation is enabled). If the environment variable is set to false,
11 cancellation is disabled and the cancel construct and cancellation points are effectively ignored.
12 The behavior of the program is implementation defined if OMP_CANCELLATION is set to neither
13 true nor false.

14 Cross References
15 • cancel directive, see Section 16.1
16 • cancel-var ICV, see Table 2.1

17 21.2.7 OMP_DEFAULT_DEVICE
18 The OMP_DEFAULT_DEVICE environment variable sets the device number to use in device
19 constructs by setting the initial value of the default-device-var ICV. The value of this environment
20 variable must be a non-negative integer value.

21 Cross References
22 • Device Directives and Clauses, see Chapter 13
23 • default-device-var ICV, see Table 2.1

24 21.2.8 OMP_TARGET_OFFLOAD
25 The OMP_TARGET_OFFLOAD environment variable sets the initial value of the target-offload-var
26 ICV. Its value must be one of the following:
27 mandatory | disabled | default
28 The mandatory value specifies that the effect of any device construct or device memory routine
29 that uses a device that is unavailable or not supported by the implementation, or uses a
30 non-conforming device number, is as if the omp_invalid_device device number was used.

610 OpenMP API – Version 5.2 November 2021


1 Support for the disabled value is implementation defined. If an implementation supports it, the
2 behavior is as if the only device is the host device. The default value specifies the default
3 behavior as described in Section 1.3.
4 Example:
5 % setenv OMP_TARGET_OFFLOAD mandatory

6 Cross References
7 • Device Directives and Clauses, see Chapter 13
8 • Device Memory Routines, see Section 18.8
9 • target-offload-var ICV, see Table 2.1

10 21.2.9 OMP_MAX_TASK_PRIORITY
11 The OMP_MAX_TASK_PRIORITY environment variable controls the use of task priorities by
12 setting the initial value of the max-task-priority-var ICV. The value of this environment variable
13 must be a non-negative integer.
14 Example:
15 % setenv OMP_MAX_TASK_PRIORITY 20

16 Cross References
17 • max-task-priority-var ICV, see Table 2.1

18 21.3 OMPT Environment Variables


19 This section defines environment variables that affect operation of the OMPT tool interface.

20 21.3.1 OMP_TOOL
21 The OMP_TOOL environment variable sets the tool-var ICV, which controls whether an OpenMP
22 runtime will try to register a first party tool. The value of this environment variable must be one of
23 the following:
24 enabled | disabled
25 If OMP_TOOL is set to any value other than enabled or disabled, the behavior is unspecified.
26 If OMP_TOOL is not defined, the default value for tool-var is enabled.
27 Example:
28 % setenv OMP_TOOL enabled
29 Cross References
30 • OMPT Interface, see Chapter 19
31 • tool-var ICV, see Table 2.1

CHAPTER 21. ENVIRONMENT VARIABLES 611


1 21.3.2 OMP_TOOL_LIBRARIES
2 The OMP_TOOL_LIBRARIES environment variable sets the tool-libraries-var ICV to a list of tool
3 libraries that are considered for use on a device on which an OpenMP implementation is being
4 initialized. The value of this environment variable must be a list of names of dynamically-loadable
5 libraries, separated by an implementation specific, platform typical separator. Whether the value of
6 this environment variable is case sensitive is implementation defined.
7 If the tool-var ICV is not enabled, the value of tool-libraries-var is ignored. Otherwise, if
8 ompt_start_tool is not visible in the address space on a device where OpenMP is being
9 initialized or if ompt_start_tool returns NULL, an OpenMP implementation will consider
10 libraries in the tool-libraries-var list in a left-to-right order. The OpenMP implementation will
11 search the list for a library that meets two criteria: it can be dynamically loaded on the current
12 device and it defines the symbol ompt_start_tool. If an OpenMP implementation finds a
13 suitable library, no further libraries in the list will be considered.
14 Example:
15 % setenv OMP_TOOL_LIBRARIES libtoolXY64.so:/usr/local/lib/
16 libtoolXY32.so

17 Cross References
18 • OMPT Interface, see Chapter 19
19 • ompt_start_tool, see Section 19.2.1
20 • tool-libraries-var ICV, see Table 2.1

21 21.3.3 OMP_TOOL_VERBOSE_INIT
22 The OMP_TOOL_VERBOSE_INIT environment variable sets the tool-verbose-init-var ICV, which
23 controls whether an OpenMP implementation will verbosely log the registration of a tool. The
24 value of this environment variable must be one of the following:
25 disabled | stdout | stderr | <filename>
26 If OMP_TOOL_VERBOSE_INIT is set to any value other than case insensitive disabled,
27 stdout, or stderr, the value is interpreted as a filename and the OpenMP runtime will try to
28 log to a file with prefix filename. If the value is interpreted as a filename, whether it is case
29 sensitive is implementation defined. If opening the logfile fails, the output will be redirected to
30 stderr. If OMP_TOOL_VERBOSE_INIT is not defined, the default value for tool-verbose-init-var
31 is disabled. Support for logging to stdout or stderr is implementation defined. Unless
32 tool-verbose-init-var is disabled, the OpenMP runtime will log the steps of the tool activation
33 process defined in Section 19.2.2 to a file with a name that is constructed using the provided
34 filename prefix. The format and detail of the log is implementation defined. At a minimum, the log
35 will contain one of the following:
36 • That the tool-var ICV is disabled;

612 OpenMP API – Version 5.2 November 2021


1 • An indication that a tool was available in the address space at program launch; or
2 • The path name of each tool in OMP_TOOL_LIBRARIES that is considered for dynamic loading,
3 whether dynamic loading was successful, and whether the ompt_start_tool function is
4 found in the loaded library.
5 In addition, if an ompt_start_tool function is called the log will indicate whether or not the
6 tool will use the OMPT interface.
7 Example:
8 % setenv OMP_TOOL_VERBOSE_INIT disabled
9 % setenv OMP_TOOL_VERBOSE_INIT STDERR
10 % setenv OMP_TOOL_VERBOSE_INIT ompt_load.log

11 Cross References
12 • OMPT Interface, see Chapter 19
13 • tool-verbose-init-var ICV, see Table 2.1

14 21.4 OMPD Environment Variables


15 This section defines environment variables that affect operation of the OMPD tool interface.

16 21.4.1 OMP_DEBUG
17 The OMP_DEBUG environment variable sets the debug-var ICV, which controls whether an
18 OpenMP runtime collects information that an OMPD library may need to support a tool. The value
19 of this environment variable must be one of the following:
20 enabled | disabled
21 If OMP_DEBUG is set to any value other than enabled or disabled then the behavior is
22 implementation defined.
23 Example:
24 % setenv OMP_DEBUG enabled

25 Cross References
26 • Enabling Runtime Support for OMPD, see Section 20.2.1
27 • OMPD Interface, see Chapter 20
28 • debug-var ICV, see Table 2.1

CHAPTER 21. ENVIRONMENT VARIABLES 613


1 21.5 Memory Allocation Environment Variables
2 This section defines environment variables that affect memory allocations.

3 21.5.1 OMP_ALLOCATOR
4 The OMP_ALLOCATOR environment variable sets the initial value of the def-allocator-var ICV
5 that specifies the default allocator for allocation calls, directives and clauses that do not specify an
6 allocator. The following grammar describes the values accepted for the OMP_ALLOCATOR
7 environment variable.

hallocatori |= hpredef-allocatori | hpredef-mem-spacei | hpredef-mem-spacei:htraitsi


htraitsi |= htraiti=hvaluei | htraiti=hvaluei,htraitsi
hpredef-allocatori |= one of the predefined allocators from Table 6.3
hpredef-mem-spacei |= one of the predefined memory spaces from Table 6.1
htraiti |= one of the allocator trait names from Table 6.2
hvaluei |= one of the allowed values from Table 6.2 | non-negative integer
| hpredef-allocatori

8 The value can be an integer only if the trait accepts a numerical value, for the fb_data trait the
9 value can only be predef-allocator. If the value of this environment variable is not a predefined
10 allocator, then a new allocator with the given predefined memory space and optional traits is
11 created and set as the def-allocator-var ICV. If the new allocator cannot be created, the
12 def-allocator-var ICV will be set to omp_default_mem_alloc.
13 Example:
14 setenv OMP_ALLOCATOR omp_high_bw_mem_alloc
15 setenv OMP_ALLOCATOR omp_large_cap_mem_space:alignment=16,\
16 pinned=true
17 setenv OMP_ALLOCATOR omp_high_bw_mem_space:pool_size=1048576,\
18 fallback=allocator_fb,fb_data=omp_low_lat_mem_alloc

19 Cross References
20 • Memory Allocators, see Section 6.2
21 • def-allocator-var ICV, see Table 2.1

22 21.6 Teams Environment Variables


23 This section defines environment variables that affect the operation of teams regions.

614 OpenMP API – Version 5.2 November 2021


1 21.6.1 OMP_NUM_TEAMS
2 The OMP_NUM_TEAMS environment variable sets the maximum number of teams created by a
3 teams construct by setting the nteams-var ICV. The value of this environment variable must be a
4 positive integer. The behavior of the program is implementation defined if the requested value of
5 OMP_NUM_TEAMS is greater than the number of teams that an implementation can support, or if
6 the value is not a positive integer.
7 Cross References
8 • nteams-var ICV, see Table 2.1
9 • teams directive, see Section 10.2

10 21.6.2 OMP_TEAMS_THREAD_LIMIT
11 The OMP_TEAMS_THREAD_LIMIT environment variable sets the maximum number of OpenMP
12 threads to use in each contention group created by a teams construct by setting the
13 teams-thread-limit-var ICV. The value of this environment variable must be a positive integer. The
14 behavior of the program is implementation defined if the requested value of
15 OMP_TEAMS_THREAD_LIMIT is greater than the number of threads that an implementation can
16 support, or if the value is not a positive integer.
17 Cross References
18 • teams directive, see Section 10.2
19 • teams-thread-limit-var ICV, see Table 2.1

20 21.7 OMP_DISPLAY_ENV
21 The OMP_DISPLAY_ENV environment variable instructs the runtime to display the information as
22 described in the omp_display_env routine section (Section 18.15). The value of the
23 OMP_DISPLAY_ENV environment variable may be set to one of these values:
24 true | false | verbose
25 If the environment variable is set to true, the effect is as if the omp_display_env routine is
26 called with the verbose argument set to false at the beginning of the program. If the environment
27 variable is set to verbose, the effect is as if the omp_display_env routine is called with the
28 verbose argument set to true at the beginning of the program. If the environment variable is
29 undefined or set to false, the runtime does not display any information. For all values of the
30 environment variable other than true, false, and verbose, the displayed information is
31 unspecified.
32 Example:
33 % setenv OMP_DISPLAY_ENV true

34 For the output of the above example, see Section 18.15.


35 Cross References
36 • Environment Display Routine, see Section 18.15

CHAPTER 21. ENVIRONMENT VARIABLES 615


1 A OpenMP Implementation-Defined
2 Behaviors
3 This appendix summarizes the behaviors that are described as implementation defined in the
4 OpenMP API. Each behavior is cross-referenced back to its description in the main specification.
5 An implementation is required to define and to document its behavior in these cases.

6 Chapter 1:
7 • Processor: A hardware unit that is implementation defined (see Section 1.2.1).
8 • Device: An implementation-defined logical execution engine (see Section 1.2.1).
9 • Device pointer: An implementation-defined handle that refers to a device address (see
10 Section 1.2.6).
11 • Supported active levels of parallelism: The maximum number of active parallel regions that
12 may enclose any region of code in the program is implementation defined (see Section 1.2.7).
13 • Deprecated features: For any deprecated feature, whether any modifications provided by its
14 replacement feature (if any) apply to the deprecated feature is implementation defined (see
15 Section 1.2.7).
16 • Memory model: The minimum size at which a memory update may also read and write back
17 adjacent variables that are part of another variable (as array elements or structure elements) is
18 implementation defined but is no larger than the base language requires. The manner in which a
19 program can obtain the referenced device address from a device pointer, outside the mechanisms
20 specified by OpenMP, is implementation defined (see Section 1.4.1).

21 Chapter 2:
22 • Internal control variables: The initial values of dyn-var, nthreads-var, run-sched-var, bind-var,
23 stacksize-var, wait-policy-var, thread-limit-var, max-active-levels-var, place-partition-var,
24 affinity-format-var, default-device-var, num-procs-var and def-allocator-var are implementation
25 defined (see Section 2.2).

26 Chapter 3:
C / C++
27 • A pragma directive that uses ompx as the first processing token is implementation defined (see
28 Section 3.1).
C / C++

616 OpenMP API – Version 5.2 November 2021


C++
1 • The attribute namespace of an attribute specifier or the optional namespace qualifier within a
2 sequence attribute that uses ompx is implementation defined (see Section 3.1).
3 • Whether a throw executed inside a region that arises from an exception-aborting directive
4 results in runtime error termination is implementation defined (see Section 3.1).
C++
Fortran
5 • Any directive that uses omx or ompx in the sentinel is implementation defined (see Section 3.1).
Fortran
6 Chapter 4:
7 • Loop-iteration spaces and vectors: The particular integer type used to compute the iteration
8 count for the collapsed loop is implementation defined (see Section 4.4.2).
9 Chapter 5:
Fortran
10 • Data-sharing attributes: The data-sharing attributes of dummy arguments that do not have the
11 VALUE attribute are implementation defined if the associated actual argument is shared unless
12 the actual argument is a scalar variable, structure, an array that is not a pointer or assumed-shape
13 array, or a simply contiguous array section (see Section 5.1.2).
14 • threadprivate directive: If the conditions for values of data in the threadprivate objects of
15 threads (other than an initial thread) to persist between two consecutive active parallel regions do
16 not all hold, the allocation status of an allocatable variable in the second region is
17 implementation defined (see Section 5.2).
Fortran
18 • is_device_ptr clause: Support for pointers created outside of the OpenMP device data
19 management routines is implementation defined (see Section 5.4.7).
20 Chapter 6:
21 • Memory spaces: The actual storage resources that each memory space defined in Table 6.1
22 represents are implementation defined. The mechanism that provides the constant value of the
23 variables allocated in the omp_const_mem_space memory space is implementation defined
24 (see Section 6.1).
25 • Memory allocators: The minimum size for partitioning allocated memory over storage
26 resources is implementation defined. The default value for the pool_size allocator trait (see
27 Table 6.2) is implementation defined. The memory spaces associated with the predefined
28 omp_cgroup_mem_alloc, omp_pteam_mem_alloc and omp_thread_mem_alloc
29 allocators (see Table 6.3) are implementation defined (see Section 6.2).
30 • aligned clause: If the alignment modifier is not specified, the default alignments for SIMD
31 instructions on the target platforms are implementation defined (see Section 5.11).

APPENDIX A. OPENMP IMPLEMENTATION-DEFINED BEHAVIORS 617


1 Chapter 7:
2 • OpenMP context: The accepted isa-name values for the isa trait, the accepted arch-name values
3 for the arch trait, the accepted extension-name values for the extension trait and whether the
4 dispatch construct is added to the construct set are implementation defined (see Section 7.1).
5 • Metadirectives: The number of times that each expression of the context selector of a when
6 clause is evaluated is implementation defined (see Section 7.4.1).
7 • Declare variant directives: If two replacement candidates have the same score then their order
8 is implementation defined. The number of times each expression of the context selector of a
9 match clause is evaluated is implementation defined. For calls to constexpr base functions
10 that are evaluated in constant expressions, whether any variant replacement occurs is
11 implementation defined. Any differences that the specific OpenMP context requires in the
12 prototype of the variant from the base function prototype are implementation defined (see
13 Section 7.5).
14 • declare simd directive: If a SIMD version is created and the simdlen clause is not
15 specified, the number of concurrent arguments for the function is implementation defined (see
16 Section 7.7).
17 • Declare target directives: Whether the same version is generated for different devices, or
18 whether a version that is called in a target region differs from the version that is called outside
19 a target region, is implementation defined (see Section 7.8).

20 Chapter 8:
21 • requires directive: Support for any feature specified by a requirement clause on a
22 requires directive is implementation defined (see Section 8.2).

23 Chapter 9:
24 • unroll construct: If no clauses are specified, if and how the loop is unrolled is
25 implementation defined. If the partial clause is specified without an unroll-factor argument
26 then the unroll factor is a positive integer that is implementation defined (see Section 9.2).

27 Chapter 10:
28 • Dynamic adjustment of threads: Providing the ability to adjust the number of threads
29 dynamically is implementation defined (see Section 10.1.1).
30 • Thread affinity: For the close thread affinity policy, if T > P and P does not divide T evenly,
31 the exact number of threads in a particular place is implementation defined. For the spread
32 thread affinity, if T > P and P does not divide T evenly, the exact number of threads in a
33 particular subpartition is implementation defined. The determination of whether the affinity
34 request can be fulfilled is implementation defined. If the affinity request cannot be fulfilled, then
35 the affinity of threads in the team is implementation defined (see Section 10.1.3).
36 • teams construct: The number of teams that are created is implementation defined, but it is
37 greater than or equal to the lower bound and less than or equal to the upper bound values of the
38 num_teams clause if specified. If the num_teams clause is not specified,r the number of

618 OpenMP API – Version 5.2 November 2021


1 teams is less than or equal to the value of the nteams-var ICV if its value is greater than zero.
2 Otherwise it is an implementation defined value greater than or equal to 1 (see Section 10.2).
3 • simd construct: The number of iterations that are executed concurrently at any given time is
4 implementation defined (see Section 10.4).
5 Chapter 11:
6 • single construct: The method of choosing a thread to execute the structured block each time
7 the team encounters the construct is implementation defined (see Section 11.1).
8 • sections construct: The method of scheduling the structured block sequences among threads
9 in the team is implementation defined (see Section 11.3).
10 • Worksharing-loop directive: The schedule that is used is implementation defined if the
11 schedule clause is not specified or if the specified schedule has the kind auto. The value of
12 simd_width for the simd schedule modifier is implementation defined (see Section 11.5).
13 • distribute construct: If no dist_schedule clause is specified then the schedule for the
14 distribute construct is implementation defined (see Section 11.6).
15 Chapter 12:
16 • taskloop construct: The number of loop iterations assigned to a task created from a
17 taskloop construct is implementation defined, unless the grainsize or num_tasks
18 clause is specified (see Section 12.6).
C++
19 • taskloop construct: For firstprivate variables of class type, the number of invocations
20 of copy constructors to perform the initialization is implementation defined (see Section 12.6).
C++
21 Chapter 13:
22 • thread_limit clause: The maximum number of threads that participate in the contention
23 group that each team initiates is implementation defined if no thread_limit clause is
24 specified on the construct. Otherwise, it has the implementation defined upper bound of the
25 teams-thread-limit-var ICV, if the value of this ICV is greater than zero (see Section 13.3).
26 Chapter 14:
27 • interop Construct: The foreign-runtime-id values for the prefer_type clause that the
28 implementation supports, including non-standard names compatible with this clause, and the
29 default choice when the implementation supports multiple values are implementation defined
30 (see Section 14.1).
31 Chapter 15:
32 • atomic construct: A compliant implementation may enforce exclusive access between
33 atomic regions that update different storage locations. The circumstances under which this
34 occurs are implementation defined. If the storage location designated by x is not size-aligned
35 (that is, if the byte alignment of x is not a multiple of the size of x), then the behavior of the
36 atomic region is implementation defined (see Section 15.8.4).

APPENDIX A. OPENMP IMPLEMENTATION-DEFINED BEHAVIORS 619


1 Chapter 16:
2 • None.

3 Chapter 17:
4 • None.

5 Chapter 18:
6 • Runtime Routine names that begin with the ompx_ prefix are implementation-defined extensions
7 to the OpenMP Runtime API (see Chapter 18).
C / C++
8 • Runtime library definitions: The enum types for omp_allocator_handle_t,
9 omp_event_handle_t, omp_interop_fr_t and omp_memspace_handle_t are
10 implementation defined. The integral or pointer type for omp_interop_t is implementation
11 defined. The value of the omp_invalid_device enumerator is implementation defined (see
12 Section 18.1).
C / C++
Fortran
13 • Runtime library definitions: Whether the include file omp_lib.h or the module omp_lib
14 (or both) is provided is implementation defined. Whether the omp_lib.h file provides
15 derived-type definitions or those routines that require an explicit interface is implementation
16 defined. Whether any of the OpenMP runtime library routines that take an argument are
17 extended with a generic interface so arguments of different KIND type can be accommodated is
18 implementation defined. The value of the omp_invalid_device named constant is
19 implementation defined (see Section 18.1).
Fortran
20 • omp_set_num_threads routine: If the argument is not a positive integer, the behavior is
21 implementation defined (see Section 18.2.1).
22 • omp_set_schedule routine: For implementation-specific schedule kinds, the values and
23 associated meanings of the second argument are implementation defined (see Section 18.2.11).
24 • omp_get_schedule routine: The value returned by the second argument is implementation
25 defined for any schedule kinds other than static, dynamic and guided (see
26 Section 18.2.12).
27 • omp_get_supported_active_levels routine: The number of active levels of
28 parallelism supported by the implementation is implementation defined, but must be positive (see
29 Section 18.2.14).
30 • omp_set_max_active_levels routine: If the argument is a negative integer then the
31 behavior is implementation defined. If the argument is less than the active-levels-var ICV, the
32 max-active-levels-var ICV is set to an implementation-defined value between the value of the
33 argument and the value of active-levels-var, inclusive (see Section 18.2.15).

620 OpenMP API – Version 5.2 November 2021


1 • omp_get_place_proc_ids routine: The meaning of the non-negative numerical identifiers
2 returned by the omp_get_place_proc_ids routine is implementation defined. The order of
3 the numerical identifiers returned in the array ids is implementation defined (see Section 18.3.4).
4 • omp_set_affinity_format routine: When called from within any parallel or teams
5 region, the binding thread set (and binding region, if required) for the
6 omp_set_affinity_format region and the effect of this routine are implementation
7 defined (see Section 18.3.8).
8 • omp_get_affinity_format routine: When called from within any parallel or teams
9 region, the binding thread set (and binding region, if required) for the
10 omp_get_affinity_format region is implementation defined (see Section 18.3.9).
11 • omp_display_affinity routine: If the format argument does not conform to the specified
12 format then the result is implementation defined (see Section 18.3.10).
13 • omp_capture_affinity routine: If the format argument does not conform to the specified
14 format then the result is implementation defined (see Section 18.3.11).
15 • omp_set_num_teams routine: If the argument does not evaluate to a positive integer, the
16 behavior of this routine is implementation defined (see Section 18.4.3).
17 • omp_set_teams_thread_limit routine: If the argument is not a positive integer, the
18 behavior is implementation defined (see Section 18.4.5).
19 • omp_pause_resource_all routine: The behavior of this routine is implementation
20 defined if the argument kind is not listed in Section 18.6.1 (see Section 18.6.2).
21 • omp_target_memcpy_rect and omp_target_memcpy_rect_async routines: The
22 maximum number of dimensions supported is implementation defined, but must be at least three
23 (see Section 18.8.6 and Section 18.8.8).
24 • Lock routines: If a lock contains a synchronization hint, the effect of the hint is implementation
25 defined (see Section 18.9).
26 • Interoperability routines: Implementation-defined properties may use zero and positive values
27 for properties associated with an omp_interop_t object (see Section 18.12).

28 Chapter 19:
29 • Tool callbacks: If a tool attempts to register a callback listed in Table 19.3), whether the
30 registered callback may never, sometimes or always invoke this callback for the associated events
31 is implementation defined (see Section 19.2.4).
32 • Device tracing: Whether a target device supports tracing or not is implementation defined; if a
33 target device does not support tracing, a NULL may be supplied for the lookup function to the
34 device initializer of a tool (see Section 19.2.5).
35 • ompt_set_trace_ompt and ompt_get_record_ompt runtime entry points: Whether
36 a device-specific tracing interface defines this runtime entry point, indicating that it can collect

APPENDIX A. OPENMP IMPLEMENTATION-DEFINED BEHAVIORS 621


1 traces in OMPT format, is implementation defined. The kinds of trace records available for a
2 device is implementation defined (see Section 19.2.5).
3 • Native record abstract type: The meaning of a hwid value for a device is implementation
4 defined (see Section 19.4.3.3).
5 • ompt_dispatch_chunk_t type: Whether the chunk of a taskloop is contiguous is
6 implementation defined (see Section 19.4.4.13).
7 • ompt_record_abstract_t type: The set of OMPT thread states supported is
8 implementation defined (see Section 19.4.4.28).
9 • ompt_callback_sync_region_t callback type: For the implicit-barrier-wait-begin and
10 implicit-barrier-wait-end events at the end of a parallel region, whether the parallel_data
11 argument is NULL or points to the parallel data of the current parallel region is implementation
12 defined (see Section 19.5.2.13).
13 • ompt_callback_target_data_op_emi_t and
14 ompt_callback_target_data_op_t callback types: Whether in some operations
15 src_addr or dest_addr might point to an intermediate buffer is implementation defined (see
16 Section 19.5.2.25).
17 • ompt_get_place_proc_ids_t entry point type: The meaning of the numerical
18 identifiers returned is implementation defined. The order of ids returned in the array is
19 implementation defined (see Section 19.6.1.8).
20 • ompt_get_partition_place_nums_t entry point type: The order of the identifiers
21 returned in the array place_nums is implementation defined (see Section 19.6.1.10).
22 • ompt_get_proc_id_t entry point type: The meaning of the numerical identifier returned
23 is implementation defined (see Section 19.6.1.11).

24 Chapter 20:
25 • ompd_callback_print_string_fn_t callback type: The value of category is
26 implementation defined (see Section 20.4.5).
27 • ompd_parallel_handle_compare operation: The means by which parallel region
28 handles are ordered is implementation defined (see Section 20.5.6.5).
29 • ompd_task_handle_compare operation: The means by which task handles are ordered is
30 implementation defined (see Section 20.5.7.6).

31 Chapter 21:
32 • OMP_DYNAMIC environment variable: If the value is neither true nor false, the behavior
33 of the program is implementation defined (see Section 21.1.1).
34 • OMP_NUM_THREADS environment variable: If any value of the list specified leads to a number
35 of threads that is greater than the implementation can support, or if any value is not a positive
36 integer, then the behavior of the program is implementation defined (see Section 21.1.2).

622 OpenMP API – Version 5.2 November 2021


1 • OMP_THREAD_LIMIT environment variable: If the requested value is greater than the number
2 of threads an implementation can support, or if the value is not a positive integer, the behavior of
3 the program is implementation defined (see Section 21.1.3).
4 • OMP_MAX_ACTIVE_LEVELS environment variable: If the value is a negative integer or is
5 greater than the maximum number of nested active parallel levels that an implementation can
6 support then the behavior of the program is implementation defined (see Section 21.1.4).
7 • OMP_NESTED environment variable (deprecated): If the value is neither true nor false,
8 the behavior of the program is implementation defined (see Section 21.1.5).
9 • Conflicting OMP_NESTED (deprecated) and OMP_MAX_ACTIVE_LEVELS environment
10 variables: If both environment variables are set, the value of OMP_NESTED is false, and the
11 value of OMP_MAX_ACTIVE_LEVELS is greater than 1, then the behavior is implementation
12 defined (see Section 21.1.5).
13 • OMP_PLACES environment variable: The meaning of the numbers specified in the
14 environment variable and how the numbering is done are implementation defined. The precise
15 definitions of the abstract names are implementation defined. An implementation may add
16 implementation-defined abstract names as appropriate for the target platform. When creating a
17 place list of n elements by appending the number n to an abstract name, the determination of
18 which resources to include in the place list is implementation defined. When requesting more
19 resources than available, the length of the place list is also implementation defined. The behavior
20 of the program is implementation defined when the execution environment cannot map a
21 numerical value (either explicitly defined or implicitly derived from an interval) within the
22 OMP_PLACES list to a processor on the target platform, or if it maps to an unavailable processor.
23 The behavior is also implementation defined when the OMP_PLACES environment variable is
24 defined using an abstract name (see Section 21.1.6).
25 • OMP_PROC_BIND environment variable: If the value is not true, false, or a comma
26 separated list of primary (master has been deprecated), close, or spread, the behavior is
27 implementation defined. The behavior is also implementation defined if an initial thread cannot
28 be bound to the first place in the OpenMP place list. The thread affinity policy is implementation
29 defined if the value is true (see Section 21.1.7).
30 • OMP_SCHEDULE environment variable: If the value does not conform to the specified format
31 then the behavior of the program is implementation defined (see Section 21.2.1).
32 • OMP_STACKSIZE environment variable: If the value does not conform to the specified format
33 or the implementation cannot provide a stack of the specified size then the behavior is
34 implementation defined (see Section 21.2.2).
35 • OMP_WAIT_POLICY environment variable: The details of the active and passive
36 behaviors are implementation defined (see Section 21.2.3).
37 • OMP_DISPLAY_AFFINITY environment variable: For all values of the environment
38 variables other than true or false, the display action is implementation defined (see
39 Section 21.2.4).

APPENDIX A. OPENMP IMPLEMENTATION-DEFINED BEHAVIORS 623


1 • OMP_AFFINITY_FORMAT environment variable: Additional implementation-defined field
2 types can be added (see Section 21.2.5).
3 • OMP_CANCELLATION environment variable: If the value is set to neither true nor false,
4 the behavior of the program is implementation defined (see Section 21.2.6).
5 • OMP_TARGET_OFFLOAD environment variable: The support of disabled is
6 implementation defined (see Section 21.2.8).
7 • OMP_TOOL_LIBRARIES environment variable: Whether the value of the environment
8 variable is case sensitive is implementation defined (see Section 21.3.2).
9 • OMP_TOOL_VERBOSE_INIT environment variable: Support for logging to stdout or
10 stderr is implementation defined. Whether the value of the environment variable is case
11 sensitive when it is treated as a filename is implementation defined. The format and detail of the
12 log is implementation defined (see Section 21.3.3).
13 • OMP_DEBUG environment variable: If the value is neither disabled nor enabled, the
14 behavior is implementation defined (see Section 21.4.1).
15 • OMP_NUM_TEAMS environment variable: If the value is not a positive integer or is greater than
16 the number of teams that an implementation can support, the behavior of the program is
17 implementation defined (see Section 21.6.1).
18 • OMP_TEAMS_THREAD_LIMIT environment variable: If the value is not a positive integer or
19 is greater than the number of threads that an implementation can support, the behavior of the
20 program is implementation defined (see Section 21.6.2).

624 OpenMP API – Version 5.2 November 2021


1 B Features History
2 This appendix summarizes the major changes between OpenMP API versions since version 2.5.

3 B.1 Deprecated Features


4 The following features were deprecated in Version 5.2:
5 • The syntax of the linear clause that specifies its argument and linear-modifier as
6 linear-modifier(list) was deprecated.
7 • The minus (-) operator for reductions was deprecated.
8 • The syntax of modifiers without comma separators in the map clause was deprecated.
Fortran
9 • The use of one or more allocate directives with an associated ALLOCATE statement was
10 deprecated.
Fortran
11 • The argument that specified the arguments of the uses_allocators clause as a
12 comma-separated list in which each list item is a clause-argument-specification of the form
13 allocator[(traits)] was deprecated.
14 • The use of the default clause on metadirectives was deprecated.
C / C++
15 • The delimited form of the declare target directive was deprecated.
C / C++
16 • The use of the to clause on the declare target directive was deprecated.
17 • The syntax of the destroy clause on the depobj construct with no argument was deprecated.
18 • The use of the keywords source and sink as task-dependence-type modifiers and the
19 associated syntax for the depend clause was deprecated.
20 • The init clause of interop construct now accepts an interop_type in any position of the
21 modifier list.
22 • The requirement that the ICVs num-procs-var, thread-num-var, final-task-var, implicit-task-var
23 and team-size-var must also be available with an ompd- prefix was deprecated.

APPENDIX B. FEATURES HISTORY 625


1 The following features were deprecated in Version 5.1:
Fortran
2 • Cray pointer support was deprecated.
3 • Specifying list items that are not of type C_PTR in a use_device_ptr or is_device_ptr
4 clause was deprecated.
Fortran
5 • The use of clauses supplied to the requires directive as context traits was deprecated.
6 • The master affinity policy was deprecated.
7 • The master construct and all combined and composite constructs of which it is a constituent
8 construct were deprecated.
9 • The constant omp_atv_sequential was deprecated.
10 • The ompt_sync_region_barrier and ompt_sync_region_barrier_implicit
11 values of the ompt_sync_region_t enum were deprecated.
12 • The ompt_state_wait_barrier and ompt_state_wait_barrier_implicit
13 values of the ompt_state_t enum were deprecated.
14 The following features were deprecated in Version 5.0:
15 • The nest-var ICV, the OMP_NESTED environment variable, and the omp_set_nested and
16 omp_get_nested routines were deprecated.
17 • Lock hints were renamed to synchronization hints. The following lock hint type and constants
18 were deprecated:
19 – the C/C++ type omp_lock_hint_t and the Fortran kind omp_lock_hint_kind;
20 – the constants omp_lock_hint_none, omp_lock_hint_uncontended,
21 omp_lock_hint_contended, omp_lock_hint_nonspeculative, and
22 omp_lock_hint_speculative.

23 B.2 Version 5.1 to 5.2 Differences


24 • The explicit-task-var ICV has replaced the implicit-task-var ICV and has the opposite meaning
25 and semantics (see Chapter 2). The omp_in_explicit_task routine was added to query if
26 a code region is executed from an explicit task region (see Section 18.5.2).
27 • Major reorganization and numerous changes were made to improve the quality of the
28 specification of OpenMP syntax and to increase consistency of restrictions and their wording.
29 These changes frequently result in the possible perception of differences to preceding versions of
30 the OpenMP specification. However, those differences almost always resolve ambiguities, which
31 may nonetheless have implications for existing implementations and programs.

626 OpenMP API – Version 5.2 November 2021


1 • For OpenMP directives, reserved the omp sentinel (see Section 3.1, Section 3.1.1 and
2 Section 3.1.2) and, for implementation-defined directives that extend the OpenMP directives
3 reserved the ompx sentinel for C/C++ and free source form Fortran (see Section 3.1 and
4 Section 3.1.2) and the omx sentinel for fixed source form Fortran to accommodate character
5 position requirements (see Section 3.1.1). Reserved clause names that begin with the ompx_
6 prefix for implementation-defined clauses on OpenMP directives (see Section 3.2). Reserved
7 names in the base language that start with the omp_ and ompx_ prefix and reserved the omp and
8 ompx namespaces (see Chapter 4) for the OpenMP runtime API and for implementation-defined
9 extensions to that API (see Chapter 18).
10 • Allowed any clause that can be specified on a paired end directive to be specified on the
11 directive (see Section 3.1), including the copyprivate clause (see Section 5.7.2) and the
12 nowait clause in Fortran (see Section 15.6).
13 • For consistency with the syntax of other definitions of the clause, the syntax of the destroy
14 clause on the depobj construct with no argument was deprecated (see Section 3.5).
15 • For consistency with the syntax of other clauses, the syntax of the linear clause that specifies
16 its argument and linear-modifier as linear-modifier(list) was deprecated and the step modifier
17 was added for specifying the linear step (see Section 5.4.6).
18 • The minus (-) operator for reductions was deprecated (see Section 5.5.5).
19 • The syntax of modifiers without comma separators in the map clause was deprecated (see
20 Section 5.8.3).
21 • To support the complete range of user-defined mappers and to improve consistency of map
22 clause usage, the declare mapper directive was extended to accept iterator-modifier and the
23 present map-type-modifier (see Section 5.8.3 and Section 5.8.8).
24 • If a matching mapped list item is not found in the data environment, the pointer retains its
25 original value as per the firstprivate semantics (see Section 5.8.6).
26 • The enter clause was added as a synonym for the to clause on the declare target directive, and
27 the corresponding to clause was deprecated to reduce parsing ambiguity (see Section 5.8.4 and
28 Section 7.8).
Fortran
29 • Metadirectives (see Section 7.4), assumption directives (see Section 8.3), nothing directives
30 (see Section 8.4), error directives (see Section 8.5) and loop transformation constructs (see
31 Chapter 9) were added to the list of directives that are allowed in a pure procedure (see
32 Chapter 3).
33 • The allocators construct was added to support the use of OpenMP allocators for variables
34 that are allocated by a Fortran ALLOCATE statement, and the application of allocate
35 directives to an ALLOCATE statement was deprecated (see Section 6.7).

APPENDIX B. FEATURES HISTORY 627


1 • For consistency with other constructs with associated base language code, the dispatch
2 construct was extended to allow an optional paired end directive to be specified (see
3 Section 7.6).
Fortran
4 • To support the full range of allocators and to improve consistency with the syntax of other
5 clauses, the argument that specified the arguments of the uses_allocators as a
6 comma-separated list in which each list item is a clause-argument-specification of the form
7 allocator[(traits)] was deprecated (see Section 6.8).
8 • To improve code clarity and to reduce ambiguity in this specification, the otherwise clause
9 was added as a synonym for the default clause on metadirectives and the corresponding
10 default clause syntax was deprecated (see Section 7.4.2).
C / C++
11 • To improve overall syntax consistency and to reduce redundancy, the delimited form of the
12 declare target directive was deprecated (see Section 7.8.2).
C / C++
13 • The behavior of the order clause with the concurrent parameter was changed so that it only
14 affects whether a loop schedule is reproducible if a modifier is explicitly specified (see
15 Section 10.3).
16 • Support for the allocate and firstprivate clauses on the scope directive was added
17 (see Section 11.2).
18 • The ompt_callback_work callback work types for worksharing loop were added (see
19 Section 11.5).
20 • To simplify usage, the map clause on a target enter data or target exit data
21 construct now has a default map type that provides the same behavior as the to or from map
22 types, respectively (see Section 13.6 and Section 13.7).
23 • The doacross clause was added as a synonym for the depend clause with the keywords
24 source and sink as dependence-type modifiers and the corresponding depend clause syntax
25 was deprecated to improve code clarity and to reduce parsing ambiguity. Also, the
26 omp_cur_iteration keyword was added to represent an iteration vector that refers to the
27 current logical iteration (see Section 15.9.6).

28 B.3 Version 5.0 to 5.1 Differences


29 • Full support of C11, C++11, C++14, C++17, C++20 and Fortran 2008 was completed (see
30 Section 1.7).
31 • Various changes throughout the specification were made to provide initial support of Fortran
32 2018 (see Section 1.7).

628 OpenMP API – Version 5.2 November 2021


1 • To support device-specific ICV settings the environment variable syntax was extended to support
2 device-specific variables (see Section 2.2 and Chapter 21).
3 • The OpenMP directive syntax was extended to include C++ attribute specifiers (see Section 3.1).
4 • The omp_all_memory reserved locator was added (see Section 3.1), and the depend clause
5 was extended to allow its use (see Section 15.9.5).
6 • Support for private and firstprivate as an argument to the default clause in C and
7 C++ was added (see Section 5.4.1).
8 • Support was added so that iterators may be defined and used in a map clause (see Section 5.8.3)
9 or in data-motion clause on a target update directive (see Section 13.9).
10 • The present argument was added to the defaultmap clause (see Section 5.8.7).
11 • Support for the align clause on the allocate directive and allocator and align
12 modifiers on the allocate clause was added (see Chapter 6).
13 • The target_device trait set was added to the OpenMP context (see Section 7.1), and the
14 target_device selector set was added to context selectors (see Section 7.2).
15 • For C/C++, the declare variant directive was extended to support elision of preprocessed code
16 and to allow enclosed function definitions to be interpreted as variant functions (see Section 7.5).
17 • The declare variant directive was extended with new clauses (adjust_args and
18 append_args) that support adjustment of the interface between the original function and its
19 variants (see Section 7.5).
20 • The dispatch construct was added to allow users to control when variant substitution happens
21 and to define additional information that can be passed as arguments to the function variants (see
22 Section 7.6).
23 • Support was added for indirect calls to the device version of a procedure or function in target
24 regions (see Section 7.8).
25 • Assumption directives were added to allow users to specify invariants (see Section 8.3).
26 • To support clarity in metadirectives, the nothing directive was added (see Section 8.4).
27 • To allow users to control the compilation process and runtime error actions, the error directive
28 was added (see Section 8.5).
29 • Loop transformation constructs were added (see Chapter 7).
30 • The masked construct was added to support restricting execution to a specific thread (see
31 Section 10.5).
32 • The scope directive was added to support reductions without requiring a parallel or
33 worksharing region (see Section 11.2).

APPENDIX B. FEATURES HISTORY 629


1 • The grainsize and num_tasks clauses for the taskloop construct were extended with a
2 strict modifier to ensure a deterministic distribution of logical iterations to tasks (see
3 Section 12.6).
4 • The thread_limit clause was added to the target construct to control the upper bound on
5 the number of threads in the created contention group (see Section 13.8).
6 • The has_device_addr clause was added to the target construct to allow access to
7 variables or array sections that already have a device address (see Section 13.8).
8 • The interop directive was added to enable portable interoperability with foreign execution
9 contexts used to implement OpenMP (see Section 14.1). Runtime routines that facilitate use of
10 omp_interop_t objects were also added (see Section 18.12).
11 • The nowait clause was added to the taskwait directive to support insertion of non-blocking
12 join operations in a task dependence graph (see Section 15.5).
13 • Support was added for compare-and-swap and (for C and C++) minimum and maximum atomic
14 operations through the compare clause. Support was also added for the specification of the
15 memory order to apply to a failed comparing atomic operation with the fail clause (see
16 Section 15.8.4).
17 • Specification of the seq_cst clause on a flush construct was allowed, with the same
18 meaning as a flush construct without a list and without a clause (see Section 15.8.5).
19 • To support inout sets, the inoutset argument was added to the depend clause (see
20 Section 15.9.5).
21 • The omp_set_num_teams and omp_set_teams_thread_limit runtime routines were
22 added to control the number of teams and the size of those teams on the teams construct (see
23 Section 18.4.3 and Section 18.4.5). Additionally, the omp_get_max_teams and
24 omp_get_teams_thread_limit runtime routines were added to retrieve the values that
25 will be used in the next teams construct (see Section 18.4.4 and Section 18.4.6).
26 • The omp_target_is_accessible runtime routine was added to test whether host memory
27 is accessible from a given device (see Section 18.8.4).
28 • To support asynchronous device memory management, omp_target_memcpy_async and
29 omp_target_memcpy_rect_async runtime routines were added (see Section 18.8.7 and
30 Section 18.8.8).
31 • The omp_get_mapped_ptr runtime routine was added to support obtaining the device
32 pointer that is associated with a host pointer for a given device (see Section 18.8.11).
33 • The omp_calloc, omp_realloc, omp_aligned_alloc and omp_aligned_calloc
34 API routines were added (see Section 18.13).
35 • For the omp_alloctrait_key_t enum, the omp_atv_serialized value was added and
36 the omp_atv_default value was changed (see Section 18.13.1).

630 OpenMP API – Version 5.2 November 2021


1 • The omp_display_env runtime routine was added to provide information about ICVs and
2 settings of environment variables (see Section 18.15).
3 • The ompt_scope_beginend value was added to the ompt_scope_endpoint_t enum
4 to indicate the coincident beginning and end of a scope (see Section 19.4.4.11).
5 • The ompt_sync_region_barrier_implicit_workshare,
6 ompt_sync_region_barrier_implicit_parallel, and
7 ompt_sync_region_barrier_teams values were added to the
8 ompt_sync_region_t enum (see Section 19.4.4.14).
9 • Values for asynchronous data transfers were added to the ompt_target_data_op_t enum
10 (see Section 19.4.4.15).
11 • The ompt_state_wait_barrier_implementation and
12 ompt_state_wait_barrier_teams values were added to the ompt_state_t enum
13 (see Section 19.4.4.28).
14 • The ompt_callback_target_data_op_emi_t, ompt_callback_target_emi_t,
15 ompt_callback_target_map_emi_t, and
16 ompt_callback_target_submit_emi_t callbacks were added to support external
17 monitoring interfaces (see Section 19.5.2.25, Section 19.5.2.26, Section 19.5.2.27 and
18 Section 19.5.2.28).
19 • The ompt_callback_error_t type was added (see Section 19.5.2.30).
20 • The OMP_PLACES syntax was extended (see Section 21.1.6).
21 • The OMP_NUM_TEAMS and OMP_TEAMS_THREAD_LIMIT environment variables were added
22 to control the number and size of teams on the teams construct (see Section 21.6.1 and
23 Section 21.6.2).

24 B.4 Version 4.5 to 5.0 Differences


25 • The memory model was extended to distinguish different types of flush operations according to
26 specified flush properties (see Section 1.4.4) and to define a happens before order based on
27 synchronizing flush operations (see Section 1.4.5).
28 • Various changes throughout the specification were made to provide initial support of C11,
29 C++11, C++14, C++17 and Fortran 2008 (see Section 1.7).
30 • Full support of Fortran 2003 was completed (see Section 1.7).
31 • The target-offload-var internal control variable (see Chapter 2) and the
32 OMP_TARGET_OFFLOAD environment variable (see Section 21.2.8) were added to support
33 runtime control of the execution of device constructs.

APPENDIX B. FEATURES HISTORY 631


1 • Control over whether nested parallelism is enabled or disabled was integrated into the
2 max-active-levels-var internal control variable (see Section 2.2), the default value of which is
3 now implementation defined, unless determined according to the values of the
4 OMP_NUM_THREADS (see Section 21.1.2) or OMP_PROC_BIND (see Section 21.1.7)
5 environment variables.
6 • Support for array shaping (see Section 3.2.4) and for array sections with non-unit strides in C and
7 C++ (see Section 3.2.5) was added to facilitate specification of discontiguous storage, and the
8 target update construct (see Section 13.9) and the depend clause (see Section 15.9.5)
9 were extended to allow the use of shape-operators (see Section 3.2.4).
10 • Iterators (see Section 3.2.6) were added to support expressions in a list that expand to multiple
11 expressions.
12 • The canonical loop form was defined for Fortran and, for all base languages, extended to permit
13 non-rectangular loop nests (see Section 4.4.1).
14 • The relational-op in the canonical loop form for C/C++ was extended to include != (see
15 Section 4.4.1).
16 • To support conditional assignment to lastprivate variables, the conditional modifier was
17 added to the lastprivate clause (see Section 5.4.5).
18 • The inscan modifier for the reduction clause (see Section 5.5.8) and the scan directive
19 (see Section 5.6) were added to support inclusive and exclusive scan computations.
20 • To support task reductions, the task modifier was added to the reduction clause (see
21 Section 5.5.8), the task_reduction clause (see Section 5.5.9) was added to the
22 taskgroup construct (see Section 15.4), and the in_reduction clause (see Section 5.5.10)
23 was added to the task (see Section 12.5) and target (see Section 13.8) constructs.
24 • To support taskloop reductions, the reduction (see Section 5.5.8) and in_reduction (see
25 Section 5.5.10) clauses were added to the taskloop construct (see Section 12.6).
26 • The description of the map clause was modified to clarify the mapping order when multiple
27 map-types are specified for a variable or structure members of a variable on the same construct.
28 The close map-type-modifier was added as a hint for the runtime to allocate memory close to
29 the target device (see Section 5.8.3).
30 • The capability to map C/C++ pointer variables and to assign the address of device memory that
31 is mapped by an array section to them was added. Support for mapping of Fortran pointer and
32 allocatable variables, including pointer and allocatable components of variables, was added (see
33 Section 5.8.3).
34 • The defaultmap clause (see Section 5.8.7) was extended to allow selecting the data-mapping
35 or data-sharing attributes for any of the scalar, aggregate, pointer, or allocatable classes on a
36 per-region basis. Additionally it accepts the none parameter to support the requirement that all
37 variables referenced in the construct must be explicitly mapped or privatized.

632 OpenMP API – Version 5.2 November 2021


1 • The declare mapper directive was added to support mapping of data types with direct and
2 indirect members (see Section 5.8.8).
3 • Predefined memory spaces (see Section 6.1), predefined memory allocators and allocator traits
4 (see Section 6.2) and directives, clauses and API routines (see Chapter 6 and Section 18.13) to
5 use them were added to support different kinds of memories.
6 • Metadirectives (see Section 7.4) and declare variant directives (see Section 7.5) were added to
7 support selection of directive variants and declared function variants at a call site, respectively,
8 based on compile-time traits of the enclosing context.
9 • Support for nested declare target directives was added (see Section 7.8).
10 • The requires directive (see Section 8.2) was added to support applications that require
11 implementation-specific features.
12 • The teams construct (see Section 10.2) was extended to support execution on the host device
13 without an enclosing target construct (see Section 13.8).
14 • The loop construct and the order(concurrent) clause were added to support compiler
15 optimization and parallelization of loops for which iterations may execute in any order, including
16 concurrently (see Section 10.3 and Section 11.7).
17 • The collapse of associated loops that are imperfectly nested loops was defined for the simd (see
18 Section 10.4), worksharing-loop (see Section 11.5), distribute (see Section 11.6) and
19 taskloop (see Section 12.6) constructs.
20 • The simd construct (see Section 10.4) was extended to accept the if, nontemporal, and
21 order(concurrent) clauses and to allow the use of atomic constructs within it.
22 • The default loop schedule modifier for worksharing-loop constructs without the static
23 schedule and the ordered clause was changed to nonmonotonic (see Section 11.5).
24 • The affinity clause was added to the task construct (see Section 12.5) to support hints that
25 indicate data affinity of explicit tasks.
26 • The detach clause for the task construct (see Section 12.5) and the omp_fulfill_event
27 runtime routine (see Section 18.11.1) were added to support execution of detachable tasks.
28 • The taskloop construct (see Section 12.6) was added to the list of constructs that can be
29 canceled by the cancel construct (see Section 16.1)).
30 • To support mutually exclusive inout sets, a mutexinoutset dependence-type was added to
31 the depend clause (see Section 12.9 and Section 15.9.5).
32 • The semantics of the use_device_ptr clause for pointer variables was clarified and the
33 use_device_addr clause for using the device address of non-pointer variables inside the
34 target data construct was added (see Section 13.5).
35 • To support reverse offload, the ancestor modifier was added to the device clause for the
36 target construct (see Section 13.8).

APPENDIX B. FEATURES HISTORY 633


1 • To reduce programmer effort, implicit declare target directives for some functions (C, C++,
2 Fortran) and subroutines (Fortran) were added (see Section 13.8 and Section 7.8).
3 • The target update construct (see Section 13.9) was modified to allow array sections that
4 specify discontiguous storage.
5 • The to and from clauses on the target update construct (see Section 13.9), the depend
6 clause on task generating constructs (see Section 15.9.5), and the map clause (see Section 5.8.3)
7 were extended to allow any lvalue expression as a list item for C/C++.
8 • Lock hints were renamed to synchronization hints, and the old names were deprecated (see
9 Section 15.1).
10 • The depend clause was added to the taskwait construct (see Section 15.5).
11 • To support acquire and release semantics with weak memory ordering, the acq_rel,
12 acquire, and release clauses were added to the atomic construct (see Section 15.8.4) and
13 flush construct (see Section 15.8.5), and the memory ordering semantics of implicit flushes on
14 various constructs and runtime routines were clarified (see Section 15.8.6).
15 • The atomic construct was extended with the hint clause (see Section 15.8.4).
16 • The depend clause (see Section 15.9.5) was extended to support iterators and to support depend
17 objects that can be created with the new depobj construct.
18 • New combined constructs master taskloop, parallel master,
19 parallel master taskloop, master taskloop simd
20 parallel master taskloop simd (see Section 17.3) were added.
21 • The omp_set_nested (see Section 18.2.9) and omp_get_nested (see Section 18.2.10)
22 routines and the OMP_NESTED environment variable (see Section 21.1.5) were deprecated.
23 • The omp_get_supported_active_levels routine was added to query the number of
24 active levels of parallelism supported by the implementation (see Section 18.2.14).
25 • Runtime routines omp_set_affinity_format (see Section 18.3.8),
26 omp_get_affinity_format (see Section 18.3.9), omp_set_affinity (see
27 Section 18.3.10), and omp_capture_affinity (see Section 18.3.11) and environment
28 variables OMP_DISPLAY_AFFINITY (see Section 21.2.4) and OMP_AFFINITY_FORMAT
29 (see Section 21.2.5) were added to provide OpenMP runtime thread affinity information.
30 • The omp_pause_resource and omp_pause_resource_all runtime routines were
31 added to allow the runtime to relinquish resources used by OpenMP (see Section 18.6.1 and
32 Section 18.6.2).
33 • The omp_get_device_num runtime routine (see Section 18.7.5) was added to support
34 determination of the device on which a thread is executing.
35 • Support for a first-party tool interface (see Chapter 19) was added.
36 • Support for a third-party tool interface (see Chapter 20) was added.

634 OpenMP API – Version 5.2 November 2021


1 • Support for controlling offloading behavior with the OMP_TARGET_OFFLOAD environment
2 variable was added (see Section 21.2.8).
3 • Stubs for Runtime Library Routines (previously Appendix A) were moved to a separate
4 document.
5 • Interface Declarations (previously Appendix B) were moved to a separate document.

6 B.5 Version 4.0 to 4.5 Differences


7 • Support for several features of Fortran 2003 was added (see Section 1.7).
8 • The if clause was extended to take a directive-name-modifier that allows it to apply to combined
9 constructs (see Section 3.4).
10 • The implicit data-sharing attribute for scalar variables in target regions was changed to
11 firstprivate (see Section 5.1.1).
12 • Use of some C++ reference types was allowed in some data sharing attribute clauses (see
13 Section 5.4).
14 • The ref, val, and uval modifiers were added to the linear clause (see Section 5.4.6).
15 • Semantics for reductions on C/C++ array sections were added and restrictions on the use of
16 arrays and pointers in reductions were removed (see Section 5.5.8).
17 • Support was added to the map clauses to handle structure elements (see Section 5.8.3).
18 • To support unstructured data mapping for devices, the map clause (see Section 5.8.3) was
19 updated and the target enter data (see Section 13.6) and target exit data (see
20 Section 13.7) constructs were added.
21 • The declare target directive was extended to allow mapping of global variables to be
22 deferred to specific device executions and to allow an extended-list to be specified in C/C++ (see
23 Section 7.8).
24 • The simdlen clause was added to the simd construct (see Section 10.4) to support
25 specification of the exact number of iterations desired per SIMD chunk.
26 • A parameter was added to the ordered clause of the worksharing-loop construct (see
27 Section 11.5) and clauses were added to the ordered construct (see Section 15.10) to support
28 doacross loop nests and use of the simd construct on loops with loop-carried backward
29 dependences.
30 • The linear clause was added to the worksharing-loop construct (see Section 11.5).
31 • The priority clause was added to the task construct (see Section 12.5) to support hints that
32 specify the relative execution priority of explicit tasks. The
33 omp_get_max_task_priority routine was added to return the maximum supported

APPENDIX B. FEATURES HISTORY 635


1 priority value (see Section 18.5.1) and the OMP_MAX_TASK_PRIORITY environment variable
2 was added to control the maximum priority value allowed (see Section 21.2.9).
3 • The taskloop construct (see Section 12.6) was added to support nestable parallel loops that
4 create OpenMP tasks.
5 • To support interaction with native device implementations, the use_device_ptr clause was
6 added to the target data construct (see Section 13.5) and the is_device_ptr clause was
7 added to the target construct (see Section 13.8).
8 • The nowait and depend clauses were added to the target construct (see Section 13.8) to
9 improve support for asynchronous execution of target regions.
10 • The private, firstprivate and defaultmap clauses were added to the target
11 construct (see Section 13.8).
12 • The hint clause was added to the critical construct (see Section 15.2).
13 • The source and sink dependence types were added to the depend clause (see
14 Section 15.9.5) to support doacross loop nests.
15 • To support a more complete set of device construct shortcuts, the target parallel, target
16 parallel worksharing-loop, target parallel worksharing-loop SIMD, and target simd (see
17 Section 17.3) combined constructs were added.
18 • Query functions for OpenMP thread affinity were added (see Section 18.3.2 to Section 18.3.7).
19 • Device memory routines were added to allow explicit allocation, deallocation, memory transfers,
20 and memory associations (see Section 18.8).
21 • The lock API was extended with lock routines that support storing a hint with a lock to select a
22 desired lock implementation for a lock’s intended usage by the application code (see
23 Section 18.9.2).
24 • C/C++ Grammar (previously Appendix B) was moved to a separate document.

25 B.6 Version 3.1 to 4.0 Differences


26 • Various changes throughout the specification were made to provide initial support of Fortran
27 2003 (see Section 1.7).
28 • C/C++ array syntax was extended to support array sections (see Section 3.2.5).
29 • The reduction clause (see Section 5.5.8) was extended and the declare reduction
30 construct (see Section 5.5.11) was added to support user defined reductions.
31 • The proc_bind clause (see Section 10.1.3), the OMP_PLACES environment variable (see
32 Section 21.1.6), and the omp_get_proc_bind runtime routine (see Section 18.3.1) were
33 added to support thread affinity policies.

636 OpenMP API – Version 5.2 November 2021


1 • SIMD directives were added to support SIMD parallelism (see Section 10.4).
2 • Implementation defined task scheduling points for untied tasks were removed (see Section 12.9).
3 • Device directives (see Chapter 13), the OMP_DEFAULT_DEVICE environment variable (see
4 Section 21.2.7), and the omp_set_default_device, omp_get_default_device,
5 omp_get_num_devices, omp_get_num_teams, omp_get_team_num, and
6 omp_is_initial_device routines were added to support execution on devices.
7 • The taskgroup construct (see Section 15.4) was added to support deep task synchronization.
8 • The atomic construct (see Section 15.8.4) was extended to support atomic swap with the
9 capture clause, to allow new atomic update and capture forms, and to support sequentially
10 consistent atomic operations with a new seq_cst clause.
11 • The depend clause (see Section 15.9.5) was added to support task dependences.
12 • The cancel construct (see Section 16.1), the cancellation point construct (see
13 Section 16.2), the omp_get_cancellation runtime routine (see Section 18.2.8), and the
14 OMP_CANCELLATION environment variable (see Section 21.2.6) were added to support the
15 concept of cancellation.
16 • The OMP_DISPLAY_ENV environment variable (see Section 21.7) was added to display the
17 value of ICVs associated with the OpenMP environment variables.
18 • Examples (previously Appendix A) were moved to a separate document.

19 B.7 Version 3.0 to 3.1 Differences


20 • The bind-var ICV (see Section 2.1) and the OMP_PROC_BIND environment variable (see
21 Section 21.1.7) were added to support control of whether threads are bound to processors.
22 • Data environment restrictions were changed to allow intent(in) and const-qualified types
23 for the firstprivate clause (see Section 5.4.4).
24 • Data environment restrictions were changed to allow Fortran pointers in firstprivate (see
25 Section 5.4.4) and lastprivate (see Section 5.4.5) clauses.
26 • New reduction operators min and max were added for C and C++ (see Section 5.5).
27 • The nthreads-var ICV was modified to be a list of the number of threads to use at each nested
28 parallel region level, and the algorithm for determining the number of threads used in a parallel
29 region was modified to handle a list (see Section 10.1.1).
30 • The final and mergeable clauses (see Section 12.5) were added to the task construct to
31 support optimization of task data environments.
32 • The taskyield construct (see Section 12.7) was added to allow user-defined task scheduling
33 points.

APPENDIX B. FEATURES HISTORY 637


1 • The atomic construct (see Section 15.8.4) was extended to include read, write, and
2 capture forms, and an update clause was added to apply the already existing form of the
3 atomic construct.
4 • The nesting restrictions in Section 17.1 were clarified to disallow closely-nested OpenMP
5 regions within an atomic region so that an atomic region can be consistently defined with
6 other OpenMP regions to include all code in the atomic construct.
7 • The omp_in_final runtime library routine (see Section 18.5.3) was added to support
8 specialization of final task regions.
9 • Descriptions of examples (previously Appendix A) were expanded and clarified.
10 • Incorrect use of omp_integer_kind in Fortran interfaces was replaced with
11 selected_int_kind(8).

12 B.8 Version 2.5 to 3.0 Differences


13 • The definition of active parallel region was changed so that a parallel region is active if
14 it is executed by a team that consists of more than one thread (see Section 1.2.2).
15 • The concept of tasks was added to the execution model (see Section 1.2.5 and Section 1.3).
16 • The OpenMP memory model was extended to cover atomicity of memory accesses (see
17 Section 1.4.1). The description of the behavior of volatile in terms of flush was removed.
18 • The definition of the nest-var, dyn-var, nthreads-var and run-sched-var internal control variables
19 (ICVs) were modified to provide one copy of these ICVs per task instead of one copy for the
20 whole program (see Chapter 2). The omp_set_num_threads, omp_set_nested, and
21 omp_set_dynamic runtime library routines were specified to support their use from inside a
22 parallel region (see Section 18.2.1, Section 18.2.6 and Section 18.2.9).
23 • The thread-limit-var ICV, the omp_get_thread_limit runtime library routine and the
24 OMP_THREAD_LIMIT environment variable were added to support control of the maximum
25 number of threads (see Section 2.1, Section 18.2.13 and Section 21.1.3).
26 • The max-active-levels-var ICV, omp_set_max_active_levels and
27 omp_get_max_active_levels runtime library routines, and
28 OMP_MAX_ACTIVE_LEVELS environment variable were added to support control of the
29 number of nested active parallel regions (see Section 2.1, Section 18.2.15, Section 18.2.16
30 and Section 21.1.4).
31 • The stacksize-var ICV and the OMP_STACKSIZE environment variable were added to support
32 control of thread stack sizes (see Section 2.1 and Section 21.2.2).
33 • The wait-policy-var ICV and the OMP_WAIT_POLICY environment variable were added to
34 control the desired behavior of waiting threads (see Section 2.1 and Section 21.2.3).

638 OpenMP API – Version 5.2 November 2021


1 • Predetermined data-sharing attributes were defined for Fortran assumed-size arrays (see
2 Section 5.1.1).
3 • Static class members variables were allowed in threadprivate directives (see Section 5.2).
4 • Invocations of constructors and destructors for private and threadprivate class type variables was
5 clarified (see Section 5.2, Section 5.4.3, Section 5.4.4, Section 5.7.1 and Section 5.7.2).
6 • The use of Fortran allocatable arrays was allowed in private, firstprivate,
7 lastprivate, reduction, copyin and copyprivate clauses (see Section 5.2,
8 Section 5.4.3, Section 5.4.4, Section 5.4.5, Section 5.5.8, Section 5.7.1 and Section 5.7.2).
9 • Support for firstprivate was added to the default clause in Fortran (see Section 5.4.1).
10 • Implementations were precluded from using the storage of the original list item to hold the new
11 list item on the primary thread for list items in the private clause, and the value was made
12 well defined on exit from the parallel region if no attempt is made to reference the original
13 list item inside the parallel region (see Section 5.4.3).
14 • Data environment restrictions were changed to allow intent(in) and const-qualified types
15 for the firstprivate clause (see Section 5.4.4).
16 • Data environment restrictions were changed to allow Fortran pointers in firstprivate (see
17 Section 5.4.4) and lastprivate (see Section 5.4.5).
18 • New reduction operators min and max were added for C and C++ (see Section 5.5).
19 • Determination of the number of threads in parallel regions was updated (see Section 10.1.1).
20 • The assignment of iterations to threads in a loop construct with a static schedule kind was
21 made deterministic (see Section 11.5).
22 • The worksharing-loop construct was extended to support association with more than one
23 perfectly nested loop through the collapse clause (see Section 11.5).
24 • Iteration variables for worksharing-loops were allowed to be random access iterators or of
25 unsigned integer type (see Section 11.5).
26 • The schedule kind auto was added to allow the implementation to choose any possible mapping
27 of iterations in a loop construct to threads in the team (see Section 11.5).
28 • The task construct (see Chapter 12) was added to support explicit tasks.
29 • The taskwait construct (see Section 15.5) was added to support task synchronization.
30 • The runtime library routines omp_set_schedule and omp_get_schedule were added to
31 set and to retrieve the value of the run-sched-var ICV (see Section 18.2.11 and Section 18.2.12).
32 • The omp_get_level runtime library routine was added to return the number of nested
33 parallel regions that enclose the task that contains the call (see Section 18.2.17).

APPENDIX B. FEATURES HISTORY 639


1 • The omp_get_ancestor_thread_num runtime library routine was added to return the
2 thread number of the ancestor of the current thread (see Section 18.2.18).
3 • The omp_get_team_size runtime library routine was added to return the size of the thread
4 team to which the ancestor of the current thread belongs (see Section 18.2.19).
5 • The omp_get_active_level runtime library routine was added to return the number of
6 active parallel regions that enclose the task that contains the call (see Section 18.2.20).
7 • Lock ownership was defined in terms of tasks instead of threads (see Section 18.9).

640 OpenMP API – Version 5.2 November 2021


Index

Symbols bind, 258


_OPENMP macro, 69, 607, 608, 615 branch, 204

A C
acquire flush, 30 cancel, 332
adjust_args, 195 cancellation constructs, 332
affinity, 228 cancel, 332
affinity, 264 cancellation point, 336
align, 174 cancellation point, 336
aligned, 169 canonical loop nest form, 85
allocate, 176, 178 capture, atomic, 311
allocator, 175 clause format, 56
allocators, 180 clauses
append_args, 196 adjust_args, 195
array sections, 64 affinity, 264
array shaping, 63 align, 174
assumes, 214, 215 aligned, 169
assumption clauses, 213 allocate, 178
assumption directives, 213 allocator, 175
at, 210 append_args, 196
atomic, 311 assumption, 213
atomic, 310 at, 210
atomic construct, 619 atomic, 310
attribute clauses, 108 attribute data-sharing, 108
attributes, data-mapping, 147, 148 bind, 258
attributes, data-sharing, 96 branch, 204
auto, 253 collapse, 93
copyin, 144
B copyprivate, 146
barrier, 301 data copying, 144
barrier, implicit, 303 data-sharing, 108
base language format, 74 default, 109
begin declare target, 207 defaultmap, 161
begin declare variant, 198 depend, 323
begin metadirective, 192 destroy, 73
begin assumes, 215 detach, 265

641
device, 276 priority, 261
device_type, 275 private, 111
dist_schedule, 256 proc_bind, 229
doacross, 326 reduction, 134
enter, 158 requirement, 212
exclusive, 143 safelen, 237
extended-atomic, 310 schedule, 252
filter, 239 severity, 217
final, 261 shared, 110
firstprivate, 112 simdlen, 237
from, 167 sizes, 220
full, 221 task_reduction, 137
grainsize, 269 thread_limit, 277
has_device_addr, 122 to, 166
hint, 296, 299 uniform, 168
if Clause, 72 untied, 260
in_reduction, 138 update, 321
inclusive, 143 use, 294
indirect, 209 use_device_addr, 123
init, 293 use_device_ptr, 121
initializer, 130 uses_allocators, 181
is_device_ptr, 120 when, 190
lastprivate, 115 collapse, 93
linear, 117 combined and composite directive
link, 159 names, 342
map, 150 combined construct semantics, 343
match, 194 compare, atomic, 311
memory-order, 309 compilation sentinels, 70, 71
mergeable, 260 compliance, 34
message, 217 composite constructs, 343
nocontext, 201 composition of constructs, 338
nogroup, 309 conditional compilation, 69
nontemporal, 236 consistent loop schedules, 95
novariants, 201 construct syntax, 48
nowait, 308 constructs
num_tasks, 270 allocators, 180
num_teams, 233 atomic, 311
num_threads, 227 barrier, 301
order, 233 cancel, 332
ordered, 94 cancellation constructs, 332
otherwise, 191 cancellation point, 336
parallelization-level, 331 combined constructs, 343
partial, 221 composite constructs, 343

642 OpenMP API – Version 5.2 November 2021


critical, 299 data-mapping control, 147
depobj, 322 data-motion clauses, 165
device constructs, 275 data-sharing attribute clauses, 108
dispatch, 200 data-sharing attribute rules, 96
distribute, 254 declare mapper, 162
do, 251 declare reduction, 139
flush, 315 declare simd, 201
for, 250 Declare Target, 204
interop, 291 declare target, 206
loop, 257 declare variant, 197
masked, 238 declare variant, 193
ordered, 328–330 default, 109
parallel, 223 defaultmap, 161
scope, 242 depend, 323
sections, 243 depend object, 321
simd, 235 dependences, 320
single, 240 depobj, 322
target, 283 deprecated features, 625
target data, 279 destroy, 73
target enter data, 280 detach, 265
target exit data, 282 device, 276
target update, 289 device constructs
task, 262 device constructs, 275
taskgroup, 304 target, 283
tasking constructs, 260 target update, 289
taskloop, 266 device data environments, 28, 280, 282
taskwait, 306 device directives, 275
taskyield, 270 device information routines, 381
teams, 230 device memory routines, 385
tile, 219 device_type, 275
unroll, 220 directive format, 49
work-distribution, 240 directive syntax, 48
workshare, 245 directives
worksharing, 240 allocate, 176
worksharing-loop construct, 247 assumes, 214, 215
controlling OpenMP thread affinity, 228 assumptions, 213
copyin, 144 begin assumes, 215
copyprivate, 146 begin declare target, 207
critical, 299 begin declare variant, 198
begin metadirective, 192
D declare mapper, 162
data copying clauses, 144 declare reduction, 139
data environment, 96 declare simd, 201
data terminology, 14

Index 643
Declare Target, 204 OMP_THREAD_LIMIT, 601
declare target, 206 OMP_TOOL, 611
declare variant, 197 OMP_TOOL_LIBRARIES, 612
declare variant, 193 OMP_TOOL_VERBOSE_INIT, 612
error, 216 OMP_WAIT_POLICY, 607
memory management directives, 171 event, 414
metadirective, 189, 192 event callback registration, 446
nothing, 216 event callback signatures, 474
requires, 210 event routines, 414
scan Directive, 141 exclusive, 143
section, 244 execution model, 23
threadprivate, 101 extended-atomic, 310
variant directives, 183
dispatch, 200 F
dist_schedule, 256 features history, 625
distribute, 254 filter, 239
do, 251 final, 261
doacross, 326 firstprivate, 112
dynamic, 252 fixed source form conditional compilation
dynamic thread adjustment, 618 sentinels, 70
fixed source form directives, 54
E flush, 315
enter, 158 flush operation, 29
environment display routine, 438 flush synchronization, 30
environment variables, 599 flush-set, 29
OMP_AFFINITY_FORMAT, 608 for, 250
OMP_ALLOCATOR, 614 frames, 470
OMP_CANCELLATION, 610 free source form conditional compilation
OMP_DEBUG, 613 sentinel, 71
OMP_DEFAULT_DEVICE, 610 free source form directives, 55
OMP_DISPLAY_AFFINITY, 607 from, 167
OMP_DISPLAY_ENV, 615 full, 221
OMP_DYNAMIC, 600
OMP_MAX_ACTIVE_LEVELS, 601 G
OMP_MAX_TASK_PRIORITY, 611 glossary, 2
OMP_NESTED, 602 grainsize, 269
OMP_NUM_TEAMS, 615 guided, 252
OMP_NUM_THREADS, 600
OMP_PLACES, 602 H
OMP_PROC_BIND, 604 happens before, 30
OMP_SCHEDULE, 605 has_device_addr, 122
OMP_STACKSIZE, 606 header files, 345
OMP_TARGET_OFFLOAD, 610 hint, 299
OMP_TEAMS_THREAD_LIMIT, 615 history of features, 625

644 OpenMP API – Version 5.2 November 2021


I memory management directives, 171
ICVs (internal control variables), 38 memory management routines, 422
if Clause, 72 memory model, 26
implementation, 616 memory spaces, 171
implementation terminology, 19 memory-order, 309
implicit barrier, 303 mergeable, 260
implicit data-mapping attribute rules, 148 message, 217
implicit flushes, 317 metadirective, 189
in_reduction, 138 metadirective, 192
include files, 345 modifier
inclusive, 143 task-dependence-typetask-dependence-
indirect, 209 type,
informational and utility directives, 210 321
init, 293 modifying and retrieving ICV values, 42
internal control variables, 616 modifying ICVs, 40
internal control variables (ICVs), 38
interoperability, 291 N
Interoperability routines, 416 nesting of regions, 338
introduction, 1 nocontext, 201
is_device_ptr, 120 nogroup, 309
iterators, 67 nontemporal, 236
normative references, 35
L nothing, 216
lastprivate, 115 novariants, 201
linear, 117 nowait, 308
link, 159 num_tasks, 270
list item privatization, 105 num_teams, 233
lock routines, 403 num_threads, 227
loop, 257
loop concepts, 85 O
loop iteration spaces, 91 OMP_AFFINITY_FORMAT, 608
loop iteration vectors, 91 omp_aligned_alloc, 428
loop terminology, 9 omp_aligned_calloc, 431
loop transformation constructs, 219 omp_alloc, 428
OMP_ALLOCATOR, 614
M omp_calloc, 431
map, 150 OMP_CANCELLATION, 610
mapper, 149 omp_capture_affinity, 371
mapper identifiers, 149 OMP_DEBUG, 613
masked, 238 OMP_DEFAULT_DEVICE, 610
match, 194 omp_destroy_allocator, 426
memory allocators, 172 omp_destroy_lock, 407
memory management, 171 omp_destroy_nest_lock, 407
memory management directives OMP_DISPLAY_AFFINITY, 607

Index 645
omp_display_affinity, 370 omp_get_schedule, 356
OMP_DISPLAY_ENV, 615 omp_get_supported_active
omp_display_env, 438 _levels, 358
OMP_DYNAMIC, 600 omp_get_team_num, 373
omp_free, 430 omp_get_team_size, 361
omp_fulfill_event, 414 omp_get_teams_thread_limit, 376
omp_get_active_level, 362 omp_get_thread_limit, 357
omp_get_affinity_format, 369 omp_get_thread_num, 350
omp_get_ancestor_thread_num, 360 omp_get_wtick, 414
omp_get_cancellation, 353 omp_get_wtime, 413
omp_get_default_allocator, 428 omp_in_explicit_task, 377
omp_get_default_device, 382 omp_in_final, 378
omp_get_device_num, 384 omp_in_parallel, 351
omp_get_dynamic, 352 omp_init_allocator, 425
omp_get_initial_device, 385 omp_init_lock, 405, 406
omp_get_interop_int, 417 omp_init_nest_lock, 405, 406
omp_get_interop_name, 420 omp_is_initial_device, 384
omp_get_interop_ptr, 418 OMP_MAX_ACTIVE_LEVELS, 601
omp_get_interop_rc_desc, 421 OMP_MAX_TASK_PRIORITY, 611
omp_get_interop_str, 419 OMP_NESTED, 602
omp_get_interop_type_desc, 421 OMP_NUM_TEAMS, 615
omp_get_level, 360 OMP_NUM_THREADS, 600
omp_get_mapped_ptr, 402 omp_pause_resource, 378
omp_get_max_active_levels, 359 omp_pause_resource_all, 380
omp_get_max_task_priority, 377 OMP_PLACES, 602
omp_get_max_teams, 374 OMP_PROC_BIND, 604
omp_get_max_threads, 350 omp_realloc, 433
omp_get_nested, 354 OMP_SCHEDULE, 605
omp_get_num_devices, 383 omp_set_affinity_format, 368
omp_get_num_interop_properties, omp_set_default_allocator, 427
417 omp_set_default_device, 382
omp_get_num_places, 364 omp_set_dynamic, 352
omp_get_num_procs, 381 omp_set_lock, 408
omp_get_num_teams, 372 omp_set_max_active_levels, 358
omp_get_num_threads, 349 omp_set_nest_lock, 408
omp_get_partition_num_places, omp_set_nested, 353
367 omp_set_num_teams, 373
omp_get_partition_place_nums, omp_set_num_threads, 348
368 omp_set_schedule, 355
omp_get_place_num, 366 omp_set_teams_thread_limit, 375
omp_get_place_num_procs, 365 OMP_STACKSIZE, 606
omp_get_place_proc_ids, 365 omp_target_alloc, 385
omp_get_proc_bind, 363 omp_target_associate_ptr, 399

646 OpenMP API – Version 5.2 November 2021


omp_target_disassociate_ptr, 401 ompd_callback_sizeof_fn_t, 553
omp_target_free, 387 ompd_callback_symbol_addr
omp_target_is_accessible, 390 _fn_t, 554
omp_target_is_present, 389 ompd_callbacks_t, 560
omp_target_memcpy, 391 ompd_dll_locations_valid, 541
omp_target_memcpy_async, 394 ompd_dll_locations, 540
omp_target_memcpy_rect, 392 ompt_callback_buffer
omp_target_memcpy_rect_async, _complete_t, 498
396 ompt_callback_buffer
OMP_TARGET_OFFLOAD, 610 _request_t, 498
OMP_TEAMS_THREAD_LIMIT, 615 ompt_callback_cancel_t, 493
omp_test_lock, 412 ompt_callback_control
omp_test_nest_lock, 412 _tool_t, 508
OMP_THREAD_LIMIT, 601 ompt_callback_dependences_t, 482
OMP_TOOL, 611 ompt_callback_dispatch_t, 480
OMP_TOOL_LIBRARIES, 612 ompt_callback_error_t, 509
OMP_TOOL_VERBOSE_INIT, 612 ompt_callback_device
omp_unset_lock, 410 _finalize_t, 496
omp_unset_nest_lock, 410 ompt_callback_device
OMP_WAIT_POLICY, 607 _initialize_t, 494
ompd_bp_device_begin, 597 ompt_callback_flush_t, 492
ompd_bp_device_end, 598 ompt_callback_implicit
ompd_bp_parallel_begin, 594 _task_t, 485
ompd_bp_parallel_end, 595 ompt_callback_masked_t, 486
ompd_bp_task_begin, 595 ompt_callback_mutex
ompd_bp_task_end, 596 _acquire_t, 489
ompd_bp_thread_begin, 596 ompt_callback_mutex_t, 490
ompd_bp_thread_end, 597 ompt_callback_nest_lock_t, 491
ompd_callback_device_host ompt_callback_parallel
_fn_t, 558 _begin_t, 476
ompd_callback_get_thread ompt_callback_parallel
_context_for_thread_id _end_t, 477
_fn_t, 551 ompt_callback_sync_region_t, 487
ompd_callback_memory_alloc ompt_callback_device_load_t, 496
_fn_t, 550 ompt_callback_device
ompd_callback_memory_free _unload_t, 497
_fn_t, 550 ompt_callback_target_data
ompd_callback_memory_read _emi_op_t, 499
_fn_t, 555 ompt_callback_target_data
ompd_callback_memory_write _op_t, 499
_fn_t, 557 ompt_callback_target_emi_t, 502
ompd_callback_print_string ompt_callback_target
_fn_t, 559 _map_emi_t, 504

Index 647
ompt_callback_target_map_t, 504 release flush, 30
ompt_callback_target requirement, 212
_submit_emi_t, 506 requires, 210
ompt_callback_target reserved locators, 62
_submit_t, 506 resource relinquishing routines, 378
ompt_callback_target_t, 502 runtime, 253
ompt_callback_task_create_t, 481 runtime library definitions, 345
ompt_callback_task runtime library routines, 345
_dependence_t, 484
ompt_callback_task S
_schedule_t, 484 safelen, 237
ompt_callback_thread scan Directive, 141
_begin_t, 475 schedule, 252
ompt_callback_thread_end_t, 476 scheduling, 272
ompt_callback_work_t, 479 scope, 242
OpenMP allocator structured blocks, 77 section, 244
OpenMP argument lists, 60 sections, 243
OpenMP atomic structured blocks, 79 severity, 217
OpenMP compliance, 34 shared, 110
OpenMP context-specific structured simd, 235
blocks, 77 simdlen, 237
OpenMP function dispatch structured Simple Lock Routines, 404
blocks, 78 single, 240
OpenMP operations, 62 sizes, 220
OpenMP stylized expressions, 76 stand-alone directives, 54
OpenMP types, 74 static, 252
order, 233 strong flush, 29
ordered, 94, 328–330 structured blocks, 76
otherwise, 191 synchronization constructs, 296
synchronization constructs and clauses, 296
P synchronization hint type, 296
parallel, 223 synchronization hints, 296
parallelism generating constructs, 223 synchronization terminology, 10
parallelization-level, 331
partial, 221 T
priority, 261 target, 283
private, 111 target data, 279
proc_bind, 229 target memory routines, 385
target update, 289
R task, 262
read, atomic, 311 task scheduling, 272
initializer, 130 task-dependence-type, 321
reduction, 134 task_reduction, 137
reduction clauses, 124 taskgroup, 304

648 OpenMP API – Version 5.2 November 2021


tasking constructs, 260 when, 190
tasking routines, 377 work-distribution
tasking terminology, 12 constructs, 240
taskloop, 266 work-distribution constructs, 240
taskwait, 306 workshare, 245
taskyield, 270 worksharing
teams, 230 constructs, 240
teams region routines, 372 worksharing constructs, 240
thread affinity, 228 worksharing-loop construct, 247
thread affinity routines, 363 write, atomic, 311
thread team routines, 348
thread_limit, 277
threadprivate, 101
tile, 219
timer, 413
timing routines, 413
to, 166
tool control, 435
tool initialization, 443
tool interfaces definitions, 440, 540
tools header files, 440, 540
tracing device activity, 447
types
sync_hint, 296

U
uniform, 168
unroll, 220
untied, 260
update, 321
update, atomic, 311
use, 294
use_device_addr, 123
use_device_ptr, 121
uses_allocators, 181

V
variables, environment, 599
variant directives, 183

W
wait identifier, 472
wall clock timer, 413
error, 216

Index 649

You might also like