mirror of
https://forge.sourceware.org/marek/gcc.git
synced 2026-02-22 03:47:02 -05:00
reflection
158 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
254a858ae7 | Update copyright years. | ||
|
|
05c2ad4a2e |
OpenMP/Fortran: Allow explicit map followed by implicit deep mapping [PR120505]
Consider the following source code, assuming tiles is allocatable:
```
!$omp target enter data map(var%tiles(1)%den1, var%tiles(1)%den2) ! (1)
[...]
!$omp target ! implicitly maps var, which triggers deep mapping of tiles (2)
```
Each omp directive causes a run-time error in libgomp:
(1) libgomp: Mapped array elements must be the same (0x14d729c0 vs 0x14d72a18)
(2) libgomp: Trying to map into device [0x3704ca50..0x3704cb00) object when
[0x3704ca50..0x3704caa8) is already mapped
Regarding (1), the OpenMP spec has the following restriction: "If multiple list
items are explicitly mapped on the same construct and have the same containing
array or have base pointers that share original storage, and if any of the list
items do not have corresponding list items that are present in the device data
environment prior to a task encountering the construct, then the list items must
refer to *the same array elements* of either the containing array or the
implicit array of the base pointers."
Because tiles is allocatable, we cannot prove at compile time that array
elements are the same, so the check is deferred to libgomp. But there the
condition enforcing that all addresses are the same is too strict, so this patch
relaxes it to only check that addresses are sorted in increasing order.
The OpenMP spec allows (2) as long as it is implicit, without extending the
original mapping. So this patch sets the GOMP_MAP_IMPLICIT flag appropriately
on deep maps at compile time to let libgomp know that it is fine.
This patch ensures that such user code is accepted by:
(1) Setting the GOMP_MAP_IMPLICIT flag appropriately on deep maps;
(2) Relaxing the restriction on struct mapping from different containing arrays,
so that the element index need not be the same, instead addresses must be sorted
in increasing order.
This fixes the two errors currently seen when running SPEC HPC clvleaf
benchmark. However, further mapping issues prevent the benchmark from running to
completion.
PR fortran/120505
gcc/ChangeLog:
* omp-low.cc (lower_omp_target): Set GOMP_MAP_IMPLICIT flag.
libgomp/ChangeLog:
* target.c (gomp_map_vars_internal): Allow struct mapping from different
containing array elements as long as adresses are in increasing order.
* testsuite/libgomp.c-c++-common/map-arrayofstruct-2.c: Adjust
dg-output.
* testsuite/libgomp.c-c++-common/map-arrayofstruct-3.c: Likewise.
* testsuite/libgomp.fortran/map-subarray-5.f90: Likewise.
* testsuite/libgomp.fortran/map-subarray-10.f90: New test.
* testsuite/libgomp.fortran/map-subarray-9.f90: New test.
|
||
|
|
62174ec27b |
openmp, nvptx: ompx_gnu_managed_mem_alloc
This adds support for using Cuda Managed Memory with omp_alloc. AMD support will be added in a future patch. There is one new predefined allocator, "ompx_gnu_managed_mem_alloc", plus a corresponding memory space, which can be used to allocate memory in the "managed" space. The nvptx plugin is modified to make the necessary Cuda calls, via two new (optional) plugin interfaces. gcc/fortran/ChangeLog: * openmp.cc (is_predefined_allocator): Use GOMP_OMP_PREDEF_ALLOC_MAX and GOMP_OMPX_PREDEF_ALLOC_MIN/MAX instead of hardcoded values in the comment. include/ChangeLog: * cuda/cuda.h (cuMemAllocManaged): Add declaration and related CU_MEM_ATTACH_GLOBAL flag. * gomp-constants.h (GOMP_OMPX_PREDEF_ALLOC_MAX): Update to 201. (GOMP_OMP_PREDEF_MEMSPACE_MAX): New constant. (GOMP_OMPX_PREDEF_MEMSPACE_MIN): New constant. (GOMP_OMPX_PREDEF_MEMSPACE_MAX): New constant. libgomp/ChangeLog: * allocator.c (ompx_gnu_max_predefined_alloc): Update to ompx_gnu_managed_mem_alloc. (_Static_assert): Fix assertion messages for allocators and add new assertions for memspace constants. (omp_max_predefined_mem_space): New define. (ompx_gnu_min_predefined_mem_space): New define. (ompx_gnu_max_predefined_mem_space): New define. (MEMSPACE_ALLOC): Add check for non-standard memspaces. (MEMSPACE_CALLOC): Likewise. (MEMSPACE_REALLOC): Likewise. (MEMSPACE_VALIDATE): Likewise. (predefined_ompx_gnu_alloc_mapping): Add ompx_gnu_managed_mem_space. (omp_init_allocator): Add ompx_gnu_managed_mem_space validation. * config/gcn/allocator.c (gcn_memspace_alloc): Add check for non-standard memspaces. (gcn_memspace_calloc): Likewise. (gcn_memspace_realloc): Likewise. (gcn_memspace_validate): Update to validate standard vs non-standard memspaces. * config/linux/allocator.c (linux_memspace_alloc): Add managed memory space handling. (linux_memspace_calloc): Likewise. (linux_memspace_free): Likewise. (linux_memspace_realloc): Likewise (returns NULL for fallback). * config/nvptx/allocator.c (nvptx_memspace_alloc): Add check for non-standard memspaces. (nvptx_memspace_calloc): Likewise. (nvptx_memspace_realloc): Likewise. (nvptx_memspace_validate): Update to validate standard vs non-standard memspaces. * env.c (parse_allocator): Add ompx_gnu_managed_mem_alloc, ompx_gnu_managed_mem_space, and some static asserts so I don't forget them again. * libgomp-plugin.h (GOMP_OFFLOAD_managed_alloc): New declaration. (GOMP_OFFLOAD_managed_free): New declaration. * libgomp.h (gomp_managed_alloc): New declaration. (gomp_managed_free): New declaration. (struct gomp_device_descr): Add managed_alloc_func and managed_free_func fields. * libgomp.texi: Document ompx_gnu_managed_mem_alloc and ompx_gnu_managed_mem_space, add C++ template documentation, and describe NVPTX and AMD support. * omp.h.in: Add ompx_gnu_managed_mem_space and ompx_gnu_managed_mem_alloc enumerators, and gnu_managed_mem C++ allocator template. * omp_lib.f90.in: Add Fortran bindings for new allocator and memory space. * omp_lib.h.in: Likewise. * plugin/cuda-lib.def: Add cuMemAllocManaged. * plugin/plugin-nvptx.c (nvptx_alloc): Add managed parameter to support cuMemAllocManaged. (GOMP_OFFLOAD_alloc): Move contents to ... (cleanup_and_alloc): ... this new function, and add managed support. (GOMP_OFFLOAD_managed_alloc): New function. (GOMP_OFFLOAD_managed_free): New function. * target.c (gomp_managed_alloc): New function. (gomp_managed_free): New function. (gomp_load_plugin_for_device): Load optional managed_alloc and managed_free plugin APIs. * testsuite/lib/libgomp.exp: Add check_effective_target_omp_managedmem. * testsuite/libgomp.c++/alloc-managed-1.C: New test. * testsuite/libgomp.c/alloc-managed-1.c: New test. * testsuite/libgomp.c/alloc-managed-2.c: New test. * testsuite/libgomp.c/alloc-managed-3.c: New test. * testsuite/libgomp.c/alloc-managed-4.c: New test. * testsuite/libgomp.fortran/alloc-managed-1.f90: New test. Co-authored-by: Kwok Cheung Yeung <kcyeung@baylibre.com> Co-authored-by: Thomas Schwinge <tschwinge@baylibre.com> |
||
|
|
5da963d988 |
OpenMP: Add omp_default_device named constant [PR119677]
OpenMP TR 14 (OpenMP 6.1) adds omp_default_device < -1 as named constant alongside omp_initial_device and omp_default_device. GCC supports it already internally via GOMP_DEVICE_DEFAULT_OMP_61, but this patch now adds the omp_default_device enum/PARAMETER to omp.h / omp_lib. Note that PR119677 requests some cleanups, which still have to be done. PR libgomp/119677 gcc/fortran/ChangeLog: * intrinsic.texi (OpenMP Modules): Add omp_default_device. * openmp.cc (gfc_resolve_omp_context_selector): Accept omp_default_device as conforming device number. libgomp/ChangeLog: * omp.h.in (omp_default_device): New enum value. * omp_lib.f90.in: New parameter. * omp_lib.h.in: Likewise * target.c (gomp_get_default_device): New. Split off from ... (resolve_device): ... here; call it. (omp_target_alloc, omp_target_free, omp_target_is_present, omp_target_memcpy_check, omp_target_memset, omp_target_memset_async, omp_target_associate_ptr, omp_get_mapped_ptr, omp_target_is_accessible, omp_pause_resource, omp_get_uid_from_device): Handle omp_default_device. * testsuite/libgomp.c/device_uid.c: Likewise. * testsuite/libgomp.fortran/device_uid.f90: Likewise. * testsuite/libgomp.c-c++-common/omp-default-device.c: New test. * testsuite/libgomp.fortran/omp-default-device.f90: New test. |
||
|
|
3b8d9d579c |
libgomp, nvptx: Cuda pinned memory
Use Cuda to pin memory, instead of Linux mlock, when available. There are two advantages: firstly, this gives a significant speed boost for NVPTX offloading, and secondly, it side-steps the usual OS ulimit/rlimit setting. The design adds a device independent plugin API for allocating pinned memory, and then implements it for NVPTX. At present, the other supported devices do not have equivalent capabilities (or requirements). libgomp/ChangeLog: * config/linux/allocator.c: Include assert.h. (using_device_for_page_locked): New variable. (linux_memspace_alloc): Add init0 parameter. Support device pinning. (linux_memspace_calloc): Set init0 to true. (linux_memspace_free): Support device pinning. (linux_memspace_realloc): Support device pinning. (MEMSPACE_ALLOC): Set init0 to false. * libgomp-plugin.h (GOMP_OFFLOAD_page_locked_host_alloc): New prototype. (GOMP_OFFLOAD_page_locked_host_free): Likewise. * libgomp.h (gomp_page_locked_host_alloc): Likewise. (gomp_page_locked_host_free): Likewise. (struct gomp_device_descr): Add page_locked_host_alloc_func and page_locked_host_free_func. * libgomp.texi: Adjust the docs for the pinned trait. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_page_locked_host_alloc): New function. (GOMP_OFFLOAD_page_locked_host_free): Likewise. * target.c (device_for_page_locked): New variable. (get_device_for_page_locked): New function. (gomp_page_locked_host_alloc): Likewise. (gomp_page_locked_host_free): Likewise. (gomp_load_plugin_for_device): Add page_locked_host_alloc and page_locked_host_free. * testsuite/libgomp.c/alloc-pinned-1.c: Change expectations for NVPTX devices. * testsuite/libgomp.c/alloc-pinned-2.c: Likewise. * testsuite/libgomp.c/alloc-pinned-3.c: Likewise. * testsuite/libgomp.c/alloc-pinned-4.c: Likewise. * testsuite/libgomp.c/alloc-pinned-5.c: Likewise. * testsuite/libgomp.c/alloc-pinned-6.c: Likewise. Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com> |
||
|
|
87262627fd |
openmp: Add support for iterators in 'target update' clauses (C/C++)
This adds support for iterators in 'to' and 'from' clauses in the 'target update' OpenMP directive. gcc/c/ * c-parser.cc (c_parser_omp_clause_from_to): Parse 'iterator' modifier. * c-typeck.cc (c_finish_omp_clauses): Finish iterators for to/from clauses. gcc/cp/ * parser.cc (cp_parser_omp_clause_from_to): Parse 'iterator' modifier. * semantics.cc (finish_omp_clauses): Finish iterators for to/from clauses. gcc/ * gimplify.cc (remove_unused_omp_iterator_vars): Display unused variable warning for 'to' and 'from' clauses. (gimplify_scan_omp_clauses): Add argument for iterator loop sequence. Gimplify the clause decl and size into the iterator loop if iterators are used. (gimplify_omp_workshare): Add argument for iterator loops sequence in call to gimplify_scan_omp_clauses. (gimplify_omp_target_update): Call remove_unused_omp_iterator_vars and build_omp_iterators_loops. Add loop sequence as argument when calling gimplify_scan_omp_clauses, gimplify_adjust_omp_clauses and building the Gimple statement. * tree-pretty-print.cc (dump_omp_clause): Call dump_omp_iterators for to/from clauses with iterators. * tree.cc (omp_clause_num_ops): Add extra operand for OMP_CLAUSE_FROM and OMP_CLAUSE_TO. * tree.h (OMP_CLAUSE_HAS_ITERATORS): Add check for OMP_CLAUSE_TO and OMP_CLAUSE_FROM. (OMP_CLAUSE_ITERATORS): Likewise. gcc/testsuite/ * c-c++-common/gomp/target-update-iterators-1.c: New. * c-c++-common/gomp/target-update-iterators-2.c: New. * c-c++-common/gomp/target-update-iterators-3.c: New. libgomp/ * target.c (gomp_update): Call gomp_merge_iterator_maps. Free allocated variables. * testsuite/libgomp.c-c++-common/target-update-iterators-1.c: New. * testsuite/libgomp.c-c++-common/target-update-iterators-2.c: New. * testsuite/libgomp.c-c++-common/target-update-iterators-3.c: New. |
||
|
|
8b8b0eada6 |
openmp: Add support for iterators in map clauses (C/C++)
This adds preliminary support for iterators in map clauses within OpenMP 'target' constructs (which includes constructs such as 'target enter data'). Iterators with non-constant loop bounds are not currently supported. gcc/c/ * c-parser.cc (c_parser_omp_variable_list): Use location of the map expression as the clause location. (c_parser_omp_clause_map): Parse 'iterator' modifier. * c-typeck.cc (c_finish_omp_clauses): Finish iterators. Apply iterators to generated clauses. gcc/cp/ * parser.cc (cp_parser_omp_clause_map): Parse 'iterator' modifier. * semantics.cc (finish_omp_clauses): Finish iterators. Apply iterators to generated clauses. gcc/ * gimple-pretty-print.cc (dump_gimple_omp_target): Print expanded iterator loops. * gimple.cc (gimple_build_omp_target): Add argument for iterator loops sequence. Initialize iterator loops field. * gimple.def (GIMPLE_OMP_TARGET): Set GSS symbol to GSS_OMP_TARGET. * gimple.h (gomp_target): Set GSS symbol to GSS_OMP_TARGET. Add extra field for iterator loops. (gimple_build_omp_target): Add argument for iterator loops sequence. (gimple_omp_target_iterator_loops): New. (gimple_omp_target_iterator_loops_ptr): New. (gimple_omp_target_set_iterator_loops): New. * gimplify.cc (find_var_decl): New. (copy_omp_iterator): New. (remap_omp_iterator_var_1): New. (remap_omp_iterator_var): New. (remove_unused_omp_iterator_vars): New. (struct iterator_loop_info_t): New type. (iterator_loop_info_map_t): New type. (build_omp_iterators_loops): New. (enter_omp_iterator_loop_context_1): New. (enter_omp_iterator_loop_context): New. (enter_omp_iterator_loop_context): New. (exit_omp_iterator_loop_context): New. (gimplify_adjust_omp_clauses): Add argument for iterator loop sequence. Gimplify the clause decl and size into the iterator loop if iterators are used. (gimplify_omp_workshare): Call remove_unused_omp_iterator_vars and build_omp_iterators_loops for OpenMP target expressions. Add loop sequence as argument when calling gimplify_adjust_omp_clauses and building the Gimple statement. * gimplify.h (enter_omp_iterator_loop_context): New prototype. (exit_omp_iterator_loop_context): New prototype. * gsstruct.def (GSS_OMP_TARGET): New. * omp-low.cc (lower_omp_map_iterator_expr): New. (lower_omp_map_iterator_size): New. (finish_omp_map_iterators): New. (lower_omp_target): Add sorry if iterators used with deep mapping. Call lower_omp_map_iterator_expr before assigning to sender ref. Call lower_omp_map_iterator_size before setting the size. Insert iterator loop sequence before the statements for the target clause. * tree-nested.cc (convert_nonlocal_reference_stmt): Walk the iterator loop sequence of OpenMP target statements. (convert_local_reference_stmt): Likewise. (convert_tramp_reference_stmt): Likewise. * tree-pretty-print.cc (dump_omp_iterators): Dump extra iterator information if present. (dump_omp_clause): Call dump_omp_iterators for iterators in map clauses. * tree.cc (omp_clause_num_ops): Add operand for OMP_CLAUSE_MAP. (walk_tree_1): Do not walk last operand of OMP_CLAUSE_MAP. * tree.h (OMP_CLAUSE_HAS_ITERATORS): New. (OMP_CLAUSE_ITERATORS): New. gcc/testsuite/ * c-c++-common/gomp/map-6.c (foo): Amend expected error message. * c-c++-common/gomp/target-map-iterators-1.c: New. * c-c++-common/gomp/target-map-iterators-2.c: New. * c-c++-common/gomp/target-map-iterators-3.c: New. * c-c++-common/gomp/target-map-iterators-4.c: New. libgomp/ * target.c (kind_to_name): New. (gomp_merge_iterator_maps): New. (gomp_map_vars_internal): Call gomp_merge_iterator_maps. Copy address of only the first iteration to target vars. Free allocated variables. * testsuite/libgomp.c-c++-common/target-map-iterators-1.c: New. * testsuite/libgomp.c-c++-common/target-map-iterators-2.c: New. * testsuite/libgomp.c-c++-common/target-map-iterators-3.c: New. Co-authored-by: Andrew Stubbs <ams@baylibre.com> |
||
|
|
8a759dbb69 |
libgomp/target.c: Fix buffer size for 'omp requires' diagnostic
One of the buffers that printed the list of set 'omp requires' requirements missed the 'self' clause addition, being potentially to short when all device-affecting clauses were passed. Solved it by moving the sizeof(<string of all permitted values>" into a new '#define' just above the associated gomp_requires_to_name function. libgomp/ChangeLog: * target.c (GOMP_REQUIRES_NAME_BUF_LEN): Define. (GOMP_offload_register_ver, gomp_target_init): Use it for the char buffer size. |
||
|
|
4e47e2f833 |
libgomp: Add OpenMP's omp_target_memset/omp_target_memset_async
PR libgomp/120444
include/ChangeLog:
* cuda/cuda.h (cuMemsetD8, cuMemsetD8Async): Declare.
libgomp/ChangeLog:
* libgomp-plugin.h (GOMP_OFFLOAD_memset): Declare.
* libgomp.h (struct gomp_device_descr): Add memset_func.
* libgomp.map (GOMP_6.0.1): Add omp_target_memset{,_async}.
* libgomp.texi (Device Memory Routines): Document them.
* omp.h.in (omp_target_memset, omp_target_memset_async): Declare.
* omp_lib.f90.in (omp_target_memset, omp_target_memset_async):
Add interfaces.
* omp_lib.h.in (omp_target_memset, omp_target_memset_async): Likewise.
* plugin/cuda-lib.def: Add cuMemsetD8.
* plugin/plugin-gcn.c (struct hsa_runtime_fn_info): Add
hsa_amd_memory_fill_fn.
(init_hsa_runtime_functions): DLSYM_OPT_FN load it.
(GOMP_OFFLOAD_memset): New.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_memset): New.
* target.c (omp_target_memset_int, omp_target_memset,
omp_target_memset_async_helper, omp_target_memset_async): New.
(gomp_load_plugin_for_device): Add DLSYM (memset).
* testsuite/libgomp.c-c++-common/omp_target_memset.c: New test.
* testsuite/libgomp.c-c++-common/omp_target_memset-2.c: New test.
* testsuite/libgomp.c-c++-common/omp_target_memset-3.c: New test.
* testsuite/libgomp.fortran/omp_target_memset.f90: New test.
* testsuite/libgomp.fortran/omp_target_memset-2.f90: New test.
|
||
|
|
f4aa6b5a8d |
libgomp: Add OpenACC's acc_memcpy_device{,_async} routines [PR93226]
libgomp/ChangeLog:
PR libgomp/93226
* libgomp-plugin.h (GOMP_OFFLOAD_openacc_async_dev2dev): New
prototype.
* libgomp.h (struct acc_dispatch_t): Add dev2dev_func.
(gomp_copy_dev2dev): New prototype.
* libgomp.map (OACC_2.6.1): New; add acc_memcpy_device{,_async}.
* libgomp.texi (acc_memcpy_device): New.
* oacc-mem.c (memcpy_tofrom_device): Change to take from/to
device boolean; use memcpy not memmove; add early return if
size == 0 or same device + same ptr.
(acc_memcpy_to_device, acc_memcpy_to_device_async,
acc_memcpy_from_device, acc_memcpy_from_device_async): Update.
(acc_memcpy_device, acc_memcpy_device_async): New.
* openacc.f90 (acc_memcpy_device, acc_memcpy_device_async):
Add interface.
* openacc_lib.h (acc_memcpy_device, acc_memcpy_device_async):
Likewise.
* openacc.h (acc_memcpy_device, acc_memcpy_device_async): Add
prototype.
* plugin/plugin-gcn.c (GOMP_OFFLOAD_openacc_async_host2dev):
Update comment.
(GOMP_OFFLOAD_openacc_async_dev2host): Update call.
(GOMP_OFFLOAD_openacc_async_dev2dev): New.
* plugin/plugin-nvptx.c (cuda_memcpy_dev_sanity_check): New.
(GOMP_OFFLOAD_dev2dev): Call it.
(GOMP_OFFLOAD_openacc_async_dev2dev): New.
* target.c (gomp_copy_dev2dev): New.
(gomp_load_plugin_for_device): Load dev2dev and async_dev2dev.
* testsuite/libgomp.oacc-c-c++-common/acc_memcpy_device-1.c: New test.
* testsuite/libgomp.oacc-fortran/acc_memcpy_device-1.f90: New test.
|
||
|
|
814e29e390 |
OpenMP: Fix mapping of zero-sized arrays with non-literal size: map(var[:n]), n = 0
For map(ptr[:0]), the used map kind is GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION and it is permitted that 'ptr' does not exist. 'ptr' is set to the device pointee if it exists or to the host value otherwise. For map(ptr[:3]), the variable is first mapped and then ptr is updated to point to the just-mapped device data; the attachment uses GOMP_MAP_ATTACH. For map(ptr[:n]), generates always a GOMP_MAP_ATTACH, but when n == 0, it was failing with: "pointer target not mapped for attach" The solution is not to fail but first to check whether it was mapped before. It turned out that for the mapping part, GCC adds a run-time check whether n == 0 - and uses GOMP_MAP_ZERO_LEN_ARRAY_SECTION for the mapping. Thus, we just have to check whether there such a mapping for the address for which the GOMP_MAP_ATTACH. was requested. And, if there was, the error diagnostic can be skipped. Unsurprisingly, this issue occurs in real-world code; it was detected in a code that distributes work via MPI and for some processes, some bounds ended up to be zero. libgomp/ChangeLog: * target.c (gomp_attach_pointer): Return bool; accept additional bool to optionally silence the fatal pointee-not-found error. (gomp_map_vars_internal): If the pointee could not be found, check whether it was mapped as GOMP_MAP_ZERO_LEN_ARRAY_SECTION. * libgomp.h (gomp_attach_pointer): Update prototype. * oacc-mem.c (acc_attach_async, goacc_enter_data_internal): Update calls. * testsuite/libgomp.c/target-map-zero-sized.c: New test. * testsuite/libgomp.c/target-map-zero-sized-2.c: New test. * testsuite/libgomp.c/target-map-zero-sized-3.c: New test. |
||
|
|
4d5d1a7326 |
libgomp: Save OpenMP device number when initializing the interop object
The interop object (opaque object to the user, used internally in libgomp) already had a 'device_num' member, but it was missed to actually set it. libgomp/ChangeLog: * target.c (gomp_interop_internal): Set the 'device_num' member when initializing an interop object. |
||
|
|
99e2906ae2 |
OpenMP: 'interop' construct - add ME support + target-independent libgomp
This patch partially enables use of the OpenMP interop construct by adding middle end support, mostly in the omplower pass, and in the target-independent part of the libgomp runtime. It follows up on previous patches for C, C++ and Fortran front ends support. The full interop feature requires another patch to enable foreign runtime support in libgomp plugins. gcc/ChangeLog: * builtin-types.def (BT_FN_VOID_INT_INT_PTR_PTR_PTR_INT_PTR_INT_PTR_UINT_PTR): New. * gimple-low.cc (lower_stmt): Handle GIMPLE_OMP_INTEROP. * gimple-pretty-print.cc (dump_gimple_omp_interop): New function. (pp_gimple_stmt_1): Handle GIMPLE_OMP_INTEROP. * gimple.cc (gimple_build_omp_interop): New function. (gimple_copy): Handle GIMPLE_OMP_INTEROP. * gimple.def (GIMPLE_OMP_INTEROP): Define. * gimple.h (gimple_build_omp_interop): Declare. (gimple_omp_interop_clauses): New function. (gimple_omp_interop_clauses_ptr): Likewise. (gimple_omp_interop_set_clauses): Likewise. (gimple_return_set_retval): Handle GIMPLE_OMP_INTEROP. * gimplify.cc (gimplify_scan_omp_clauses): Handle OMP_CLAUSE_INIT, OMP_CLAUSE_USE and OMP_CLAUSE_DESTROY. (gimplify_omp_interop): New function. (gimplify_expr): Replace sorry with call to gimplify_omp_interop. * omp-builtins.def (BUILT_IN_GOMP_INTEROP): Define. * omp-low.cc (scan_sharing_clauses): Handle OMP_CLAUSE_INIT, OMP_CLAUSE_USE and OMP_CLAUSE_DESTROY. (scan_omp_1_stmt): Handle GIMPLE_OMP_INTEROP. (lower_omp_interop_action_clauses): New function. (lower_omp_interop): Likewise. (lower_omp_1): Handle GIMPLE_OMP_INTEROP. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_clause_destroy): Make addressable. (c_parser_omp_clause_init): Make addressable. gcc/cp/ChangeLog: * parser.cc (cp_parser_omp_clause_init): Make addressable. gcc/fortran/ChangeLog: * trans-openmp.cc (gfc_trans_omp_clauses): Make OMP_CLAUSE_DESTROY and OMP_CLAUSE_INIT addressable. * types.def (BT_FN_VOID_INT_INT_PTR_PTR_PTR_INT_PTR_INT_PTR_UINT_PTR): New. include/ChangeLog: * gomp-constants.h (GOMP_DEVICE_DEFAULT_OMP_61, GOMP_INTEROP_TARGET, GOMP_INTEROP_TARGETSYNC, GOMP_INTEROP_FLAG_NOWAIT): Define. libgomp/ChangeLog: * icv-device.c (omp_set_default_device): Check GOMP_DEVICE_DEFAULT_OMP_61. * libgomp-plugin.h (struct interop_obj_t): New. (enum gomp_interop_flag): New. (GOMP_OFFLOAD_interop): Declare. (GOMP_OFFLOAD_get_interop_int): Declare. (GOMP_OFFLOAD_get_interop_ptr): Declare. (GOMP_OFFLOAD_get_interop_str): Declare. (GOMP_OFFLOAD_get_interop_type_desc): Declare. * libgomp.h (_LIBGOMP_OMP_LOCK_DEFINED): Define. (struct gomp_device_descr): Add interop_func, get_interop_int_func, get_interop_ptr_func, get_interop_str_func, get_interop_type_desc_func. * libgomp.map: Add GOMP_interop. * libgomp_g.h (GOMP_interop): Declare. * target.c (resolve_device): Handle GOMP_DEVICE_DEFAULT_OMP_61. (omp_get_interop_int): Replace stub with actual implementation. (omp_get_interop_ptr): Likewise. (omp_get_interop_str): Likewise. (omp_get_interop_type_desc): Likewise. (struct interop_data_t): Define. (gomp_interop_internal): New function. (GOMP_interop): Likewise. (gomp_load_plugin_for_device): Load symbols for get_interop_int, get_interop_ptr, get_interop_str and get_interop_type_desc. * testsuite/libgomp.c-c++-common/interop-1.c: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/interop-1.c: Remove dg-prune-output "sorry". * c-c++-common/gomp/interop-2.c: Likewise. * c-c++-common/gomp/interop-3.c: Likewise. * c-c++-common/gomp/interop-4.c: Remove dg-message "not supported". * g++.dg/gomp/interop-5.C: Likewise. * gfortran.dg/gomp/interop-4.f90: Likewise. * c-c++-common/gomp/interop-5.c: New test. * gfortran.dg/gomp/interop-5.f90: New test. Co-authored-by: Tobias Burnus <tburnus@baylibre.com> |
||
|
|
e759ff08e2 |
libgomp: Add '__attribute__((unused))' to variables used only in 'assert(...)'
Without this attribute, the building process will fail if GCC is configured with 'CFLAGS=-DNDEBUG'. libgomp/ChangeLog: * oacc-mem.c (acc_unmap_data, goacc_exit_datum_1, find_group_last, goacc_enter_data_internal): Add '__attribute__((unused))'. * target.c (gomp_unmap_vars_internal): Likewise. |
||
|
|
6441eb6dc0 | Update copyright years. | ||
|
|
4cb20dc043 |
libgomp: with USM, init 'link' variables with host address
If requires unified_shared_memory or self_maps is set, make 'declare target link' variables to point initially to the host pointer. libgomp/ChangeLog: * target.c (gomp_load_image_to_device): For requires unified_shared_memory, update 'link' vars to point to the host var. * testsuite/libgomp.c-c++-common/target-link-3.c: New test. * testsuite/libgomp.c-c++-common/target-link-4.c: New test. |
||
|
|
b752eed3e3 |
OpenMP: Add support for 'self_maps' to the 'require' directive
'self_maps' implies 'unified_shared_memory', except that the latter also permits that explicit maps copy data to device memory while self_maps does not. In GCC, currently, both are handled identical. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_requires): Handle self_maps clause. gcc/cp/ChangeLog: * parser.cc (cp_parser_omp_requires): Handle self_maps clause. gcc/fortran/ChangeLog: * gfortran.h (enum gfc_omp_requires_kind): Add OMP_REQ_SELF_MAPS. (gfc_namespace): Enlarge omp_requires bitfield. * module.cc (enum ab_attribute, attr_bits): Add AB_OMP_REQ_SELF_MAPS. (mio_symbol_attribute): Handle it. * openmp.cc (gfc_check_omp_requires, gfc_match_omp_requires): Handle self_maps clause. * parse.cc (gfc_parse_file): Handle self_maps clause. gcc/ChangeLog: * lto-cgraph.cc (output_offload_tables, omp_requires_to_name): Handle self_maps clause. * omp-general.cc (struct omp_ts_info, omp_context_selector_matches): Likewise for the associated trait. * omp-general.h (enum omp_requires): Add OMP_REQUIRES_SELF_MAPS. * omp-selectors.h (enum omp_ts_code): Add OMP_TRAIT_IMPLEMENTATION_SELF_MAPS. include/ChangeLog: * gomp-constants.h (GOMP_REQUIRES_SELF_MAPS): #define. libgomp/ChangeLog: * plugin/plugin-gcn.c (GOMP_OFFLOAD_get_num_devices): Accept self_maps clause. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices): Likewise. * libgomp.texi (TR13 Impl. Status): Set to 'Y'. * target.c (gomp_requires_to_name, GOMP_offload_register_ver, gomp_target_init): Handle self_maps clause. * testsuite/libgomp.fortran/self_maps.f90: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/declare-variant-1.c: Add self_maps test. * c-c++-common/gomp/requires-4.c: Likewise. * gfortran.dg/gomp/declare-variant-3.f90: Likewise. * c-c++-common/gomp/requires-2.c: Update dg-error msg. * gfortran.dg/gomp/requires-2.f90: Likewise. * gfortran.dg/gomp/requires-self-maps-aux.f90: New. * gfortran.dg/gomp/requires-self-maps.f90: New. |
||
|
|
cdb9aa0f62 |
OpenMP: Fix omp_get_device_from_uid, minor cleanup
In Fortran, omp_get_device_from_uid can also accept substrings, which are then not NUL terminated. Fixed by introducing a fortran.c wrapper function. Additionally, in case of a fail the plugin functions now return NULL instead of failing fatally such that a fall-back UID is generated. gcc/ChangeLog: * omp-general.cc (omp_runtime_api_procname): Strip "omp_" from string; move get_device_from_uid as now a '_' suffix exists. libgomp/ChangeLog: * fortran.c (omp_get_device_from_uid_): New function. * libgomp.map (GOMP_6.0): Add it. * oacc-host.c (host_dispatch): Init '.uid' and '.get_uid_func'. * omp_lib.f90.in: Make it used by removing bind(C). * omp_lib.h.in: Likewise. * target.c (omp_get_device_from_uid): Ensure the device is initialized. * plugin/plugin-gcn.c (GOMP_OFFLOAD_get_uid): Add function comment; return NULL in case of an error. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_uid): Likewise. * testsuite/libgomp.fortran/device_uid.f90: Update to test substrings. |
||
|
|
bf4a5efa80 |
OpenMP: Add get_device_from_uid/omp_get_uid_from_device routines
Those TR13/OpenMP 6.0 routines permit a reproducible offloading to a specific device by mapping an OpenMP device number to a unique ID (UID). The GPU device UIDs should be universally unique, the one for the host is not. gcc/ChangeLog: * omp-general.cc (omp_runtime_api_procname): Add get_device_from_uid and omp_get_uid_from_device routines. include/ChangeLog: * cuda/cuda.h (cuDeviceGetUuid): Declare. (cuDeviceGetUuid_v2): Add prototype. libgomp/ChangeLog: * config/gcn/target.c (omp_get_uid_from_device, omp_get_device_from_uid): Add stub implementation. * config/nvptx/target.c (omp_get_uid_from_device, omp_get_device_from_uid): Likewise. * fortran.c (omp_get_uid_from_device_, omp_get_uid_from_device_8_): New functions. * libgomp-plugin.h (GOMP_OFFLOAD_get_uid): Add prototype. * libgomp.h (struct gomp_device_descr): Add 'uid' and 'get_uid_func'. * libgomp.map (GOMP_6.0): New, includind the new UID routines. * libgomp.texi (OpenMP Technical Report 13): Mark UID routines as 'Y'. (Device Information Routines): Document new UID routines. (Offload-Target Specifics): Document UID format. * omp.h.in (omp_get_device_from_uid, omp_get_uid_from_device): New prototype. * omp_lib.f90.in (omp_get_device_from_uid, omp_get_uid_from_device): New interface. * omp_lib.h.in: Likewise. * plugin/cuda-lib.def: Add cuDeviceGetUuid and cuDeviceGetUuid_v2 via CUDA_ONE_CALL_MAYBE_NULL. * plugin/plugin-gcn.c (GOMP_OFFLOAD_get_uid): New. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_uid): New. * target.c (str_omp_initial_device): New static var. (STR_OMP_DEV_PREFIX): Define. (gomp_get_uid_for_device, omp_get_uid_from_device, omp_get_device_from_uid): New. (gomp_load_plugin_for_device): DLSYM_OPT the function 'get_uid'. (gomp_target_init): Set the device's 'uid' field to NULL. * testsuite/libgomp.c/device_uid.c: New test. * testsuite/libgomp.fortran/device_uid.f90: New test. |
||
|
|
0beac1db38 |
libgomp: Add interop types and routines to OpenMP's headers and module
This commit adds OpenMP 5.1+'s interop enumeration, type and routine declarations to the C/C++ header file and, new in OpenMP TR13, also to the Fortran module and omp_lib.h header file. While a stub implementation is provided, only with foreign runtime support by the libgomp GPU plugins and with the 'interop' directive, this becomes really useful. libgomp/ChangeLog: * fortran.c (omp_get_interop_str_, omp_get_interop_name_, omp_get_interop_type_desc_, omp_get_interop_rc_desc_): Add. * libgomp.map (GOMP_5.1.3): New; add interop routines. * omp.h.in: Add interop typedefs, enum and prototypes. (__GOMP_DEFAULT_NULL): Define. (omp_target_memcpy_async, omp_target_memcpy_rect_async): Use it for the optional depend argument. * omp_lib.f90.in: Add paramters and interfaces for interop. * omp_lib.h.in: Likewise; move F90 '&' to column 81 for -ffree-length-80. * target.c (omp_get_num_interop_properties, omp_get_interop_int, omp_get_interop_ptr, omp_get_interop_str, omp_get_interop_name, omp_get_interop_type_desc, omp_get_interop_rc_desc): Add. * config/gcn/target.c (omp_get_num_interop_properties, omp_get_interop_int, omp_get_interop_ptr, omp_get_interop_str, omp_get_interop_name, omp_get_interop_type_desc, omp_get_interop_rc_desc): Add. * config/nvptx/target.c (omp_get_num_interop_properties, omp_get_interop_int, omp_get_interop_ptr, omp_get_interop_str, omp_get_interop_name, omp_get_interop_type_desc, omp_get_interop_rc_desc): Add. * testsuite/libgomp.c-c++-common/interop-routines-1.c: New test. * testsuite/libgomp.c-c++-common/interop-routines-2.c: New test. * testsuite/libgomp.fortran/interop-routines-1.F90: New test. * testsuite/libgomp.fortran/interop-routines-2.F90: New test. * testsuite/libgomp.fortran/interop-routines-3.F: New test. * testsuite/libgomp.fortran/interop-routines-4.F: New test. * testsuite/libgomp.fortran/interop-routines-5.F: New test. * testsuite/libgomp.fortran/interop-routines-6.F: New test. * testsuite/libgomp.fortran/interop-routines-7.F90: New test. |
||
|
|
0c56fd6a1f |
libgomp: Device load_image - improve minor num-funcs/vars check
The run time library loads the offload functions and variable and optionally the ICV variable and returns the number of loaded items, which has to match the host side. The plugin returns "+1" (since GCC 12) for the ICV variable entry, independently whether it was loaded or not, but the var's value (start == end == 0) can be used to detect when this failed. Thus, we can tighten the assert check - which this commit does together with making the output less surprising - and simplify the condition further below. libgomp/ChangeLog: * target.c (gomp_load_image_to_device): Extend fatal-error message; simplify a condition. |
||
|
|
14c47e7eb0 |
libgomp: Fix declare target link with offset array-section mapping [PR116107]
Assume that 'int var[100]' is 'omp declare target link(var)'. When now mapping an array section with offset such as 'map(to:var[20:10])', the device-side link pointer has to store &<device-storage-data>[0] minus the offset such that var[20] will access <device-storage-data>[0]. But the offset calculation was missed such that the device-side 'var' pointed to the first element of the mapped data - and var[20] points beyond at some invalid memory. PR middle-end/116107 libgomp/ChangeLog: * target.c (gomp_map_vars_internal): Honor array mapping offsets with declare-target 'link' variables. * testsuite/libgomp.c-c++-common/target-link-2.c: New test. |
||
|
|
a95c1911d8 |
libgomp: Document 'GOMP_teams4'
For reference: - <https://inbox.sourceware.org/20211111190313.GV2710@tucnak> "[PATCH] openmp: Honor OpenMP 5.1 num_teams lower bound" - <https://inbox.sourceware.org/20211112132023.GC2710@tucnak> "[PATCH] libgomp, nvptx: Honor OpenMP 5.1 num_teams lower bound" libgomp/ * config/gcn/target.c (GOMP_teams4): Document. * config/nvptx/target.c (GOMP_teams4): Likewise. * target.c (GOMP_teams4): Likewise. |
||
|
|
4ccb3366ad |
libgomp: Enable USM for some nvptx devices
A few high-end nvptx devices support the attribute CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS; for those, unified shared memory is supported in hardware. This patch enables support for those - if all installed nvptx devices have this feature (as the capabilities are per device type). This exposes a bug in gomp_copy_back_icvs as it did before use omp_get_mapped_ptr to find mapped variables, but that returns the unchanged pointer in cased of shared memory. But in this case, we have a few actually mapped pointers - like the ICV variables. Additionally, there was a mismatch with regards to '-1' for the device number as gomp_copy_back_icvs and omp_get_mapped_ptr count differently. Hence, do the lookup manually. include/ChangeLog: * cuda/cuda.h (CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS): Add. libgomp/ChangeLog: * libgomp.texi (nvptx): Update USM description. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices): Claim support when requesting USM and all devices support CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS. * target.c (gomp_copy_back_icvs): Fix device ptr lookup. (gomp_target_init): Set GOMP_OFFLOAD_CAP_SHARED_MEM is the devices supports USM. |
||
|
|
a7578a077e |
OpenACC 2.7: Adjust acc_map_data/acc_unmap_data interaction with reference counters
This patch adjusts the implementation of acc_map_data/acc_unmap_data API library routines to more fit the description in the OpenACC 2.7 specification. Instead of using REFCOUNT_INFINITY, we now define a REFCOUNT_ACC_MAP_DATA special value to mark acc_map_data-created mappings. Adjustment around mapping related code to respect OpenACC semantics are also added. libgomp/ChangeLog: * libgomp.h (REFCOUNT_ACC_MAP_DATA): Define as (REFCOUNT_SPECIAL | 2). * oacc-mem.c (acc_map_data): Adjust to use REFCOUNT_ACC_MAP_DATA, initialize dynamic_refcount as 1. (acc_unmap_data): Adjust to use REFCOUNT_ACC_MAP_DATA, (goacc_map_var_existing): Add REFCOUNT_ACC_MAP_DATA case. (goacc_exit_datum_1): Add REFCOUNT_ACC_MAP_DATA case, respect REFCOUNT_ACC_MAP_DATA when decrementing/finalizing. Force lowest dynamic_refcount to be 1 for REFCOUNT_ACC_MAP_DATA. (goacc_enter_data_internal): Add REFCOUNT_ACC_MAP_DATA case. * target.c (gomp_increment_refcount): Return early for REFCOUNT_ACC_MAP_DATA case. (gomp_decrement_refcount): Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-96.c: New testcase. * testsuite/libgomp.oacc-c-c++-common/unmap-infinity-1.c: Adjust testcase error output scan test. |
||
|
|
dea9ac2a00 |
libgomp: Use void (*) (void *) rather than void (*)() for host_fn type [PR114216]
For the type of the target callbacks we use elsehwere void (*) (void *) and IMHO should use that for the reverse offload fallback as well (where the actual callback is emitted using the same code as for host fallback or device kernel entry routines), even when it is also ok to use void (*) () before C23 and we aren't building libgomp with C23 yet. On some arches perhaps void (*) () could result in worse code generation because calls in that case like casts to unprototyped functions need to sometimes pass argument in two different spots etc. so that it deals with both passing it through ... and as a named argument. 2024-03-04 Jakub Jelinek <jakub@redhat.com> PR libgomp/114216 * target.c (gomp_target_rev): Change host_fn type and corresponding cast from void (*)() to void (*) (void *). |
||
|
|
a945c346f5 | Update copyright years. | ||
|
|
f5745dc142 |
OpenMP/OpenACC: Unordered/non-constant component offset runtime diagnostic
This patch adds support for non-constant component offsets in "map" clauses for OpenMP (and the equivalants for OpenACC), which are not able to be sorted into order at compile time. Normally struct accesses in such clauses are gathered together and sorted into increasing address order after a "GOMP_MAP_STRUCT" node: if we have variable indices, that is no longer possible. This version of the patch scales back the previously-posted version to merely add a diagnostic for incorrect usage of component accesses with variably-indexed arrays of structs: the only permitted variant is where we have multiple indices that are the same, but we could not prove so at compile time. Rather than silently producing the wrong result for cases where the indices are in fact different, we error out (e.g., "map(dtarr(i)%arrptr, dtarr(j)%arrptr(4:8))", for different i/j). For now, multiple *constant* array indices are still supported (see map-arrayofstruct-1.c). That could perhaps be addressed with a follow-up patch, if necessary. This version of the patch renumbers the GOMP_MAP_STRUCT_UNORD kind to avoid clashing with the OpenACC "non-contiguous" dynamic array support (though that is not yet applied to mainline). 2023-08-18 Julian Brown <julian@codesourcery.com> gcc/ * gimplify.cc (extract_base_bit_offset): Add VARIABLE_OFFSET parameter. (omp_get_attachment, omp_group_last, omp_group_base, omp_directive_maps_explicitly): Add GOMP_MAP_STRUCT_UNORD support. (omp_accumulate_sibling_list): Update calls to extract_base_bit_offset. Support GOMP_MAP_STRUCT_UNORD. (omp_build_struct_sibling_lists, gimplify_scan_omp_clauses, gimplify_adjust_omp_clauses, gimplify_omp_target_update): Add GOMP_MAP_STRUCT_UNORD support. * omp-low.cc (lower_omp_target): Add GOMP_MAP_STRUCT_UNORD support. * tree-pretty-print.cc (dump_omp_clause): Likewise. include/ * gomp-constants.h (gomp_map_kind): Add GOMP_MAP_STRUCT_UNORD. libgomp/ * oacc-mem.c (find_group_last, goacc_enter_data_internal, goacc_exit_data_internal, GOACC_enter_exit_data): Add GOMP_MAP_STRUCT_UNORD support. * target.c (gomp_map_vars_internal): Add GOMP_MAP_STRUCT_UNORD support. Detect incorrect use of variable indexing of arrays of structs. (GOMP_target_enter_exit_data, gomp_target_task_fn): Add GOMP_MAP_STRUCT_UNORD support. * testsuite/libgomp.c-c++-common/map-arrayofstruct-1.c: New test. * testsuite/libgomp.c-c++-common/map-arrayofstruct-2.c: New test. * testsuite/libgomp.c-c++-common/map-arrayofstruct-3.c: New test. * testsuite/libgomp.fortran/map-subarray-5.f90: New test. |
||
|
|
5fdb150cd4 |
OpenMP/OpenACC: Rework clause expansion and nested struct handling
This patch reworks clause expansion in the C, C++ and (to a lesser
extent) Fortran front ends for OpenMP and OpenACC mapping nodes used in
GPU offloading support.
At present a single clause may be turned into several mapping nodes,
or have its mapping type changed, in several places scattered through
the front- and middle-end. The analysis relating to which particular
transformations are needed for some given expression has become quite hard
to follow. Briefly, we manipulate clause types in the following places:
1. During parsing, in c_omp_adjust_map_clauses. Depending on a set of
rules, we may change a FIRSTPRIVATE_POINTER (etc.) mapping into
ATTACH_DETACH, or mark the decl addressable.
2. In semantics.cc or c-typeck.cc, clauses are expanded in
handle_omp_array_sections (called via {c_}finish_omp_clauses, or in
finish_omp_clauses itself. The two cases are for processing array
sections (the former), or non-array sections (the latter).
3. In gimplify.cc, we build sibling lists for struct accesses, which
groups and sorts accesses along with their struct base, creating
new ALLOC/RELEASE nodes for pointers.
4. In gimplify.cc:gimplify_adjust_omp_clauses, mapping nodes may be
adjusted or created.
This patch doesn't completely disrupt this scheme, though clause
types are no longer adjusted in c_omp_adjust_map_clauses (step 1).
Clause expansion in step 2 (for C and C++) now uses a single, unified
mechanism, parts of which are also reused for analysis in step 3.
Rather than the kind-of "ad-hoc" pattern matching on addresses used to
expand clauses used at present, a new method for analysing addresses is
introduced. This does a recursive-descent tree walk on expression nodes,
and emits a vector of tokens describing each "part" of the address.
This tokenized address can then be translated directly into mapping nodes,
with the assurance that no part of the expression has been inadvertently
skipped or misinterpreted. In this way, all the variations of ways
pointers, arrays, references and component accesses might be combined
can be teased apart into easily-understood cases - and we know we've
"parsed" the whole address before we start analysis, so the right code
paths can easily be selected.
For example, a simple access "arr[idx]" might parse as:
base-decl access-indexed-array
or "mystruct->foo[x]" with a pointer "foo" component might parse as:
base-decl access-pointer component-selector access-pointer
A key observation is that support for "array" bases, e.g. accesses
whose root nodes are not structures, but describe scalars or arrays,
and also *one-level deep* structure accesses, have first-class support
in gimplify and beyond. Expressions that use deeper struct accesses
or e.g. multiple indirections were more problematic: some cases worked,
but lots of cases didn't. This patch reimplements the support for those
in gimplify.cc, again using the new "address tokenization" support.
An expression like "mystruct->foo->bar[0:10]" used in a mapping node will
translate the right-hand access directly in the front-end. The base for
the access will be "mystruct->foo". This is handled recursively in
gimplify.cc -- there may be several accesses of "mystruct"'s members
on the same directive, so the sibling-list building machinery can be
used again. (This was already being done for OpenACC, but the new
implementation differs somewhat in details, and is more robust.)
For OpenMP, in the case where the base pointer itself,
i.e. "mystruct->foo" here, is NOT mapped on the same directive, we
create a "fragile" mapping. This turns the "foo" component access
into a zero-length allocation (which is a new feature for the runtime,
so support has been added there too).
A couple of changes have been made to how mapping clauses are turned
into mapping nodes:
The first change is based on the observation that it is probably never
correct to use GOMP_MAP_ALWAYS_POINTER for component accesses (e.g. for
references), because if the containing struct is already mapped on the
target then the host version of the pointer in question will be corrupted
if the struct is copied back from the target. This patch removes all
such uses, across each of C, C++ and Fortran.
The second change is to the way that GOMP_MAP_ATTACH_DETACH nodes
are processed during sibling-list creation. For OpenMP, for pointer
components, we must map the base pointer separately from an array section
that uses the base pointer, so e.g. we must have both "map(mystruct.base)"
and "map(mystruct.base[0:10])" mappings. These create nodes such as:
GOMP_MAP_TOFROM mystruct.base
G_M_TOFROM *mystruct.base [len: 10*elemsize] G_M_ATTACH_DETACH mystruct.base
Instead of using the first of these directly when building the struct
sibling list then skipping the group using GOMP_MAP_ATTACH_DETACH,
leading to:
GOMP_MAP_STRUCT mystruct [len: 1] GOMP_MAP_TOFROM mystruct.base
we now introduce a new "mini-pass", omp_resolve_clause_dependencies, that
drops the GOMP_MAP_TOFROM for the base pointer, marks the second group
as having had a base-pointer mapping, then omp_build_struct_sibling_lists
can create:
GOMP_MAP_STRUCT mystruct [len: 1] GOMP_MAP_ALLOC mystruct.base [len: ptrsize]
This ends up working better in many cases, particularly those involving
references. (The "alloc" space is immediately overwritten by a pointer
attachment, so this is mildly more efficient than a redundant TO mapping
at runtime also.)
There is support in the address tokenizer for "arbitrary" base expressions
which aren't rooted at a decl, but that is not used as present because
such addresses are disallowed at parse time.
In the front-ends, the address tokenization machinery is mostly only
used for clause expansion and not for diagnostics at present. It could
be used for those too, which would allow more of my previous "address
inspector" implementation to be removed.
The new bits in gimplify.cc work with OpenACC also.
This version of the patch addresses several first-pass review comments
from Tobias, and fixes a few previously-missed cases for manually-managed
ragged array mappings (including cases using references). Some arbitrary
differences between handling of clause expansion for C vs. C++ have also
been fixed, and some fragments from later in the patch series have been
moved forward (where they were useful for fixing bugs). Several new
test cases have been added.
2023-11-29 Julian Brown <julian@codesourcery.com>
gcc/c-family/
* c-common.h (c_omp_region_type): Add C_ORT_EXIT_DATA,
C_ORT_OMP_EXIT_DATA and C_ORT_ACC_TARGET.
(omp_addr_token): Add forward declaration.
(c_omp_address_inspector): New class.
* c-omp.cc (c_omp_adjust_map_clauses): Mark decls addressable here, but
do not change any mapping node types.
(c_omp_address_inspector::unconverted_ref_origin,
c_omp_address_inspector::component_access_p,
c_omp_address_inspector::check_clause,
c_omp_address_inspector::get_root_term,
c_omp_address_inspector::map_supported_p,
c_omp_address_inspector::get_origin,
c_omp_address_inspector::maybe_unconvert_ref,
c_omp_address_inspector::maybe_zero_length_array_section,
c_omp_address_inspector::expand_array_base,
c_omp_address_inspector::expand_component_selector,
c_omp_address_inspector::expand_map_clause): New methods.
(omp_expand_access_chain): New function.
gcc/c/
* c-parser.cc (c_parser_oacc_all_clauses): Add TARGET_P parameter. Use
to select region type for c_finish_omp_clauses call.
(c_parser_oacc_loop): Update calls to c_parser_oacc_all_clauses.
(c_parser_oacc_compute): Likewise.
(c_parser_omp_target_data, c_parser_omp_target_enter_data): Support
ATTACH kind.
(c_parser_omp_target_exit_data): Support DETACH kind.
(check_clauses): Handle GOMP_MAP_POINTER and GOMP_MAP_ATTACH here.
* c-typeck.cc (handle_omp_array_sections_1,
handle_omp_array_sections, c_finish_omp_clauses): Use
c_omp_address_inspector class and OMP address tokenizer to analyze and
expand map clause expressions. Fix some diagnostics. Fix "is OpenACC"
condition for C_ORT_ACC_TARGET addition.
gcc/cp/
* parser.cc (cp_parser_oacc_all_clauses): Add TARGET_P parameter. Use
to select region type for finish_omp_clauses call.
(cp_parser_omp_target_data, cp_parser_omp_target_enter_data): Support
GOMP_MAP_ATTACH kind.
(cp_parser_omp_target_exit_data): Support GOMP_MAP_DETACH kind.
(cp_parser_oacc_declare): Update call to cp_parser_oacc_all_clauses.
(cp_parser_oacc_loop): Update calls to cp_parser_oacc_all_clauses.
(cp_parser_oacc_compute): Likewise.
* pt.cc (tsubst_expr): Use C_ORT_ACC_TARGET for call to
tsubst_omp_clauses for OpenACC compute regions.
* semantics.cc (cp_omp_address_inspector): New class, derived from
c_omp_address_inspector.
(handle_omp_array_sections_1, handle_omp_array_sections,
finish_omp_clauses): Use cp_omp_address_inspector class and OMP address
tokenizer to analyze and expand OpenMP map clause expressions. Fix
some diagnostics. Support C_ORT_ACC_TARGET.
(finish_omp_target): Handle GOMP_MAP_POINTER.
gcc/fortran/
* trans-openmp.cc (gfc_trans_omp_array_section): Add OPENMP parameter.
Use GOMP_MAP_ATTACH_DETACH instead of GOMP_MAP_ALWAYS_POINTER for
derived type components.
(gfc_trans_omp_clauses): Update calls to gfc_trans_omp_array_section.
gcc/
* gimplify.cc (build_struct_comp_nodes): Don't process
GOMP_MAP_ATTACH_DETACH "middle" nodes here.
(omp_mapping_group): Add REPROCESS_STRUCT and FRAGILE booleans for
nested struct handling.
(omp_strip_components_and_deref, omp_strip_indirections): Remove
functions.
(omp_get_attachment): Handle GOMP_MAP_DETACH here.
(omp_group_last): Handle GOMP_MAP_*, GOMP_MAP_DETACH,
GOMP_MAP_ATTACH_DETACH groups for "exit data" of reference-to-pointer
component array sections.
(omp_gather_mapping_groups_1): Initialise reprocess_struct and fragile
fields.
(omp_group_base): Handle GOMP_MAP_ATTACH_DETACH after GOMP_MAP_STRUCT.
(omp_index_mapping_groups_1): Skip reprocess_struct groups.
(omp_get_nonfirstprivate_group, omp_directive_maps_explicitly,
omp_resolve_clause_dependencies, omp_first_chained_access_token): New
functions.
(omp_check_mapping_compatibility): Adjust accepted node combinations
for "from" clauses using release instead of alloc.
(omp_accumulate_sibling_list): Add GROUP_MAP, ADDR_TOKENS, FRAGILE_P,
REPROCESSING_STRUCT, ADDED_TAIL parameters. Use OMP address tokenizer
to analyze addresses. Reimplement nested struct handling, and
implement "fragile groups".
(omp_build_struct_sibling_lists): Adjust for changes to
omp_accumulate_sibling_list. Recalculate bias for ATTACH_DETACH nodes
after GOMP_MAP_STRUCT nodes.
(gimplify_scan_omp_clauses): Call omp_resolve_clause_dependencies. Use
OMP address tokenizer.
(gimplify_adjust_omp_clauses_1): Use build_fold_indirect_ref_loc
instead of build_simple_mem_ref_loc.
* omp-general.cc (omp-general.h, tree-pretty-print.h): Include.
(omp_addr_tokenizer): New namespace.
(omp_addr_tokenizer::omp_addr_token): New.
(omp_addr_tokenizer::omp_parse_component_selector,
omp_addr_tokenizer::omp_parse_ref,
omp_addr_tokenizer::omp_parse_pointer,
omp_addr_tokenizer::omp_parse_access_method,
omp_addr_tokenizer::omp_parse_access_methods,
omp_addr_tokenizer::omp_parse_structure_base,
omp_addr_tokenizer::omp_parse_structured_expr,
omp_addr_tokenizer::omp_parse_array_expr,
omp_addr_tokenizer::omp_access_chain_p,
omp_addr_tokenizer::omp_accessed_addr): New functions.
(omp_parse_expr, debug_omp_tokenized_addr): New functions.
* omp-general.h (omp_addr_tokenizer::access_method_kinds,
omp_addr_tokenizer::structure_base_kinds,
omp_addr_tokenizer::token_type,
omp_addr_tokenizer::omp_addr_token,
omp_addr_tokenizer::omp_access_chain_p,
omp_addr_tokenizer::omp_accessed_addr): New.
(omp_addr_token, omp_parse_expr): New.
* omp-low.cc (scan_sharing_clauses): Skip error check for references
to pointers.
* tree.h (OMP_CLAUSE_ATTACHMENT_MAPPING_ERASED): New macro.
gcc/testsuite/
* c-c++-common/gomp/clauses-2.c: Fix error output.
* c-c++-common/gomp/target-implicit-map-2.c: Adjust scan output.
* c-c++-common/gomp/target-50.c: Adjust scan output.
* c-c++-common/gomp/target-enter-data-1.c: Adjust scan output.
* g++.dg/gomp/static-component-1.C: New test.
* gcc.dg/gomp/target-3.c: Adjust scan output.
* gfortran.dg/gomp/map-9.f90: Adjust scan output.
libgomp/
* target.c (gomp_map_pointer): Modify zero-length array section
pointer handling.
(gomp_attach_pointer): Likewise.
(gomp_map_fields_existing): Use gomp_map_0len_lookup.
(gomp_attach_pointer): Allow attaching null pointers (or Fortran
"unassociated" pointers).
(gomp_map_vars_internal): Handle zero-sized struct members. Add
diagnostic for unmapped struct pointer members.
* testsuite/libgomp.c-c++-common/baseptrs-1.c: New test.
* testsuite/libgomp.c-c++-common/baseptrs-2.c: New test.
* testsuite/libgomp.c-c++-common/baseptrs-6.c: New test.
* testsuite/libgomp.c-c++-common/baseptrs-7.c: New test.
* testsuite/libgomp.c-c++-common/ptr-attach-2.c: New test.
* testsuite/libgomp.c-c++-common/target-implicit-map-2.c: Fix missing
"free".
* testsuite/libgomp.c-c++-common/target-implicit-map-5.c: New test.
* testsuite/libgomp.c-c++-common/target-map-zlas-1.c: New test.
* testsuite/libgomp.c++/class-array-1.C: New test.
* testsuite/libgomp.c++/baseptrs-3.C: New test.
* testsuite/libgomp.c++/baseptrs-4.C: New test.
* testsuite/libgomp.c++/baseptrs-5.C: New test.
* testsuite/libgomp.c++/baseptrs-8.C: New test.
* testsuite/libgomp.c++/baseptrs-9.C: New test.
* testsuite/libgomp.c++/ref-mapping-1.C: New test.
* testsuite/libgomp.c++/target-48.C: New test.
* testsuite/libgomp.c++/target-49.C: New test.
* testsuite/libgomp.c++/target-exit-data-reftoptr-1.C: New test.
* testsuite/libgomp.c++/target-lambda-1.C: Update for OpenMP 5.2
semantics.
* testsuite/libgomp.c++/target-this-3.C: Likewise.
* testsuite/libgomp.c++/target-this-4.C: Likewise.
* testsuite/libgomp.fortran/struct-elem-map-1.f90: Add temporary XFAIL.
* testsuite/libgomp.fortran/target-enter-data-6.f90: Likewise.
|
||
|
|
d4b6d14792 |
OpenMP/Fortran: Implement omp allocators/allocate for ptr/allocatables
This commit adds -fopenmp-allocators which enables support for 'omp allocators' and 'omp allocate' that are associated with a Fortran allocate-stmt. If such a construct is encountered, an error is shown, unless the -fopenmp-allocators flag is present. With -fopenmp -fopenmp-allocators, those constructs get turned into GOMP_alloc allocations, while -fopenmp-allocators (also without -fopenmp) ensures deallocation and reallocation (via intrinsic assignments) are properly directed to GOMP_free/omp_realloc - while normal Fortran allocations are processed by free/realloc. In order to distinguish a 'malloc'ed from a 'GOMP_alloc'ed memory, the version field of the Fortran array discriptor is (mis)used: 0 indicates the normal Fortran allocation while 1 denotes GOMP_alloc. For scalars, there is record keeping in libgomp: GOMP_add_alloc(ptr) will add the pointer address to a splay_tree while GOMP_is_alloc(ptr) will return true it was previously added but also removes it from the list. Besides Fortran FE work, BUILT_IN_GOMP_REALLOC is no part of omp-builtins.def and libgomp gains the mentioned two new function. gcc/ChangeLog: * builtin-types.def (BT_FN_PTR_PTR_SIZE_PTRMODE_PTRMODE): New. * omp-builtins.def (BUILT_IN_GOMP_REALLOC): New. * builtins.cc (builtin_fnspec): Handle it. * gimple-ssa-warn-access.cc (fndecl_alloc_p, matching_alloc_calls_p): Likewise. * gimple.cc (nonfreeing_call_p): Likewise. * predict.cc (expr_expected_value_1): Likewise. * tree-ssa-ccp.cc (evaluate_stmt): Likewise. * tree.cc (fndecl_dealloc_argno): Likewise. gcc/fortran/ChangeLog: * dump-parse-tree.cc (show_omp_node): Handle EXEC_OMP_ALLOCATE and EXEC_OMP_ALLOCATORS. * f95-lang.cc (ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LIST): Add 'ECF_LEAF | ECF_MALLOC' to existing 'ECF_NOTHROW'. (ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LEAF_LIST): Define. * gfortran.h (gfc_omp_clauses): Add contained_in_target_construct. * invoke.texi (-fopenacc, -fopenmp): Update based on C version. (-fopenmp-simd): New, based on C version. (-fopenmp-allocators): New. * lang.opt (fopenmp-allocators): Add. * openmp.cc (resolve_omp_clauses): For allocators/allocate directive, add target and no dynamic_allocators diagnostic and more invalid diagnostic. * parse.cc (decode_omp_directive): Set contains_teams_construct. * trans-array.h (gfc_array_allocate): Update prototype. (gfc_conv_descriptor_version): New prototype. * trans-decl.cc (gfc_init_default_dt): Fix comment. * trans-array.cc (gfc_conv_descriptor_version): New. (gfc_array_allocate): Support GOMP_alloc allocation. (gfc_alloc_allocatable_for_assignment, structure_alloc_comps): Handle GOMP_free/omp_realloc as needed. * trans-expr.cc (gfc_conv_procedure_call): Likewise. (alloc_scalar_allocatable_for_assignment): Likewise. * trans-intrinsic.cc (conv_intrinsic_move_alloc): Likewise. * trans-openmp.cc (gfc_trans_omp_allocators, gfc_trans_omp_directive): Handle allocators/allocate directive. (gfc_omp_call_add_alloc, gfc_omp_call_is_alloc): New. * trans-stmt.h (gfc_trans_allocate): Update prototype. * trans-stmt.cc (gfc_trans_allocate): Support GOMP_alloc. * trans-types.cc (gfc_get_dtype_rank_type): Set version field. * trans.cc (gfc_allocate_using_malloc, gfc_allocate_allocatable): Update to handle GOMP_alloc. (gfc_deallocate_with_status, gfc_deallocate_scalar_with_status): Handle GOMP_free. (trans_code): Update call. * trans.h (gfc_allocate_allocatable, gfc_allocate_using_malloc): Update prototype. (gfc_omp_call_add_alloc, gfc_omp_call_is_alloc): New prototype. * types.def (BT_FN_PTR_PTR_SIZE_PTRMODE_PTRMODE): New. libgomp/ChangeLog: * allocator.c (struct fort_alloc_splay_tree_key_s, fort_alloc_splay_compare, GOMP_add_alloc, GOMP_is_alloc): New. * libgomp.h: Define splay_tree_static for 'reverse' splay tree. * libgomp.map (GOMP_5.1.2): New; add GOMP_add_alloc and GOMP_is_alloc; move GOMP_target_map_indirect_ptr from ... (GOMP_5.1.1): ... here. * libgomp.texi (Impl. Status, Memory management): Update for allocators/allocate directives. * splay-tree.c: Handle splay_tree_static define to declare all functions as static. (splay_tree_lookup_node): New. * splay-tree.h: Handle splay_tree_decl_only define. (splay_tree_lookup_node): New prototype. * target.c: Define splay_tree_static for 'reverse'. * testsuite/libgomp.fortran/allocators-1.f90: New test. * testsuite/libgomp.fortran/allocators-2.f90: New test. * testsuite/libgomp.fortran/allocators-3.f90: New test. * testsuite/libgomp.fortran/allocators-4.f90: New test. * testsuite/libgomp.fortran/allocators-5.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/allocate-14.f90: Add coarray and not-listed tests. * gfortran.dg/gomp/allocate-5.f90: Remove sorry dg-message. * gfortran.dg/bind_c_array_params_2.f90: Update expected dump for dtype '.version=0'. * gfortran.dg/gomp/allocate-16.f90: New test. * gfortran.dg/gomp/allocators-3.f90: New test. * gfortran.dg/gomp/allocators-4.f90: New test. |
||
|
|
a49c7d3193 |
openmp: Add support for the 'indirect' clause in C/C++
This adds support for the 'indirect' clause in the 'declare target' directive. Functions declared as indirect may be called via function pointers passed from the host in offloaded code. Virtual calls to member functions via the object pointer in C++ are currently not supported in target regions. 2023-11-07 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/c-family/ * c-attribs.cc (c_common_attribute_table): Add attribute for indirect functions. * c-pragma.h (enum parma_omp_clause): Add entry for indirect clause. gcc/c/ * c-decl.cc (c_decl_attributes): Add attribute for indirect functions. * c-lang.h (c_omp_declare_target_attr): Add indirect field. * c-parser.cc (c_parser_omp_clause_name): Handle indirect clause. (c_parser_omp_clause_indirect): New. (c_parser_omp_all_clauses): Handle indirect clause. (OMP_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask. (c_parser_omp_declare_target): Handle indirect clause. Emit error message if device_type or indirect clauses used alone. Emit error if indirect clause used with device_type that is not 'any'. (OMP_BEGIN_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask. (c_parser_omp_begin): Handle indirect clause. * c-typeck.cc (c_finish_omp_clauses): Handle indirect clause. gcc/cp/ * cp-tree.h (cp_omp_declare_target_attr): Add indirect field. * decl2.cc (cplus_decl_attributes): Add attribute for indirect functions. * parser.cc (cp_parser_omp_clause_name): Handle indirect clause. (cp_parser_omp_clause_indirect): New. (cp_parser_omp_all_clauses): Handle indirect clause. (handle_omp_declare_target_clause): Add extra parameter. Add indirect attribute for indirect functions. (OMP_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask. (cp_parser_omp_declare_target): Handle indirect clause. Emit error message if device_type or indirect clauses used alone. Emit error if indirect clause used with device_type that is not 'any'. (OMP_BEGIN_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask. (cp_parser_omp_begin): Handle indirect clause. * semantics.cc (finish_omp_clauses): Handle indirect clause. gcc/ * lto-cgraph.cc (enum LTO_symtab_tags): Add tag for indirect functions. (output_offload_tables): Write indirect functions. (input_offload_tables): read indirect functions. * lto-section-names.h (OFFLOAD_IND_FUNC_TABLE_SECTION_NAME): New. * omp-builtins.def (BUILT_IN_GOMP_TARGET_MAP_INDIRECT_PTR): New. * omp-offload.cc (offload_ind_funcs): New. (omp_discover_implicit_declare_target): Add functions marked with 'omp declare target indirect' to indirect functions list. (omp_finish_file): Add indirect functions to section for offload indirect functions. (execute_omp_device_lower): Redirect indirect calls on target by passing function pointer to BUILT_IN_GOMP_TARGET_MAP_INDIRECT_PTR. (pass_omp_device_lower::gate): Run pass_omp_device_lower if indirect functions are present on an accelerator device. * omp-offload.h (offload_ind_funcs): New. * tree-core.h (omp_clause_code): Add OMP_CLAUSE_INDIRECT. * tree.cc (omp_clause_num_ops): Add entry for OMP_CLAUSE_INDIRECT. (omp_clause_code_name): Likewise. * tree.h (OMP_CLAUSE_INDIRECT_EXPR): New. * config/gcn/mkoffload.cc (process_asm): Process offload_ind_funcs section. Count number of indirect functions. (process_obj): Emit number of indirect functions. * config/nvptx/mkoffload.cc (ind_func_ids, ind_funcs_tail): New. (process): Emit offload_ind_func_table in PTX code. Emit indirect function names and count in image. * config/nvptx/nvptx.cc (nvptx_record_offload_symbol): Mark indirect functions in PTX code with IND_FUNC_MAP. gcc/testsuite/ * c-c++-common/gomp/declare-target-7.c: Update expected error message. * c-c++-common/gomp/declare-target-indirect-1.c: New. * c-c++-common/gomp/declare-target-indirect-2.c: New. * g++.dg/gomp/attrs-21.C (v12): Update expected error message. * g++.dg/gomp/declare-target-indirect-1.C: New. * gcc.dg/gomp/attrs-21.c (v12): Update expected error message. include/ * gomp-constants.h (GOMP_VERSION): Increment to 3. (GOMP_VERSION_SUPPORTS_INDIRECT_FUNCS): New. libgcc/ * offloadstuff.c (OFFLOAD_IND_FUNC_TABLE_SECTION_NAME): New. (__offload_ind_func_table): New. (__offload_ind_funcs_end): New. (__OFFLOAD_TABLE__): Add entries for indirect functions. libgomp/ * Makefile.am (libgomp_la_SOURCES): Add target-indirect.c. * Makefile.in: Regenerate. * libgomp-plugin.h (GOMP_INDIRECT_ADDR_MAP): New define. (GOMP_OFFLOAD_load_image): Add extra argument. * libgomp.h (struct indirect_splay_tree_key_s): New. (indirect_splay_tree_node, indirect_splay_tree, indirect_splay_tree_key): New. (indirect_splay_compare): New. * libgomp.map (GOMP_5.1.1): Add GOMP_target_map_indirect_ptr. * libgomp.texi (OpenMP 5.1): Update documentation on indirect calls in target region and on indirect clause. (Other new OpenMP 5.2 features): Add entry for virtual function calls. * libgomp_g.h (GOMP_target_map_indirect_ptr): Add prototype. * oacc-host.c (host_load_image): Add extra argument. * target.c (gomp_load_image_to_device): If the GOMP_VERSION is high enough, read host indirect functions table and pass to load_image_func. * config/accel/target-indirect.c: New. * config/linux/target-indirect.c: New. * config/gcn/team.c (build_indirect_map): Add prototype. (gomp_gcn_enter_kernel): Initialize support for indirect function calls on GCN target. * config/nvptx/team.c (build_indirect_map): Add prototype. (gomp_nvptx_main): Initialize support for indirect function calls on NVPTX target. * plugin/plugin-gcn.c (struct gcn_image_desc): Add field for indirect functions count. (GOMP_OFFLOAD_load_image): Add extra argument. If the GOMP_VERSION is high enough, build address translation table and copy it to target memory. * plugin/plugin-nvptx.c (nvptx_tdata): Add field for indirect functions count. (GOMP_OFFLOAD_load_image): Add extra argument. If the GOMP_VERSION is high enough, Build address translation table and copy it to target memory. * testsuite/libgomp.c-c++-common/declare-target-indirect-1.c: New. * testsuite/libgomp.c-c++-common/declare-target-indirect-2.c: New. * testsuite/libgomp.c++/declare-target-indirect-1.C: New. |
||
|
|
d22cd7745f |
Revert: "Another revert test with a bogus hash"
This reverts commit ffffffffffffffffffffffffffffffffffffffff. This should get rejected because of the invalid hash. If it still is accepted, it does something sensible: It removes tailing white space from a line in libgomp/target.c. |
||
|
|
8b9e559fe7 |
libgomp: cuda.h and omp_target_memcpy_rect cleanup
Fixes for commit r14-2792-g25072a477a56a727b369bf9b20f4d18198ff5894
"OpenMP: Call cuMemcpy2D/cuMemcpy3D for nvptx for omp_target_memcpy_rect",
namely:
In that commit, the code was changed to handle shared-memory devices;
however, as pointed out, omp_target_memcpy_check already set the pointer
to NULL in that case. Hence, this commit reverts to the prior version.
In cuda.h, it adds cuMemcpyPeer{,Async} for symmetry for cuMemcpy3DPeer
(all currently unused) and in three structs, fixes reserved-member names
and remove a bogus 'const' in three structs.
And it changes a DLSYM to DLSYM_OPT as not all plugins support the new
functions, yet.
include/ChangeLog:
* cuda/cuda.h (CUDA_MEMCPY2D, CUDA_MEMCPY3D, CUDA_MEMCPY3D_PEER):
Remove bogus 'const' from 'const void *dst' and fix reserved-name
name in those structs.
(cuMemcpyPeer, cuMemcpyPeerAsync): Add.
libgomp/ChangeLog:
* target.c (omp_target_memcpy_rect_worker): Undo dim=1 change for
GOMP_OFFLOAD_CAP_SHARED_MEM.
(omp_target_memcpy_rect_copy): Likewise for lock condition.
(gomp_load_plugin_for_device): Use DLSYM_OPT not DLSYM for
memcpy3d/memcpy2d.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_memcpy2d,
GOMP_OFFLOAD_memcpy3d): Use memset 0 to nullify reserved and
unused src/dst fields for that mem type; remove '{src,dst}LOD = 0'.
|
||
|
|
25072a477a |
OpenMP: Call cuMemcpy2D/cuMemcpy3D for nvptx for omp_target_memcpy_rect
When copying a 2D or 3D rectangular memmory block, the performance is better when using CUDA's cuMemcpy2D/cuMemcpy3D instead of copying the data one by one. That's what this commit does. Additionally, it permits device-to-device copies, if neccessary using a temporary variable on the host. include/ChangeLog: * cuda/cuda.h (CUlimit): Add CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_INVALID_HANDLE. (CUarray, CUmemorytype, CUDA_MEMCPY2D, CUDA_MEMCPY3D, CUDA_MEMCPY3D_PEER): New typdefs. (cuMemcpy2D, cuMemcpy2DAsync, cuMemcpy2DUnaligned, cuMemcpy3D, cuMemcpy3DAsync, cuMemcpy3DPeer, cuMemcpy3DPeerAsync): New prototypes. libgomp/ChangeLog: * libgomp-plugin.h (GOMP_OFFLOAD_memcpy2d, GOMP_OFFLOAD_memcpy3d): New prototypes. * libgomp.h (struct gomp_device_descr): Add memcpy2d_func and memcpy3d_func. * libgomp.texi (nvtpx): Document when cuMemcpy2D/cuMemcpy3D is used. * oacc-host.c (memcpy2d_func, .memcpy3d_func): Init with NULL. * plugin/cuda-lib.def (cuMemcpy2D, cuMemcpy2DUnaligned, cuMemcpy3D): Invoke via CUDA_ONE_CALL. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_memcpy2d, GOMP_OFFLOAD_memcpy3d): New. * target.c (omp_target_memcpy_rect_worker): (omp_target_memcpy_rect_check, omp_target_memcpy_rect_copy): Permit all device-to-device copyies; invoke new plugins for 2D and 3D copying when available. (gomp_load_plugin_for_device): DLSYM the new plugin functions. * testsuite/libgomp.c/target-12.c: Fix dimension bug. * testsuite/libgomp.fortran/target-12.f90: Likewise. * testsuite/libgomp.fortran/target-memcpy-rect-1.f90: New test. |
||
|
|
b25ea7ab78 |
OpenMP (C/C++): Keep pointer value of unmapped ptr with default mapping [PR110270]
For C/C++ pointers, default implicit mapping firstprivatizes the pointer but if the memory it points to is mapped, the it is updated to point to the device memory (by attaching a zero sized array section of the pointed-to storage). However, if the pointed-to storage wasn't mapped, the pointer was set to NULL on the device side (OpenMP 5.0/5.1 semantic). With this commit, the pointer retains the on-host address in that case (OpenMP 5.2 semantic). The new semantic avoids an explicit map/firstprivate/is_device_ptr in the following sensible cases: Special values (e.g. pointer or 0x1, 0x2 etc.), explicitly device allocated memory (e.g. omp_target_alloc), and with (unified) shared memory. (Note: With (U)SM, mappings still must be tracked, at least when omp_target_associate_ptr does not fail when passing in two destinct pointers.) libgomp/ PR middle-end/110270 * target.c (gomp_map_vars_internal): Copy host value instead of NULL for GOMP_MAP_ZERO_LEN_ARRAY_SECTION if not mapped. * libgomp.texi (OpenMP 5.2 Impl.): Mark as 'Y'. * testsuite/libgomp.c/target-19.c: Update expected value. * testsuite/libgomp.c++/target-18.C: Likewise. * testsuite/libgomp.c++/target-19.C: Likewise. * testsuite/libgomp.c-c++-common/requires-unified-addr-2.c: New test. * testsuite/libgomp.c-c++-common/target-implicit-map-3.c: New test. * testsuite/libgomp.c-c++-common/target-implicit-map-4.c: New test. |
||
|
|
8216ca8503 |
libgomp: Fix OMP_TARGET_OFFLOAD=mandatory
It turned out that gomp_init_targets_once() was not run when directly calling 'omp target' or 'omp target (enter/exit) data' causing an abort with OMP_TARGET_OFFLOAD=mandatory wrongly claiming that no device is available. It was called a tiny bit later but few lines too late for updating the default-device-var. libgomp/ChangeLog: * target.c (resolve_device): Call gomp_get_num_devices early to ensure gomp_init_targets_once was called before using default-device-var. * testsuite/libgomp.c/target-55.c: New test. * testsuite/libgomp.c/target-55a.c: New test. |
||
|
|
f2ef1dabbc |
Align a 'OMP_TARGET_OFFLOAD=mandatory' diagnostic with others
On 2023-06-14T11:42:22+0200, Tobias Burnus <tobias@codesourcery.com> wrote:
> On 14.06.23 10:09, Thomas Schwinge wrote:
>> Let me know if I should also adjust the new 'target { ! offload_device }'
>> diagnostic "[...] MANDATORY but only the host device is available" to
>> include a comma before 'but', for consistency with the other existing
>> diagnostics (cited above)?
>
> I think it makes sense to be consistent. Thus: Yes, please add the commas.
Fix-up for recent commit
|
||
|
|
18c8b56c7d |
OpenMP: Set default-device-var with OMP_TARGET_OFFLOAD=mandatory
OMP_TARGET_OFFLOAD=mandatory handling was before inconsistent. Hence, in OpenMP 5.2 it was clarified/extended by having implications on the default-device-var; additionally, omp_initial_device and omp_invalid_device enum values/PARAMETERs were added; support for it was added in r13-1066-g1158fe43407568 including aborting for omp_invalid_device and non-conforming device numbers. Only the mandatory handling was missing. Namely, while the default-device-var is usually initialized to value 0, with 'mandatory' it must have the value 'omp_invalid_device' if and only if zero non-host devices are available. (The OMP_DEFAULT_DEVICE env var overrides this as it comes semantically after the initialization.) To achieve this, default-device-var is now initialized to MIN_INT. If there is no 'mandatory', it is set to 0 directly after env var parsing. Otherwise, it is updated in gomp_target_init to either 0 or omp_invalid_device. To ensure INT_MIN is never seen by the user, both the omp_get_default_device API routine and omp_display_env (user call and OMP_DISPLAY_ENV env var) call gomp_init_targets_once() in that case. libgomp/ChangeLog: * env.c (gomp_default_icv_values): Init default_device_var to an nonconforming value - INT_MIN. (initialize_env): After env-var parsing, set default_device_var to device 0 unless OMP_TARGET_OFFLOAD=mandatory. (omp_display_env): If default_device_var is INT_MIN, call gomp_init_targets_once. * icv-device.c (omp_get_default_device): Likewise. * libgomp.texi (OMP_DEFAULT_DEVICE): Update init description. (OpenMP 5.2 Impl. Status): Mark OMP_TARGET_OFFLOAD=mandatory as 'Y'. * target.c (resolve_device): Improve error message device-num < 0 with 'mandatory' and no no-host devices available. (gomp_target_init): Set default-device-var if INT_MIN. * testsuite/libgomp.c/target-48.c: New test. * testsuite/libgomp.c/target-49.c: New test. * testsuite/libgomp.c/target-50.c: New test. * testsuite/libgomp.c/target-50a.c: New test. * testsuite/libgomp.c/target-51.c: New test. * testsuite/libgomp.c/target-52.c: New test. * testsuite/libgomp.c/target-53.c: New test. * testsuite/libgomp.c/target-54.c: New test. |
||
|
|
38944ec2a6 |
OpenMP: Cleanups related to the 'present' modifier
Reduce number of enum values passed to libgomp as
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} have the same semantic as
GOMP_MAP_FORCE_PRESENT (i.e. abort if not present, otherwise ignore);
that's different to GOMP_MAP_ALWAYS_PRESENT_{TO,TOFROM,FROM} which also
abort if not present but copy data when present. This is is a follow-up to
the commit r14-1579-g4ede915d5dde93 done 6 days ago.
Additionally, the commit improves a libgomp run-time and a C/C++ compile-time
error wording and extends testcases a tiny bit.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_clause_map): Reword error message for
clearness especially with 'omp target (enter/exit) data.'
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_clause_map): Reword error message for
clearness especially with 'omp target (enter/exit) data.'
* semantics.cc (handle_omp_array_sections): Handle
GOMP_MAP_{ALWAYS_,}PRESENT_{TO,TOFROM,FROM,ALLOC} enum values.
gcc/ChangeLog:
* gimplify.cc (gimplify_adjust_omp_clauses_1): Use
GOMP_MAP_FORCE_PRESENT for 'present alloc' implicit mapping.
(gimplify_adjust_omp_clauses): Change
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} to the equivalent
GOMP_MAP_FORCE_PRESENT.
* omp-low.cc (lower_omp_target): Remove handling of no-longer valid
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC}; update map kinds used for
to/from clauses with present modifier.
include/ChangeLog:
* gomp-constants.h (enum gomp_map_kind): Change the enum values
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} to be compiler only.
(GOMP_MAP_PRESENT_P): Update to include also GOMP_MAP_FORCE_PRESENT.
libgomp/ChangeLog:
* target.c (gomp_to_device_kind_p, gomp_map_vars_internal): Replace
GOMP_MAP_PRESENT_{FROM,TO,TOFROM,ACLLOC} by GOMP_MAP_FORCE_PRESENT.
(gomp_map_vars_internal, gomp_update): Likewise; unify and improve
error message.
* testsuite/libgomp.c-c++-common/target-present-2.c: Update for
changed error message.
* testsuite/libgomp.fortran/target-present-1.f90: Likewise.
* testsuite/libgomp.fortran/target-present-2.f90: Likewise.
* testsuite/libgomp.oacc-c-c++-common/present-1.c: Likewise.
* testsuite/libgomp.c-c++-common/target-present-1.c: Likewise and
extend testcase to check that data is copied when needed.
* testsuite/libgomp.c-c++-common/target-present-3.c: Likewise.
* testsuite/libgomp.fortran/target-present-3.f90: Likewise.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/defaultmap-4.c: Update scan-tree-dump.
* c-c++-common/gomp/map-9.c: Likewise.
* gfortran.dg/gomp/defaultmap-8.f90: Likewise.
* gfortran.dg/gomp/map-11.f90: Likewise.
* gfortran.dg/gomp/target-update-1.f90: Likewise.
* gfortran.dg/gomp/map-12.f90: Likewise; also check original dump.
* c-c++-common/gomp/map-6.c: Update dg-error and also check
clause error with 'target (enter/exit) data'.
|
||
|
|
4ede915d5d |
openmp: Add support for the 'present' modifier
This implements support for the OpenMP 5.1 'present' modifier, which can be used in map clauses in the 'target', 'target data', 'target data enter' and 'target data exit' constructs, and in the 'to' and 'from' clauses of the 'target update' construct. It is also supported in defaultmap. The modifier triggers a fatal runtime error if the data specified by the clause is not already present on the target device. It can also be combined with 'always' in map clauses. 2023-06-06 Kwok Cheung Yeung <kcy@codesourcery.com> Tobias Burnus <tobias@codesourcery.com> gcc/c/ * c-parser.cc (c_parser_omp_clause_defaultmap, c_parser_omp_clause_map): Parse 'present'. (c_parser_omp_clause_to, c_parser_omp_clause_from): Remove. (c_parser_omp_clause_from_to): New; parse to/from clauses with optional present modifer. (c_parser_omp_all_clauses): Update call. (c_parser_omp_target_data, c_parser_omp_target_enter_data, c_parser_omp_target_exit_data): Handle new map enum values for 'present' mapping. gcc/cp/ * parser.cc (cp_parser_omp_clause_defaultmap, cp_parser_omp_clause_map): Parse 'present'. (cp_parser_omp_clause_from_to): New; parse to/from clauses with optional 'present' modifier. (cp_parser_omp_all_clauses): Update call. (cp_parser_omp_target_data, cp_parser_omp_target_enter_data, cp_parser_omp_target_exit_data): Handle new enum value for 'present' mapping. * semantics.cc (finish_omp_target): Likewise. gcc/fortran/ * dump-parse-tree.cc (show_omp_namelist): Display 'present' map modifier. (show_omp_clauses): Display 'present' motion modifier for 'to' and 'from' clauses. * gfortran.h (enum gfc_omp_map_op): Add entries with 'present' modifiers. (struct gfc_omp_namelist): Add 'present_modifer'. * openmp.cc (gfc_match_motion_var_list): New, handles optional 'present' modifier for to/from clauses. (gfc_match_omp_clauses): Call it for to/from clauses; parse 'present' in defaultmap and map clauses. (resolve_omp_clauses): Allow 'present' modifiers on 'target', 'target data', 'target enter' and 'target exit' directives. * trans-openmp.cc (gfc_trans_omp_clauses): Apply 'present' modifiers to tree node for 'map', 'to' and 'from' clauses. Apply 'present' for defaultmap. gcc/ * gimplify.cc (omp_notice_variable): Apply GOVD_MAP_ALLOC_ONLY flag and defaultmap flags if the defaultmap has GOVD_MAP_FORCE_PRESENT flag set. (omp_get_attachment): Handle map clauses with 'present' modifier. (omp_group_base): Likewise. (gimplify_scan_omp_clauses): Reorder present maps to come first. Set GOVD flags for present defaultmaps. (gimplify_adjust_omp_clauses_1): Set map kind for present defaultmaps. * omp-low.cc (scan_sharing_clauses): Handle 'always, present' map clauses. (lower_omp_target): Handle map clauses with 'present' modifier. Handle 'to' and 'from' clauses with 'present'. * tree-core.h (enum omp_clause_defaultmap_kind): Add OMP_CLAUSE_DEFAULTMAP_PRESENT defaultmap kind. * tree-pretty-print.cc (dump_omp_clause): Handle 'map', 'to' and 'from' clauses with 'present' modifier. Handle present defaultmap. * tree.h (OMP_CLAUSE_MOTION_PRESENT): New #define. include/ * gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_5): New. (GOMP_MAP_FLAG_FORCE): Redefine. (GOMP_MAP_FLAG_PRESENT, GOMP_MAP_FLAG_ALWAYS_PRESENT): New. (enum gomp_map_kind): Add map kinds with 'present' modifiers. (GOMP_MAP_COPY_TO_P, GOMP_MAP_COPY_FROM_P): Evaluate to true for map variants with 'present' (GOMP_MAP_ALWAYS_TO_P, GOMP_MAP_ALWAYS_FROM_P): Evaluate to true for map variants with 'always, present' modifiers. (GOMP_MAP_ALWAYS): Redefine. (GOMP_MAP_FORCE_P, GOMP_MAP_PRESENT_P): New. libgomp/ * libgomp.texi (OpenMP 5.1 Impl. status): Set 'present' support for defaultmap to 'Y', add 'Y' entry for 'present' on to/from/map clauses. * target.c (gomp_to_device_kind_p): Add map kinds with 'present' modifier. (gomp_map_vars_existing): Use new GOMP_MAP_FORCE_P macro. (gomp_map_vars_internal, gomp_update, gomp_target_rev): Emit runtime error if memory region not present. * testsuite/libgomp.c-c++-common/target-present-1.c: New test. * testsuite/libgomp.c-c++-common/target-present-2.c: New test. * testsuite/libgomp.c-c++-common/target-present-3.c: New test. * testsuite/libgomp.fortran/target-present-1.f90: New test. * testsuite/libgomp.fortran/target-present-2.f90: New test. * testsuite/libgomp.fortran/target-present-3.f90: New test. gcc/testsuite/ * c-c++-common/gomp/map-6.c: Update dg-error, extend to test for duplicated 'present' and extend scan-dump tests for 'present'. * gfortran.dg/gomp/defaultmap-1.f90: Update dg-error. * gfortran.dg/gomp/map-7.f90: Extend parse and dump test for 'present'. * gfortran.dg/gomp/map-8.f90: Extend for duplicate 'present' modifier checking. * c-c++-common/gomp/defaultmap-4.c: New test. * c-c++-common/gomp/map-9.c: New test. * c-c++-common/gomp/target-update-1.c: New test. * gfortran.dg/gomp/defaultmap-8.f90: New test. * gfortran.dg/gomp/map-11.f90: New test. * gfortran.dg/gomp/map-12.f90: New test. * gfortran.dg/gomp/target-update-1.f90: New test. |
||
|
|
130c2f3c3a |
libgomp: Simplify OpenMP reverse offload host <-> device memory copy implementation
... by using the existing 'goacc_asyncqueue' instead of re-coding parts of it. Follow-up to commit |
||
|
|
e8fec6998b |
Add caveat/safeguard to OpenMP: Handle descriptors in target's firstprivate [PR104949]
Follow-up to commit
|
||
|
|
f8332e52a4 |
Use 'GOMP_MAP_VARS_TARGET' for OpenACC compute constructs [PR90596]
Thereby considerably simplify the device plugins' 'GOMP_OFFLOAD_openacc_exec',
'GOMP_OFFLOAD_openacc_async_exec' functions: in terms of lines of code, but in
particular conceptually: no more device memory allocation, host to device data
copying, device memory deallocation -- 'GOMP_MAP_VARS_TARGET' does all that for
us.
This depends on commit
|
||
|
|
2b2340e236 |
Allow libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral' data
This does *allow*, but under no circumstances is this currently going to be
used: all potentially applicable data is non-'ephemeral', and thus not
considered for 'gomp_coalesce_buf_add' for OpenACC 'async'. (But a use will
emerge later.)
Follow-up to commit r12-2530-gd88a6951586c7229b25708f4486eaaf4bf4b5bbe
"Don't use libgomp 'cbuf' buffering with OpenACC 'async'", addressing this
TODO comment:
TODO ... but we could allow CBUF usage for EPHEMERAL data? (Open question:
is it more performant to use libgomp CBUF buffering or individual device
asyncronous copying?)
Ephemeral data is small, and therefore individual device asyncronous copying
does seem dubious -- in particular given that for all those, we'd individually
have to allocate and queue for deallocation a temporary buffer to capture the
ephemeral data. Instead, just let the 'cbuf' *be* the temporary buffer.
libgomp/
* target.c (gomp_copy_host2dev, gomp_map_vars_internal): Allow
libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral'
data.
|
||
|
|
199867d07b |
Simplify OpenACC 'no_create' clause implementation
For 'OFFSET_INLINED', 'gomp_map_val' does the right thing, and we may then simplify the device plugins accordingly. This is a follow-up to Subversion r279551 (Git commit |
||
|
|
edaf1d6078 |
libgomp: Fix reverse-offload for GOMP_MAP_TO_PSET
libgomp/ * target.c (gomp_target_rev): Dereference ptr to get device address. * testsuite/libgomp.fortran/reverse-offload-5.f90: Add test for unallocated allocatable. |
||
|
|
c7a9655be6 |
libgomp: Fix 'target enter data' with always pointer
As GOMP_MAP_ALWAYS_POINTER operates on the previous map item, ensure that with 'target enter data' both are passed together to gomp_map_vars_internal. libgomp/ChangeLog: * target.c (gomp_map_vars_internal): Add 'i > 0' before doing a kind check. (GOMP_target_enter_exit_data): If the next map item is GOMP_MAP_ALWAYS_POINTER map it together with the current item. * testsuite/libgomp.fortran/target-enter-data-3.f90: New test. |
||
|
|
0b1ce70a81 |
libgomp: Fix reverse offload issues
If there is nothing to map, skip the mapping and avoid attempting to copy 0 bytes from addrs, sizes and kinds. Additionally, it could happen that a non-allocated address was deallocated, such as a pointer set, leading to a free for the actual data. libgomp/ * target.c (gomp_target_rev): Handle mapnum == 0 and avoid freeing not allocated memory. * testsuite/libgomp.fortran/reverse-offload-6.f90: New test. |
||
|
|
83ffe9cde7 | Update copyright years. | ||
|
|
ea4b23d9c8 |
libgomp: Handle OpenMP's reverse offloads
This commit enabled reverse offload for nvptx such that gomp_target_rev actually gets called. And it fills the latter function to do all of the following: finding the host function to the device func ptr and copying the arguments to the host, processing the mapping/firstprivate, calling the host function, copying back the data and freeing as needed. The data handling is made easier by assuming that all host variables either existed before (and are in the mapping) or that those are devices variables not yet available on the host. Thus, the reverse mapping can do without refcounts etc. Note that the spec disallows inside a target region device-affecting constructs other than target plus ancestor device-modifier and it also limits the clauses permitted on this construct. For the function addresses, an additional splay tree is used; for the lookup of mapped variables, the existing splay-tree is used. Unfortunately, its data structure requires a full walk of the tree; Additionally, the just mapped variables are recorded in a separate data structure an extra lookup. While the lookup is slow, assuming that only few variables get mapped in each reverse offload construct and that reverse offload is the exception and not performance critical, this seems to be acceptable. libgomp/ChangeLog: * libgomp.h (struct target_mem_desc): Predeclare; move below after 'reverse_splay_tree_node' and add rev_array member. (struct reverse_splay_tree_key_s, reverse_splay_compare): New. (reverse_splay_tree_node, reverse_splay_tree, reverse_splay_tree_key): New typedef. (struct gomp_device_descr): Add mem_map_rev member. * oacc-host.c (host_dispatch): NULL init .mem_map_rev. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices): Claim support for GOMP_REQUIRES_REVERSE_OFFLOAD. * splay-tree.h (splay_tree_callback_stop): New typedef; like splay_tree_callback but returning int not void. (splay_tree_foreach_lazy): Define; like splay_tree_foreach but taking splay_tree_callback_stop as argument. * splay-tree.c (splay_tree_foreach_internal_lazy, splay_tree_foreach_lazy): New; but early exit if callback returns nonzero. * target.c: Instatiate splay_tree_c with splay_tree_prefix 'reverse'. (gomp_map_lookup_rev): New. (gomp_load_image_to_device): Handle reverse-offload function lookup table. (gomp_unload_image_from_device): Free devicep->mem_map_rev. (struct gomp_splay_tree_rev_lookup_data, gomp_splay_tree_rev_lookup, gomp_map_rev_lookup, struct cpy_data, gomp_map_cdata_lookup_int, gomp_map_cdata_lookup): New auxiliary structs and functions for gomp_target_rev. (gomp_target_rev): Implement reverse offloading and its mapping. (gomp_target_init): Init current_device.mem_map_rev.root. * testsuite/libgomp.fortran/reverse-offload-2.f90: New test. * testsuite/libgomp.fortran/reverse-offload-3.f90: New test. * testsuite/libgomp.fortran/reverse-offload-4.f90: New test. * testsuite/libgomp.fortran/reverse-offload-5.f90: New test. * testsuite/libgomp.fortran/reverse-offload-5a.f90: New test without mapping of on-device allocated variables. |