Commit Graph

158 Commits

Author SHA1 Message Date
Jakub Jelinek
254a858ae7 Update copyright years. 2026-01-02 09:56:11 +01:00
Paul-Antoine Arras
05c2ad4a2e OpenMP/Fortran: Allow explicit map followed by implicit deep mapping [PR120505]
Consider the following source code, assuming tiles is allocatable:

```
!$omp target enter data map(var%tiles(1)%den1, var%tiles(1)%den2) !        (1)
[...]
!$omp target ! implicitly maps var, which triggers deep mapping of tiles   (2)
```

Each omp directive causes a run-time error in libgomp:
(1) libgomp: Mapped array elements must be the same (0x14d729c0 vs 0x14d72a18)
(2) libgomp: Trying to map into device [0x3704ca50..0x3704cb00) object when
             [0x3704ca50..0x3704caa8) is already mapped

Regarding (1), the OpenMP spec has the following restriction: "If multiple list
items are explicitly mapped on the same construct and have the same containing
array or have base pointers that share original storage, and if any of the list
items do not have corresponding list items that are present in the device data
environment prior to a task encountering the construct, then the list items must
refer to *the same array elements* of either the containing array or the
implicit array of the base pointers."
Because tiles is allocatable, we cannot prove at compile time that array
elements are the same, so the check is deferred to libgomp. But there the
condition enforcing that all addresses are the same is too strict, so this patch
relaxes it to only check that addresses are sorted in increasing order.

The OpenMP spec allows (2) as long as it is implicit, without extending the
original mapping. So this patch sets the GOMP_MAP_IMPLICIT flag appropriately
on deep maps at compile time to let libgomp know that it is fine.

This patch ensures that such user code is accepted by:
(1) Setting the GOMP_MAP_IMPLICIT flag appropriately on deep maps;
(2) Relaxing the restriction on struct mapping from different containing arrays,
so that the element index need not be the same, instead addresses must be sorted
in increasing order.

This fixes the two errors currently seen when running SPEC HPC clvleaf
benchmark. However, further mapping issues prevent the benchmark from running to
completion.

	PR fortran/120505

gcc/ChangeLog:

	* omp-low.cc (lower_omp_target): Set GOMP_MAP_IMPLICIT flag.

libgomp/ChangeLog:

	* target.c (gomp_map_vars_internal): Allow struct mapping from different
	containing array elements as long as adresses are in increasing order.
	* testsuite/libgomp.c-c++-common/map-arrayofstruct-2.c: Adjust
	dg-output.
	* testsuite/libgomp.c-c++-common/map-arrayofstruct-3.c: Likewise.
	* testsuite/libgomp.fortran/map-subarray-5.f90: Likewise.
	* testsuite/libgomp.fortran/map-subarray-10.f90: New test.
	* testsuite/libgomp.fortran/map-subarray-9.f90: New test.
2025-12-01 11:35:44 +01:00
Andrew Stubbs
62174ec27b openmp, nvptx: ompx_gnu_managed_mem_alloc
This adds support for using Cuda Managed Memory with omp_alloc.  AMD support
will be added in a future patch.

There is one new predefined allocator, "ompx_gnu_managed_mem_alloc", plus a
corresponding memory space, which can be used to allocate memory in the
"managed" space.

The nvptx plugin is modified to make the necessary Cuda calls, via two new
(optional) plugin interfaces.

gcc/fortran/ChangeLog:

	* openmp.cc (is_predefined_allocator): Use GOMP_OMP_PREDEF_ALLOC_MAX
	and GOMP_OMPX_PREDEF_ALLOC_MIN/MAX instead of hardcoded values in the
	comment.

include/ChangeLog:

	* cuda/cuda.h (cuMemAllocManaged): Add declaration and related
	CU_MEM_ATTACH_GLOBAL flag.
	* gomp-constants.h (GOMP_OMPX_PREDEF_ALLOC_MAX): Update to 201.
	(GOMP_OMP_PREDEF_MEMSPACE_MAX): New constant.
	(GOMP_OMPX_PREDEF_MEMSPACE_MIN): New constant.
	(GOMP_OMPX_PREDEF_MEMSPACE_MAX): New constant.

libgomp/ChangeLog:

	* allocator.c (ompx_gnu_max_predefined_alloc): Update to
	ompx_gnu_managed_mem_alloc.
	(_Static_assert): Fix assertion messages for allocators and add
	new assertions for memspace constants.
	(omp_max_predefined_mem_space): New define.
	(ompx_gnu_min_predefined_mem_space): New define.
	(ompx_gnu_max_predefined_mem_space): New define.
	(MEMSPACE_ALLOC): Add check for non-standard memspaces.
	(MEMSPACE_CALLOC): Likewise.
	(MEMSPACE_REALLOC): Likewise.
	(MEMSPACE_VALIDATE): Likewise.
	(predefined_ompx_gnu_alloc_mapping): Add ompx_gnu_managed_mem_space.
	(omp_init_allocator): Add ompx_gnu_managed_mem_space validation.
	* config/gcn/allocator.c (gcn_memspace_alloc): Add check for
	non-standard memspaces.
	(gcn_memspace_calloc): Likewise.
	(gcn_memspace_realloc): Likewise.
	(gcn_memspace_validate): Update to validate standard vs non-standard
	memspaces.
	* config/linux/allocator.c (linux_memspace_alloc): Add managed
	memory space handling.
	(linux_memspace_calloc): Likewise.
	(linux_memspace_free): Likewise.
	(linux_memspace_realloc): Likewise (returns NULL for fallback).
	* config/nvptx/allocator.c (nvptx_memspace_alloc): Add check for
	non-standard memspaces.
	(nvptx_memspace_calloc): Likewise.
	(nvptx_memspace_realloc): Likewise.
	(nvptx_memspace_validate): Update to validate standard vs non-standard
	memspaces.
	* env.c (parse_allocator): Add ompx_gnu_managed_mem_alloc,
	ompx_gnu_managed_mem_space, and some static asserts so I don't forget
	them again.
	* libgomp-plugin.h (GOMP_OFFLOAD_managed_alloc): New declaration.
	(GOMP_OFFLOAD_managed_free): New declaration.
	* libgomp.h (gomp_managed_alloc): New declaration.
	(gomp_managed_free): New declaration.
	(struct gomp_device_descr): Add managed_alloc_func and
	managed_free_func fields.
	* libgomp.texi: Document ompx_gnu_managed_mem_alloc and
	ompx_gnu_managed_mem_space, add C++ template documentation, and
	describe NVPTX and AMD support.
	* omp.h.in: Add ompx_gnu_managed_mem_space and
	ompx_gnu_managed_mem_alloc enumerators, and gnu_managed_mem C++
	allocator template.
	* omp_lib.f90.in: Add Fortran bindings for new allocator and
	memory space.
	* omp_lib.h.in: Likewise.
	* plugin/cuda-lib.def: Add cuMemAllocManaged.
	* plugin/plugin-nvptx.c (nvptx_alloc): Add managed parameter to
	support cuMemAllocManaged.
	(GOMP_OFFLOAD_alloc): Move contents to ...
	(cleanup_and_alloc): ... this new function, and add managed support.
	(GOMP_OFFLOAD_managed_alloc): New function.
	(GOMP_OFFLOAD_managed_free): New function.
	* target.c (gomp_managed_alloc): New function.
	(gomp_managed_free): New function.
	(gomp_load_plugin_for_device): Load optional managed_alloc
	and managed_free plugin APIs.
	* testsuite/lib/libgomp.exp: Add check_effective_target_omp_managedmem.
	* testsuite/libgomp.c++/alloc-managed-1.C: New test.
	* testsuite/libgomp.c/alloc-managed-1.c: New test.
	* testsuite/libgomp.c/alloc-managed-2.c: New test.
	* testsuite/libgomp.c/alloc-managed-3.c: New test.
	* testsuite/libgomp.c/alloc-managed-4.c: New test.
	* testsuite/libgomp.fortran/alloc-managed-1.f90: New test.

Co-authored-by: Kwok Cheung Yeung <kcyeung@baylibre.com>
Co-authored-by: Thomas Schwinge <tschwinge@baylibre.com>
2025-11-13 14:16:09 +00:00
Tobias Burnus
5da963d988 OpenMP: Add omp_default_device named constant [PR119677]
OpenMP TR 14 (OpenMP 6.1) adds omp_default_device < -1 as
named constant alongside omp_initial_device and omp_default_device.

GCC supports it already internally via GOMP_DEVICE_DEFAULT_OMP_61,
but this patch now adds the omp_default_device enum/PARAMETER to
omp.h / omp_lib.

Note that PR119677 requests some cleanups, which still have to be
done.

	PR libgomp/119677

gcc/fortran/ChangeLog:

	* intrinsic.texi (OpenMP Modules): Add omp_default_device.
	* openmp.cc (gfc_resolve_omp_context_selector): Accept
	omp_default_device as conforming device number.

libgomp/ChangeLog:

	* omp.h.in (omp_default_device): New enum value.
	* omp_lib.f90.in: New parameter.
	* omp_lib.h.in: Likewise
	* target.c (gomp_get_default_device): New. Split off from ...
	(resolve_device): ... here; call it.
	(omp_target_alloc, omp_target_free, omp_target_is_present,
	omp_target_memcpy_check, omp_target_memset, omp_target_memset_async,
	omp_target_associate_ptr, omp_get_mapped_ptr,
	omp_target_is_accessible, omp_pause_resource,
	omp_get_uid_from_device): Handle omp_default_device.
	* testsuite/libgomp.c/device_uid.c: Likewise.
	* testsuite/libgomp.fortran/device_uid.f90: Likewise.
	* testsuite/libgomp.c-c++-common/omp-default-device.c: New test.
	* testsuite/libgomp.fortran/omp-default-device.f90: New test.
2025-11-12 10:18:18 +01:00
Andrew Stubbs
3b8d9d579c libgomp, nvptx: Cuda pinned memory
Use Cuda to pin memory, instead of Linux mlock, when available.

There are two advantages: firstly, this gives a significant speed boost for
NVPTX offloading, and secondly, it side-steps the usual OS ulimit/rlimit
setting.

The design adds a device independent plugin API for allocating pinned memory,
and then implements it for NVPTX.  At present, the other supported devices do
not have equivalent capabilities (or requirements).

libgomp/ChangeLog:

	* config/linux/allocator.c: Include assert.h.
	(using_device_for_page_locked): New variable.
	(linux_memspace_alloc): Add init0 parameter. Support device pinning.
	(linux_memspace_calloc): Set init0 to true.
	(linux_memspace_free): Support device pinning.
	(linux_memspace_realloc): Support device pinning.
	(MEMSPACE_ALLOC): Set init0 to false.
	* libgomp-plugin.h
	(GOMP_OFFLOAD_page_locked_host_alloc): New prototype.
	(GOMP_OFFLOAD_page_locked_host_free): Likewise.
	* libgomp.h (gomp_page_locked_host_alloc): Likewise.
	(gomp_page_locked_host_free): Likewise.
	(struct gomp_device_descr): Add page_locked_host_alloc_func and
	page_locked_host_free_func.
	* libgomp.texi: Adjust the docs for the pinned trait.
	* plugin/plugin-nvptx.c
	(GOMP_OFFLOAD_page_locked_host_alloc): New function.
	(GOMP_OFFLOAD_page_locked_host_free): Likewise.
	* target.c (device_for_page_locked): New variable.
	(get_device_for_page_locked): New function.
	(gomp_page_locked_host_alloc): Likewise.
	(gomp_page_locked_host_free): Likewise.
	(gomp_load_plugin_for_device): Add page_locked_host_alloc and
	page_locked_host_free.
	* testsuite/libgomp.c/alloc-pinned-1.c: Change expectations for NVPTX
	devices.
	* testsuite/libgomp.c/alloc-pinned-2.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-3.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-4.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-5.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-6.c: Likewise.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2025-10-23 11:08:06 +00:00
Kwok Cheung Yeung
87262627fd openmp: Add support for iterators in 'target update' clauses (C/C++)
This adds support for iterators in 'to' and 'from' clauses in the
'target update' OpenMP directive.

gcc/c/

	* c-parser.cc (c_parser_omp_clause_from_to): Parse 'iterator' modifier.
	* c-typeck.cc (c_finish_omp_clauses): Finish iterators for to/from
	clauses.

gcc/cp/

	* parser.cc (cp_parser_omp_clause_from_to): Parse 'iterator' modifier.
	* semantics.cc (finish_omp_clauses): Finish iterators for to/from
	clauses.

gcc/

	* gimplify.cc (remove_unused_omp_iterator_vars): Display unused
	variable warning for 'to' and 'from' clauses.
	(gimplify_scan_omp_clauses): Add argument for iterator loop sequence.
	Gimplify the clause decl and size into the iterator loop if iterators
	are used.
	(gimplify_omp_workshare): Add argument for iterator loops sequence
	in call to gimplify_scan_omp_clauses.
	(gimplify_omp_target_update): Call remove_unused_omp_iterator_vars and
	build_omp_iterators_loops.  Add loop sequence as argument when calling
	gimplify_scan_omp_clauses, gimplify_adjust_omp_clauses and building
	the Gimple statement.
	* tree-pretty-print.cc (dump_omp_clause): Call dump_omp_iterators
	for to/from clauses with iterators.
	* tree.cc (omp_clause_num_ops): Add extra operand for OMP_CLAUSE_FROM
	and OMP_CLAUSE_TO.
	* tree.h (OMP_CLAUSE_HAS_ITERATORS): Add check for OMP_CLAUSE_TO and
	OMP_CLAUSE_FROM.
	(OMP_CLAUSE_ITERATORS): Likewise.

gcc/testsuite/

	* c-c++-common/gomp/target-update-iterators-1.c: New.
	* c-c++-common/gomp/target-update-iterators-2.c: New.
	* c-c++-common/gomp/target-update-iterators-3.c: New.

libgomp/

	* target.c (gomp_update): Call gomp_merge_iterator_maps.  Free
	allocated variables.
	* testsuite/libgomp.c-c++-common/target-update-iterators-1.c: New.
	* testsuite/libgomp.c-c++-common/target-update-iterators-2.c: New.
	* testsuite/libgomp.c-c++-common/target-update-iterators-3.c: New.
2025-08-06 01:37:10 +01:00
Kwok Cheung Yeung
8b8b0eada6 openmp: Add support for iterators in map clauses (C/C++)
This adds preliminary support for iterators in map clauses within OpenMP
'target' constructs (which includes constructs such as 'target enter data').

Iterators with non-constant loop bounds are not currently supported.

gcc/c/

	* c-parser.cc (c_parser_omp_variable_list): Use location of the
	map expression as the clause location.
	(c_parser_omp_clause_map): Parse 'iterator' modifier.
	* c-typeck.cc (c_finish_omp_clauses): Finish iterators.  Apply
	iterators to generated clauses.

gcc/cp/

	* parser.cc (cp_parser_omp_clause_map): Parse 'iterator' modifier.
	* semantics.cc (finish_omp_clauses): Finish iterators.  Apply
	iterators to generated clauses.

gcc/

	* gimple-pretty-print.cc (dump_gimple_omp_target): Print expanded
	iterator loops.
	* gimple.cc (gimple_build_omp_target): Add argument for iterator
	loops sequence.  Initialize iterator loops field.
	* gimple.def (GIMPLE_OMP_TARGET): Set GSS symbol to GSS_OMP_TARGET.
	* gimple.h (gomp_target): Set GSS symbol to GSS_OMP_TARGET.  Add extra
	field for iterator loops.
	(gimple_build_omp_target): Add argument for iterator loops sequence.
	(gimple_omp_target_iterator_loops): New.
	(gimple_omp_target_iterator_loops_ptr): New.
	(gimple_omp_target_set_iterator_loops): New.
	* gimplify.cc (find_var_decl): New.
	(copy_omp_iterator): New.
	(remap_omp_iterator_var_1): New.
	(remap_omp_iterator_var): New.
	(remove_unused_omp_iterator_vars): New.
	(struct iterator_loop_info_t): New type.
	(iterator_loop_info_map_t): New type.
	(build_omp_iterators_loops): New.
	(enter_omp_iterator_loop_context_1): New.
	(enter_omp_iterator_loop_context): New.
	(enter_omp_iterator_loop_context): New.
	(exit_omp_iterator_loop_context): New.
	(gimplify_adjust_omp_clauses): Add argument for iterator loop
	sequence.  Gimplify the clause decl and size into the iterator
	loop if iterators are used.
	(gimplify_omp_workshare): Call remove_unused_omp_iterator_vars and
	build_omp_iterators_loops for OpenMP target expressions.  Add
	loop sequence as argument when calling gimplify_adjust_omp_clauses
	and building the Gimple statement.
	* gimplify.h (enter_omp_iterator_loop_context): New prototype.
	(exit_omp_iterator_loop_context): New prototype.
	* gsstruct.def (GSS_OMP_TARGET): New.
	* omp-low.cc (lower_omp_map_iterator_expr): New.
	(lower_omp_map_iterator_size): New.
	(finish_omp_map_iterators): New.
	(lower_omp_target): Add sorry if iterators used with deep mapping.
	Call lower_omp_map_iterator_expr before assigning to sender ref.
	Call lower_omp_map_iterator_size before setting the size.  Insert
	iterator loop sequence before the statements for the target clause.
	* tree-nested.cc (convert_nonlocal_reference_stmt): Walk the iterator
	loop sequence of OpenMP target statements.
	(convert_local_reference_stmt): Likewise.
	(convert_tramp_reference_stmt): Likewise.
	* tree-pretty-print.cc (dump_omp_iterators): Dump extra iterator
	information if present.
	(dump_omp_clause): Call dump_omp_iterators for iterators in map
	clauses.
	* tree.cc (omp_clause_num_ops): Add operand for OMP_CLAUSE_MAP.
	(walk_tree_1): Do not walk last operand of OMP_CLAUSE_MAP.
	* tree.h (OMP_CLAUSE_HAS_ITERATORS): New.
	(OMP_CLAUSE_ITERATORS): New.

gcc/testsuite/

	* c-c++-common/gomp/map-6.c (foo): Amend expected error message.
	* c-c++-common/gomp/target-map-iterators-1.c: New.
	* c-c++-common/gomp/target-map-iterators-2.c: New.
	* c-c++-common/gomp/target-map-iterators-3.c: New.
	* c-c++-common/gomp/target-map-iterators-4.c: New.

libgomp/

	* target.c (kind_to_name): New.
	(gomp_merge_iterator_maps): New.
	(gomp_map_vars_internal): Call gomp_merge_iterator_maps.  Copy
	address of only the first iteration to target vars.  Free allocated
	variables.
	* testsuite/libgomp.c-c++-common/target-map-iterators-1.c: New.
	* testsuite/libgomp.c-c++-common/target-map-iterators-2.c: New.
	* testsuite/libgomp.c-c++-common/target-map-iterators-3.c: New.

Co-authored-by: Andrew Stubbs <ams@baylibre.com>
2025-08-06 01:37:10 +01:00
Tobias Burnus
8a759dbb69 libgomp/target.c: Fix buffer size for 'omp requires' diagnostic
One of the buffers that printed the list of set 'omp requires'
requirements missed the 'self' clause addition, being potentially
to short when all device-affecting clauses were passed. Solved it
by moving the sizeof(<string of all permitted values>" into a new
'#define' just above the associated gomp_requires_to_name function.

libgomp/ChangeLog:

	* target.c (GOMP_REQUIRES_NAME_BUF_LEN): Define.
	(GOMP_offload_register_ver, gomp_target_init): Use it for the
	char buffer size.
2025-06-19 21:16:42 +02:00
Tobias Burnus
4e47e2f833 libgomp: Add OpenMP's omp_target_memset/omp_target_memset_async
PR libgomp/120444

include/ChangeLog:

	* cuda/cuda.h (cuMemsetD8, cuMemsetD8Async): Declare.

libgomp/ChangeLog:

	* libgomp-plugin.h (GOMP_OFFLOAD_memset): Declare.
	* libgomp.h (struct gomp_device_descr): Add memset_func.
	* libgomp.map (GOMP_6.0.1): Add omp_target_memset{,_async}.
	* libgomp.texi (Device Memory Routines): Document them.
	* omp.h.in (omp_target_memset, omp_target_memset_async): Declare.
	* omp_lib.f90.in (omp_target_memset, omp_target_memset_async):
	Add interfaces.
	* omp_lib.h.in (omp_target_memset, omp_target_memset_async): Likewise.
	* plugin/cuda-lib.def: Add cuMemsetD8.
	* plugin/plugin-gcn.c (struct hsa_runtime_fn_info): Add
	hsa_amd_memory_fill_fn.
	(init_hsa_runtime_functions): DLSYM_OPT_FN load it.
	(GOMP_OFFLOAD_memset): New.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_memset): New.
	* target.c (omp_target_memset_int, omp_target_memset,
	omp_target_memset_async_helper, omp_target_memset_async): New.
	(gomp_load_plugin_for_device): Add DLSYM (memset).
	* testsuite/libgomp.c-c++-common/omp_target_memset.c: New test.
	* testsuite/libgomp.c-c++-common/omp_target_memset-2.c: New test.
	* testsuite/libgomp.c-c++-common/omp_target_memset-3.c: New test.
	* testsuite/libgomp.fortran/omp_target_memset.f90: New test.
	* testsuite/libgomp.fortran/omp_target_memset-2.f90: New test.
2025-06-02 17:43:57 +02:00
Tobias Burnus
f4aa6b5a8d libgomp: Add OpenACC's acc_memcpy_device{,_async} routines [PR93226]
libgomp/ChangeLog:

	PR libgomp/93226
	* libgomp-plugin.h (GOMP_OFFLOAD_openacc_async_dev2dev): New
	prototype.
	* libgomp.h (struct acc_dispatch_t): Add dev2dev_func.
	(gomp_copy_dev2dev): New prototype.
	* libgomp.map (OACC_2.6.1): New; add acc_memcpy_device{,_async}.
	* libgomp.texi (acc_memcpy_device): New.
	* oacc-mem.c (memcpy_tofrom_device): Change to take from/to
	device boolean; use memcpy not memmove; add early return if
	size == 0 or same device + same ptr.
	(acc_memcpy_to_device, acc_memcpy_to_device_async,
	acc_memcpy_from_device, acc_memcpy_from_device_async): Update.
	(acc_memcpy_device, acc_memcpy_device_async): New.
	* openacc.f90 (acc_memcpy_device, acc_memcpy_device_async):
	Add interface.
	* openacc_lib.h (acc_memcpy_device, acc_memcpy_device_async):
	Likewise.
	* openacc.h (acc_memcpy_device, acc_memcpy_device_async): Add
	prototype.
	* plugin/plugin-gcn.c (GOMP_OFFLOAD_openacc_async_host2dev):
	Update comment.
	(GOMP_OFFLOAD_openacc_async_dev2host): Update call.
	(GOMP_OFFLOAD_openacc_async_dev2dev): New.
	* plugin/plugin-nvptx.c (cuda_memcpy_dev_sanity_check): New.
	(GOMP_OFFLOAD_dev2dev): Call it.
	(GOMP_OFFLOAD_openacc_async_dev2dev): New.
	* target.c (gomp_copy_dev2dev): New.
	(gomp_load_plugin_for_device): Load dev2dev and async_dev2dev.
	* testsuite/libgomp.oacc-c-c++-common/acc_memcpy_device-1.c: New test.
	* testsuite/libgomp.oacc-fortran/acc_memcpy_device-1.f90: New test.
2025-05-29 22:47:06 +02:00
Tobias Burnus
814e29e390 OpenMP: Fix mapping of zero-sized arrays with non-literal size: map(var[:n]), n = 0
For map(ptr[:0]), the used map kind is GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION
and it is permitted that 'ptr' does not exist. 'ptr' is set to the device
pointee if it exists or to the host value otherwise.

For map(ptr[:3]), the variable is first mapped and then ptr is updated to point
to the just-mapped device data; the attachment uses GOMP_MAP_ATTACH.

For map(ptr[:n]), generates always a GOMP_MAP_ATTACH, but when n == 0, it
was failing with:
   "pointer target not mapped for attach"

The solution is not to fail but first to check whether it was mapped before.
It turned out that for the mapping part, GCC adds a run-time check whether
n == 0 - and uses GOMP_MAP_ZERO_LEN_ARRAY_SECTION for the mapping.
Thus, we just have to check whether there such a mapping for the address
for which the GOMP_MAP_ATTACH. was requested. And, if there was, the
error diagnostic can be skipped.

Unsurprisingly, this issue occurs in real-world code; it was detected in
a code that distributes work via MPI and for some processes, some bounds
ended up to be zero.

libgomp/ChangeLog:

	* target.c (gomp_attach_pointer): Return bool; accept additional
	bool to optionally silence the fatal pointee-not-found error.
	(gomp_map_vars_internal): If the pointee could not be found,
	check whether it was mapped as GOMP_MAP_ZERO_LEN_ARRAY_SECTION.
	* libgomp.h (gomp_attach_pointer): Update prototype.
	* oacc-mem.c (acc_attach_async, goacc_enter_data_internal): Update
	calls.
	* testsuite/libgomp.c/target-map-zero-sized.c: New test.
	* testsuite/libgomp.c/target-map-zero-sized-2.c: New test.
	* testsuite/libgomp.c/target-map-zero-sized-3.c: New test.
2025-05-14 20:06:49 +02:00
Tobias Burnus
4d5d1a7326 libgomp: Save OpenMP device number when initializing the interop object
The interop object (opaque object to the user, used internally in libgomp)
already had a 'device_num' member, but it was missed to actually set it.

libgomp/ChangeLog:

	* target.c (gomp_interop_internal): Set the 'device_num' member
	when initializing an interop object.
2025-03-24 19:52:10 +01:00
Paul-Antoine Arras
99e2906ae2 OpenMP: 'interop' construct - add ME support + target-independent libgomp
This patch partially enables use of the OpenMP interop construct by adding
middle end support, mostly in the omplower pass, and in the target-independent
part of the libgomp runtime. It follows up on previous patches for C, C++ and
Fortran front ends support. The full interop feature requires another patch to
enable foreign runtime support in libgomp plugins.

gcc/ChangeLog:

	* builtin-types.def
	(BT_FN_VOID_INT_INT_PTR_PTR_PTR_INT_PTR_INT_PTR_UINT_PTR): New.
	* gimple-low.cc (lower_stmt): Handle GIMPLE_OMP_INTEROP.
	* gimple-pretty-print.cc (dump_gimple_omp_interop): New function.
	(pp_gimple_stmt_1): Handle GIMPLE_OMP_INTEROP.
	* gimple.cc (gimple_build_omp_interop): New function.
	(gimple_copy): Handle GIMPLE_OMP_INTEROP.
	* gimple.def (GIMPLE_OMP_INTEROP): Define.
	* gimple.h (gimple_build_omp_interop): Declare.
	(gimple_omp_interop_clauses): New function.
	(gimple_omp_interop_clauses_ptr): Likewise.
	(gimple_omp_interop_set_clauses): Likewise.
	(gimple_return_set_retval): Handle GIMPLE_OMP_INTEROP.
	* gimplify.cc (gimplify_scan_omp_clauses): Handle OMP_CLAUSE_INIT,
	OMP_CLAUSE_USE and OMP_CLAUSE_DESTROY.
	(gimplify_omp_interop): New function.
	(gimplify_expr): Replace sorry with call to gimplify_omp_interop.
	* omp-builtins.def (BUILT_IN_GOMP_INTEROP): Define.
	* omp-low.cc (scan_sharing_clauses): Handle OMP_CLAUSE_INIT,
	OMP_CLAUSE_USE and OMP_CLAUSE_DESTROY.
	(scan_omp_1_stmt): Handle GIMPLE_OMP_INTEROP.
	(lower_omp_interop_action_clauses): New function.
	(lower_omp_interop): Likewise.
	(lower_omp_1): Handle GIMPLE_OMP_INTEROP.

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_omp_clause_destroy): Make addressable.
	(c_parser_omp_clause_init): Make addressable.

gcc/cp/ChangeLog:

	* parser.cc (cp_parser_omp_clause_init): Make addressable.

gcc/fortran/ChangeLog:

	* trans-openmp.cc (gfc_trans_omp_clauses): Make OMP_CLAUSE_DESTROY and
	OMP_CLAUSE_INIT addressable.
	* types.def (BT_FN_VOID_INT_INT_PTR_PTR_PTR_INT_PTR_INT_PTR_UINT_PTR):
	New.

include/ChangeLog:

	* gomp-constants.h (GOMP_DEVICE_DEFAULT_OMP_61, GOMP_INTEROP_TARGET,
	GOMP_INTEROP_TARGETSYNC, GOMP_INTEROP_FLAG_NOWAIT): Define.

libgomp/ChangeLog:

	* icv-device.c (omp_set_default_device): Check
	GOMP_DEVICE_DEFAULT_OMP_61.
	* libgomp-plugin.h (struct interop_obj_t): New.
	(enum gomp_interop_flag): New.
	(GOMP_OFFLOAD_interop): Declare.
	(GOMP_OFFLOAD_get_interop_int): Declare.
	(GOMP_OFFLOAD_get_interop_ptr): Declare.
	(GOMP_OFFLOAD_get_interop_str): Declare.
	(GOMP_OFFLOAD_get_interop_type_desc): Declare.
	* libgomp.h (_LIBGOMP_OMP_LOCK_DEFINED): Define.
	(struct gomp_device_descr): Add interop_func, get_interop_int_func,
	get_interop_ptr_func, get_interop_str_func, get_interop_type_desc_func.
	* libgomp.map: Add GOMP_interop.
	* libgomp_g.h (GOMP_interop): Declare.
	* target.c (resolve_device): Handle GOMP_DEVICE_DEFAULT_OMP_61.
	(omp_get_interop_int): Replace stub with actual implementation.
	(omp_get_interop_ptr): Likewise.
	(omp_get_interop_str): Likewise.
	(omp_get_interop_type_desc): Likewise.
	(struct interop_data_t): Define.
	(gomp_interop_internal): New function.
	(GOMP_interop): Likewise.
	(gomp_load_plugin_for_device): Load symbols for get_interop_int,
	get_interop_ptr, get_interop_str and get_interop_type_desc.
	* testsuite/libgomp.c-c++-common/interop-1.c: New test.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/interop-1.c: Remove dg-prune-output "sorry".
	* c-c++-common/gomp/interop-2.c: Likewise.
	* c-c++-common/gomp/interop-3.c: Likewise.
	* c-c++-common/gomp/interop-4.c: Remove dg-message "not supported".
	* g++.dg/gomp/interop-5.C: Likewise.
	* gfortran.dg/gomp/interop-4.f90: Likewise.
	* c-c++-common/gomp/interop-5.c: New test.
	* gfortran.dg/gomp/interop-5.f90: New test.

Co-authored-by: Tobias Burnus <tburnus@baylibre.com>
2025-03-21 19:24:16 +01:00
shynur
e759ff08e2 libgomp: Add '__attribute__((unused))' to variables used only in 'assert(...)'
Without this attribute, the building process will fail if GCC is configured
with 'CFLAGS=-DNDEBUG'.

libgomp/ChangeLog:

	* oacc-mem.c (acc_unmap_data, goacc_exit_datum_1, find_group_last,
	goacc_enter_data_internal): Add '__attribute__((unused))'.
	* target.c (gomp_unmap_vars_internal): Likewise.
2025-02-22 22:37:50 +01:00
Jakub Jelinek
6441eb6dc0 Update copyright years. 2025-01-02 11:59:57 +01:00
Tobias Burnus
4cb20dc043 libgomp: with USM, init 'link' variables with host address
If requires unified_shared_memory or self_maps is set, make
'declare target link' variables to point initially to the host pointer.

libgomp/ChangeLog:

	* target.c (gomp_load_image_to_device): For requires
	unified_shared_memory, update 'link' vars to point to the host var.
	* testsuite/libgomp.c-c++-common/target-link-3.c: New test.
	* testsuite/libgomp.c-c++-common/target-link-4.c: New test.
2024-09-24 17:41:39 +02:00
Tobias Burnus
b752eed3e3 OpenMP: Add support for 'self_maps' to the 'require' directive
'self_maps' implies 'unified_shared_memory', except that the latter
also permits that explicit maps copy data to device memory while
self_maps does not. In GCC, currently, both are handled identical.

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_omp_requires): Handle self_maps clause.

gcc/cp/ChangeLog:

	* parser.cc (cp_parser_omp_requires): Handle self_maps clause.

gcc/fortran/ChangeLog:

	* gfortran.h (enum gfc_omp_requires_kind): Add OMP_REQ_SELF_MAPS.
	(gfc_namespace): Enlarge omp_requires bitfield.
	* module.cc (enum ab_attribute, attr_bits): Add AB_OMP_REQ_SELF_MAPS.
	(mio_symbol_attribute): Handle it.
	* openmp.cc (gfc_check_omp_requires, gfc_match_omp_requires): Handle
	self_maps clause.
	* parse.cc (gfc_parse_file): Handle self_maps clause.

gcc/ChangeLog:

	* lto-cgraph.cc (output_offload_tables, omp_requires_to_name): Handle
	self_maps clause.
	* omp-general.cc (struct omp_ts_info, omp_context_selector_matches):
	Likewise for the associated trait.
	* omp-general.h (enum omp_requires): Add OMP_REQUIRES_SELF_MAPS.
	* omp-selectors.h (enum omp_ts_code): Add
	OMP_TRAIT_IMPLEMENTATION_SELF_MAPS.

include/ChangeLog:

	* gomp-constants.h (GOMP_REQUIRES_SELF_MAPS): #define.

libgomp/ChangeLog:

	* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_num_devices):
	Accept self_maps clause.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices):
	Likewise.
	* libgomp.texi (TR13 Impl. Status): Set to 'Y'.
	* target.c (gomp_requires_to_name, GOMP_offload_register_ver,
	gomp_target_init): Handle self_maps clause.
	* testsuite/libgomp.fortran/self_maps.f90: New test.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/declare-variant-1.c: Add self_maps test.
	* c-c++-common/gomp/requires-4.c: Likewise.
	* gfortran.dg/gomp/declare-variant-3.f90:  Likewise.
	* c-c++-common/gomp/requires-2.c: Update dg-error msg.
	* gfortran.dg/gomp/requires-2.f90: Likewise.
	* gfortran.dg/gomp/requires-self-maps-aux.f90: New.
	* gfortran.dg/gomp/requires-self-maps.f90: New.
2024-09-24 10:53:59 +02:00
Tobias Burnus
cdb9aa0f62 OpenMP: Fix omp_get_device_from_uid, minor cleanup
In Fortran, omp_get_device_from_uid can also accept substrings, which are
then not NUL terminated.  Fixed by introducing a fortran.c wrapper function.
Additionally, in case of a fail the plugin functions now return NULL instead
of failing fatally such that a fall-back UID is generated.

gcc/ChangeLog:

	* omp-general.cc (omp_runtime_api_procname): Strip "omp_" from
	string; move get_device_from_uid as now a '_' suffix exists.

libgomp/ChangeLog:

	* fortran.c (omp_get_device_from_uid_): New function.
	* libgomp.map (GOMP_6.0): Add it.
	* oacc-host.c (host_dispatch): Init '.uid' and '.get_uid_func'.
	* omp_lib.f90.in: Make it used by removing bind(C).
	* omp_lib.h.in: Likewise.
	* target.c (omp_get_device_from_uid): Ensure the device is initialized.
	* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_uid): Add function comment;
	return NULL in case of an error.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_uid): Likewise.
	* testsuite/libgomp.fortran/device_uid.f90: Update to test substrings.
2024-09-23 15:58:39 +02:00
Tobias Burnus
bf4a5efa80 OpenMP: Add get_device_from_uid/omp_get_uid_from_device routines
Those TR13/OpenMP 6.0 routines permit a reproducible offloading to
a specific device by mapping an OpenMP device number to a
unique ID (UID). The GPU device UIDs should be universally unique,
the one for the host is not.

gcc/ChangeLog:

	* omp-general.cc (omp_runtime_api_procname): Add
	get_device_from_uid and omp_get_uid_from_device routines.

include/ChangeLog:

	* cuda/cuda.h (cuDeviceGetUuid): Declare.
	(cuDeviceGetUuid_v2): Add prototype.

libgomp/ChangeLog:

	* config/gcn/target.c (omp_get_uid_from_device,
	omp_get_device_from_uid): Add stub implementation.
	* config/nvptx/target.c (omp_get_uid_from_device,
	omp_get_device_from_uid): Likewise.
	* fortran.c (omp_get_uid_from_device_,
	omp_get_uid_from_device_8_): New functions.
	* libgomp-plugin.h (GOMP_OFFLOAD_get_uid): Add prototype.
	* libgomp.h (struct gomp_device_descr): Add 'uid' and 'get_uid_func'.
	* libgomp.map (GOMP_6.0): New, includind the new UID routines.
	* libgomp.texi (OpenMP Technical Report 13): Mark UID routines as 'Y'.
	(Device Information Routines): Document new UID routines.
	(Offload-Target Specifics): Document UID format.
	* omp.h.in (omp_get_device_from_uid, omp_get_uid_from_device):
	New prototype.
	* omp_lib.f90.in (omp_get_device_from_uid, omp_get_uid_from_device):
	New interface.
	* omp_lib.h.in: Likewise.
	* plugin/cuda-lib.def: Add cuDeviceGetUuid and cuDeviceGetUuid_v2 via
	CUDA_ONE_CALL_MAYBE_NULL.
	* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_uid): New.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_uid): New.
	* target.c (str_omp_initial_device): New static var.
	(STR_OMP_DEV_PREFIX): Define.
	(gomp_get_uid_for_device, omp_get_uid_from_device,
	omp_get_device_from_uid): New.
	(gomp_load_plugin_for_device): DLSYM_OPT the function 'get_uid'.
	(gomp_target_init): Set the device's 'uid' field to NULL.
	* testsuite/libgomp.c/device_uid.c: New test.
	* testsuite/libgomp.fortran/device_uid.f90: New test.
2024-09-20 09:25:33 +02:00
Tobias Burnus
0beac1db38 libgomp: Add interop types and routines to OpenMP's headers and module
This commit adds OpenMP 5.1+'s interop enumeration, type and routine
declarations to the C/C++ header file and, new in OpenMP TR13, also to
the Fortran module and omp_lib.h header file.

While a stub implementation is provided, only with foreign runtime
support by the libgomp GPU plugins and with the 'interop' directive,
this becomes really useful.

libgomp/ChangeLog:

	* fortran.c (omp_get_interop_str_, omp_get_interop_name_,
	omp_get_interop_type_desc_, omp_get_interop_rc_desc_): Add.
	* libgomp.map (GOMP_5.1.3): New; add interop routines.
	* omp.h.in: Add interop typedefs, enum and prototypes.
	(__GOMP_DEFAULT_NULL): Define.
	(omp_target_memcpy_async, omp_target_memcpy_rect_async):
	Use it for the optional depend argument.
	* omp_lib.f90.in: Add paramters and interfaces for interop.
	* omp_lib.h.in: Likewise; move F90 '&' to column 81 for
	-ffree-length-80.
	* target.c (omp_get_num_interop_properties, omp_get_interop_int,
	omp_get_interop_ptr, omp_get_interop_str, omp_get_interop_name,
	omp_get_interop_type_desc, omp_get_interop_rc_desc): Add.
	* config/gcn/target.c (omp_get_num_interop_properties,
	omp_get_interop_int, omp_get_interop_ptr, omp_get_interop_str,
	omp_get_interop_name, omp_get_interop_type_desc,
	omp_get_interop_rc_desc): Add.
	* config/nvptx/target.c (omp_get_num_interop_properties,
	omp_get_interop_int, omp_get_interop_ptr, omp_get_interop_str,
	omp_get_interop_name, omp_get_interop_type_desc,
	omp_get_interop_rc_desc): Add.
	* testsuite/libgomp.c-c++-common/interop-routines-1.c: New test.
	* testsuite/libgomp.c-c++-common/interop-routines-2.c: New test.
	* testsuite/libgomp.fortran/interop-routines-1.F90: New test.
	* testsuite/libgomp.fortran/interop-routines-2.F90: New test.
	* testsuite/libgomp.fortran/interop-routines-3.F: New test.
	* testsuite/libgomp.fortran/interop-routines-4.F: New test.
	* testsuite/libgomp.fortran/interop-routines-5.F: New test.
	* testsuite/libgomp.fortran/interop-routines-6.F: New test.
	* testsuite/libgomp.fortran/interop-routines-7.F90: New test.
2024-08-28 11:50:43 +02:00
Tobias Burnus
0c56fd6a1f libgomp: Device load_image - improve minor num-funcs/vars check
The run time library loads the offload functions and variable and optionally
the ICV variable and returns the number of loaded items, which has to match
the host side. The plugin returns "+1" (since GCC 12) for the ICV variable
entry, independently whether it was loaded or not, but the var's value
(start == end == 0) can be used to detect when this failed.

Thus, we can tighten the assert check - which this commit does together with
making the output less surprising - and simplify the condition further below.

libgomp/ChangeLog:

	* target.c (gomp_load_image_to_device): Extend fatal-error message;
	simplify a condition.
2024-08-06 10:34:28 +02:00
Tobias Burnus
14c47e7eb0 libgomp: Fix declare target link with offset array-section mapping [PR116107]
Assume that 'int var[100]' is 'omp declare target link(var)'. When now
mapping an array section with offset such as 'map(to:var[20:10])',
the device-side link pointer has to store &<device-storage-data>[0] minus
the offset such that var[20] will access <device-storage-data>[0]. But
the offset calculation was missed such that the device-side 'var' pointed
to the first element of the mapped data - and var[20] points beyond at
some invalid memory.

	PR middle-end/116107

libgomp/ChangeLog:

	* target.c (gomp_map_vars_internal): Honor array mapping offsets
	with declare-target 'link' variables.
	* testsuite/libgomp.c-c++-common/target-link-2.c: New test.
2024-07-29 11:40:38 +02:00
Thomas Schwinge
a95c1911d8 libgomp: Document 'GOMP_teams4'
For reference:

  - <https://inbox.sourceware.org/20211111190313.GV2710@tucnak> "[PATCH] openmp: Honor OpenMP 5.1 num_teams lower bound"
  - <https://inbox.sourceware.org/20211112132023.GC2710@tucnak> "[PATCH] libgomp, nvptx: Honor OpenMP 5.1 num_teams lower bound"

	libgomp/
	* config/gcn/target.c (GOMP_teams4): Document.
	* config/nvptx/target.c (GOMP_teams4): Likewise.
	* target.c (GOMP_teams4): Likewise.
2024-07-19 21:55:47 +02:00
Tobias Burnus
4ccb3366ad libgomp: Enable USM for some nvptx devices
A few high-end nvptx devices support the attribute
CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS; for those, unified shared
memory is supported in hardware. This patch enables support for those -
if all installed nvptx devices have this feature (as the capabilities
are per device type).

This exposes a bug in gomp_copy_back_icvs as it did before use
omp_get_mapped_ptr to find mapped variables, but that returns
the unchanged pointer in cased of shared memory. But in this case,
we have a few actually mapped pointers - like the ICV variables.
Additionally, there was a mismatch with regards to '-1' for the
device number as gomp_copy_back_icvs and omp_get_mapped_ptr count
differently. Hence, do the lookup manually.

include/ChangeLog:

	* cuda/cuda.h (CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS): Add.

libgomp/ChangeLog:

	* libgomp.texi (nvptx): Update USM description.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices):
	Claim support when requesting USM and all devices support
	CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS.
	* target.c (gomp_copy_back_icvs): Fix device ptr lookup.
	(gomp_target_init): Set GOMP_OFFLOAD_CAP_SHARED_MEM is the
	devices supports USM.
2024-05-29 15:14:38 +02:00
Chung-Lin Tang
a7578a077e OpenACC 2.7: Adjust acc_map_data/acc_unmap_data interaction with reference counters
This patch adjusts the implementation of acc_map_data/acc_unmap_data API library
routines to more fit the description in the OpenACC 2.7 specification.

Instead of using REFCOUNT_INFINITY, we now define a REFCOUNT_ACC_MAP_DATA
special value to mark acc_map_data-created mappings. Adjustment around
mapping related code to respect OpenACC semantics are also added.

libgomp/ChangeLog:

	* libgomp.h (REFCOUNT_ACC_MAP_DATA): Define as (REFCOUNT_SPECIAL | 2).
	* oacc-mem.c (acc_map_data): Adjust to use REFCOUNT_ACC_MAP_DATA,
	initialize dynamic_refcount as 1.
	(acc_unmap_data): Adjust to use REFCOUNT_ACC_MAP_DATA,
	(goacc_map_var_existing): Add REFCOUNT_ACC_MAP_DATA case.
	(goacc_exit_datum_1): Add REFCOUNT_ACC_MAP_DATA case, respect
	REFCOUNT_ACC_MAP_DATA when decrementing/finalizing. Force lowest
	dynamic_refcount to be 1 for REFCOUNT_ACC_MAP_DATA.
	(goacc_enter_data_internal): Add REFCOUNT_ACC_MAP_DATA case.
	* target.c (gomp_increment_refcount): Return early for
	REFCOUNT_ACC_MAP_DATA case.
	(gomp_decrement_refcount): Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-96.c: New testcase.
	* testsuite/libgomp.oacc-c-c++-common/unmap-infinity-1.c: Adjust
	testcase error output scan test.
2024-04-16 09:04:11 +00:00
Jakub Jelinek
dea9ac2a00 libgomp: Use void (*) (void *) rather than void (*)() for host_fn type [PR114216]
For the type of the target callbacks we use elsehwere void (*) (void *) and
IMHO should use that for the reverse offload fallback as well (where the actual
callback is emitted using the same code as for host fallback or device kernel
entry routines), even when it is also ok to use void (*) () before C23 and
we aren't building libgomp with C23 yet.  On some arches perhaps void (*) ()
could result in worse code generation because calls in that case like casts
to unprototyped functions need to sometimes pass argument in two different spots
etc. so that it deals with both passing it through ... and as a named argument.

2024-03-04  Jakub Jelinek  <jakub@redhat.com>

	PR libgomp/114216
	* target.c (gomp_target_rev): Change host_fn type and corresponding
	cast from void (*)() to void (*) (void *).
2024-03-04 11:48:40 +01:00
Jakub Jelinek
a945c346f5 Update copyright years. 2024-01-03 12:19:35 +01:00
Julian Brown
f5745dc142 OpenMP/OpenACC: Unordered/non-constant component offset runtime diagnostic
This patch adds support for non-constant component offsets in "map"
clauses for OpenMP (and the equivalants for OpenACC), which are not able
to be sorted into order at compile time.  Normally struct accesses in
such clauses are gathered together and sorted into increasing address
order after a "GOMP_MAP_STRUCT" node: if we have variable indices,
that is no longer possible.

This version of the patch scales back the previously-posted version to
merely add a diagnostic for incorrect usage of component accesses with
variably-indexed arrays of structs: the only permitted variant is where
we have multiple indices that are the same, but we could not prove so
at compile time.  Rather than silently producing the wrong result for
cases where the indices are in fact different, we error out (e.g.,
"map(dtarr(i)%arrptr, dtarr(j)%arrptr(4:8))", for different i/j).

For now, multiple *constant* array indices are still supported (see
map-arrayofstruct-1.c).  That could perhaps be addressed with a follow-up
patch, if necessary.

This version of the patch renumbers the GOMP_MAP_STRUCT_UNORD kind to
avoid clashing with the OpenACC "non-contiguous" dynamic array support
(though that is not yet applied to mainline).

2023-08-18  Julian Brown  <julian@codesourcery.com>

gcc/
	* gimplify.cc (extract_base_bit_offset): Add VARIABLE_OFFSET parameter.
	(omp_get_attachment, omp_group_last, omp_group_base,
	omp_directive_maps_explicitly): Add GOMP_MAP_STRUCT_UNORD support.
	(omp_accumulate_sibling_list): Update calls to extract_base_bit_offset.
	Support GOMP_MAP_STRUCT_UNORD.
	(omp_build_struct_sibling_lists, gimplify_scan_omp_clauses,
	gimplify_adjust_omp_clauses, gimplify_omp_target_update): Add
	GOMP_MAP_STRUCT_UNORD support.
	* omp-low.cc (lower_omp_target): Add GOMP_MAP_STRUCT_UNORD support.
	* tree-pretty-print.cc (dump_omp_clause): Likewise.

include/
	* gomp-constants.h (gomp_map_kind): Add GOMP_MAP_STRUCT_UNORD.

libgomp/
	* oacc-mem.c (find_group_last, goacc_enter_data_internal,
	goacc_exit_data_internal, GOACC_enter_exit_data): Add
	GOMP_MAP_STRUCT_UNORD support.
	* target.c (gomp_map_vars_internal): Add GOMP_MAP_STRUCT_UNORD support.
	Detect incorrect use of variable indexing of arrays of structs.
	(GOMP_target_enter_exit_data, gomp_target_task_fn): Add
	GOMP_MAP_STRUCT_UNORD support.
	* testsuite/libgomp.c-c++-common/map-arrayofstruct-1.c: New test.
	* testsuite/libgomp.c-c++-common/map-arrayofstruct-2.c: New test.
	* testsuite/libgomp.c-c++-common/map-arrayofstruct-3.c: New test.
	* testsuite/libgomp.fortran/map-subarray-5.f90: New test.
2023-12-15 10:33:52 +00:00
Julian Brown
5fdb150cd4 OpenMP/OpenACC: Rework clause expansion and nested struct handling
This patch reworks clause expansion in the C, C++ and (to a lesser
extent) Fortran front ends for OpenMP and OpenACC mapping nodes used in
GPU offloading support.

At present a single clause may be turned into several mapping nodes,
or have its mapping type changed, in several places scattered through
the front- and middle-end.  The analysis relating to which particular
transformations are needed for some given expression has become quite hard
to follow.  Briefly, we manipulate clause types in the following places:

 1. During parsing, in c_omp_adjust_map_clauses.  Depending on a set of
    rules, we may change a FIRSTPRIVATE_POINTER (etc.) mapping into
    ATTACH_DETACH, or mark the decl addressable.

 2. In semantics.cc or c-typeck.cc, clauses are expanded in
    handle_omp_array_sections (called via {c_}finish_omp_clauses, or in
    finish_omp_clauses itself.  The two cases are for processing array
    sections (the former), or non-array sections (the latter).

 3. In gimplify.cc, we build sibling lists for struct accesses, which
    groups and sorts accesses along with their struct base, creating
    new ALLOC/RELEASE nodes for pointers.

 4. In gimplify.cc:gimplify_adjust_omp_clauses, mapping nodes may be
    adjusted or created.

This patch doesn't completely disrupt this scheme, though clause
types are no longer adjusted in c_omp_adjust_map_clauses (step 1).
Clause expansion in step 2 (for C and C++) now uses a single, unified
mechanism, parts of which are also reused for analysis in step 3.

Rather than the kind-of "ad-hoc" pattern matching on addresses used to
expand clauses used at present, a new method for analysing addresses is
introduced.  This does a recursive-descent tree walk on expression nodes,
and emits a vector of tokens describing each "part" of the address.
This tokenized address can then be translated directly into mapping nodes,
with the assurance that no part of the expression has been inadvertently
skipped or misinterpreted.  In this way, all the variations of ways
pointers, arrays, references and component accesses might be combined
can be teased apart into easily-understood cases - and we know we've
"parsed" the whole address before we start analysis, so the right code
paths can easily be selected.

For example, a simple access "arr[idx]" might parse as:

  base-decl access-indexed-array

or "mystruct->foo[x]" with a pointer "foo" component might parse as:

  base-decl access-pointer component-selector access-pointer

A key observation is that support for "array" bases, e.g. accesses
whose root nodes are not structures, but describe scalars or arrays,
and also *one-level deep* structure accesses, have first-class support
in gimplify and beyond.  Expressions that use deeper struct accesses
or e.g. multiple indirections were more problematic: some cases worked,
but lots of cases didn't.  This patch reimplements the support for those
in gimplify.cc, again using the new "address tokenization" support.

An expression like "mystruct->foo->bar[0:10]" used in a mapping node will
translate the right-hand access directly in the front-end.  The base for
the access will be "mystruct->foo".  This is handled recursively in
gimplify.cc -- there may be several accesses of "mystruct"'s members
on the same directive, so the sibling-list building machinery can be
used again.  (This was already being done for OpenACC, but the new
implementation differs somewhat in details, and is more robust.)

For OpenMP, in the case where the base pointer itself,
i.e. "mystruct->foo" here, is NOT mapped on the same directive, we
create a "fragile" mapping.  This turns the "foo" component access
into a zero-length allocation (which is a new feature for the runtime,
so support has been added there too).

A couple of changes have been made to how mapping clauses are turned
into mapping nodes:

The first change is based on the observation that it is probably never
correct to use GOMP_MAP_ALWAYS_POINTER for component accesses (e.g. for
references), because if the containing struct is already mapped on the
target then the host version of the pointer in question will be corrupted
if the struct is copied back from the target.  This patch removes all
such uses, across each of C, C++ and Fortran.

The second change is to the way that GOMP_MAP_ATTACH_DETACH nodes
are processed during sibling-list creation.  For OpenMP, for pointer
components, we must map the base pointer separately from an array section
that uses the base pointer, so e.g. we must have both "map(mystruct.base)"
and "map(mystruct.base[0:10])" mappings.  These create nodes such as:

  GOMP_MAP_TOFROM mystruct.base
  G_M_TOFROM *mystruct.base [len: 10*elemsize] G_M_ATTACH_DETACH mystruct.base

Instead of using the first of these directly when building the struct
sibling list then skipping the group using GOMP_MAP_ATTACH_DETACH,
leading to:

  GOMP_MAP_STRUCT mystruct [len: 1] GOMP_MAP_TOFROM mystruct.base

we now introduce a new "mini-pass", omp_resolve_clause_dependencies, that
drops the GOMP_MAP_TOFROM for the base pointer, marks the second group
as having had a base-pointer mapping, then omp_build_struct_sibling_lists
can create:

  GOMP_MAP_STRUCT mystruct [len: 1] GOMP_MAP_ALLOC mystruct.base [len: ptrsize]

This ends up working better in many cases, particularly those involving
references.  (The "alloc" space is immediately overwritten by a pointer
attachment, so this is mildly more efficient than a redundant TO mapping
at runtime also.)

There is support in the address tokenizer for "arbitrary" base expressions
which aren't rooted at a decl, but that is not used as present because
such addresses are disallowed at parse time.

In the front-ends, the address tokenization machinery is mostly only
used for clause expansion and not for diagnostics at present.  It could
be used for those too, which would allow more of my previous "address
inspector" implementation to be removed.

The new bits in gimplify.cc work with OpenACC also.

This version of the patch addresses several first-pass review comments
from Tobias, and fixes a few previously-missed cases for manually-managed
ragged array mappings (including cases using references).  Some arbitrary
differences between handling of clause expansion for C vs. C++ have also
been fixed, and some fragments from later in the patch series have been
moved forward (where they were useful for fixing bugs).  Several new
test cases have been added.

2023-11-29  Julian Brown  <julian@codesourcery.com>

gcc/c-family/
	* c-common.h (c_omp_region_type): Add C_ORT_EXIT_DATA,
	C_ORT_OMP_EXIT_DATA and C_ORT_ACC_TARGET.
	(omp_addr_token): Add forward declaration.
	(c_omp_address_inspector): New class.
	* c-omp.cc (c_omp_adjust_map_clauses): Mark decls addressable here, but
	do not change any mapping node types.
	(c_omp_address_inspector::unconverted_ref_origin,
	c_omp_address_inspector::component_access_p,
	c_omp_address_inspector::check_clause,
	c_omp_address_inspector::get_root_term,
	c_omp_address_inspector::map_supported_p,
	c_omp_address_inspector::get_origin,
	c_omp_address_inspector::maybe_unconvert_ref,
	c_omp_address_inspector::maybe_zero_length_array_section,
	c_omp_address_inspector::expand_array_base,
	c_omp_address_inspector::expand_component_selector,
	c_omp_address_inspector::expand_map_clause): New methods.
	(omp_expand_access_chain): New function.

gcc/c/
	* c-parser.cc (c_parser_oacc_all_clauses): Add TARGET_P parameter. Use
	to select region type for c_finish_omp_clauses call.
	(c_parser_oacc_loop): Update calls to c_parser_oacc_all_clauses.
	(c_parser_oacc_compute): Likewise.
	(c_parser_omp_target_data, c_parser_omp_target_enter_data): Support
	ATTACH kind.
	(c_parser_omp_target_exit_data): Support DETACH kind.
	(check_clauses): Handle GOMP_MAP_POINTER and GOMP_MAP_ATTACH here.
	* c-typeck.cc (handle_omp_array_sections_1,
	handle_omp_array_sections, c_finish_omp_clauses): Use
	c_omp_address_inspector class and OMP address tokenizer to analyze and
	expand map clause expressions.  Fix some diagnostics.  Fix "is OpenACC"
	condition for C_ORT_ACC_TARGET addition.

gcc/cp/
	* parser.cc (cp_parser_oacc_all_clauses): Add TARGET_P parameter. Use
	to select region type for finish_omp_clauses call.
	(cp_parser_omp_target_data, cp_parser_omp_target_enter_data): Support
	GOMP_MAP_ATTACH kind.
	(cp_parser_omp_target_exit_data): Support GOMP_MAP_DETACH kind.
	(cp_parser_oacc_declare): Update call to cp_parser_oacc_all_clauses.
	(cp_parser_oacc_loop): Update calls to cp_parser_oacc_all_clauses.
	(cp_parser_oacc_compute): Likewise.
	* pt.cc (tsubst_expr): Use C_ORT_ACC_TARGET for call to
	tsubst_omp_clauses for OpenACC compute regions.
	* semantics.cc (cp_omp_address_inspector): New class, derived from
	c_omp_address_inspector.
	(handle_omp_array_sections_1, handle_omp_array_sections,
	finish_omp_clauses): Use cp_omp_address_inspector class and OMP address
	tokenizer to analyze and expand OpenMP map clause expressions.  Fix
	some diagnostics.  Support C_ORT_ACC_TARGET.
	(finish_omp_target): Handle GOMP_MAP_POINTER.

gcc/fortran/
	* trans-openmp.cc (gfc_trans_omp_array_section): Add OPENMP parameter.
	Use GOMP_MAP_ATTACH_DETACH instead of GOMP_MAP_ALWAYS_POINTER for
	derived type components.
	(gfc_trans_omp_clauses): Update calls to gfc_trans_omp_array_section.

gcc/
	* gimplify.cc (build_struct_comp_nodes): Don't process
	GOMP_MAP_ATTACH_DETACH "middle" nodes here.
	(omp_mapping_group): Add REPROCESS_STRUCT and FRAGILE booleans for
	nested struct handling.
	(omp_strip_components_and_deref, omp_strip_indirections): Remove
	functions.
	(omp_get_attachment): Handle GOMP_MAP_DETACH here.
	(omp_group_last): Handle GOMP_MAP_*, GOMP_MAP_DETACH,
	GOMP_MAP_ATTACH_DETACH groups for "exit data" of reference-to-pointer
	component array sections.
	(omp_gather_mapping_groups_1): Initialise reprocess_struct and fragile
	fields.
	(omp_group_base): Handle GOMP_MAP_ATTACH_DETACH after GOMP_MAP_STRUCT.
	(omp_index_mapping_groups_1): Skip reprocess_struct groups.
	(omp_get_nonfirstprivate_group, omp_directive_maps_explicitly,
	omp_resolve_clause_dependencies, omp_first_chained_access_token): New
	functions.
	(omp_check_mapping_compatibility): Adjust accepted node combinations
	for "from" clauses using release instead of alloc.
	(omp_accumulate_sibling_list): Add GROUP_MAP, ADDR_TOKENS, FRAGILE_P,
	REPROCESSING_STRUCT, ADDED_TAIL parameters.  Use OMP address tokenizer
	to analyze addresses.  Reimplement nested struct handling, and
	implement "fragile groups".
	(omp_build_struct_sibling_lists): Adjust for changes to
	omp_accumulate_sibling_list.  Recalculate bias for ATTACH_DETACH nodes
	after GOMP_MAP_STRUCT nodes.
	(gimplify_scan_omp_clauses): Call omp_resolve_clause_dependencies.  Use
	OMP address tokenizer.
	(gimplify_adjust_omp_clauses_1): Use build_fold_indirect_ref_loc
	instead of build_simple_mem_ref_loc.
	* omp-general.cc (omp-general.h, tree-pretty-print.h): Include.
	(omp_addr_tokenizer): New namespace.
	(omp_addr_tokenizer::omp_addr_token): New.
	(omp_addr_tokenizer::omp_parse_component_selector,
	omp_addr_tokenizer::omp_parse_ref,
	omp_addr_tokenizer::omp_parse_pointer,
	omp_addr_tokenizer::omp_parse_access_method,
	omp_addr_tokenizer::omp_parse_access_methods,
	omp_addr_tokenizer::omp_parse_structure_base,
	omp_addr_tokenizer::omp_parse_structured_expr,
	omp_addr_tokenizer::omp_parse_array_expr,
	omp_addr_tokenizer::omp_access_chain_p,
	omp_addr_tokenizer::omp_accessed_addr): New functions.
	(omp_parse_expr, debug_omp_tokenized_addr): New functions.
	* omp-general.h (omp_addr_tokenizer::access_method_kinds,
	omp_addr_tokenizer::structure_base_kinds,
	omp_addr_tokenizer::token_type,
	omp_addr_tokenizer::omp_addr_token,
	omp_addr_tokenizer::omp_access_chain_p,
	omp_addr_tokenizer::omp_accessed_addr): New.
	(omp_addr_token, omp_parse_expr): New.
	* omp-low.cc (scan_sharing_clauses): Skip error check for references
	to pointers.
	* tree.h (OMP_CLAUSE_ATTACHMENT_MAPPING_ERASED): New macro.

gcc/testsuite/
	* c-c++-common/gomp/clauses-2.c: Fix error output.
	* c-c++-common/gomp/target-implicit-map-2.c: Adjust scan output.
	* c-c++-common/gomp/target-50.c: Adjust scan output.
	* c-c++-common/gomp/target-enter-data-1.c: Adjust scan output.
	* g++.dg/gomp/static-component-1.C: New test.
	* gcc.dg/gomp/target-3.c: Adjust scan output.
	* gfortran.dg/gomp/map-9.f90: Adjust scan output.

libgomp/
	* target.c (gomp_map_pointer): Modify zero-length array section
	pointer handling.
	(gomp_attach_pointer): Likewise.
	(gomp_map_fields_existing): Use gomp_map_0len_lookup.
	(gomp_attach_pointer): Allow attaching null pointers (or Fortran
	"unassociated" pointers).
	(gomp_map_vars_internal): Handle zero-sized struct members.  Add
	diagnostic for unmapped struct pointer members.
	* testsuite/libgomp.c-c++-common/baseptrs-1.c: New test.
	* testsuite/libgomp.c-c++-common/baseptrs-2.c: New test.
	* testsuite/libgomp.c-c++-common/baseptrs-6.c: New test.
	* testsuite/libgomp.c-c++-common/baseptrs-7.c: New test.
	* testsuite/libgomp.c-c++-common/ptr-attach-2.c: New test.
	* testsuite/libgomp.c-c++-common/target-implicit-map-2.c: Fix missing
	"free".
	* testsuite/libgomp.c-c++-common/target-implicit-map-5.c: New test.
	* testsuite/libgomp.c-c++-common/target-map-zlas-1.c: New test.
	* testsuite/libgomp.c++/class-array-1.C: New test.
	* testsuite/libgomp.c++/baseptrs-3.C: New test.
	* testsuite/libgomp.c++/baseptrs-4.C: New test.
	* testsuite/libgomp.c++/baseptrs-5.C: New test.
	* testsuite/libgomp.c++/baseptrs-8.C: New test.
	* testsuite/libgomp.c++/baseptrs-9.C: New test.
	* testsuite/libgomp.c++/ref-mapping-1.C: New test.
	* testsuite/libgomp.c++/target-48.C: New test.
	* testsuite/libgomp.c++/target-49.C: New test.
	* testsuite/libgomp.c++/target-exit-data-reftoptr-1.C: New test.
	* testsuite/libgomp.c++/target-lambda-1.C: Update for OpenMP 5.2
	semantics.
	* testsuite/libgomp.c++/target-this-3.C: Likewise.
	* testsuite/libgomp.c++/target-this-4.C: Likewise.
	* testsuite/libgomp.fortran/struct-elem-map-1.f90: Add temporary XFAIL.
	* testsuite/libgomp.fortran/target-enter-data-6.f90: Likewise.
2023-12-13 20:30:49 +00:00
Tobias Burnus
d4b6d14792 OpenMP/Fortran: Implement omp allocators/allocate for ptr/allocatables
This commit adds -fopenmp-allocators which enables support for
'omp allocators' and 'omp allocate' that are associated with a Fortran
allocate-stmt. If such a construct is encountered, an error is shown,
unless the -fopenmp-allocators flag is present.

With -fopenmp -fopenmp-allocators, those constructs get turned into
GOMP_alloc allocations, while -fopenmp-allocators (also without -fopenmp)
ensures deallocation and reallocation (via intrinsic assignments) are
properly directed to GOMP_free/omp_realloc - while normal Fortran
allocations are processed by free/realloc.

In order to distinguish a 'malloc'ed from a 'GOMP_alloc'ed memory, the
version field of the Fortran array discriptor is (mis)used: 0 indicates
the normal Fortran allocation while 1 denotes GOMP_alloc. For scalars,
there is record keeping in libgomp: GOMP_add_alloc(ptr) will add the
pointer address to a splay_tree while GOMP_is_alloc(ptr) will return
true it was previously added but also removes it from the list.

Besides Fortran FE work, BUILT_IN_GOMP_REALLOC is no part of
omp-builtins.def and libgomp gains the mentioned two new function.

gcc/ChangeLog:

	* builtin-types.def (BT_FN_PTR_PTR_SIZE_PTRMODE_PTRMODE): New.
	* omp-builtins.def (BUILT_IN_GOMP_REALLOC): New.
	* builtins.cc (builtin_fnspec): Handle it.
	* gimple-ssa-warn-access.cc (fndecl_alloc_p,
	matching_alloc_calls_p): Likewise.
	* gimple.cc (nonfreeing_call_p): Likewise.
	* predict.cc (expr_expected_value_1): Likewise.
	* tree-ssa-ccp.cc (evaluate_stmt): Likewise.
	* tree.cc (fndecl_dealloc_argno): Likewise.

gcc/fortran/ChangeLog:

	* dump-parse-tree.cc (show_omp_node): Handle EXEC_OMP_ALLOCATE
	and EXEC_OMP_ALLOCATORS.
	* f95-lang.cc (ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LIST):
	Add 'ECF_LEAF | ECF_MALLOC' to existing 'ECF_NOTHROW'.
	(ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LEAF_LIST): Define.
	* gfortran.h (gfc_omp_clauses): Add contained_in_target_construct.
	* invoke.texi (-fopenacc, -fopenmp): Update based on C version.
	(-fopenmp-simd): New, based on C version.
	(-fopenmp-allocators): New.
	* lang.opt (fopenmp-allocators): Add.
	* openmp.cc (resolve_omp_clauses): For allocators/allocate directive,
	add target and no dynamic_allocators diagnostic and more invalid
	diagnostic.
	* parse.cc (decode_omp_directive): Set contains_teams_construct.
	* trans-array.h (gfc_array_allocate): Update prototype.
	(gfc_conv_descriptor_version): New prototype.
	* trans-decl.cc (gfc_init_default_dt): Fix comment.
	* trans-array.cc (gfc_conv_descriptor_version): New.
	(gfc_array_allocate): Support GOMP_alloc allocation.
	(gfc_alloc_allocatable_for_assignment, structure_alloc_comps):
	Handle GOMP_free/omp_realloc as needed.
	* trans-expr.cc (gfc_conv_procedure_call): Likewise.
	(alloc_scalar_allocatable_for_assignment): Likewise.
	* trans-intrinsic.cc (conv_intrinsic_move_alloc): Likewise.
	* trans-openmp.cc (gfc_trans_omp_allocators,
	gfc_trans_omp_directive): Handle allocators/allocate directive.
	(gfc_omp_call_add_alloc, gfc_omp_call_is_alloc): New.
	* trans-stmt.h (gfc_trans_allocate): Update prototype.
	* trans-stmt.cc (gfc_trans_allocate): Support GOMP_alloc.
	* trans-types.cc (gfc_get_dtype_rank_type): Set version field.
	* trans.cc (gfc_allocate_using_malloc, gfc_allocate_allocatable):
	Update to handle GOMP_alloc.
	(gfc_deallocate_with_status, gfc_deallocate_scalar_with_status):
	Handle GOMP_free.
	(trans_code): Update call.
	* trans.h (gfc_allocate_allocatable, gfc_allocate_using_malloc):
	Update prototype.
	(gfc_omp_call_add_alloc, gfc_omp_call_is_alloc): New prototype.
	* types.def (BT_FN_PTR_PTR_SIZE_PTRMODE_PTRMODE): New.

libgomp/ChangeLog:

	* allocator.c (struct fort_alloc_splay_tree_key_s,
	fort_alloc_splay_compare, GOMP_add_alloc, GOMP_is_alloc): New.
	* libgomp.h: Define splay_tree_static for 'reverse' splay tree.
	* libgomp.map (GOMP_5.1.2): New; add GOMP_add_alloc and
	GOMP_is_alloc; move GOMP_target_map_indirect_ptr from ...
	(GOMP_5.1.1): ... here.
	* libgomp.texi (Impl. Status, Memory management): Update for
	allocators/allocate directives.
	* splay-tree.c: Handle splay_tree_static define to declare all
	functions as static.
	(splay_tree_lookup_node): New.
	* splay-tree.h: Handle splay_tree_decl_only define.
	(splay_tree_lookup_node): New prototype.
	* target.c: Define splay_tree_static for 'reverse'.
	* testsuite/libgomp.fortran/allocators-1.f90: New test.
	* testsuite/libgomp.fortran/allocators-2.f90: New test.
	* testsuite/libgomp.fortran/allocators-3.f90: New test.
	* testsuite/libgomp.fortran/allocators-4.f90: New test.
	* testsuite/libgomp.fortran/allocators-5.f90: New test.

gcc/testsuite/ChangeLog:

	* gfortran.dg/gomp/allocate-14.f90: Add coarray and
	not-listed tests.
	* gfortran.dg/gomp/allocate-5.f90: Remove sorry dg-message.
	* gfortran.dg/bind_c_array_params_2.f90: Update expected
	dump for dtype '.version=0'.
	* gfortran.dg/gomp/allocate-16.f90: New test.
	* gfortran.dg/gomp/allocators-3.f90: New test.
	* gfortran.dg/gomp/allocators-4.f90: New test.
2023-12-08 15:18:25 +01:00
Kwok Cheung Yeung
a49c7d3193 openmp: Add support for the 'indirect' clause in C/C++
This adds support for the 'indirect' clause in the 'declare target'
directive.  Functions declared as indirect may be called via function
pointers passed from the host in offloaded code.

Virtual calls to member functions via the object pointer in C++ are
currently not supported in target regions.

2023-11-07  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/c-family/
	* c-attribs.cc (c_common_attribute_table): Add attribute for
	indirect functions.
	* c-pragma.h (enum parma_omp_clause): Add entry for indirect clause.

gcc/c/
	* c-decl.cc (c_decl_attributes): Add attribute for indirect
	functions.
	* c-lang.h (c_omp_declare_target_attr): Add indirect field.
	* c-parser.cc (c_parser_omp_clause_name): Handle indirect clause.
	(c_parser_omp_clause_indirect): New.
	(c_parser_omp_all_clauses): Handle indirect clause.
	(OMP_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
	(c_parser_omp_declare_target): Handle indirect clause.  Emit error
	message if device_type or indirect clauses used alone.  Emit error
	if indirect clause used with device_type that is not 'any'.
	(OMP_BEGIN_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
	(c_parser_omp_begin): Handle indirect clause.
	* c-typeck.cc (c_finish_omp_clauses): Handle indirect clause.

gcc/cp/
	* cp-tree.h (cp_omp_declare_target_attr): Add indirect field.
	* decl2.cc (cplus_decl_attributes): Add attribute for indirect
	functions.
	* parser.cc (cp_parser_omp_clause_name): Handle indirect clause.
	(cp_parser_omp_clause_indirect): New.
	(cp_parser_omp_all_clauses): Handle indirect clause.
	(handle_omp_declare_target_clause): Add extra parameter.  Add
	indirect attribute for indirect functions.
	(OMP_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
	(cp_parser_omp_declare_target): Handle indirect clause.  Emit error
	message if device_type or indirect clauses used alone.  Emit error
	if indirect clause used with device_type that is not 'any'.
	(OMP_BEGIN_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
	(cp_parser_omp_begin): Handle indirect clause.
	* semantics.cc (finish_omp_clauses): Handle indirect clause.

gcc/
	* lto-cgraph.cc (enum LTO_symtab_tags): Add tag for indirect
	functions.
	(output_offload_tables): Write indirect functions.
	(input_offload_tables): read indirect functions.
	* lto-section-names.h (OFFLOAD_IND_FUNC_TABLE_SECTION_NAME): New.
	* omp-builtins.def (BUILT_IN_GOMP_TARGET_MAP_INDIRECT_PTR): New.
	* omp-offload.cc (offload_ind_funcs): New.
	(omp_discover_implicit_declare_target): Add functions marked with
	'omp declare target indirect' to indirect functions list.
	(omp_finish_file): Add indirect functions to section for offload
	indirect functions.
	(execute_omp_device_lower): Redirect indirect calls on target by
	passing function pointer to BUILT_IN_GOMP_TARGET_MAP_INDIRECT_PTR.
	(pass_omp_device_lower::gate): Run pass_omp_device_lower if
	indirect functions are present on an accelerator device.
	* omp-offload.h (offload_ind_funcs): New.
	* tree-core.h (omp_clause_code): Add OMP_CLAUSE_INDIRECT.
	* tree.cc (omp_clause_num_ops): Add entry for OMP_CLAUSE_INDIRECT.
	(omp_clause_code_name): Likewise.
	* tree.h (OMP_CLAUSE_INDIRECT_EXPR): New.
	* config/gcn/mkoffload.cc (process_asm): Process offload_ind_funcs
	section.  Count number of indirect functions.
	(process_obj): Emit number of indirect functions.
	* config/nvptx/mkoffload.cc (ind_func_ids, ind_funcs_tail): New.
	(process): Emit offload_ind_func_table in PTX code.  Emit indirect
	function names and count in image.
	* config/nvptx/nvptx.cc (nvptx_record_offload_symbol): Mark
	indirect functions in PTX code with IND_FUNC_MAP.

gcc/testsuite/
	* c-c++-common/gomp/declare-target-7.c: Update expected error message.
	* c-c++-common/gomp/declare-target-indirect-1.c: New.
	* c-c++-common/gomp/declare-target-indirect-2.c: New.
	* g++.dg/gomp/attrs-21.C (v12): Update expected error message.
	* g++.dg/gomp/declare-target-indirect-1.C: New.
	* gcc.dg/gomp/attrs-21.c (v12): Update expected error message.

include/
	* gomp-constants.h (GOMP_VERSION): Increment to 3.
	(GOMP_VERSION_SUPPORTS_INDIRECT_FUNCS): New.

libgcc/
	* offloadstuff.c (OFFLOAD_IND_FUNC_TABLE_SECTION_NAME): New.
	(__offload_ind_func_table): New.
	(__offload_ind_funcs_end): New.
	(__OFFLOAD_TABLE__): Add entries for indirect functions.

libgomp/
	* Makefile.am (libgomp_la_SOURCES): Add target-indirect.c.
	* Makefile.in: Regenerate.
	* libgomp-plugin.h (GOMP_INDIRECT_ADDR_MAP): New define.
	(GOMP_OFFLOAD_load_image): Add extra argument.
	* libgomp.h (struct indirect_splay_tree_key_s): New.
	(indirect_splay_tree_node, indirect_splay_tree,
	indirect_splay_tree_key): New.
	(indirect_splay_compare): New.
	* libgomp.map (GOMP_5.1.1): Add GOMP_target_map_indirect_ptr.
	* libgomp.texi (OpenMP 5.1): Update documentation on indirect
	calls in target region and on indirect clause.
	(Other new OpenMP 5.2 features): Add entry for virtual function calls.
	* libgomp_g.h (GOMP_target_map_indirect_ptr): Add prototype.
	* oacc-host.c (host_load_image): Add extra argument.
	* target.c (gomp_load_image_to_device): If the GOMP_VERSION is high
	enough, read host indirect functions table and pass to
	load_image_func.
	* config/accel/target-indirect.c: New.
	* config/linux/target-indirect.c: New.
	* config/gcn/team.c (build_indirect_map): Add prototype.
	(gomp_gcn_enter_kernel): Initialize support for indirect
	function calls on GCN target.
	* config/nvptx/team.c (build_indirect_map): Add prototype.
	(gomp_nvptx_main): Initialize support for indirect function
	calls on NVPTX target.
	* plugin/plugin-gcn.c (struct gcn_image_desc): Add field for
	indirect functions count.
	(GOMP_OFFLOAD_load_image): Add extra argument.  If the GOMP_VERSION
	is high enough, build address translation table and copy it to target
	memory.
	* plugin/plugin-nvptx.c (nvptx_tdata): Add field for indirect
	functions count.
	(GOMP_OFFLOAD_load_image): Add extra argument.  If the GOMP_VERSION
	is high enough, Build address translation table and copy it to target
	memory.
	* testsuite/libgomp.c-c++-common/declare-target-indirect-1.c: New.
	* testsuite/libgomp.c-c++-common/declare-target-indirect-2.c: New.
	* testsuite/libgomp.c++/declare-target-indirect-1.C: New.
2023-11-07 15:44:50 +00:00
Tobias Burnus
d22cd7745f Revert: "Another revert test with a bogus hash"
This reverts commit ffffffffffffffffffffffffffffffffffffffff.

This should get rejected because of the invalid hash.
If it still is accepted, it does something sensible:
It removes tailing white space from a line in libgomp/target.c.
2023-09-07 13:33:35 +02:00
Tobias Burnus
8b9e559fe7 libgomp: cuda.h and omp_target_memcpy_rect cleanup
Fixes for commit r14-2792-g25072a477a56a727b369bf9b20f4d18198ff5894
"OpenMP: Call cuMemcpy2D/cuMemcpy3D for nvptx for omp_target_memcpy_rect",
namely:

In that commit, the code was changed to handle shared-memory devices;
however, as pointed out, omp_target_memcpy_check already set the pointer
to NULL in that case.  Hence, this commit reverts to the prior version.

In cuda.h, it adds cuMemcpyPeer{,Async} for symmetry for cuMemcpy3DPeer
(all currently unused) and in three structs, fixes reserved-member names
and remove a bogus 'const' in three structs.

And it changes a DLSYM to DLSYM_OPT as not all plugins support the new
functions, yet.

include/ChangeLog:

	* cuda/cuda.h (CUDA_MEMCPY2D, CUDA_MEMCPY3D, CUDA_MEMCPY3D_PEER):
	Remove bogus 'const' from 'const void *dst' and fix reserved-name
	name in those structs.
	(cuMemcpyPeer, cuMemcpyPeerAsync): Add.

libgomp/ChangeLog:

	* target.c (omp_target_memcpy_rect_worker): Undo dim=1 change for
	GOMP_OFFLOAD_CAP_SHARED_MEM.
	(omp_target_memcpy_rect_copy): Likewise for lock condition.
	(gomp_load_plugin_for_device): Use DLSYM_OPT not DLSYM for
	memcpy3d/memcpy2d.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_memcpy2d,
	GOMP_OFFLOAD_memcpy3d): Use memset 0 to nullify reserved and
	unused src/dst fields for that mem type; remove '{src,dst}LOD = 0'.
2023-07-29 13:25:03 +02:00
Tobias Burnus
25072a477a OpenMP: Call cuMemcpy2D/cuMemcpy3D for nvptx for omp_target_memcpy_rect
When copying a 2D or 3D rectangular memmory block, the performance is
better when using CUDA's cuMemcpy2D/cuMemcpy3D instead of copying the
data one by one. That's what this commit does.

Additionally, it permits device-to-device copies, if neccessary using a
temporary variable on the host.

include/ChangeLog:

	* cuda/cuda.h (CUlimit): Add CUDA_ERROR_NOT_INITIALIZED,
	CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_INVALID_HANDLE.
	(CUarray, CUmemorytype, CUDA_MEMCPY2D, CUDA_MEMCPY3D,
	CUDA_MEMCPY3D_PEER): New typdefs.
	(cuMemcpy2D, cuMemcpy2DAsync, cuMemcpy2DUnaligned,
	cuMemcpy3D, cuMemcpy3DAsync, cuMemcpy3DPeer,
	cuMemcpy3DPeerAsync): New prototypes.

libgomp/ChangeLog:

	* libgomp-plugin.h (GOMP_OFFLOAD_memcpy2d,
	GOMP_OFFLOAD_memcpy3d): New prototypes.
	* libgomp.h (struct gomp_device_descr): Add memcpy2d_func
	and memcpy3d_func.
	* libgomp.texi (nvtpx): Document when cuMemcpy2D/cuMemcpy3D is used.
	* oacc-host.c (memcpy2d_func, .memcpy3d_func): Init with NULL.
	* plugin/cuda-lib.def (cuMemcpy2D, cuMemcpy2DUnaligned,
	cuMemcpy3D): Invoke via CUDA_ONE_CALL.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_memcpy2d,
	GOMP_OFFLOAD_memcpy3d): New.
	* target.c (omp_target_memcpy_rect_worker):
	(omp_target_memcpy_rect_check, omp_target_memcpy_rect_copy):
	Permit all device-to-device copyies; invoke new plugins for
	2D and 3D copying when available.
	(gomp_load_plugin_for_device): DLSYM the new plugin functions.
	* testsuite/libgomp.c/target-12.c: Fix dimension bug.
	* testsuite/libgomp.fortran/target-12.f90: Likewise.
	* testsuite/libgomp.fortran/target-memcpy-rect-1.f90: New test.
2023-07-26 16:22:35 +02:00
Tobias Burnus
b25ea7ab78 OpenMP (C/C++): Keep pointer value of unmapped ptr with default mapping [PR110270]
For C/C++ pointers, default implicit mapping firstprivatizes the pointer
but if the memory it points to is mapped, the it is updated to point to
the device memory (by attaching a zero sized array section of the pointed-to
storage).

However, if the pointed-to storage wasn't mapped, the pointer was set to
NULL on the device side (OpenMP 5.0/5.1 semantic). With this commit, the
pointer retains the on-host address in that case (OpenMP 5.2 semantic).

The new semantic avoids an explicit map/firstprivate/is_device_ptr in the
following sensible cases: Special values (e.g. pointer or 0x1, 0x2 etc.),
explicitly device allocated memory (e.g. omp_target_alloc), and with
(unified) shared memory.
(Note: With (U)SM, mappings still must be tracked, at least when
omp_target_associate_ptr does not fail when passing in two destinct pointers.)

libgomp/

	PR middle-end/110270
	* target.c (gomp_map_vars_internal): Copy host value instead of NULL
	for  GOMP_MAP_ZERO_LEN_ARRAY_SECTION if not mapped.
	* libgomp.texi (OpenMP 5.2 Impl.): Mark as 'Y'.
	* testsuite/libgomp.c/target-19.c: Update expected value.
	* testsuite/libgomp.c++/target-18.C: Likewise.
	* testsuite/libgomp.c++/target-19.C: Likewise.
	* testsuite/libgomp.c-c++-common/requires-unified-addr-2.c: New test.
	* testsuite/libgomp.c-c++-common/target-implicit-map-3.c: New test.
	* testsuite/libgomp.c-c++-common/target-implicit-map-4.c: New test.
2023-06-19 09:08:51 +02:00
Tobias Burnus
8216ca8503 libgomp: Fix OMP_TARGET_OFFLOAD=mandatory
It turned out that gomp_init_targets_once() was not run when directly
calling 'omp target' or 'omp target (enter/exit) data' causing an
abort with OMP_TARGET_OFFLOAD=mandatory wrongly claiming that no
device is available. It was called a tiny bit later but few lines too
late for updating the default-device-var.

libgomp/ChangeLog:

	* target.c (resolve_device): Call gomp_get_num_devices early to ensure
	gomp_init_targets_once was called before using default-device-var.
	* testsuite/libgomp.c/target-55.c: New test.
	* testsuite/libgomp.c/target-55a.c: New test.
2023-06-16 17:48:09 +02:00
Thomas Schwinge
f2ef1dabbc Align a 'OMP_TARGET_OFFLOAD=mandatory' diagnostic with others
On 2023-06-14T11:42:22+0200, Tobias Burnus <tobias@codesourcery.com> wrote:
> On 14.06.23 10:09, Thomas Schwinge wrote:
>> Let me know if I should also adjust the new 'target { ! offload_device }'
>> diagnostic "[...] MANDATORY but only the host device is available" to
>> include a comma before 'but', for consistency with the other existing
>> diagnostics (cited above)?
>
> I think it makes sense to be consistent. Thus: Yes, please add the commas.

Fix-up for recent commit 18c8b56c7d
"OpenMP: Set default-device-var with OMP_TARGET_OFFLOAD=mandatory".

	libgomp/
	* target.c (resolve_device): Align a
	'OMP_TARGET_OFFLOAD=mandatory' diagnostic with others.
	* testsuite/libgomp.c/target-51.c: Adjust.
2023-06-14 12:56:35 +02:00
Tobias Burnus
18c8b56c7d OpenMP: Set default-device-var with OMP_TARGET_OFFLOAD=mandatory
OMP_TARGET_OFFLOAD=mandatory handling was before inconsistent. Hence, in
OpenMP 5.2 it was clarified/extended by having implications on the
default-device-var; additionally, omp_initial_device and omp_invalid_device
enum values/PARAMETERs were added; support for it was added
in r13-1066-g1158fe43407568 including aborting for omp_invalid_device and
non-conforming device numbers. Only the mandatory handling was missing.

Namely, while the default-device-var is usually initialized to value 0,
with 'mandatory' it must have the value 'omp_invalid_device' if and only if
zero non-host devices are available. (The OMP_DEFAULT_DEVICE env var
overrides this as it comes semantically after the initialization.)

To achieve this, default-device-var is now initialized to MIN_INT. If
there is no 'mandatory', it is set to 0 directly after env var parsing.
Otherwise, it is updated in gomp_target_init to either 0 or
omp_invalid_device. To ensure INT_MIN is never seen by the user, both
the omp_get_default_device API routine and omp_display_env (user call
and OMP_DISPLAY_ENV env var) call gomp_init_targets_once() in that case.

libgomp/ChangeLog:

	* env.c (gomp_default_icv_values): Init default_device_var to
	an nonconforming value - INT_MIN.
	(initialize_env): After env-var parsing, set default_device_var to
	device 0 unless OMP_TARGET_OFFLOAD=mandatory.
	(omp_display_env): If default_device_var is INT_MIN, call
	gomp_init_targets_once.
	* icv-device.c (omp_get_default_device): Likewise.
	* libgomp.texi (OMP_DEFAULT_DEVICE): Update init description.
	(OpenMP 5.2 Impl. Status): Mark OMP_TARGET_OFFLOAD=mandatory as 'Y'.
	* target.c (resolve_device): Improve error message device-num < 0
	with 'mandatory' and no no-host devices available.
	(gomp_target_init): Set default-device-var if INT_MIN.
	* testsuite/libgomp.c/target-48.c: New test.
	* testsuite/libgomp.c/target-49.c: New test.
	* testsuite/libgomp.c/target-50.c: New test.
	* testsuite/libgomp.c/target-50a.c: New test.
	* testsuite/libgomp.c/target-51.c: New test.
	* testsuite/libgomp.c/target-52.c: New test.
	* testsuite/libgomp.c/target-53.c: New test.
	* testsuite/libgomp.c/target-54.c: New test.
2023-06-14 07:53:02 +02:00
Tobias Burnus
38944ec2a6 OpenMP: Cleanups related to the 'present' modifier
Reduce number of enum values passed to libgomp as
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} have the same semantic as
GOMP_MAP_FORCE_PRESENT (i.e. abort if not present, otherwise ignore);
that's different to GOMP_MAP_ALWAYS_PRESENT_{TO,TOFROM,FROM} which also
abort if not present but copy data when present. This is is a follow-up to
the commit r14-1579-g4ede915d5dde93 done 6 days ago.

Additionally, the commit improves a libgomp run-time and a C/C++ compile-time
error wording and extends testcases a tiny bit.

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_omp_clause_map): Reword error message for
	clearness especially with 'omp target (enter/exit) data.'

gcc/cp/ChangeLog:

	* parser.cc (cp_parser_omp_clause_map): Reword error message for
	clearness especially with 'omp target (enter/exit) data.'
	* semantics.cc (handle_omp_array_sections): Handle
	GOMP_MAP_{ALWAYS_,}PRESENT_{TO,TOFROM,FROM,ALLOC} enum values.

gcc/ChangeLog:

	* gimplify.cc (gimplify_adjust_omp_clauses_1): Use
	GOMP_MAP_FORCE_PRESENT for 'present alloc' implicit mapping.
	(gimplify_adjust_omp_clauses): Change
	GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} to the equivalent
	GOMP_MAP_FORCE_PRESENT.
	* omp-low.cc (lower_omp_target): Remove handling of no-longer valid
	GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC}; update map kinds used for
	to/from clauses with present modifier.

include/ChangeLog:

	* gomp-constants.h (enum gomp_map_kind): Change the enum values
	GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} to be compiler only.
	(GOMP_MAP_PRESENT_P): Update to include also GOMP_MAP_FORCE_PRESENT.

libgomp/ChangeLog:

	* target.c (gomp_to_device_kind_p, gomp_map_vars_internal): Replace
	GOMP_MAP_PRESENT_{FROM,TO,TOFROM,ACLLOC} by GOMP_MAP_FORCE_PRESENT.
	(gomp_map_vars_internal, gomp_update): Likewise; unify and improve
	error message.
	* testsuite/libgomp.c-c++-common/target-present-2.c: Update for
	changed error message.
	* testsuite/libgomp.fortran/target-present-1.f90: Likewise.
	* testsuite/libgomp.fortran/target-present-2.f90: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/present-1.c: Likewise.
	* testsuite/libgomp.c-c++-common/target-present-1.c: Likewise and
	extend testcase to check that data is copied when needed.
	* testsuite/libgomp.c-c++-common/target-present-3.c: Likewise.
	* testsuite/libgomp.fortran/target-present-3.f90: Likewise.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/defaultmap-4.c: Update scan-tree-dump.
	* c-c++-common/gomp/map-9.c: Likewise.
	* gfortran.dg/gomp/defaultmap-8.f90: Likewise.
	* gfortran.dg/gomp/map-11.f90: Likewise.
	* gfortran.dg/gomp/target-update-1.f90: Likewise.
	* gfortran.dg/gomp/map-12.f90: Likewise; also check original dump.
	* c-c++-common/gomp/map-6.c: Update dg-error and also check
	clause error with 'target (enter/exit) data'.
2023-06-12 18:15:28 +02:00
Tobias Burnus
4ede915d5d openmp: Add support for the 'present' modifier
This implements support for the OpenMP 5.1 'present' modifier, which can be
used in map clauses in the 'target', 'target data', 'target data enter' and
'target data exit' constructs, and in the 'to' and 'from' clauses of the
'target update' construct.  It is also supported in defaultmap.

The modifier triggers a fatal runtime error if the data specified by the
clause is not already present on the target device.  It can also be combined
with 'always' in map clauses.

2023-06-06  Kwok Cheung Yeung  <kcy@codesourcery.com>
	    Tobias Burnus  <tobias@codesourcery.com>

gcc/c/
	* c-parser.cc (c_parser_omp_clause_defaultmap,
	c_parser_omp_clause_map): Parse 'present'.
	(c_parser_omp_clause_to, c_parser_omp_clause_from): Remove.
	(c_parser_omp_clause_from_to): New; parse to/from clauses with
	optional present modifer.
	(c_parser_omp_all_clauses): Update call.
	(c_parser_omp_target_data, c_parser_omp_target_enter_data,
	c_parser_omp_target_exit_data): Handle new map enum values
	for 'present' mapping.

gcc/cp/
	* parser.cc (cp_parser_omp_clause_defaultmap,
	cp_parser_omp_clause_map): Parse 'present'.
	(cp_parser_omp_clause_from_to): New; parse to/from
	clauses with optional 'present' modifier.
	(cp_parser_omp_all_clauses): Update call.
	(cp_parser_omp_target_data, cp_parser_omp_target_enter_data,
	cp_parser_omp_target_exit_data): Handle new enum value for
	'present' mapping.
	* semantics.cc (finish_omp_target): Likewise.

gcc/fortran/
	* dump-parse-tree.cc (show_omp_namelist): Display 'present' map
	modifier.
	(show_omp_clauses): Display 'present' motion modifier for 'to'
	and 'from' clauses.

	* gfortran.h (enum gfc_omp_map_op): Add entries with 'present'
	modifiers.
	(struct gfc_omp_namelist): Add 'present_modifer'.
	* openmp.cc (gfc_match_motion_var_list): New, handles optional
	'present' modifier for to/from clauses.
	(gfc_match_omp_clauses): Call it for to/from clauses; parse 'present'
	in defaultmap and map clauses.
	(resolve_omp_clauses): Allow 'present' modifiers on 'target',
	'target data', 'target enter' and 'target exit'	directives.
	* trans-openmp.cc (gfc_trans_omp_clauses): Apply 'present' modifiers
	to tree node for 'map', 'to' and 'from'	clauses.  Apply 'present' for
	defaultmap.

gcc/
	* gimplify.cc (omp_notice_variable): Apply GOVD_MAP_ALLOC_ONLY flag
	and defaultmap flags if the defaultmap has GOVD_MAP_FORCE_PRESENT flag
	set.
	(omp_get_attachment): Handle map clauses with 'present' modifier.
	(omp_group_base): Likewise.
	(gimplify_scan_omp_clauses): Reorder present maps to come first.
	Set GOVD flags for present defaultmaps.
	(gimplify_adjust_omp_clauses_1): Set map kind for present defaultmaps.
	* omp-low.cc (scan_sharing_clauses): Handle 'always, present' map
	clauses.
	(lower_omp_target): Handle map clauses with 'present' modifier.
	Handle 'to' and 'from' clauses with 'present'.
	* tree-core.h (enum omp_clause_defaultmap_kind): Add
	OMP_CLAUSE_DEFAULTMAP_PRESENT defaultmap kind.
	* tree-pretty-print.cc (dump_omp_clause): Handle 'map', 'to' and
	'from' clauses with 'present' modifier.  Handle present defaultmap.
	* tree.h (OMP_CLAUSE_MOTION_PRESENT): New #define.

include/
	* gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_5): New.
	(GOMP_MAP_FLAG_FORCE): Redefine.
	(GOMP_MAP_FLAG_PRESENT, GOMP_MAP_FLAG_ALWAYS_PRESENT): New.
	(enum gomp_map_kind): Add map kinds with 'present' modifiers.
	(GOMP_MAP_COPY_TO_P, GOMP_MAP_COPY_FROM_P): Evaluate to true for
	map variants with 'present'
	(GOMP_MAP_ALWAYS_TO_P, GOMP_MAP_ALWAYS_FROM_P): Evaluate to true
	for map variants with 'always, present' modifiers.
	(GOMP_MAP_ALWAYS): Redefine.
	(GOMP_MAP_FORCE_P, GOMP_MAP_PRESENT_P): New.

libgomp/
	* libgomp.texi (OpenMP 5.1 Impl. status): Set 'present' support for
	defaultmap to 'Y', add 'Y' entry for 'present' on to/from/map clauses.
	* target.c (gomp_to_device_kind_p): Add map kinds with 'present'
	modifier.
	(gomp_map_vars_existing): Use new GOMP_MAP_FORCE_P macro.
	(gomp_map_vars_internal, gomp_update, gomp_target_rev):
	Emit runtime error if memory region not present.
	* testsuite/libgomp.c-c++-common/target-present-1.c: New test.
	* testsuite/libgomp.c-c++-common/target-present-2.c: New test.
	* testsuite/libgomp.c-c++-common/target-present-3.c: New test.
	* testsuite/libgomp.fortran/target-present-1.f90: New test.
	* testsuite/libgomp.fortran/target-present-2.f90: New test.
	* testsuite/libgomp.fortran/target-present-3.f90: New test.

gcc/testsuite/

	* c-c++-common/gomp/map-6.c: Update dg-error, extend to test for
	duplicated 'present' and extend scan-dump tests for 'present'.
	* gfortran.dg/gomp/defaultmap-1.f90: Update dg-error.
	* gfortran.dg/gomp/map-7.f90: Extend parse and dump test for
	'present'.
	* gfortran.dg/gomp/map-8.f90: Extend for duplicate 'present'
	modifier checking.
	* c-c++-common/gomp/defaultmap-4.c: New test.
	* c-c++-common/gomp/map-9.c: New test.
	* c-c++-common/gomp/target-update-1.c: New test.
	* gfortran.dg/gomp/defaultmap-8.f90: New test.
	* gfortran.dg/gomp/map-11.f90: New test.
	* gfortran.dg/gomp/map-12.f90: New test.
	* gfortran.dg/gomp/target-update-1.f90: New test.
2023-06-06 16:49:22 +02:00
Thomas Schwinge
130c2f3c3a libgomp: Simplify OpenMP reverse offload host <-> device memory copy implementation
... by using the existing 'goacc_asyncqueue' instead of re-coding parts of it.

Follow-up to commit 131d18e928
"libgomp/nvptx: Prepare for reverse-offload callback handling",
and commit ea4b23d9c8
"libgomp: Handle OpenMP's reverse offloads".

	libgomp/
	* target.c (gomp_target_rev): Instead of 'dev_to_host_cpy',
	'host_to_dev_cpy', 'token', take a single 'goacc_asyncqueue'.
	* libgomp.h (gomp_target_rev): Adjust.
	* libgomp-plugin.c (GOMP_PLUGIN_target_rev): Adjust.
	* libgomp-plugin.h (GOMP_PLUGIN_target_rev): Adjust.
	* plugin/plugin-gcn.c (process_reverse_offload): Adjust.
	* plugin/plugin-nvptx.c (rev_off_dev_to_host_cpy)
	(rev_off_host_to_dev_cpy): Remove.
	(GOMP_OFFLOAD_run): Adjust.
2023-05-08 15:58:05 +02:00
Thomas Schwinge
e8fec6998b Add caveat/safeguard to OpenMP: Handle descriptors in target's firstprivate [PR104949]
Follow-up to commit 49d1a2f913
"OpenMP: Handle descriptors in target's firstprivate [PR104949]".

	PR fortran/104949
	libgomp/
	* target.c (gomp_map_vars_internal) <GOMP_MAP_FIRSTPRIVATE>: Add
	caveat/safeguard.
2023-03-24 17:14:54 +01:00
Thomas Schwinge
f8332e52a4 Use 'GOMP_MAP_VARS_TARGET' for OpenACC compute constructs [PR90596]
Thereby considerably simplify the device plugins' 'GOMP_OFFLOAD_openacc_exec',
'GOMP_OFFLOAD_openacc_async_exec' functions: in terms of lines of code, but in
particular conceptually: no more device memory allocation, host to device data
copying, device memory deallocation -- 'GOMP_MAP_VARS_TARGET' does all that for
us.

This depends on commit 2b2340e236
"Allow libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral' data",
where I said that "a use will emerge later", which is this one here.

	PR libgomp/90596
	libgomp/
	* target.c (gomp_map_vars_internal): Allow for
	'param_kind == GOMP_MAP_VARS_OPENACC | GOMP_MAP_VARS_TARGET'.
	* oacc-parallel.c (GOACC_parallel_keyed): Pass
	'GOMP_MAP_VARS_TARGET' to 'goacc_map_vars'.
	* plugin/plugin-gcn.c (alloc_by_agent, gcn_exec)
	(GOMP_OFFLOAD_openacc_exec, GOMP_OFFLOAD_openacc_async_exec):
	Adjust, simplify.
	(gomp_offload_free): Remove.
	* plugin/plugin-nvptx.c (nvptx_exec, GOMP_OFFLOAD_openacc_exec)
	(GOMP_OFFLOAD_openacc_async_exec): Adjust, simplify.
	(cuda_free_argmem): Remove.
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c:
	Adjust.
2023-03-10 18:05:27 +01:00
Thomas Schwinge
2b2340e236 Allow libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral' data
This does *allow*, but under no circumstances is this currently going to be
used: all potentially applicable data is non-'ephemeral', and thus not
considered for 'gomp_coalesce_buf_add' for OpenACC 'async'.  (But a use will
emerge later.)

Follow-up to commit r12-2530-gd88a6951586c7229b25708f4486eaaf4bf4b5bbe
"Don't use libgomp 'cbuf' buffering with OpenACC 'async'", addressing this
TODO comment:

    TODO ... but we could allow CBUF usage for EPHEMERAL data?  (Open question:
    is it more performant to use libgomp CBUF buffering or individual device
    asyncronous copying?)

Ephemeral data is small, and therefore individual device asyncronous copying
does seem dubious -- in particular given that for all those, we'd individually
have to allocate and queue for deallocation a temporary buffer to capture the
ephemeral data.  Instead, just let the 'cbuf' *be* the temporary buffer.

	libgomp/
	* target.c (gomp_copy_host2dev, gomp_map_vars_internal): Allow
	libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral'
	data.
2023-03-10 16:19:53 +01:00
Thomas Schwinge
199867d07b Simplify OpenACC 'no_create' clause implementation
For 'OFFSET_INLINED', 'gomp_map_val' does the right thing, and we may then
simplify the device plugins accordingly.

This is a follow-up to
Subversion r279551 (Git commit a6163563f2)
"Add OpenACC 2.6's no_create",
Subversion r279622 (Git commit 5bcd470bf0)
"Use gomp_map_val for OpenACC host-to-device address translation".

	libgomp/
	* target.c (gomp_map_vars_internal): Use 'OFFSET_INLINED' for
	'GOMP_MAP_IF_PRESENT'.
	* plugin/plugin-gcn.c (gcn_exec, GOMP_OFFLOAD_openacc_exec)
	(GOMP_OFFLOAD_openacc_async_exec): Adjust.
	* plugin/plugin-nvptx.c (nvptx_exec, GOMP_OFFLOAD_openacc_exec)
	(GOMP_OFFLOAD_openacc_async_exec): Likewise.
	* testsuite/libgomp.oacc-c-c++-common/no_create-1.c: Add 'async'
	testing.
	* testsuite/libgomp.oacc-c-c++-common/no_create-2.c: Likewise.
2023-03-10 15:48:43 +01:00
Tobias Burnus
edaf1d6078 libgomp: Fix reverse-offload for GOMP_MAP_TO_PSET
libgomp/
	* target.c (gomp_target_rev): Dereference ptr
	to get device address.
	* testsuite/libgomp.fortran/reverse-offload-5.f90: Add test
	for unallocated allocatable.
2023-02-15 11:21:11 +01:00
Tobias Burnus
c7a9655be6 libgomp: Fix 'target enter data' with always pointer
As GOMP_MAP_ALWAYS_POINTER operates on the previous map item, ensure that
with 'target enter data' both are passed together to gomp_map_vars_internal.

libgomp/ChangeLog:

	* target.c (gomp_map_vars_internal): Add 'i > 0' before doing a
	kind check.
	(GOMP_target_enter_exit_data): If the next map item is
	GOMP_MAP_ALWAYS_POINTER map it together with the current item.
	* testsuite/libgomp.fortran/target-enter-data-3.f90: New test.
2023-02-15 11:18:31 +01:00
Tobias Burnus
0b1ce70a81 libgomp: Fix reverse offload issues
If there is nothing to map, skip the mapping and avoid attempting to
copy 0 bytes from addrs, sizes and kinds.

Additionally, it could happen that a non-allocated address was deallocated,
such as a pointer set, leading to a free for the actual data.

libgomp/
	* target.c (gomp_target_rev): Handle mapnum == 0 and avoid
	freeing not allocated memory.
	* testsuite/libgomp.fortran/reverse-offload-6.f90: New test.
2023-02-03 11:31:53 +01:00
Jakub Jelinek
83ffe9cde7 Update copyright years. 2023-01-16 11:52:17 +01:00
Tobias Burnus
ea4b23d9c8 libgomp: Handle OpenMP's reverse offloads
This commit enabled reverse offload for nvptx such that gomp_target_rev
actually gets called.  And it fills the latter function to do all of
the following: finding the host function to the device func ptr and
copying the arguments to the host, processing the mapping/firstprivate,
calling the host function, copying back the data and freeing as needed.

The data handling is made easier by assuming that all host variables
either existed before (and are in the mapping) or that those are
devices variables not yet available on the host. Thus, the reverse
mapping can do without refcounts etc. Note that the spec disallows
inside a target region device-affecting constructs other than target
plus ancestor device-modifier and it also limits the clauses permitted
on this construct.

For the function addresses, an additional splay tree is used; for
the lookup of mapped variables, the existing splay-tree is used.
Unfortunately, its data structure requires a full walk of the tree;
Additionally, the just mapped variables are recorded in a separate
data structure an extra lookup. While the lookup is slow, assuming
that only few variables get mapped in each reverse offload construct
and that reverse offload is the exception and not performance critical,
this seems to be acceptable.

libgomp/ChangeLog:

	* libgomp.h (struct target_mem_desc): Predeclare; move
	below after 'reverse_splay_tree_node' and add rev_array
	member.
	(struct reverse_splay_tree_key_s, reverse_splay_compare): New.
	(reverse_splay_tree_node, reverse_splay_tree,
	reverse_splay_tree_key): New typedef.
	(struct gomp_device_descr): Add mem_map_rev member.
	* oacc-host.c (host_dispatch): NULL init .mem_map_rev.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices): Claim
	support for GOMP_REQUIRES_REVERSE_OFFLOAD.
	* splay-tree.h (splay_tree_callback_stop): New typedef; like
	splay_tree_callback but returning int not void.
	(splay_tree_foreach_lazy): Define; like splay_tree_foreach but
	taking splay_tree_callback_stop as argument.
	* splay-tree.c (splay_tree_foreach_internal_lazy,
	splay_tree_foreach_lazy): New; but early exit if callback returns
	nonzero.
	* target.c: Instatiate splay_tree_c with splay_tree_prefix 'reverse'.
	(gomp_map_lookup_rev): New.
	(gomp_load_image_to_device): Handle reverse-offload function
	lookup table.
	(gomp_unload_image_from_device): Free devicep->mem_map_rev.
	(struct gomp_splay_tree_rev_lookup_data, gomp_splay_tree_rev_lookup,
	gomp_map_rev_lookup, struct cpy_data, gomp_map_cdata_lookup_int,
	gomp_map_cdata_lookup): New auxiliary structs and functions for
	gomp_target_rev.
	(gomp_target_rev): Implement reverse offloading and its mapping.
	(gomp_target_init): Init current_device.mem_map_rev.root.
	* testsuite/libgomp.fortran/reverse-offload-2.f90: New test.
	* testsuite/libgomp.fortran/reverse-offload-3.f90: New test.
	* testsuite/libgomp.fortran/reverse-offload-4.f90: New test.
	* testsuite/libgomp.fortran/reverse-offload-5.f90: New test.
	* testsuite/libgomp.fortran/reverse-offload-5a.f90: New test without
	mapping of on-device allocated variables.
2022-12-10 13:42:08 +01:00