mirror of
https://gcc.gnu.org/git/gcc.git
synced 2026-02-22 03:46:53 -05:00
723b18ce3dd24ebcbb6426fb40a375642477123f
1335 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
723b18ce3d |
libgomp, amdgcn: Implement Managed Memory
This patch implements "managed" memory for AMD GCN GPUs in OpenMP. It builds on the support added to the NVPTX libgomp for CUDA Managed Memory, a week or two ago. These features were first posted here a few years ago, as part of a larger Unified Shared Memory patch series, and then in a slightly changed version just over a year ago. Hopefully this time the controversial bits have been removed. Since we do not use HIP we cannot use hipMallocManaged, so this patch attempts to replicate the same effect by setting the appropriate attributes. This works on more devices than support proper USM, but still I cannot be sure that the settings are correct for every device out there (I have tested on gfx900, gfx906, gfx908, gfx90a, and gfx1100). The HSA header file update uses the most recent files relicensed for us by AMD, at the time of the first patch posting. Those files have certainly moved on in the upstream sources, but I did not ask to get those relicensed. include/ChangeLog: * hsa.h: Import newer version. * hsa_ext_amd.h: Likewise. * hsa_ext_image.h: Likewise. libgomp/ChangeLog: * Makefile.in: Regenerate. * libgomp-plugin.h (gomp_simple_alloc_init_context): New prototype. (gomp_simple_alloc_register_memory): New prototype. (gomp_simple_alloc): New prototype. (gomp_simple_free): New prototype. (gomp_simple_realloc): New prototype. * libgomp.h (gomp_simple_alloc_init_context): Move to libgomp-plugin.h. (gomp_simple_alloc_register_memory): Likewise. (gomp_simple_alloc): Likewise. (gomp_simple_free): Likewise. (gomp_simple_realloc): Likewise. * libgomp.texi: Update AMD managed memory description. * plugin/Makefrag.am (libgomp_plugin_gcn_la_SOURCES): Add simple-allocator.c and plugin/mutex.c. * plugin/plugin-gcn.c: Include sys/mman.h and unistd.h. (struct hsa_runtime_fn_info): Add hsa_amd_svm_attributes_set_fn. (dump_hsa_system_info): Add HSA_AMD_SYSTEM_INFO_SVM_SUPPORTED and HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT to the GCN_DEBUG output. (init_hsa_runtime_functions): Add hsa_amd_svm_attributes_set. (isa_matches_agent): Add a new error message for the case where the ISA doesn't match but the name does. (managed_ctx): New variable. (managed_heap_create): New function. (GOMP_OFFLOAD_managed_alloc): Likewise. (GOMP_OFFLOAD_managed_free): Likewise. * simple-allocator.c (gomp_fatal): New macro. * testsuite/lib/libgomp.exp (check_effective_target_omp_managedmem): Add amdgcn support checker. (check_effective_target_offload_target_amdgcn_with_xnack): New. * testsuite/libgomp.c-c++-common/requires-4.c: Ignore xnack warning. * testsuite/libgomp.c-c++-common/requires-4a.c: Ignore xnack warning. * testsuite/libgomp.c-c++-common/requires-5.c: Ignore xnack warning. * testsuite/libgomp.c++/alloc-managed-1.C: Add -mxnack=on, if needed. * testsuite/libgomp.c/alloc-managed-1.c: Likewise. * testsuite/libgomp.c/alloc-managed-2.c: Likewise. * testsuite/libgomp.c/alloc-managed-3.c: Likewise. * testsuite/libgomp.c/alloc-managed-4.c: Likewise. * testsuite/libgomp.fortran/alloc-managed-1.f90: Likewise. * plugin/mutex.c: New file. |
||
|
|
05c2ad4a2e |
OpenMP/Fortran: Allow explicit map followed by implicit deep mapping [PR120505]
Consider the following source code, assuming tiles is allocatable:
```
!$omp target enter data map(var%tiles(1)%den1, var%tiles(1)%den2) ! (1)
[...]
!$omp target ! implicitly maps var, which triggers deep mapping of tiles (2)
```
Each omp directive causes a run-time error in libgomp:
(1) libgomp: Mapped array elements must be the same (0x14d729c0 vs 0x14d72a18)
(2) libgomp: Trying to map into device [0x3704ca50..0x3704cb00) object when
[0x3704ca50..0x3704caa8) is already mapped
Regarding (1), the OpenMP spec has the following restriction: "If multiple list
items are explicitly mapped on the same construct and have the same containing
array or have base pointers that share original storage, and if any of the list
items do not have corresponding list items that are present in the device data
environment prior to a task encountering the construct, then the list items must
refer to *the same array elements* of either the containing array or the
implicit array of the base pointers."
Because tiles is allocatable, we cannot prove at compile time that array
elements are the same, so the check is deferred to libgomp. But there the
condition enforcing that all addresses are the same is too strict, so this patch
relaxes it to only check that addresses are sorted in increasing order.
The OpenMP spec allows (2) as long as it is implicit, without extending the
original mapping. So this patch sets the GOMP_MAP_IMPLICIT flag appropriately
on deep maps at compile time to let libgomp know that it is fine.
This patch ensures that such user code is accepted by:
(1) Setting the GOMP_MAP_IMPLICIT flag appropriately on deep maps;
(2) Relaxing the restriction on struct mapping from different containing arrays,
so that the element index need not be the same, instead addresses must be sorted
in increasing order.
This fixes the two errors currently seen when running SPEC HPC clvleaf
benchmark. However, further mapping issues prevent the benchmark from running to
completion.
PR fortran/120505
gcc/ChangeLog:
* omp-low.cc (lower_omp_target): Set GOMP_MAP_IMPLICIT flag.
libgomp/ChangeLog:
* target.c (gomp_map_vars_internal): Allow struct mapping from different
containing array elements as long as adresses are in increasing order.
* testsuite/libgomp.c-c++-common/map-arrayofstruct-2.c: Adjust
dg-output.
* testsuite/libgomp.c-c++-common/map-arrayofstruct-3.c: Likewise.
* testsuite/libgomp.fortran/map-subarray-5.f90: Likewise.
* testsuite/libgomp.fortran/map-subarray-10.f90: New test.
* testsuite/libgomp.fortran/map-subarray-9.f90: New test.
|
||
|
|
0044388575 |
Change the default C++ dialect to gnu++20
On Mon, Nov 03, 2025 at 01:34:28PM -0500, Marek Polacek via Gcc wrote: > I would like us to declare that C++20 is no longer experimental and > change the default dialect to gnu++20. Last time we changed the default > was over 5 years ago in GCC 11: > <https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=0801f419440c14f6772b28f763ad7d40f7f7a580> > and before that in 2015 in GCC 6.1, so this happens roughly every 5 years. > > I had been hoping to move to C++20 in GCC 15 (see bug 113920), but at that time > libstdc++ still had incomplete C++20 support and the compiler had issues to iron > out (mangling of concepts, modules work, etc.). Are we ready now? Is anyone > aware of any blockers? Presumably we still wouldn't enable Modules by default. > > I'm willing to do the work if we decide that it's time to switch the default > C++ dialect (that includes updating cxx-status.html and adding a new caveat to > changes.html). I haven't seen a patch posted for this, so just that something is posted during stage1 if we decide to do it, here is a patch. The patch makes -std=gnu++20 the default C++ dialect and documents that -fmodules is still not implied by that or -std=c++20 and modules support is still experimental. 2025-11-26 Jakub Jelinek <jakub@redhat.com> gcc/ * doc/invoke.texi (gnu++17): Remove comment about the default. (c++20): Remove note about experimental support, except add a note that modules are still experimental and need to be enabled separately. (gnu++20): Likewise. Move here comment about the default. (fcoroutines): Mention it is enabled by default for C++20 and later. * doc/standards.texi: Document that the default for C++ is -std=gnu++20. gcc/c-family/ * c-opts.cc (c_common_init_options): Call set_std_cxx20 rather than set_std_cxx17. * c.opt (std=c++2a): Change description to deprecated option wording. (std=c++20): Remove experimental support part. (std=c++2b): Change description to deprecated option wording. (std=gnu++2a): Likewise. (std=gnu++20): Remove experimental support part. (std=gnu++2b): Change description to deprecated option wording. gcc/testsuite/ * lib/target-supports.exp: Set cxx_default to c++20 rather than c++17. * lib/g++-dg.exp (g++-std-flags): Reorder list to put 20 first and 17 after 26. * g++.dg/debug/pr80461.C (bar): Use v = v + 1; instead of ++v;. * g++.dg/debug/pr94459.C: Add -std=gnu++17 to dg-options. * g++.dg/diagnostic/virtual-constexpr.C: Remove dg-skip-if, instead use { c++11 && c++17_down } effective target instead of c++11. * g++.dg/guality/pr67192.C: Add -std=gnu++17. * g++.dg/torture/pr84961-1.C: Likewise. * g++.dg/torture/pr84961-2.C: Likewise. * g++.dg/torture/pr51482.C (anim_track_bez_wvect::tangent): Cast key_class to int before multiplying it by float. * g++.dg/torture/stackalign/unwind-4.C (foo): Use g_a = g_a + 1; instead of g_a++;. * g++.dg/tree-prof/partition1.C (bar): Use l = l + 1; return l; instead of return ++l;. * obj-c++.dg/exceptions-3.mm: Add -std=gnu++17. * obj-c++.dg/exceptions-5.mm: Likewise. libgomp/ * testsuite/libgomp.c++/atomic-12.C (main): Add ()s around array reference index. * testsuite/libgomp.c++/atomic-13.C: Likewise. * testsuite/libgomp.c++/atomic-8.C: Likewise. * testsuite/libgomp.c++/atomic-9.C: Likewise. * testsuite/libgomp.c++/loop-6.C: Use count = count + 1; return count > 0; instead of return ++count > 0;. * testsuite/libgomp.c++/pr38650.C: Add -std=gnu++17. * testsuite/libgomp.c++/target-lambda-1.C (merge_data_func): Use [=,this] instead of just [=] in lambda captures. * testsuite/libgomp.c-c++-common/target-40.c (f1): Use v += 1; instead of v++;. * testsuite/libgomp.c-c++-common/depend-iterator-2.c: Use v = v + 1; instead of v++. |
||
|
|
2b033568b9 |
libgomp/oacc: fix atomic_capture-3 iteration ordering issues
In r11-3059-g8183ebcdc1c843, Julian fixed a few issues with atomic_capture-2.c relying on iteration order guarantees that do not exist under OpenACC parallelized loops and, notably, do not happen even by accident on AMDGCN. The atomic_capture-3.c testcase was made by copying it from atomic_capture-2.c and adding additional options in commit r12-310-g4cf3b10f27b199, but from an older version of atomic_capture-2.c, which lacked these ordering fixes fixes, so they resurfaced in this test. This patch ports those fixes from atomic_capture-2.c into atomic_capture-3.c. libgomp/ChangeLog: * testsuite/libgomp.oacc-c-c++-common/atomic_capture-3.c: Copy changes in r11-3059-g8183ebcdc1c843 from atomic_capture-2.c. |
||
|
|
bd6d591799 |
OpenMP: C/C++ common testcases for "omp begin declare variant"
gcc/testsuite/ChangeLog * c-c++-common/gomp/delim-declare-variant-1.c: New. * c-c++-common/gomp/delim-declare-variant-2.c: New. * c-c++-common/gomp/delim-declare-variant-3.c: New. * c-c++-common/gomp/delim-declare-variant-4.c: New. * c-c++-common/gomp/delim-declare-variant-5.c: New. * c-c++-common/gomp/delim-declare-variant-6.c: New. * c-c++-common/gomp/delim-declare-variant-7.c: New. * c-c++-common/gomp/delim-declare-variant-8.c: New. * c-c++-common/gomp/delim-declare-variant-9.c: New. libgomp/ChangeLog * testsuite/libgomp.c-c++-common/delim-declare-variant-1.c: New. * testsuite/libgomp.c-c++-common/delim-declare-variant-2.c: New. Co-Authored-By: Tobias Burnus <tburnus@baylibre.com> |
||
|
|
a469cf3df2 |
OpenMP: C++ front end support for "begin declare variant"
This patch implements C++ support for the "begin declare variant" construct. The OpenMP specification is hazy on interaction of this feature with C++ language features. Variant functions in classes are supported but must be defined as members in the class definition, using an unqualified name for the base function which also must be present in that class. Similarly variant functions in a namespace can only be defined in that namespace using an unqualified name for a base function already declared in that namespace. Variants for template functions or inside template classes seem to (mostly) work. gcc/c-family/ChangeLog * c-omp.cc (c_omp_directives): Uncomment "begin declare variant" and "end declare variant". gcc/cp/ChangeLog * cp-tree.h (struct cp_omp_declare_variant_attr): New. (struct saved_scope): Add omp_declare_variant_attribute field. * decl.cc (omp_declare_variant_finalize_one): Add logic to inject "this" parameter for method calls. * parser.cc (cp_parser_skip_to_pragma_omp_end_declare_variant): New. (cp_parser_translation_unit): Handle leftover "begin declare variant" functions. (omp_start_variant_function): New. (omp_finish_variant_function): New. (omp_maybe_record_variant_base): New. (cp_parser_init_declarator): Handle variant functions. (cp_parser_class_specifier): Handle deferred lookup of base functions when the entire class has been seen. (cp_parser_member_declaration): Handle variant functions. (cp_finish_omp_declare_variant): Merge context selectors if in a "begin declare variant" block. (cp_parser_omp_begin): Match "omp begin declare variant". Adjust error messages. (cp_parser_omp_end): Match "omp end declare variant". * parser.h (struct omp_begin_declare_variant_map_entry): New. (struct cp_parser): Add omp_begin_declare_variant_map field. * semantics.cc (finish_translation_unit): Detect unmatched "omp begin declare variant". gcc/testsuite/ChangeLog * g++.dg/gomp/delim-declare-variant-1.C: New. * g++.dg/gomp/delim-declare-variant-2.C: New. * g++.dg/gomp/delim-declare-variant-3.C: New. * g++.dg/gomp/delim-declare-variant-4.C: New. * g++.dg/gomp/delim-declare-variant-5.C: New. * g++.dg/gomp/delim-declare-variant-6.C: New. * g++.dg/gomp/delim-declare-variant-7.C: New. * g++.dg/gomp/delim-declare-variant-40.C: New. * g++.dg/gomp/delim-declare-variant-41.C: New. * g++.dg/gomp/delim-declare-variant-50.C: New. * g++.dg/gomp/delim-declare-variant-51.C: New. * g++.dg/gomp/delim-declare-variant-52.C: New. * g++.dg/gomp/delim-declare-variant-70.C: New. * g++.dg/gomp/delim-declare-variant-71.C: New. libgomp/ * testsuite/libgomp.c++/bdv_module1.C: New. * testsuite/libgomp.c++/bdv_module1_main.C: New. * testsuite/libgomp.c++/bdv_module2.C: New. * testsuite/libgomp.c++/bdv_module2_impl.C: New. * testsuite/libgomp.c++/bdv_module2_main.C: New. * testsuite/libgomp.c++/bdv_module3.C: New. * testsuite/libgomp.c++/bdv_module3_impl.C: New. * testsuite/libgomp.c++/bdv_module3_main.C: New. * testsuite/libgomp.c++/delim-declare-variant-1.C: New. * testsuite/libgomp.c++/delim-declare-variant-2.C: New. * testsuite/libgomp.c++/delim-declare-variant-7.C: New. Co-Authored-By: Julian Brown <julian@codesourcery.com> Co-Authored-By: waffl3x <waffl3x@baylibre.com> |
||
|
|
10d9df13f9 |
libgomp: Fix race condition data-2{,-lib}.c testcase
In the testcases, the kernels scheduled on queues 11, 12, 13, 14 have data dependencies on, respectively, 'b', 'c', 'd', and 'e', as they write to them. However, they also have a data dependency on 'a' and 'N', as they read those. Previously, the testcases exited 'a' on queue 10 and 'N' on queue 15, meaning that it was possible for the aforementioned kernels to execute and to have 'a' and 'N' pulled under their feet. This patch adds waits for each of the kernels onto queue 10 before freeing 'a', guaranteeing that 'a' outlives the kernels, and the same on 'N'. libgomp/ChangeLog: * testsuite/libgomp.oacc-c-c++-common/data-2-lib.c (explanatory header): Fix typo. (main): Insert waits on kernels reading 'a' into queue 10 before exiting 'a', and waits on kernels reading 'N' into queue 15 before exiting 'N'. * testsuite/libgomp.oacc-c-c++-common/data-2.c: Ditto. |
||
|
|
f9c01c47f8 |
ipa: Remove LTO requirement for builtin callback carriers.
Due to the if statement in ipa_compute_jump_functions_for_bb, callback edges were never constructed for builtin functions unless LTO was enabled. This patch corrects this behavior, allowing GCC to optimize callbacks more broadly. It also extends our testing capabilities. gcc/ChangeLog: * attr-callback.cc (callback_edge_callee_has_attr): New function. * attr-callback.h (callback_edge_callee_has_attr): New function decl. * ipa-prop.cc (ipa_compute_jump_functions_for_bb): Don't skip callback carriers when calculating jump functions. libgomp/ChangeLog: * testsuite/libgomp.c/ipcp-cb-spec1.c: Remove LTO requirement. * testsuite/libgomp.c/ipcp-cb-spec2.c: Likewise. * testsuite/libgomp.c/ipcp-cb1.c: Likewise. Signed-off-by: Josef Melcr <josef.melcr@suse.com> |
||
|
|
edc821b60c |
OpenMP/OpenACC tests. vs C++26
OpenMP/OpenACC array sections, generally expr[expr:expr] or
expr[expr:expr:expr] can have any of the exprs between [ and ]
omitted, low-bound (first defaults to 0, last (stride) defaults to
1 and the middle (length) for some arrays defaults to
ceil((size − lower_bound)/stride).
People have been writing this for years without spaces between [ and :
and : and ] when that expr has been omitted, but guess for C++26
one needs to add a space. I think [ :: ] isn't going to be parsed
as the same as [ : : ] either.
gcc/testsuite/
* c-c++-common/goacc/cache-3-1.c: Add dg-skip-if for c++26.
* g++.dg/goacc/data-clause-2.C: Likewise.
* g++.dg/gomp/allocate-3.C: Likewise.
* c-c++-common/gomp/affinity-2.c: Use { c || c++23_down } effective
target.
* c-c++-common/goacc/cache-3-2.c: Replace [: in OpenMP or OpenACC
pragmas or attributes with [ : and :] with : ].
* c-c++-common/goacc/data-clause-1.c: Likewise.
* c-c++-common/goacc/data-clause-2.c: Likewise.
* c-c++-common/goacc/data-clause-duplicate-1.c: Likewise.
* c-c++-common/goacc/mdc-2.c: Likewise.
* c-c++-common/goacc/readonly-1.c: Likewise.
* c-c++-common/gomp/allocate-4.c: Likewise.
* c-c++-common/gomp/clauses-3.c: Likewise.
* c-c++-common/gomp/declare-mapper-3.c: Likewise.
* c-c++-common/gomp/depend-1.c: Likewise.
* c-c++-common/gomp/depend-2.c: Likewise.
* c-c++-common/gomp/depend-3.c: Likewise.
* c-c++-common/gomp/depend-4.c: Likewise.
* c-c++-common/gomp/depend-5.c: Likewise.
* c-c++-common/gomp/depend-6.c: Likewise.
* c-c++-common/gomp/dispatch-1.c: Likewise.
* c-c++-common/gomp/loop-5.c: Likewise.
* c-c++-common/gomp/map-1.c: Likewise.
* c-c++-common/gomp/map-2.c: Likewise.
* c-c++-common/gomp/map-4.c: Likewise.
* c-c++-common/gomp/map-7.c: Likewise.
* c-c++-common/gomp/pr100902-1.c: Likewise.
* c-c++-common/gomp/pr103642.c: Likewise.
* c-c++-common/gomp/pr120180-1.c: Likewise.
* c-c++-common/gomp/pr61486-1.c: Likewise.
* c-c++-common/gomp/pr81006.c: Likewise.
* c-c++-common/gomp/pr91920.c: Likewise.
* c-c++-common/gomp/pr96867.c: Likewise.
* c-c++-common/gomp/pr99928-16.c: Likewise.
* c-c++-common/gomp/reduction-1.c: Likewise.
* c-c++-common/gomp/scan-1.c: Likewise.
* c-c++-common/gomp/target-data-1.c: Likewise.
* c-c++-common/gomp/target-enter-data-1.c: Likewise.
* c-c++-common/gomp/target-has-device-addr-1.c: Likewise.
* c-c++-common/gomp/target-implicit-map-2.c: Likewise.
* c-c++-common/gomp/target-map-iterators-1.c: Likewise.
* c-c++-common/gomp/target-map-iterators-3.c: Likewise.
* c-c++-common/gomp/target-update-iterators-1.c: Likewise.
* c-c++-common/gomp/target-update-iterators-3.c: Likewise.
* g++.dg/goacc/cache-3-1.C: Likewise.
* g++.dg/goacc/cache-3-2.C: Likewise.
* g++.dg/goacc/data-clause-1.C: Likewise.
* g++.dg/goacc/mdc.C: Likewise.
* g++.dg/gomp/array-section-2.C: Likewise.
* g++.dg/gomp/bad-array-section-10.C: Likewise.
* g++.dg/gomp/bad-array-section-11.C: Likewise.
* g++.dg/gomp/bad-array-section-9.C: Likewise.
* g++.dg/gomp/declare-mapper-1.C: Likewise.
* g++.dg/gomp/declare-mapper-2.C: Likewise.
* g++.dg/gomp/depend-1.C: Likewise.
* g++.dg/gomp/depend-2.C: Likewise.
* g++.dg/gomp/ind-base-3.C: Likewise.
* g++.dg/gomp/map-1.C: Likewise.
* g++.dg/gomp/map-2.C: Likewise.
* g++.dg/gomp/map-ptrmem-1.C: Likewise.
* g++.dg/gomp/map-ptrmem-2.C: Likewise.
* g++.dg/gomp/member-array-2.C: Likewise.
* g++.dg/gomp/target-this-3.C: Likewise.
* g++.dg/gomp/target-this-4.C: Likewise.
libgomp/
* testsuite/libgomp.c++/allocate-1.C: Replace [: in OpenMP or OpenACC
pragmas or attributes with [ : and :] with : ].
* testsuite/libgomp.c++/baseptrs-3.C: Likewise.
* testsuite/libgomp.c++/baseptrs-5.C: Likewise.
* testsuite/libgomp.c++/class-array-1.C: Likewise.
* testsuite/libgomp.c++/examples-4/target_data-5.C: Likewise.
* testsuite/libgomp.c++/lvalue-tofrom-2.C: Likewise.
* testsuite/libgomp.c++/pr101544-1.C: Likewise.
* testsuite/libgomp.c++/pr108286.C: Likewise.
* testsuite/libgomp.c++/reduction-10.C: Likewise.
* testsuite/libgomp.c++/reduction-11.C: Likewise.
* testsuite/libgomp.c++/reduction-12.C: Likewise.
* testsuite/libgomp.c++/reduction-5.C: Likewise.
* testsuite/libgomp.c++/reduction-6.C: Likewise.
* testsuite/libgomp.c++/reduction-7.C: Likewise.
* testsuite/libgomp.c++/reduction-8.C: Likewise.
* testsuite/libgomp.c++/reduction-9.C: Likewise.
* testsuite/libgomp.c++/target-18.C: Likewise.
* testsuite/libgomp.c++/target-19.C: Likewise.
* testsuite/libgomp.c++/target-2.C: Likewise.
* testsuite/libgomp.c++/target-22.C: Likewise.
* testsuite/libgomp.c++/target-23.C: Likewise.
* testsuite/libgomp.c++/target-9.C: Likewise.
* testsuite/libgomp.c++/target-flex-100.C: Likewise.
* testsuite/libgomp.c++/target-flex-101.C: Likewise.
* testsuite/libgomp.c++/target-flex-12.C: Likewise.
* testsuite/libgomp.c++/target-flex-2003.C: Likewise.
* testsuite/libgomp.c++/target-flex-30.C: Likewise.
* testsuite/libgomp.c++/target-flex-300.C: Likewise.
* testsuite/libgomp.c++/target-flex-32.C: Likewise.
* testsuite/libgomp.c++/target-flex-33.C: Likewise.
* testsuite/libgomp.c++/target-flex-41.C: Likewise.
* testsuite/libgomp.c++/target-flex-60.C: Likewise.
* testsuite/libgomp.c++/target-flex-61.C: Likewise.
* testsuite/libgomp.c++/target-flex-62.C: Likewise.
* testsuite/libgomp.c++/target-flex-80.C: Likewise.
* testsuite/libgomp.c++/target-flex-81.C: Likewise.
* testsuite/libgomp.c++/target-has-device-addr-7.C: Likewise.
* testsuite/libgomp.c++/target-in-reduction-1.C: Likewise.
* testsuite/libgomp.c++/target-in-reduction-2.C: Likewise.
* testsuite/libgomp.c++/target-lambda-1.C: Likewise.
* testsuite/libgomp.c++/target-lambda-3.C: Likewise.
* testsuite/libgomp.c++/target-map-class-1.C: Likewise.
* testsuite/libgomp.c++/target-std__array-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__bitset-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__deque-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__flat_map-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__flat_multimap-concurrent.C:
Likewise.
* testsuite/libgomp.c++/target-std__flat_multiset-concurrent.C:
Likewise.
* testsuite/libgomp.c++/target-std__flat_set-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__forward_list-concurrent.C:
Likewise.
* testsuite/libgomp.c++/target-std__list-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__map-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__multimap-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__multiset-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__set-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__span-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__unordered_map-concurrent.C:
Likewise.
* testsuite/libgomp.c++/target-std__unordered_multimap-concurrent.C:
Likewise.
* testsuite/libgomp.c++/target-std__unordered_multiset-concurrent.C:
Likewise.
* testsuite/libgomp.c++/target-std__unordered_set-concurrent.C:
Likewise.
* testsuite/libgomp.c++/target-std__valarray-1.C: Likewise.
* testsuite/libgomp.c++/target-std__valarray-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__vector-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-this-3.C: Likewise.
* testsuite/libgomp.c++/target-this-4.C: Likewise.
* testsuite/libgomp.c++/target-virtual-1.C: Likewise.
* testsuite/libgomp.c++/task-reduction-11.C: Likewise.
* testsuite/libgomp.c++/task-reduction-12.C: Likewise.
* testsuite/libgomp.c++/task-reduction-13.C: Likewise.
* testsuite/libgomp.c++/task-reduction-17.C: Likewise.
* testsuite/libgomp.c++/task-reduction-18.C: Likewise.
* testsuite/libgomp.c++/task-reduction-19.C: Likewise.
* testsuite/libgomp.c++/task-reduction-4.C: Likewise.
* testsuite/libgomp.c++/task-reduction-5.C: Likewise.
* testsuite/libgomp.c++/task-reduction-6.C: Likewise.
* testsuite/libgomp.c++/task-reduction-7.C: Likewise.
* testsuite/libgomp.c++/taskloop-reduction-2.C: Likewise.
* testsuite/libgomp.c++/taskloop-reduction-3.C: Likewise.
* testsuite/libgomp.c++/taskloop-reduction-4.C: Likewise.
* testsuite/libgomp.c-c++-common/allocate-1.c: Likewise.
* testsuite/libgomp.c-c++-common/allocate-3.c: Likewise.
* testsuite/libgomp.c-c++-common/baseptrs-2.c: Likewise.
* testsuite/libgomp.c-c++-common/dispatch-1.c: Likewise.
* testsuite/libgomp.c-c++-common/dispatch-2.c: Likewise.
* testsuite/libgomp.c-c++-common/interop-2.c: Likewise.
* testsuite/libgomp.c-c++-common/matrix-omp-target-teams-distribute-parallel-for-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/ptr-attach-1.c: Likewise.
* testsuite/libgomp.c-c++-common/ptr-attach-2.c: Likewise.
* testsuite/libgomp.c-c++-common/refcount-1.c: Likewise.
* testsuite/libgomp.c-c++-common/struct-elem-4.c: Likewise.
* testsuite/libgomp.c-c++-common/target-2.c: Likewise.
* testsuite/libgomp.c-c++-common/target-has-device-addr-1.c: Likewise.
* testsuite/libgomp.c-c++-common/target-implicit-map-2.c: Likewise.
* testsuite/libgomp.c-c++-common/target-implicit-map-5.c: Likewise.
* testsuite/libgomp.c-c++-common/target-in-reduction-1.c: Likewise.
* testsuite/libgomp.c-c++-common/target-in-reduction-2.c: Likewise.
* testsuite/libgomp.c-c++-common/target-map-iterators-1.c: Likewise.
* testsuite/libgomp.c-c++-common/target-map-iterators-2.c: Likewise.
* testsuite/libgomp.c-c++-common/target-map-iterators-3.c: Likewise.
* testsuite/libgomp.c-c++-common/target-map-zlas-1.c: Likewise.
* testsuite/libgomp.c-c++-common/target-update-iterators-1.c: Likewise.
* testsuite/libgomp.c-c++-common/target-update-iterators-2.c: Likewise.
* testsuite/libgomp.c-c++-common/target-update-iterators-3.c: Likewise.
* testsuite/libgomp.c-c++-common/task-reduction-11.c: Likewise.
* testsuite/libgomp.c-c++-common/task-reduction-12.c: Likewise.
* testsuite/libgomp.c-c++-common/task-reduction-16.c: Likewise.
* testsuite/libgomp.c-c++-common/task-reduction-3.c: Likewise.
* testsuite/libgomp.c-c++-common/task-reduction-7.c: Likewise.
* testsuite/libgomp.c-c++-common/task-reduction-9.c: Likewise.
* testsuite/libgomp.c-c++-common/taskloop-reduction-2.c: Likewise.
* testsuite/libgomp.c-c++-common/teams-nteams-icv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-16.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-6.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c: Likewise.
|
||
|
|
62174ec27b |
openmp, nvptx: ompx_gnu_managed_mem_alloc
This adds support for using Cuda Managed Memory with omp_alloc. AMD support will be added in a future patch. There is one new predefined allocator, "ompx_gnu_managed_mem_alloc", plus a corresponding memory space, which can be used to allocate memory in the "managed" space. The nvptx plugin is modified to make the necessary Cuda calls, via two new (optional) plugin interfaces. gcc/fortran/ChangeLog: * openmp.cc (is_predefined_allocator): Use GOMP_OMP_PREDEF_ALLOC_MAX and GOMP_OMPX_PREDEF_ALLOC_MIN/MAX instead of hardcoded values in the comment. include/ChangeLog: * cuda/cuda.h (cuMemAllocManaged): Add declaration and related CU_MEM_ATTACH_GLOBAL flag. * gomp-constants.h (GOMP_OMPX_PREDEF_ALLOC_MAX): Update to 201. (GOMP_OMP_PREDEF_MEMSPACE_MAX): New constant. (GOMP_OMPX_PREDEF_MEMSPACE_MIN): New constant. (GOMP_OMPX_PREDEF_MEMSPACE_MAX): New constant. libgomp/ChangeLog: * allocator.c (ompx_gnu_max_predefined_alloc): Update to ompx_gnu_managed_mem_alloc. (_Static_assert): Fix assertion messages for allocators and add new assertions for memspace constants. (omp_max_predefined_mem_space): New define. (ompx_gnu_min_predefined_mem_space): New define. (ompx_gnu_max_predefined_mem_space): New define. (MEMSPACE_ALLOC): Add check for non-standard memspaces. (MEMSPACE_CALLOC): Likewise. (MEMSPACE_REALLOC): Likewise. (MEMSPACE_VALIDATE): Likewise. (predefined_ompx_gnu_alloc_mapping): Add ompx_gnu_managed_mem_space. (omp_init_allocator): Add ompx_gnu_managed_mem_space validation. * config/gcn/allocator.c (gcn_memspace_alloc): Add check for non-standard memspaces. (gcn_memspace_calloc): Likewise. (gcn_memspace_realloc): Likewise. (gcn_memspace_validate): Update to validate standard vs non-standard memspaces. * config/linux/allocator.c (linux_memspace_alloc): Add managed memory space handling. (linux_memspace_calloc): Likewise. (linux_memspace_free): Likewise. (linux_memspace_realloc): Likewise (returns NULL for fallback). * config/nvptx/allocator.c (nvptx_memspace_alloc): Add check for non-standard memspaces. (nvptx_memspace_calloc): Likewise. (nvptx_memspace_realloc): Likewise. (nvptx_memspace_validate): Update to validate standard vs non-standard memspaces. * env.c (parse_allocator): Add ompx_gnu_managed_mem_alloc, ompx_gnu_managed_mem_space, and some static asserts so I don't forget them again. * libgomp-plugin.h (GOMP_OFFLOAD_managed_alloc): New declaration. (GOMP_OFFLOAD_managed_free): New declaration. * libgomp.h (gomp_managed_alloc): New declaration. (gomp_managed_free): New declaration. (struct gomp_device_descr): Add managed_alloc_func and managed_free_func fields. * libgomp.texi: Document ompx_gnu_managed_mem_alloc and ompx_gnu_managed_mem_space, add C++ template documentation, and describe NVPTX and AMD support. * omp.h.in: Add ompx_gnu_managed_mem_space and ompx_gnu_managed_mem_alloc enumerators, and gnu_managed_mem C++ allocator template. * omp_lib.f90.in: Add Fortran bindings for new allocator and memory space. * omp_lib.h.in: Likewise. * plugin/cuda-lib.def: Add cuMemAllocManaged. * plugin/plugin-nvptx.c (nvptx_alloc): Add managed parameter to support cuMemAllocManaged. (GOMP_OFFLOAD_alloc): Move contents to ... (cleanup_and_alloc): ... this new function, and add managed support. (GOMP_OFFLOAD_managed_alloc): New function. (GOMP_OFFLOAD_managed_free): New function. * target.c (gomp_managed_alloc): New function. (gomp_managed_free): New function. (gomp_load_plugin_for_device): Load optional managed_alloc and managed_free plugin APIs. * testsuite/lib/libgomp.exp: Add check_effective_target_omp_managedmem. * testsuite/libgomp.c++/alloc-managed-1.C: New test. * testsuite/libgomp.c/alloc-managed-1.c: New test. * testsuite/libgomp.c/alloc-managed-2.c: New test. * testsuite/libgomp.c/alloc-managed-3.c: New test. * testsuite/libgomp.c/alloc-managed-4.c: New test. * testsuite/libgomp.fortran/alloc-managed-1.f90: New test. Co-authored-by: Kwok Cheung Yeung <kcyeung@baylibre.com> Co-authored-by: Thomas Schwinge <tschwinge@baylibre.com> |
||
|
|
2de6462c38 |
libgomp.{c-c++-common,fortran}/target-is-accessible-1.c: Fix testcases for omp_default_device [P119677]
Commit r16-5188-g5da963d988e8ea added omp_default_device such that -5 became a conforming device number, but the tests used them to test for as non-conforming number; now -6 is used. libgomp/ChangeLog: PR libgomp/119677 * testsuite/libgomp.c-c++-common/target-is-accessible-1.c: Modify test as -5 is now a conforming device number. * testsuite/libgomp.fortran/target-is-accessible-1.f90: Likewise. |
||
|
|
5da963d988 |
OpenMP: Add omp_default_device named constant [PR119677]
OpenMP TR 14 (OpenMP 6.1) adds omp_default_device < -1 as named constant alongside omp_initial_device and omp_default_device. GCC supports it already internally via GOMP_DEVICE_DEFAULT_OMP_61, but this patch now adds the omp_default_device enum/PARAMETER to omp.h / omp_lib. Note that PR119677 requests some cleanups, which still have to be done. PR libgomp/119677 gcc/fortran/ChangeLog: * intrinsic.texi (OpenMP Modules): Add omp_default_device. * openmp.cc (gfc_resolve_omp_context_selector): Accept omp_default_device as conforming device number. libgomp/ChangeLog: * omp.h.in (omp_default_device): New enum value. * omp_lib.f90.in: New parameter. * omp_lib.h.in: Likewise * target.c (gomp_get_default_device): New. Split off from ... (resolve_device): ... here; call it. (omp_target_alloc, omp_target_free, omp_target_is_present, omp_target_memcpy_check, omp_target_memset, omp_target_memset_async, omp_target_associate_ptr, omp_get_mapped_ptr, omp_target_is_accessible, omp_pause_resource, omp_get_uid_from_device): Handle omp_default_device. * testsuite/libgomp.c/device_uid.c: Likewise. * testsuite/libgomp.fortran/device_uid.f90: Likewise. * testsuite/libgomp.c-c++-common/omp-default-device.c: New test. * testsuite/libgomp.fortran/omp-default-device.f90: New test. |
||
|
|
90f2ab4b6e |
libgomp.c++/target-std__multimap-concurrent.C: Fix USM memory freeing
Fix the unified-shared memory test,
libgomp.c++/target-std__multimap-concurrent-usm.C
added in commit r16-1010-g83ca283853f195
libgomp: Add testcases for concurrent access to standard C++ containers
on offload targets, a number of USM variants
This tests includes the actual code of target-std__multimap-concurrent.C.
The issue is that multimap.insert allocates memory – which is freed by
the destructor. However, if the memory is allocated on a device
('insert'), it also needs to be freed there ('clear') as in general
freeing device-allocated memory is not possible on the host.
libgomp/ChangeLog:
* testsuite/libgomp.c++/target-std__multimap-concurrent.C: Fix memory
freeing of device allocated memory with USM.
|
||
|
|
28d20a591d |
libgomp.fortran/omp_target_memset.f90 - Avoid implicit mapping by an uninit size [PR122543]
In OpenMP, pointers are implicitly mapped - which means for Fortran that their pointer target is also mapped. However, for uninitialized memory, this means that some random pointee with some random amount of memory is copied - in the good case, size == 0, but if not, odd things can happen. Solution: Use 'fptr => null()' before the target mapping or - as done here - declare the pointer inside the region. libgomp/ChangeLog: PR libgomp/122543 * testsuite/libgomp.fortran/omp_target_memset.f90: Move fptr inside the target to avoid implicit mapping of its uninit pointee. * testsuite/libgomp.fortran/omp_target_memset-2.f90: Likewise. |
||
|
|
7a25badc5d |
Fix 'libgomp.c/pr122281.c' for non-USM offloading execution [PR122281]
... where it currently runs into:
libgomp: cuCtxSynchronize error: an illegal memory access was encountered
... for nvptx, or similarly for GCN:
Memory access fault by GPU node-1 (Agent handle: 0x34d77290) on address 0x7fff3c553000. Reason: Page not present or supervisor privilege.
Fix-up for commit r16-4961-ge2cbcd1b27c0da92bdcd96664064d3d0c1d44e6f
"Fix gimple_copy for OpenMP atomic load/store [PR122281, PR105001]".
PR libgomp/122281
libgomp/
* testsuite/libgomp.c/pr122281.c: Fix for non-USM offloading
execution.
|
||
|
|
e2cbcd1b27 |
Fix gimple_copy for OpenMP atomic load/store [PR122281, PR105001]
PR libgomp/122281 PR middle-end/105001 gcc/ChangeLog: * gimple.cc (gimple_copy): Add missing unshare_expr for GIMPLE_OMP_ATOMIC_LOAD and GIMPLE_OMP_ATOMIC_STORE. |
||
|
|
9e5a9aa490 |
libgomp: fine-grained pinned memory allocator
This patch introduces a new custom memory allocator for use with pinned memory (in the case where the Cuda allocator isn't available). In future, this allocator will also be used for Managed Memory. Both memories are incompatible with the system malloc because allocated memory cannot share a page with memory allocated for other purposes. This means that small allocations will no longer consume an entire page of pinned memory. Unfortunately, it also means that pinned memory pages will never be unmapped (although they may be reused). This isn't a technical limitation; the "free" algorithm could be extended in future, if needed. The implementation is not perfect; there are various corner cases (especially related to extending onto new pages) where allocations and reallocations may be sub-optimal, but it should still be a step forward in support for small allocations. I have considered using libmemkind's "fixed" memory but rejected it for three reasons: 1) libmemkind may not always be present at runtime, 2) there's no currently documented means to extend a "fixed" kind one page at a time (although the code appears to have an undocumented function that may do the job, and/or extending libmemkind to support the MAP_LOCKED mmap flag with its regular kinds would be straight-forward), 3) Managed Memory benefits from having the metadata located in different memory and using an external implementation makes it hard to guarantee this. libgomp/ChangeLog: * Makefile.am (libgomp_la_SOURCES): Add simple-allocator.c. * Makefile.in: Regenerate. * basic-allocator.c: Mention simple-allocator in the comment. * config/linux/allocator.c: Include unistd.h. (pin_ctx): New variable. (ctxlock): New variable. (linux_init_pin_ctx): New function. (linux_memspace_alloc): Use simple-allocator for pinned memory. (linux_memspace_free): Likewise. (linux_memspace_realloc): Likewise. * libgomp.h (gomp_simple_alloc_init_context): New prototype. (gomp_simple_alloc_register_memory): New prototype. (gomp_simple_alloc): New prototype. (gomp_simple_free): New prototype. (gomp_simple_realloc): New prototype. * libgomp.texi: Update pinned memory trait documentation. * testsuite/libgomp.c/alloc-pinned-8.c: New test. * simple-allocator.c: New file. |
||
|
|
3b8d9d579c |
libgomp, nvptx: Cuda pinned memory
Use Cuda to pin memory, instead of Linux mlock, when available. There are two advantages: firstly, this gives a significant speed boost for NVPTX offloading, and secondly, it side-steps the usual OS ulimit/rlimit setting. The design adds a device independent plugin API for allocating pinned memory, and then implements it for NVPTX. At present, the other supported devices do not have equivalent capabilities (or requirements). libgomp/ChangeLog: * config/linux/allocator.c: Include assert.h. (using_device_for_page_locked): New variable. (linux_memspace_alloc): Add init0 parameter. Support device pinning. (linux_memspace_calloc): Set init0 to true. (linux_memspace_free): Support device pinning. (linux_memspace_realloc): Support device pinning. (MEMSPACE_ALLOC): Set init0 to false. * libgomp-plugin.h (GOMP_OFFLOAD_page_locked_host_alloc): New prototype. (GOMP_OFFLOAD_page_locked_host_free): Likewise. * libgomp.h (gomp_page_locked_host_alloc): Likewise. (gomp_page_locked_host_free): Likewise. (struct gomp_device_descr): Add page_locked_host_alloc_func and page_locked_host_free_func. * libgomp.texi: Adjust the docs for the pinned trait. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_page_locked_host_alloc): New function. (GOMP_OFFLOAD_page_locked_host_free): Likewise. * target.c (device_for_page_locked): New variable. (get_device_for_page_locked): New function. (gomp_page_locked_host_alloc): Likewise. (gomp_page_locked_host_free): Likewise. (gomp_load_plugin_for_device): Add page_locked_host_alloc and page_locked_host_free. * testsuite/libgomp.c/alloc-pinned-1.c: Change expectations for NVPTX devices. * testsuite/libgomp.c/alloc-pinned-2.c: Likewise. * testsuite/libgomp.c/alloc-pinned-3.c: Likewise. * testsuite/libgomp.c/alloc-pinned-4.c: Likewise. * testsuite/libgomp.c/alloc-pinned-5.c: Likewise. * testsuite/libgomp.c/alloc-pinned-6.c: Likewise. Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com> |
||
|
|
7292780932 |
testsuite: Move ipcp-cb* from ipa to libgomp
This patch addresses the incorrectly placed tests, which fail if the testsuite is ran and gcc has not been installed yet, as discussed here: https://gcc.gnu.org/pipermail/gcc-patches/2025-October/698095.html. gcc/testsuite/ChangeLog: * gcc.dg/ipa/ipcp-cb-spec1.c: Moved to libgomp/testsuite/libgomp.c/. * gcc.dg/ipa/ipcp-cb-spec2.c: Likewise. * gcc.dg/ipa/ipcp-cb1.c: Likewise. libgomp/ChangeLog: * testsuite/libgomp.c/ipcp-cb-spec1.c: Moved from gcc/testsuite/gcc.dg/ipa/. * testsuite/libgomp.c/ipcp-cb-spec2.c: Likewise. * testsuite/libgomp.c/ipcp-cb1.c: Likewise. Signed-off-by: Josef Melcr <jmelcr02@gmail.com> |
||
|
|
651df6b43e |
c++, gimplify: Implement C++26 P2795R5 - Erroneous behavior for uninitialized reads: Adjust 'libgomp.c++/{target-flex-101.C,target-std__flat_map-concurrent.C,target-std__flat_multimap-concurrent.C}' [PR114457, PR122268, PR120450]
With commit r16-4212-gf256a13f8aed833fe964a2ba541b7b30ad9b4a76
"c++, gimplify: Implement C++26 P2795R5 - Erroneous behavior for uninitialized reads [PR114457]",
we acquired:
{+FAIL: libgomp.c++/target-flex-101.C (internal compiler error: in assign_temp, at function.cc:990)+}
[-PASS:-]{+FAIL:+} libgomp.c++/target-flex-101.C (test for excess errors)
[-PASS:-]{+UNRESOLVED:+} libgomp.c++/target-flex-101.C [-execution test-]{+compilation failed to produce executable+}
... for GCN, nvptx offloading compilation, and on the other hand:
[-XFAIL:-]{+XPASS:+} libgomp.c++/target-std__flat_map-concurrent.C (internal compiler error[-: in assign_temp, at function.cc:990)-]
[-XFAIL:-]{+XPASS:+} libgomp.c++/target-std__flat_map-concurrent.C (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} libgomp.c++/target-std__flat_map-concurrent.C [-compilation failed to produce executable-]{+execution test+}
[-XFAIL:-]{+XPASS:+} libgomp.c++/target-std__flat_multimap-concurrent.C (internal compiler error[-: in assign_temp, at function.cc:990)-]
[-XFAIL:-]{+XPASS:+} libgomp.c++/target-std__flat_multimap-concurrent.C (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} libgomp.c++/target-std__flat_multimap-concurrent.C [-compilation failed to produce executable-]{+execution test+}
... for GCN offloading compilation (already PASSed for nvptx).
Note that these test cases explicitly use '-std=c++23', so don't undergo the
new C++26 P2795R5 functionality. Yet, comparing before vs. after that commit,
in the 'gimple' dumps (that is, early host compilation), there are a lot of
changes where 'gimple_assign <constructor, [...], {CLOBBER(bob)}, NULL, NULL>'s
and relatedly 'gimple_bind's newly appear/no longer appear elsewhere. This
leads to correspondingly different code at the beginning of offloading
compilation. Why/how that now ('libgomp.c++/target-flex-101.C') vs. before
('libgomp.c++/{target-std__flat_map-concurrent.C,target-std__flat_multimap-concurrent.C}')
translates into 'expand' ICEs, I can't tell.
PR c++/114457
PR c++/122268
PR c++/120450
libgomp/
* testsuite/libgomp.c++/target-flex-101.C: XFAIL GCN, nvptx
offloading compilation.
* testsuite/libgomp.c++/target-std__flat_map-concurrent.C:
Un-XFAIL GCN offloading compilation.
* testsuite/libgomp.c++/target-std__flat_multimap-concurrent.C:
Likewise.
|
||
|
|
e1e5444ff2 |
libgomp.c/declare-variant-4-gfx*: Add missing archs + dg-excess-errors
Add missing tests for gfx* context selectors; mark all but the default-arch declare-variant-4.c with 'dg-excess-errors' to silence libgomp not-found errors (still passing the scan-offload-tree-dump check) - or at least causing just UNRESOLVED errors if the error is "built without library support ... consider compiling for the associated generic architecture". In case the multilib is configured, the result will be an XPASS. libgomp/ChangeLog: * testsuite/libgomp.c/declare-variant-4-gfx10-3-generic.c: Add dg-excess-errors to handle possible missing libgomp multi lib. * testsuite/libgomp.c/declare-variant-4-gfx1030.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx1036.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx11-generic.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx1100.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx1103.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx9-4-generic.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx9-generic.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx900.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx906.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx908.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx90a.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx90c.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx942.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx1031.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1032.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1033.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1034.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1035.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1101.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1102.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1150.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1151.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1152.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1153.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx902.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx904.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx909.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx950.c: New test. |
||
|
|
b3c0e9aadb |
gcn: Add missing GFX9_4_GENERIC, OpenMP context-selector update
The definition for gfx942 and gfx950 missed the GFX9_4_GENERIC
family flag.
For OpenMP context selectors: The t-omp-device file missed the
generic selectors.
Additionally, there is now a note in the OpenMP documentation that
there is a one-to-one match for ISA names, ignoring any compatibility.
For instance, for Nvidia GPUs 'isa("sm_70")' is only true when compiling
for 'sm_70', even though sm < 7.0 code also runs on sm_70 hardware.
And, for AMD GPUs, gfx9-4-generic neither matches 'gfx942'
(even though such generic code runs on gfx942) - nor the reverse
(although all gfx9-4-generic code runs on gfx942).
gcc/ChangeLog:
* config/gcn/gcn-devices.def (gfx942, gfx950): Set generic name
to GFX9_4_GENERIC.
* config/gcn/t-omp-device: Include generic names for OpenMP's
ISA trait.
libgomp/ChangeLog:
* libgomp.texi (OpenMP Context Selectors): Add note that there is
currently an exact match between ISA and compilation, ignoring
compatibilities in both ways.
* testsuite/libgomp.c/declare-variant-4.h: Add missing variant
functions for specific and generic AMD GPUs.
* testsuite/libgomp.c/declare-variant-4-gfx10-3-generic.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx11-generic.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx9-4-generic.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx9-generic.c: New test.
|
||
|
|
ac273977ad |
*: regenerate autotools
libatomic/ChangeLog: * Makefile.in: Regenerate. * aclocal.m4: Regenerate. * configure: Regenerate. * testsuite/Makefile.in: Regenerate. libcc1/ChangeLog: * Makefile.in: Regenerate. * aclocal.m4: Regenerate. libffi/ChangeLog: * Makefile.in: Regenerate. * aclocal.m4: Regenerate. * configure: Regenerate. * include/Makefile.in: Regenerate. * man/Makefile.in: Regenerate. * testsuite/Makefile.in: Regenerate. libgcobol/ChangeLog: * Makefile.in: Regenerate. * aclocal.m4: Regenerate. * configure: Regenerate. libgfortran/ChangeLog: * Makefile.in: Regenerate. * aclocal.m4: Regenerate. * configure: Regenerate. libgm2/ChangeLog: * Makefile.in: Regenerate. * aclocal.m4: Regenerate. * configure: Regenerate. * libm2cor/Makefile.in: Regenerate. * libm2iso/Makefile.in: Regenerate. * libm2log/Makefile.in: Regenerate. * libm2min/Makefile.in: Regenerate. * libm2pim/Makefile.in: Regenerate. libgomp/ChangeLog: * Makefile.in: Regenerate. * aclocal.m4: Regenerate. * configure: Regenerate. * testsuite/Makefile.in: Regenerate. libgrust/ChangeLog: * Makefile.in: Regenerate. * aclocal.m4: Regenerate. * configure: Regenerate. * libformat_parser/Makefile.in: Regenerate. * libproc_macro_internal/Makefile.in: Regenerate. libitm/ChangeLog: * Makefile.in: Regenerate. * aclocal.m4: Regenerate. * configure: Regenerate. * testsuite/Makefile.in: Regenerate. libobjc/ChangeLog: * aclocal.m4: Regenerate. * configure: Regenerate. libphobos/ChangeLog: * Makefile.in: Regenerate. * aclocal.m4: Regenerate. * configure: Regenerate. * libdruntime/Makefile.in: Regenerate. * src/Makefile.in: Regenerate. * testsuite/Makefile.in: Regenerate. libquadmath/ChangeLog: * Makefile.in: Regenerate. * aclocal.m4: Regenerate. * configure: Regenerate. libsanitizer/ChangeLog: * Makefile.in: Regenerate. * aclocal.m4: Regenerate. * asan/Makefile.in: Regenerate. * configure: Regenerate. * hwasan/Makefile.in: Regenerate. * interception/Makefile.in: Regenerate. * libbacktrace/Makefile.in: Regenerate. * lsan/Makefile.in: Regenerate. * sanitizer_common/Makefile.in: Regenerate. * tsan/Makefile.in: Regenerate. * ubsan/Makefile.in: Regenerate. libssp/ChangeLog: * Makefile.in: Regenerate. * aclocal.m4: Regenerate. * configure: Regenerate. libstdc++-v3/ChangeLog: * Makefile.in: Regenerate. * aclocal.m4: Regenerate. * configure: Regenerate. * doc/Makefile.in: Regenerate. * include/Makefile.in: Regenerate. * libsupc++/Makefile.in: Regenerate. * po/Makefile.in: Regenerate. * python/Makefile.in: Regenerate. * src/Makefile.in: Regenerate. * src/c++11/Makefile.in: Regenerate. * src/c++17/Makefile.in: Regenerate. * src/c++20/Makefile.in: Regenerate. * src/c++23/Makefile.in: Regenerate. * src/c++26/Makefile.in: Regenerate. * src/c++98/Makefile.in: Regenerate. * src/experimental/Makefile.in: Regenerate. * src/filesystem/Makefile.in: Regenerate. * src/libbacktrace/Makefile.in: Regenerate. * testsuite/Makefile.in: Regenerate. libvtv/ChangeLog: * Makefile.in: Regenerate. * aclocal.m4: Regenerate. * configure: Regenerate. * testsuite/Makefile.in: Regenerate. lto-plugin/ChangeLog: * Makefile.in: Regenerate. * aclocal.m4: Regenerate. * configure: Regenerate. |
||
|
|
97c1d2fa97 |
OpenMP: Unshare expr in context-selector condition [PR121922]
As the testcase shows, a missing unshare_expr caused that the condition was only evaluated once instead of every time when a 'declare variant' was resolved. PR middle-end/121922 gcc/ChangeLog: * omp-general.cc (omp_dynamic_cond): Use 'unshare_expr' for the user condition. libgomp/ChangeLog: * testsuite/libgomp.c-c++-common/declare-variant-1.c: New test. Co-authored-by: Sandra Loosemore <sloosemore@baylibre.com> |
||
|
|
16d2b8881c |
libgomp: Add Fortran version of acc_copyout_finalize_async and acc_delete_finalize_async
OpenACC 2.5 added several functions for C and Fortran; while
acc_{copyout,delete}{,_finalize,_async} exist for both, for some
reasons only the C version of acc_{copyout,delete}_finalize_async
was actually added, even though the documentation (.texi) and
the .map file listed also the auxiliar Fortran functions!
OpenACC 2.5 added the Fortran version with the following odd
interface: 'type, dimension(:[,:]...)'. In OpenACC 2.6, it
was then updated to the Fortran 2018 syntax:
'type(*), dimension(..)', which is also used in openacc.f90
internally.
This commit now also updates the documentation to the newer
syntax - plus fixes a function-name typo: acc_delete_async_finalize
should have the _async at the end not in the middle!
libgomp/ChangeLog:
* libgomp.map (OACC_2.5): Move previously unimplemented
acc_{copyout,delete}_finalize_async_{32,64,array}_h_ to ...
(OACC_2.6.1): ... here.
* libgomp.texi (acc_copyin, acc_present_or_copyin, acc_create,
acc_present_or_create, acc_copyout, acc_update_device,
acc_update_self, acc_is_present): Use 'type(*), dimension(..)'
instead of 'type, dimension(:[,:]...)' for Fortran.
(acc_delete): Likewise; change acc_delete_async_finalize to
acc_delete_finalize_async.
* openacc.f90 (openacc_internal): Add interfaces for
acc_{copyout,delete}_finalize_async_{{32,64,array}_h,_l}.
(openacc): Add generic interfaces for
acc_copyout_finalize_async and acc_delete_finalize_async.
(acc_{copyout,delete}_finalize_async_{32,64,array}_h): New.
* openacc_lib.h: Add generic interfaces for
acc_copyout_finalize_async and acc_delete_finalize_async.
* testsuite/libgomp.oacc-fortran/pr92970-1.f90: New test.
|
||
|
|
87262627fd |
openmp: Add support for iterators in 'target update' clauses (C/C++)
This adds support for iterators in 'to' and 'from' clauses in the 'target update' OpenMP directive. gcc/c/ * c-parser.cc (c_parser_omp_clause_from_to): Parse 'iterator' modifier. * c-typeck.cc (c_finish_omp_clauses): Finish iterators for to/from clauses. gcc/cp/ * parser.cc (cp_parser_omp_clause_from_to): Parse 'iterator' modifier. * semantics.cc (finish_omp_clauses): Finish iterators for to/from clauses. gcc/ * gimplify.cc (remove_unused_omp_iterator_vars): Display unused variable warning for 'to' and 'from' clauses. (gimplify_scan_omp_clauses): Add argument for iterator loop sequence. Gimplify the clause decl and size into the iterator loop if iterators are used. (gimplify_omp_workshare): Add argument for iterator loops sequence in call to gimplify_scan_omp_clauses. (gimplify_omp_target_update): Call remove_unused_omp_iterator_vars and build_omp_iterators_loops. Add loop sequence as argument when calling gimplify_scan_omp_clauses, gimplify_adjust_omp_clauses and building the Gimple statement. * tree-pretty-print.cc (dump_omp_clause): Call dump_omp_iterators for to/from clauses with iterators. * tree.cc (omp_clause_num_ops): Add extra operand for OMP_CLAUSE_FROM and OMP_CLAUSE_TO. * tree.h (OMP_CLAUSE_HAS_ITERATORS): Add check for OMP_CLAUSE_TO and OMP_CLAUSE_FROM. (OMP_CLAUSE_ITERATORS): Likewise. gcc/testsuite/ * c-c++-common/gomp/target-update-iterators-1.c: New. * c-c++-common/gomp/target-update-iterators-2.c: New. * c-c++-common/gomp/target-update-iterators-3.c: New. libgomp/ * target.c (gomp_update): Call gomp_merge_iterator_maps. Free allocated variables. * testsuite/libgomp.c-c++-common/target-update-iterators-1.c: New. * testsuite/libgomp.c-c++-common/target-update-iterators-2.c: New. * testsuite/libgomp.c-c++-common/target-update-iterators-3.c: New. |
||
|
|
8b8b0eada6 |
openmp: Add support for iterators in map clauses (C/C++)
This adds preliminary support for iterators in map clauses within OpenMP 'target' constructs (which includes constructs such as 'target enter data'). Iterators with non-constant loop bounds are not currently supported. gcc/c/ * c-parser.cc (c_parser_omp_variable_list): Use location of the map expression as the clause location. (c_parser_omp_clause_map): Parse 'iterator' modifier. * c-typeck.cc (c_finish_omp_clauses): Finish iterators. Apply iterators to generated clauses. gcc/cp/ * parser.cc (cp_parser_omp_clause_map): Parse 'iterator' modifier. * semantics.cc (finish_omp_clauses): Finish iterators. Apply iterators to generated clauses. gcc/ * gimple-pretty-print.cc (dump_gimple_omp_target): Print expanded iterator loops. * gimple.cc (gimple_build_omp_target): Add argument for iterator loops sequence. Initialize iterator loops field. * gimple.def (GIMPLE_OMP_TARGET): Set GSS symbol to GSS_OMP_TARGET. * gimple.h (gomp_target): Set GSS symbol to GSS_OMP_TARGET. Add extra field for iterator loops. (gimple_build_omp_target): Add argument for iterator loops sequence. (gimple_omp_target_iterator_loops): New. (gimple_omp_target_iterator_loops_ptr): New. (gimple_omp_target_set_iterator_loops): New. * gimplify.cc (find_var_decl): New. (copy_omp_iterator): New. (remap_omp_iterator_var_1): New. (remap_omp_iterator_var): New. (remove_unused_omp_iterator_vars): New. (struct iterator_loop_info_t): New type. (iterator_loop_info_map_t): New type. (build_omp_iterators_loops): New. (enter_omp_iterator_loop_context_1): New. (enter_omp_iterator_loop_context): New. (enter_omp_iterator_loop_context): New. (exit_omp_iterator_loop_context): New. (gimplify_adjust_omp_clauses): Add argument for iterator loop sequence. Gimplify the clause decl and size into the iterator loop if iterators are used. (gimplify_omp_workshare): Call remove_unused_omp_iterator_vars and build_omp_iterators_loops for OpenMP target expressions. Add loop sequence as argument when calling gimplify_adjust_omp_clauses and building the Gimple statement. * gimplify.h (enter_omp_iterator_loop_context): New prototype. (exit_omp_iterator_loop_context): New prototype. * gsstruct.def (GSS_OMP_TARGET): New. * omp-low.cc (lower_omp_map_iterator_expr): New. (lower_omp_map_iterator_size): New. (finish_omp_map_iterators): New. (lower_omp_target): Add sorry if iterators used with deep mapping. Call lower_omp_map_iterator_expr before assigning to sender ref. Call lower_omp_map_iterator_size before setting the size. Insert iterator loop sequence before the statements for the target clause. * tree-nested.cc (convert_nonlocal_reference_stmt): Walk the iterator loop sequence of OpenMP target statements. (convert_local_reference_stmt): Likewise. (convert_tramp_reference_stmt): Likewise. * tree-pretty-print.cc (dump_omp_iterators): Dump extra iterator information if present. (dump_omp_clause): Call dump_omp_iterators for iterators in map clauses. * tree.cc (omp_clause_num_ops): Add operand for OMP_CLAUSE_MAP. (walk_tree_1): Do not walk last operand of OMP_CLAUSE_MAP. * tree.h (OMP_CLAUSE_HAS_ITERATORS): New. (OMP_CLAUSE_ITERATORS): New. gcc/testsuite/ * c-c++-common/gomp/map-6.c (foo): Amend expected error message. * c-c++-common/gomp/target-map-iterators-1.c: New. * c-c++-common/gomp/target-map-iterators-2.c: New. * c-c++-common/gomp/target-map-iterators-3.c: New. * c-c++-common/gomp/target-map-iterators-4.c: New. libgomp/ * target.c (kind_to_name): New. (gomp_merge_iterator_maps): New. (gomp_map_vars_internal): Call gomp_merge_iterator_maps. Copy address of only the first iteration to target vars. Free allocated variables. * testsuite/libgomp.c-c++-common/target-map-iterators-1.c: New. * testsuite/libgomp.c-c++-common/target-map-iterators-2.c: New. * testsuite/libgomp.c-c++-common/target-map-iterators-3.c: New. Co-authored-by: Andrew Stubbs <ams@baylibre.com> |
||
|
|
04b42c4245 |
Adjust 'libgomp.c++/target-cdtor-{1,2}.C' for 'targetm.cxx.use_aeabi_atexit' [PR119853, PR119854]
Fix-up for commit
|
||
|
|
aa143261bd |
GCN, nvptx offloading: Restrain 'WARNING: program timed out.' while in 'dynamic_cast' only for effective-target 'offload_device' [PR119692]
In PR119692 "C++ 'typeinfo', 'vtable' vs. OpenACC, OpenMP 'target' offloading": > --- Comment #8 from Rainer Orth <ro at gcc dot gnu.org> --- > The last commit made things worse on sparc-sun-solaris2.11: since that one > (dg-timeout 10) I regularly get > > WARNING: libgomp.c++/target-exceptions-bad_cast-1.C (test for excess errors) > program timed out. > FAIL: libgomp.c++/target-exceptions-bad_cast-1.C (test for excess errors) > UNRESOLVED: libgomp.c++/target-exceptions-bad_cast-1.C compilation failed to produce executable > UNRESOLVED: libgomp.c++/target-exceptions-bad_cast-1.C scan-tree-dump-times optimized "gimple_call <__cxa_bad_cast, " 1 > > Before that, the test had no issue. Compiling the test on an unloaded system > usually takes less than 1 sec, but when fully loaded, times can go up. To keep things simple, let's restrict this temporary (yeah...) workaround to apply only for effective-target 'offload_device', just like the 'dg-xfail-run-if' itself. PR target/119692 libgomp/ * testsuite/libgomp.c++/pr119692-1-4.C: '{ dg-timeout 10 { target offload_device } }'. * testsuite/libgomp.c++/pr119692-1-5.C: Likewise. * testsuite/libgomp.c++/target-exceptions-bad_cast-1.C: Likewise. * testsuite/libgomp.c++/target-exceptions-bad_cast-2.C: Likewise. * testsuite/libgomp.oacc-c++/exceptions-bad_cast-1.C: Likewise. * testsuite/libgomp.oacc-c++/exceptions-bad_cast-2.C: Likewise. |
||
|
|
1e35a51825 |
Fortran/OpenACC: Add Fortran support for acc_attach/acc_detach
While C/++ support the routines acc_attach{,_async} and
acc_detach{,_finalize}{,_async} routines since a long time, the Fortran
API routines where only added in OpenACC 3.3.
Unfortunately, they cannot directly be implemented in the library as
GCC will introduce a temporary array descriptor in some cases, which
causes the attempted attachment to the this temporary variable instead
of to the original one.
Therefore, those API routines are handled in a special way in the compiler.
gcc/fortran/ChangeLog:
* trans-stmt.cc (gfc_trans_call_acc_attach_detach): New.
(gfc_trans_call): Call it.
libgomp/ChangeLog:
* libgomp.texi (acc_attach, acc_detach): Update for Fortran
version.
* openacc.f90 (acc_attach{,_async}, acc_detach{,_finalize}{,_async}):
Add.
* openacc_lib.h: Likewise.
* testsuite/libgomp.oacc-fortran/acc-attach-detach-1.f90: New test.
* testsuite/libgomp.oacc-fortran/acc-attach-detach-2.f90: New test.
|
||
|
|
ea43b99537 |
OpenMP: Fix implicit 'declare target' for <ostream>
libstdc++-v3/include/std/ostream contains:
namespace std _GLIBCXX_VISIBILITY(default)
{
...
template<typename _CharT, typename _Traits>
inline basic_ostream<_CharT, _Traits>&
endl(basic_ostream<_CharT, _Traits>& __os)
{ return flush(__os.put(__os.widen('\n'))); }
...
#include <bits/ostream.tcc>
and the latter, libstdc++-v3/include/bits/ostream.tcc, has:
// Inhibit implicit instantiations for required instantiations,
// which are defined via explicit instantiations elsewhere.
#if _GLIBCXX_EXTERN_TEMPLATE
extern template class basic_ostream<char>;
extern template ostream& endl(ostream&);
Before this commit, omp_discover_declare_target_tgt_fn_r marked 'endl'
as (implicitly) declare target - but not the calls in it due to the
'extern' (DECL_EXTERNAL).
Thanks to inlining and as 'endl' is (therefore) not used and, hence,
discarded by the linker; hencet, it works with -O0 and -O1. However,
as the (unused) function still exits, IPA CP (enabled by -O2) will try
to do constant-value propagation and fails as the definition of 'widen'
is not available.
Solution is to still walk 'endl' despite being an 'extern(al)' decl;
this has been restricted for now to DECL_DECLARED_INLINE_P.
gcc/ChangeLog:
* omp-offload.cc (omp_discover_declare_target_tgt_fn_r): Also
walk external functions that are declare inline (and have a
DECL_SAVED_TREE).
libgomp/ChangeLog:
* testsuite/libgomp.c++/declare_target-2.C: New test.
|
||
|
|
37b454b7e1 |
gcn: Add experimental MI300 (gfx942) support
As gfx942 and gfx950 belong to gfx9-4-generic, the latter two are also added. Note that there are no specific optimizations for MI300, yet. For none of the mentioned devices, any multilib is build by default; use '--with-multilib-list=' when configuring GCC to build them alongside. gfx942 was added in LLVM (and its mc assembler, used by GCC) in version 18, generic support in LLVM 19 and gfx950 in LLVM 20. gcc/ChangeLog: * config/gcn/gcn-devices.def: Add gfx942, gfx950 and gfx9-4-generic. * config/gcn/gcn-opts.h (TARGET_CDNA3, TARGET_CDNA3_PLUS, TARGET_GLC_NAME, TARGET_TARGET_SC_CACHE): Define. (TARGET_ARCHITECTED_FLAT_SCRATCH): Use also for CDNA3. * config/gcn/gcn.h (gcn_isa): Add ISA_CDNA3 to the enum. * config/gcn/gcn.cc (print_operand): Update 'g' to use TARGET_GLC_NAME; add 'G' to print TARGET_GLC_NAME unconditionally. * config/gcn/gcn-valu.md (scatter, gather): Use TARGET_GLC_NAME. * config/gcn/gcn.md: Use %G<num> instead of glc; use 'buffer_inv sc1' for TARGET_TARGET_SC_CACHE. * doc/invoke.texi (march): Add gfx942, gfx950 and gfx9-4-generic. * doc/install.texi (amdgcn*-*-*): Add gfx942, gfx950 and gfx9-4-generic. * config/gcn/gcn-tables.opt: Regenerate. libgomp/ChangeLog: * testsuite/libgomp.c/declare-variant-4.h (gfx942): New variant function. * testsuite/libgomp.c/declare-variant-4-gfx942.c: New test. |
||
|
|
16c742e107 |
libgomp.c/target-map-zero-sized-3.c: Fix code for non-USM offload [PR120530]
A mapping clause was missing, causing the code to fail with offloading when a host pointer was not device accessible. libgomp/ChangeLog: PR target/120530 * testsuite/libgomp.c/target-map-zero-sized-3.c (main): Add missing map clause; remove unused variable. |
||
|
|
a8c03f056f |
libgomp: Fix up omp_target_memset-3.c test for C++ [PR120444]
The test PASSes for C, but FAILs for C++:
.../libgomp.c-c++-common/omp_target_memset-3.c: In function 'void test_it(void*, int, size_t)':
.../libgomp.c-c++-common/omp_target_memset-3.c:31:7: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith]
.../libgomp.c-c++-common/omp_target_memset-3.c:33:13: error: invalid conversion from 'void*' to 'int8_t*' {aka 'signed char*'} [-fpermissive]
.../libgomp.c-c++-common/omp_target_memset-3.c:10:19: note: initializing argument 1 of 'void init_val(int8_t*, int, size_t)'
.../libgomp.c-c++-common/omp_target_memset-3.c:37:14: error: invalid conversion from 'void*' to 'int8_t*' {aka 'signed char*'} [-fpermissive]
.../libgomp.c-c++-common/omp_target_memset-3.c:17:20: note: initializing argument 1 of 'void check_val(int8_t*, int, size_t)'
.../libgomp.c-c++-common/omp_target_memset-3.c:38:18: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith]
.../libgomp.c-c++-common/omp_target_memset-3.c:38:18: error: invalid conversion from 'void*' to 'int8_t*' {aka 'signed char*'} [-fpermissive]
.../libgomp.c-c++-common/omp_target_memset-3.c:17:20: note: initializing argument 1 of 'void check_val(int8_t*, int, size_t)'
.../libgomp.c-c++-common/omp_target_memset-3.c: In function 'int main()':
.../libgomp.c-c++-common/omp_target_memset-3.c:46:7: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith]
The following two-liner fixes that, tested on x86_64-linux and i686-linux.
2025-06-03 Jakub Jelinek <jakub@redhat.com>
PR libgomp/120444
* testsuite/libgomp.c-c++-common/omp_target_memset-3.c (test_it):
Change ptr argument type from void * to int8_t *.
(main): Change ptr variable type from void * to int8_t * and cast
omp_target_alloc result to the latter type.
|
||
|
|
4e47e2f833 |
libgomp: Add OpenMP's omp_target_memset/omp_target_memset_async
PR libgomp/120444
include/ChangeLog:
* cuda/cuda.h (cuMemsetD8, cuMemsetD8Async): Declare.
libgomp/ChangeLog:
* libgomp-plugin.h (GOMP_OFFLOAD_memset): Declare.
* libgomp.h (struct gomp_device_descr): Add memset_func.
* libgomp.map (GOMP_6.0.1): Add omp_target_memset{,_async}.
* libgomp.texi (Device Memory Routines): Document them.
* omp.h.in (omp_target_memset, omp_target_memset_async): Declare.
* omp_lib.f90.in (omp_target_memset, omp_target_memset_async):
Add interfaces.
* omp_lib.h.in (omp_target_memset, omp_target_memset_async): Likewise.
* plugin/cuda-lib.def: Add cuMemsetD8.
* plugin/plugin-gcn.c (struct hsa_runtime_fn_info): Add
hsa_amd_memory_fill_fn.
(init_hsa_runtime_functions): DLSYM_OPT_FN load it.
(GOMP_OFFLOAD_memset): New.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_memset): New.
* target.c (omp_target_memset_int, omp_target_memset,
omp_target_memset_async_helper, omp_target_memset_async): New.
(gomp_load_plugin_for_device): Add DLSYM (memset).
* testsuite/libgomp.c-c++-common/omp_target_memset.c: New test.
* testsuite/libgomp.c-c++-common/omp_target_memset-2.c: New test.
* testsuite/libgomp.c-c++-common/omp_target_memset-3.c: New test.
* testsuite/libgomp.fortran/omp_target_memset.f90: New test.
* testsuite/libgomp.fortran/omp_target_memset-2.f90: New test.
|
||
|
|
2ffada0296 |
Add 'libgomp.c++/target-valarray-1.C'
libgomp/ * testsuite/libgomp.c++/target-std__valarray-1.C: New. * testsuite/libgomp.c++/target-std__valarray-1.output: Likewise. |
||
|
|
83ca283853 |
libgomp: Add testcases for concurrent access to standard C++ containers on offload targets, a number of USM variants
libgomp/ * testsuite/libgomp.c++/target-std__array-concurrent-usm.C: New. * testsuite/libgomp.c++/target-std__array-concurrent.C: Adjust. * testsuite/libgomp.c++/target-std__bitset-concurrent-usm.C: New. * testsuite/libgomp.c++/target-std__bitset-concurrent.C: Adjust. * testsuite/libgomp.c++/target-std__deque-concurrent-usm.C: New. * testsuite/libgomp.c++/target-std__deque-concurrent.C: Adjust. * testsuite/libgomp.c++/target-std__forward_list-concurrent-usm.C: New. * testsuite/libgomp.c++/target-std__forward_list-concurrent.C: Adjust. * testsuite/libgomp.c++/target-std__list-concurrent-usm.C: New. * testsuite/libgomp.c++/target-std__list-concurrent.C: Adjust. * testsuite/libgomp.c++/target-std__map-concurrent-usm.C: New. * testsuite/libgomp.c++/target-std__map-concurrent.C: Adjust. * testsuite/libgomp.c++/target-std__multimap-concurrent-usm.C: New. * testsuite/libgomp.c++/target-std__multimap-concurrent.C: Adjust. * testsuite/libgomp.c++/target-std__multiset-concurrent-usm.C: New. * testsuite/libgomp.c++/target-std__multiset-concurrent.C: Adjust. * testsuite/libgomp.c++/target-std__set-concurrent-usm.C: New. * testsuite/libgomp.c++/target-std__set-concurrent.C: Adjust. * testsuite/libgomp.c++/target-std__span-concurrent-usm.C: New. * testsuite/libgomp.c++/target-std__span-concurrent.C: Adjust. * testsuite/libgomp.c++/target-std__valarray-concurrent-usm.C: New. * testsuite/libgomp.c++/target-std__valarray-concurrent.C: Adjust. * testsuite/libgomp.c++/target-std__vector-concurrent-usm.C: New. * testsuite/libgomp.c++/target-std__vector-concurrent.C: Adjust. |
||
|
|
a811d1d722 |
libgomp: Add testcases for concurrent access to standard C++ containers on offload targets
libgomp/ * testsuite/libgomp.c++/target-std__array-concurrent.C: New. * testsuite/libgomp.c++/target-std__bitset-concurrent.C: Likewise. * testsuite/libgomp.c++/target-std__deque-concurrent.C: Likewise. * testsuite/libgomp.c++/target-std__flat_map-concurrent.C: Likewise. * testsuite/libgomp.c++/target-std__flat_multimap-concurrent.C: Likewise. * testsuite/libgomp.c++/target-std__flat_multiset-concurrent.C: Likewise. * testsuite/libgomp.c++/target-std__flat_set-concurrent.C: Likewise. * testsuite/libgomp.c++/target-std__forward_list-concurrent.C: Likewise. * testsuite/libgomp.c++/target-std__list-concurrent.C: Likewise. * testsuite/libgomp.c++/target-std__map-concurrent.C: Likewise. * testsuite/libgomp.c++/target-std__multimap-concurrent.C: Likewise. * testsuite/libgomp.c++/target-std__multiset-concurrent.C: Likewise. * testsuite/libgomp.c++/target-std__set-concurrent.C: Likewise. * testsuite/libgomp.c++/target-std__span-concurrent.C: Likewise. * testsuite/libgomp.c++/target-std__unordered_map-concurrent.C: Likewise. * testsuite/libgomp.c++/target-std__unordered_multimap-concurrent.C: Likewise. * testsuite/libgomp.c++/target-std__unordered_multiset-concurrent.C: Likewise. * testsuite/libgomp.c++/target-std__unordered_set-concurrent.C: Likewise. * testsuite/libgomp.c++/target-std__valarray-concurrent.C: Likewise. * testsuite/libgomp.c++/target-std__vector-concurrent.C: Likewise. Co-authored-by: Thomas Schwinge <tschwinge@baylibre.com> |
||
|
|
fbcd0ad41f |
libgomp: Add testcases for the standard C++ math library on offload targets
libgomp/ * testsuite/libgomp.c++/target-std__cmath.C: New. * testsuite/libgomp.c++/target-std__complex.C: Likewise. * testsuite/libgomp.c++/target-std__numbers.C: Likewise. |
||
|
|
28a5bc2d4f |
Add 'libgomp.c++/target-flex-[...].C' test cases
libgomp/ChangeLog: * testsuite/libgomp.c++/target-flex-10.C: New test. * testsuite/libgomp.c++/target-flex-100.C: New test. * testsuite/libgomp.c++/target-flex-101.C: New test. * testsuite/libgomp.c++/target-flex-11.C: New test. * testsuite/libgomp.c++/target-flex-12.C: New test. * testsuite/libgomp.c++/target-flex-2000.C: New test. * testsuite/libgomp.c++/target-flex-2001.C: New test. * testsuite/libgomp.c++/target-flex-2002.C: New test. * testsuite/libgomp.c++/target-flex-2003.C: New test. * testsuite/libgomp.c++/target-flex-30.C: New test. * testsuite/libgomp.c++/target-flex-300.C: New test. * testsuite/libgomp.c++/target-flex-31.C: New test. * testsuite/libgomp.c++/target-flex-32.C: New test. * testsuite/libgomp.c++/target-flex-33.C: New test. * testsuite/libgomp.c++/target-flex-41.C: New test. * testsuite/libgomp.c++/target-flex-60.C: New test. * testsuite/libgomp.c++/target-flex-61.C: New test. * testsuite/libgomp.c++/target-flex-62.C: New test. * testsuite/libgomp.c++/target-flex-70.C: New test. * testsuite/libgomp.c++/target-flex-80.C: New test. * testsuite/libgomp.c++/target-flex-81.C: New test. * testsuite/libgomp.c++/target-flex-90.C: New test. * testsuite/libgomp.c++/target-flex-common.h: New test. Co-authored-by: Thomas Schwinge <tschwinge@baylibre.com> |
||
|
|
543f7e1d59 |
Defuse 'RESULT_DECL' check in 'pass_nrv' (for offloading compilation) [PR119835]
... to avoid running into ICEs per PR119835, until that's resolved properly. PR middle-end/119835 gcc/ * tree-nrv.cc (pass_nrv::execute): Defuse 'RESULT_DECL' check. libgomp/ * testsuite/libgomp.oacc-c-c++-common/abi-struct-1.c: '#pragma GCC optimize "-fno-inline"'. * testsuite/libgomp.c-c++-common/target-abi-struct-1.c: New. * testsuite/libgomp.c-c++-common/target-abi-struct-1-O0.c: Adjust. Co-authored-by: Richard Biener <rguenther@suse.de> |
||
|
|
0c8d92a9ac |
OpenMP: Support OpenMP 5.0 "declare mapper" directives for C
This patch adds support for "declare mapper" directives (and the "mapper" modifier on "map" clauses) for C. gcc/c/ChangeLog: * c-decl.cc (c_omp_mapper_id, c_omp_mapper_decl, c_omp_mapper_lookup, c_omp_extract_mapper_directive, c_omp_map_array_section, c_omp_scan_mapper_bindings_r, c_omp_scan_mapper_bindings): New functions. * c-objc-common.h (LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES, LANG_HOOKS_OMP_MAPPER_LOOKUP, LANG_HOOKS_OMP_EXTRACT_MAPPER_DIRECTIVE, LANG_HOOKS_OMP_MAP_ARRAY_SECTION): Define langhooks for C. * c-parser.cc (c_parser_omp_clause_map): Add declare_mapper_p parameter; handle mapper modifier. (c_parser_omp_all_clauses): Update call to c_parser_omp_clause_map. (c_parser_omp_target): Instantiate explicit mappers and record bindings for implicit mappers. (c_parser_omp_declare_mapper): Parse "declare mapper" directives. (c_parser_omp_declare): Support "declare mapper". (c_parser_omp_declare_reduction): Use inform not error_at. * c-tree.h (c_omp_finish_mapper_clauses, c_omp_mapper_lookup, c_omp_extract_mapper_directive, c_omp_map_array_section, c_omp_mapper_id, c_omp_mapper_decl, c_omp_scan_mapper_bindings, c_omp_instantiate_mappers): Add prototypes. * c-typeck.cc (c_finish_omp_clauses): Handle GOMP_MAP_PUSH_MAPPER_NAME and GOMP_MAP_POP_MAPPER_NAME. (c_omp_finish_mapper_clauses): New function (langhook). libgomp/ChangeLog: * testsuite/libgomp.c-c++-common/declare-mapper-9.c: Enable for C. * testsuite/libgomp.c-c++-common/declare-mapper-10.c: Likewise. * testsuite/libgomp.c-c++-common/declare-mapper-11.c: Likewise. * testsuite/libgomp.c-c++-common/declare-mapper-12.c: Likewise. * testsuite/libgomp.c-c++-common/declare-mapper-13.c: Likewise. * testsuite/libgomp.c-c++-common/declare-mapper-14.c: Likewise. gcc/testsuite/ChangeLog: * c-c++-common/gomp/declare-mapper-3.c: Enable for C. * c-c++-common/gomp/declare-mapper-4.c: Likewise. * c-c++-common/gomp/declare-mapper-5.c: Likewise. * c-c++-common/gomp/declare-mapper-6.c: Likewise. * c-c++-common/gomp/declare-mapper-7.c: Likewise. * c-c++-common/gomp/declare-mapper-8.c: Likewise. * c-c++-common/gomp/declare-mapper-9.c: Likewise. * c-c++-common/gomp/declare-mapper-10.c: Likewise. * c-c++-common/gomp/declare-mapper-12.c: Likewise. * c-c++-common/gomp/map-6.c: Update dg-error. * gcc.dg/gomp/udr-3.c: Update for change to dg-note. * c-c++-common/gomp/declare-mapper-11.c: New. * gcc.dg/gomp/declare-mapper-10.c: New test. * gcc.dg/gomp/declare-mapper-11.c: New test. * gcc.dg/gomp/declare-mapper-13.c: New test. |
||
|
|
48973e8783 |
OpenMP: C++ "declare mapper" support
This patch adds support for OpenMP 5.0 "declare mapper" functionality
for C++. I've merged it to og13 based on the last version
posted upstream, with some minor changes due to the newly-added
'present' map modifier support. There's also a fix to splay-tree
traversal in gimplify.cc:omp_instantiate_implicit_mappers, and this patch
omits the rearrangement of gimplify.cc:gimplify_{scan,adjust}_omp_clauses
that I separated out into its own patch and applied (to og13) already.
gcc/c-family/
* c-common.h (c_omp_region_type): Add C_ORT_DECLARE_MAPPER and
C_ORT_OMP_DECLARE_MAPPER codes.
(omp_mapper_list): Add forward declaration.
(c_omp_find_nested_mappers, c_omp_instantiate_mappers): Add prototypes.
* c-omp.cc (c_omp_find_nested_mappers): New function.
(remap_mapper_decl_info): New struct.
(remap_mapper_decl_1, omp_instantiate_mapper,
c_omp_instantiate_mappers): New functions.
gcc/cp/
* constexpr.cc (reduced_constant_expression_p): Add OMP_DECLARE_MAPPER
case.
(cxx_eval_constant_expression, potential_constant_expression_1):
Likewise.
* cp-gimplify.cc (cxx_omp_finish_mapper_clauses): New function.
* cp-objcp-common.h (LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES,
LANG_HOOKS_OMP_MAPPER_LOOKUP, LANG_HOOKS_OMP_EXTRACT_MAPPER_DIRECTIVE,
LANG_HOOKS_OMP_MAP_ARRAY_SECTION): Define langhooks.
* cp-tree.h (lang_decl_base): Add omp_declare_mapper_p field. Recount
spare bits comment.
(DECL_OMP_DECLARE_MAPPER_P): New macro.
(omp_mapper_id): Add prototype.
(cp_check_omp_declare_mapper): Add prototype.
(omp_instantiate_mappers): Add prototype.
(cxx_omp_finish_mapper_clauses): Add prototype.
(cxx_omp_mapper_lookup): Add prototype.
(cxx_omp_extract_mapper_directive): Add prototype.
(cxx_omp_map_array_section): Add prototype.
* decl.cc (check_initializer): Add OpenMP declare mapper support.
(cp_finish_decl): Set DECL_INITIAL for OpenMP declare mapper var decls
as appropriate.
* decl2.cc (mark_used): Instantiate OpenMP "declare mapper" magic var
decls.
* error.cc (dump_omp_declare_mapper): New function.
(dump_simple_decl): Use above.
* parser.cc (cp_parser_omp_clause_map): Add KIND parameter. Support
"mapper" modifier.
(cp_parser_omp_all_clauses): Add KIND argument to
cp_parser_omp_clause_map call.
(cp_parser_omp_target): Call omp_instantiate_mappers before
finish_omp_clauses.
(cp_parser_omp_declare_mapper): New function.
(cp_parser_omp_declare): Add "declare mapper" support.
* pt.cc (tsubst_decl): Adjust name of "declare mapper" magic var decls
once we know their type.
(tsubst_omp_clauses): Call omp_instantiate_mappers before
finish_omp_clauses, for target regions.
(tsubst_expr): Support OMP_DECLARE_MAPPER nodes.
(instantiate_decl): Instantiate initialiser (i.e definition) for OpenMP
declare mappers.
* semantics.cc (gimplify.h): Include.
(omp_mapper_id, omp_mapper_lookup, omp_extract_mapper_directive,
cxx_omp_map_array_section, cp_check_omp_declare_mapper): New functions.
(finish_omp_clauses): Delete GOMP_MAP_PUSH_MAPPER_NAME and
GOMP_MAP_POP_MAPPER_NAME artificial clauses.
(omp_target_walk_data): Add MAPPERS field.
(finish_omp_target_clauses_r): Scan for uses of struct/union/class type
variables.
(finish_omp_target_clauses): Create artificial mapper binding clauses
for used structs/unions/classes in offload region.
gcc/fortran/
* parse.cc (tree.h, fold-const.h, tree-hash-traits.h): Add includes
(for additions to omp-general.h).
gcc/
* gimplify.cc (gimplify_omp_ctx): Add IMPLICIT_MAPPERS field.
(new_omp_context): Initialise IMPLICIT_MAPPERS hash map.
(delete_omp_context): Delete IMPLICIT_MAPPERS hash map.
(instantiate_mapper_info): New structs.
(remap_mapper_decl_1, omp_mapper_copy_decl, omp_instantiate_mapper,
omp_instantiate_implicit_mappers): New functions.
(gimplify_scan_omp_clauses): Handle MAPPER_BINDING clauses.
(gimplify_adjust_omp_clauses): Instantiate implicit declared mappers.
(gimplify_omp_declare_mapper): New function.
(gimplify_expr): Call above function.
* langhooks-def.h (lhd_omp_mapper_lookup,
lhd_omp_extract_mapper_directive, lhd_omp_map_array_section): Add
prototypes.
(LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES,
LANG_HOOKS_OMP_MAPPER_LOOKUP, LANG_HOOKS_OMP_EXTRACT_MAPPER_DIRECTIVE,
LANG_HOOKS_OMP_MAP_ARRAY_SECTION): Define macros.
(LANG_HOOK_DECLS): Add above macros.
* langhooks.cc (lhd_omp_mapper_lookup,
lhd_omp_extract_mapper_directive, lhd_omp_map_array_section): New
dummy functions.
* langhooks.h (lang_hooks_for_decls): Add OMP_FINISH_MAPPER_CLAUSES,
OMP_MAPPER_LOOKUP, OMP_EXTRACT_MAPPER_DIRECTIVE, OMP_MAP_ARRAY_SECTION
hooks.
* omp-general.h (omp_name_type<T>): Add templatized struct, hash type
traits (for omp_name_type<tree> specialization).
(omp_mapper_list<T>): Add struct.
* tree-core.h (omp_clause_code): Add OMP_CLAUSE__MAPPER_BINDING_.
* tree-pretty-print.cc (dump_omp_clause): Support GOMP_MAP_UNSET,
GOMP_MAP_PUSH_MAPPER_NAME, GOMP_MAP_POP_MAPPER_NAME artificial mapping
clauses. Support OMP_CLAUSE__MAPPER_BINDING_ and OMP_DECLARE_MAPPER.
* tree.cc (omp_clause_num_ops, omp_clause_code_name): Add
OMP_CLAUSE__MAPPER_BINDING_.
* tree.def (OMP_DECLARE_MAPPER): New tree code.
* tree.h (OMP_DECLARE_MAPPER_ID, OMP_DECLARE_MAPPER_DECL,
OMP_DECLARE_MAPPER_CLAUSES): New defines.
(OMP_CLAUSE__MAPPER_BINDING__ID, OMP_CLAUSE__MAPPER_BINDING__DECL,
OMP_CLAUSE__MAPPER_BINDING__MAPPER): New defines.
include/
* gomp-constants.h (gomp_map_kind): Add GOMP_MAP_UNSET,
GOMP_MAP_PUSH_MAPPER_NAME, GOMP_MAP_POP_MAPPER_NAME artificial mapping
clause types.
gcc/testsuite/
* c-c++-common/gomp/map-6.c: Update error scan output.
* c-c++-common/gomp/declare-mapper-3.c: New test (only enabled for C++
for now).
* c-c++-common/gomp/declare-mapper-4.c: Likewise.
* c-c++-common/gomp/declare-mapper-5.c: Likewise.
* c-c++-common/gomp/declare-mapper-6.c: Likewise.
* c-c++-common/gomp/declare-mapper-7.c: Likewise.
* c-c++-common/gomp/declare-mapper-8.c: Likewise.
* c-c++-common/gomp/declare-mapper-9.c: Likewise.
* c-c++-common/gomp/declare-mapper-10.c: Likewise.
* c-c++-common/gomp/declare-mapper-12.c: Likewise.
* g++.dg/gomp/declare-mapper-1.C: New test.
* g++.dg/gomp/declare-mapper-2.C: New test.
* g++.dg/gomp/declare-mapper-3.C: New test.
libgomp/
* testsuite/libgomp.c++/declare-mapper-1.C: New test.
* testsuite/libgomp.c++/declare-mapper-2.C: New test.
* testsuite/libgomp.c++/declare-mapper-3.C: New test.
* testsuite/libgomp.c++/declare-mapper-4.C: New test.
* testsuite/libgomp.c++/declare-mapper-5.C: New test.
* testsuite/libgomp.c++/declare-mapper-6.C: New test.
* testsuite/libgomp.c++/declare-mapper-7.C: New test.
* testsuite/libgomp.c++/declare-mapper-8.C: New test.
* testsuite/libgomp.c-c++-common/declare-mapper-9.c: New test (only
enabled for C++ for now).
* testsuite/libgomp.c-c++-common/declare-mapper-10.c: Likewise.
* testsuite/libgomp.c-c++-common/declare-mapper-11.c: Likewise.
* testsuite/libgomp.c-c++-common/declare-mapper-12.c: Likewise.
* testsuite/libgomp.c-c++-common/declare-mapper-13.c: Likewise.
* testsuite/libgomp.c-c++-common/declare-mapper-14.c: Likewise.
Co-authored-by: Tobias Burnus <tburnus@baylibre.com>
|
||
|
|
f4aa6b5a8d |
libgomp: Add OpenACC's acc_memcpy_device{,_async} routines [PR93226]
libgomp/ChangeLog:
PR libgomp/93226
* libgomp-plugin.h (GOMP_OFFLOAD_openacc_async_dev2dev): New
prototype.
* libgomp.h (struct acc_dispatch_t): Add dev2dev_func.
(gomp_copy_dev2dev): New prototype.
* libgomp.map (OACC_2.6.1): New; add acc_memcpy_device{,_async}.
* libgomp.texi (acc_memcpy_device): New.
* oacc-mem.c (memcpy_tofrom_device): Change to take from/to
device boolean; use memcpy not memmove; add early return if
size == 0 or same device + same ptr.
(acc_memcpy_to_device, acc_memcpy_to_device_async,
acc_memcpy_from_device, acc_memcpy_from_device_async): Update.
(acc_memcpy_device, acc_memcpy_device_async): New.
* openacc.f90 (acc_memcpy_device, acc_memcpy_device_async):
Add interface.
* openacc_lib.h (acc_memcpy_device, acc_memcpy_device_async):
Likewise.
* openacc.h (acc_memcpy_device, acc_memcpy_device_async): Add
prototype.
* plugin/plugin-gcn.c (GOMP_OFFLOAD_openacc_async_host2dev):
Update comment.
(GOMP_OFFLOAD_openacc_async_dev2host): Update call.
(GOMP_OFFLOAD_openacc_async_dev2dev): New.
* plugin/plugin-nvptx.c (cuda_memcpy_dev_sanity_check): New.
(GOMP_OFFLOAD_dev2dev): Call it.
(GOMP_OFFLOAD_openacc_async_dev2dev): New.
* target.c (gomp_copy_dev2dev): New.
(gomp_load_plugin_for_device): Load dev2dev and async_dev2dev.
* testsuite/libgomp.oacc-c-c++-common/acc_memcpy_device-1.c: New test.
* testsuite/libgomp.oacc-fortran/acc_memcpy_device-1.f90: New test.
|
||
|
|
5b2e7afb13 |
libgomp.fortran/metadirective-1.f90: Expect 'error:' for nvptx compile [PR118694]
This should have been part of commit r16-838-gb3d07ec7ac2ccd or r16-883-g5d6ed6d604ff94 - all showing the same issue: '!$omp target' followed by a metadirective with 'teams'; if the metadirective cannot be early resolved, a diagnostic error is shown about using directives between 'target' and 'teams'. While the message is misleading, the problem is that the host invokes 'target' differently when 'teams' is present; in this case, host fallback + amdgcn offload require the no-teams case, nvptx offload the teams case such that it only can be resolved at runtime. Mark the error as 'dg-bogus + xfail' to silence the FAIL, when nvptx offloading is compiled for. (If not, the metadirective can be resolved early during compilation.) libgomp/ChangeLog: PR middle-end/118694 * testsuite/libgomp.fortran/metadirective-1.f90: xfail when compiling (also) for nvptx offloading as an error is then expected. |
||
|
|
b3d07ec7ac |
libgomp.c-c++-common/metadirective-1.c: Expect 'error:' for nvptx compile [PR118694]
OpenMP's 'target teams' is strictly coupled with 'teams'; if the latter
exists, the kernel is launched in directly with multiple teams. Thus,
the host has to know whether the teams construct exists or not. For
#pragma omp target
#pragma omp metadirective when (device={arch("nvptx")}: teams loop)
it is simple when 'nvptx' offloading is not supported, otherwise it depends
on the default device at runtime as the user code asks for a single team for
host fallback and gcn offload and multiple for nvptx offload.
In any case, this commit ensures that no FAIL is printed, whatever a
future solution might look like. Instead of a dg-bogus combined with an
'xfail offload_target_nvptx', one an also argue that a dg-error for
'target offload_target_nvptx' would be more appropriate.
libgomp/ChangeLog:
PR middle-end/118694
* testsuite/libgomp.c-c++-common/metadirective-1.c: xfail when
compiling (also) for nvptx offloading as an error is then expected.
|
||
|
|
9063810c86 |
'TYPE_EMPTY_P' vs. code offloading [PR120308]
We've got 'gcc/stor-layout.cc:finalize_type_size':
/* Handle empty records as per the x86-64 psABI. */
TYPE_EMPTY_P (type) = targetm.calls.empty_record_p (type);
(Indeed x86_64 is still the only target to define 'TARGET_EMPTY_RECORD_P',
calling 'gcc/tree.cc-default_is_empty_record'.)
And so it happens that for an empty struct used in code offloaded from x86_64
host (but not powerpc64le host, for example), we get to see 'TYPE_EMPTY_P' in
offloading compilation (where the offload targets (currently?) don't use it
themselves, and therefore aren't prepared to handle it).
For nvptx offloading compilation, this causes wrong code generation:
'ptxas [...] error : Call has wrong number of parameters', as nvptx code
generation for function definition doesn't pay attention to this flag (say, in
'gcc/config/nvptx/nvptx.cc:pass_in_memory', or whereever else would be
appropriate to handle that), but the generic code 'gcc/calls.cc:expand_call'
via 'gcc/function.cc:aggregate_value_p' does pay attention to it, and we thus
get mismatching function definition vs. function call.
This issue apparently isn't a problem for GCN offloading, but I don't know if
that's by design or by accident.
Richard Biener:
> It looks like TYPE_EMPTY_P is only used during RTL expansion for ABI
> purposes, so computing it during layout_type is premature as shown here.
>
> I would suggest to simply re-compute it at offload stream-in time.
(For avoidance of doubt, the additions to 'gcc.target/nvptx/abi-struct-arg.c',
'gcc.target/nvptx/abi-struct-ret.c' are not dependent on the offload streaming
code changes, but are just to mirror the changes to
'libgomp.oacc-c-c++-common/abi-struct-1.c'.)
PR lto/120308
gcc/
* lto-streamer-out.cc (hash_tree): Don't handle 'TYPE_EMPTY_P' for
'lto_stream_offload_p'.
* tree-streamer-in.cc (unpack_ts_type_common_value_fields):
Likewise.
* tree-streamer-out.cc (pack_ts_type_common_value_fields):
Likewise.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/abi-struct-1.c: Add empty
structure testing.
gcc/testsuite/
* gcc.target/nvptx/abi-struct-arg.c: Add empty structure testing.
* gcc.target/nvptx/abi-struct-ret.c: Likewise.
|
||
|
|
45efda05c4 |
Add 'libgomp.c-c++-common/target-abi-struct-1-O0.c', 'libgomp.oacc-c-c++-common/abi-struct-1.c'
libgomp/ * testsuite/libgomp.c-c++-common/target-abi-struct-1-O0.c: New. * testsuite/libgomp.oacc-c-c++-common/abi-struct-1.c: Likewise. |
||
|
|
7ef9cb4caf |
Fix libgomp.oacc-fortran/lib-13.f90 async bug
libgomp/ * testsuite/libgomp.oacc-fortran/lib-13.f90: End data region after wait API calls. |
||
|
|
f99017c312 |
OpenMP/Fortran: Fix allocatable-component mapping of derived-type array comps
The check whether the location expression in map clause has allocatable components was failing for some derived-type array expressions such as map(var%tiles(1)) as the compiler produced _4 = var.tiles; MEMREF(_4, _5); This commit now also handles this case. gcc/fortran/ChangeLog: * trans-openmp.cc (gfc_omp_deep_mapping_do): Handle SSA_NAME if a def_stmt is available. libgomp/ChangeLog: * testsuite/libgomp.fortran/alloc-comp-4.f90: New test. |