KJ16609/gcc - gcc - Gitea: Git with a cup of tea

mirror of https://gcc.gnu.org/git/gcc.git synced 2026-02-22 03:46:53 -05:00

Author	SHA1	Message	Date
Jakub Jelinek	d4499a232a	libcpp: Add -Wleading-whitespace= warning The following patch on top of the r15-4346 patch adds -Wleading-whitespace= warning option. This warning doesn't care how much one actually indents which line in the source (that is something that can't be easily done in the preprocessor without doing syntactic analysis), but just simple checks on what kind of whitespace is used in the indentation. I think it is still useful to get warnings about such issues early, while git diagnoses some of it in patches (e.g. the tab after space case), getting the warnings earlier might help avoiding such issues sooner. There are projects which ban use of tabs and require just spaces, others which require indentation just with horizontal tabs, and finally projects which want indentation with tabs for multiples of tabstop size followed by spaces (fewer than tabstop size), like GCC. For all 3 kinds the warning diagnoses indentation with '\v' or '\f' characters (unless line contains just whitespace), and for the last one also cases where a space in the indentation is followed by horizontal tab or where there are N or more consecutive spaces in the indentation (for -ftabstop=N). BTW, for additional testing I've enabled the warnings (without -Werror for them) in stage3. There are many warnings (both trailing and leading whitespace), some of them something that can be easily fixed in the headers or source files, but others with whitespace issues in generated sources, so if we enable the warnings, either we'd need to adjust the generators or disable the warnings in (some of the) generated files. 2024-10-23 Jakub Jelinek <jakub@redhat.com> libcpp/ * include/cpplib.h (struct cpp_options): Add cpp_warn_leading_whitespace and cpp_tabstop members. (enum cpp_warning_reason): Add CPP_W_LEADING_WHITESPACE. * internal.h (struct _cpp_line_note): Document new line note kinds. * init.cc (cpp_create_reader): Set cpp_tabstop to 8. * lex.cc (find_leading_whitespace_issues): New function. (_cpp_clean_line): Use it. (_cpp_process_line_notes): Handle 'L', 'S' and 'T' line notes. (lex_raw_string): Clear type on 'L', 'S' and 'T' line notes inside of raw string literals. gcc/ * doc/invoke.texi (Wleading-whitespace=): Document. gcc/c-family/ * c.opt (Wleading-whitespace=): New option. * c-opts.cc (c_common_post_options): Set cpp_opts->cpp_tabstop to global_dc->m_tabstop. gcc/testsuite/ * c-c++-common/cpp/Wleading-whitespace-1.c: New test. * c-c++-common/cpp/Wleading-whitespace-2.c: New test. * c-c++-common/cpp/Wleading-whitespace-3.c: New test. * c-c++-common/cpp/Wleading-whitespace-4.c: New test.	2024-10-23 09:58:06 +02:00
François Dumont	ee030b2800	libstdc++: Always instantiate key_type to compute hash code [PR115285] Even if it is possible to compute a hash code from the inserted arguments we need to instantiate the key_type to guaranty hash code consistency. Preserve the lazy instantiation of the mapped_type in the context of associative containers. libstdc++-v3/ChangeLog: PR libstdc++/115285 * include/bits/hashtable.h (_S_forward_key<_Kt>): Always return a temporary key_type instance. * testsuite/23_containers/unordered_map/96088.cc: Adapt to additional instanciation. Also check that mapped_type is not instantiated when there is no insertion. * testsuite/23_containers/unordered_multimap/96088.cc: Adapt to additional instanciation. * testsuite/23_containers/unordered_multiset/96088.cc: Likewise. * testsuite/23_containers/unordered_set/96088.cc: Likewise. * testsuite/23_containers/unordered_set/pr115285.cc: New test case.	2024-10-23 06:25:21 +02:00
liuhongt	ee7e77e9c1	i386: Optimize EQ/NE comparison between avx512 kmask and -1. r15-974-gbf7745f887c765e06f2e75508f263debb60aeb2e has optimized for jcc/setcc, but missed movcc. The patch supports movcc. gcc/ChangeLog: PR target/117232 * config/i386/sse.md (kortest_cmp<SWI1248_AVX512BWDQ_64:mode>_movqicc): New define_insn_and_split. (kortest_cmp<SWI1248_AVX512BWDQ_64:mode>_mov<SWI248:mode>cc): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr117232-1.c: New test. * gcc.target/i386/pr117232-apx-1.c: New test.	2024-10-22 19:28:26 -07:00
GCC Administrator	01ed5c62bf	Daily bump.	2024-10-23 00:19:43 +00:00
Joseph Myers	ecb55d9473	c: Restore "originally defined" struct redefinition messages for C23 One failure with a -std=gnu23 default that indicates a quality-of-implementation regression in C23 mode is gcc.dg/pr39084.c, which loses the expected "originally defined here" message on struct redefinition errors (which occur in a different place in the front end for C23 because it is necessary to see the members of the struct to determine whether the redefinition is valid). That message seems a good thing to have both in and out of C23 mode, so add logic to restore it in the C23 case. Bootstrapped with no regressions for x86-64-pc-linux-gnu. gcc/c/ * c-decl.cc (c_struct_parse_info): Add member refloc. (start_struct): Store refloc in struct_parse_info. (finish_struct): Give "originally defined" message for C23 struct redefinition errors. gcc/testsuite/ * gcc.dg/gnu17-tag-1.c, gcc.dg/gnu23-tag-5.c: New tests.	2024-10-23 00:10:01 +00:00
Jason Merrill	71e13ea134	c++: non-dep structured binding decltype again [PR117107] The patch for PR92687 handled the usual case of a decomp variable not being in the table, but missed the case of there being nothing in the table yet. PR c++/117107 PR c++/92687 gcc/cp/ChangeLog: * decl.cc (lookup_decomp_type): Handle null table. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/decomp10.C: New test.	2024-10-22 17:56:23 -04:00
Jason Merrill	5c6c1aba33	c++: add testcase [PR116929] This testcase was fixed by r15-822-g0173dcce92baa6 . PR c++/116929 gcc/testsuite/ChangeLog: * g++.dg/modules/enum-14.C: New test.	2024-10-22 17:55:47 -04:00
Patrick Palka	f191c83015	libstdc++: Implement LWG 4166 changes to concat_view::end() This patch proactively implements the proposed resolution for this LWG issue, which seems straightforward and slated to get approved as-is. (No _GLIBCXX_RESOLVE_LIB_DEFECTS code comment is added since concat_view is C++26, so this isn't a defect against a published standard.) libstdc++-v3/ChangeLog: * include/std/ranges (concat_view::begin): Add space after 'requires' starting a requires-clause. (concat_view::end): Likewise. Refine condition for returning an iterator rather than default_sentinel as per LWG 4166. * testsuite/std/ranges/concat/1.cc (test03): Verify LWG 4166 example. Reviewed-by: Jonathan Wakely <jwakely@redhat.com>	2024-10-22 17:01:59 -04:00
Jakub Jelinek	a6db5908a5	c: Better fix for speed up compilation of large char array initializers when not using #embed [PR117190] On Wed, Oct 16, 2024 at 11:09:32PM +0200, Jakub Jelinek wrote: > Apparently my > c: Speed up compilation of large char array initializers when not using #embed > patch broke building glibc. > > The issue is that when using CPP_EMBED, we are guaranteed by the > preprocessor that there is CPP_NUMBER CPP_COMMA before it and > CPP_COMMA CPP_NUMBER after it (or CPP_COMMA CPP_EMBED), so RAW_DATA_CST > never ends up at the end of arrays of unknown length. > Now, the c_parser_initval optimization attempted to preserve that property > rather than changing everything that e.g. inferes array number of elements > from the initializer etc. to deal with RAW_DATA_CST at the end, but > it didn't take into account the possibility that there could be > CPP_COMMA followed by CPP_CLOSE_BRACE (where the CPP_COMMA is redundant). > > As we are peaking already at 4 tokens in that code, peeking more would > require using raw tokens and that seems to be expensive doing it for > every pair of tokens due to vec_free done when we are out of raw tokens. Sorry for rushing the previous patch too much, turns out I was wrong, given that the c_parser_peek_nth_token numbering is 1 based, we can peek also with c_parser_peek_nth_token (parser, 4) and the loop actually peeked just at 3 tokens, not 4. So, I think it is better to revert the previous patch (but keep the new test) and instead peek the 4th non-raw token, which is what the following patch does. Additionally, PR117190 shows one further spot which missed the peek of the token after CPP_COMMA, in case it is incomplete array with exactly 65 elements with redundant comma after it, which this patch handles too. 2024-10-22 Jakub Jelinek <jakub@redhat.com> PR c/117190 gcc/c/ * c-parser.cc (c_parser_initval): Revert 2024-10-17 changes. Instead peek the 4th token and if it is not CPP_NUMBER, handle it like 3rd token CPP_CLOSE_BRACE for orig_len == INT_MAX. Also, check (2 + 2 * i)th raw token for the orig_len == INT_MAX case and punt if it is not CPP_NUMBER. gcc/testsuite/ * c-c++-common/init-5.c: New test.	2024-10-22 22:36:03 +02:00
Jakub Jelinek	5fd1c0c1b6	c-family: Fix up -Wsizeof-pointer-memaccess ICEs [PR117230] In the following testcases, we ICE on all 4 function calls. The problem is using TYPE_PRECISION on vector types (but guess it would be similarly problematic on structures/unions/arrays). The test only differentiates between suggestion what to do, whether to supply explicit size because sizeof (p) for {,{,un}signed }char p is not very likely what the user want, or dereferencing the pointer, so I think limiting that suggestion to integral types is ok. 2024-10-22 Jakub Jelinek <jakub@redhat.com> PR c/117230 * c-warn.cc (sizeof_pointer_memaccess_warning): Only compare TYPE_PRECISION of TREE_TYPE (type) to precision of char if TREE_TYPE (type) is integral type. * c-c++-common/Wsizeof-pointer-memaccess5.c: New test.	2024-10-22 20:30:41 +02:00
Jakub Jelinek	f616bc412c	varasm: Handle RAW_DATA_CST in compare_constant [PR117199] On the following testcase without LTO we unnecessarily don't merge two identical .LC* constants (constant hashing computes the same hash, but as compare_constant returned false for the RAW_DATA_CST in it, it never compares equal), and with LTO fails to link because LTO assumes such constants have to be merged and so doesn't emit the other constant. 2024-10-22 Jakub Jelinek <jakub@redhat.com> PR middle-end/117199 * varasm.cc (compare_constant): Handle RAW_DATA_CST. Formatting fix in the STRING_CST case. * gcc.dg/lto/pr117199_0.c: New test.	2024-10-22 20:21:56 +02:00
Jakub Jelinek	8f173da452	varasm: Fix up RAW_DATA_CST handling in array_size_for_constructor [PR117190] CONSTRUCTOR indices for arrays have bitsize type, and the r15-4375 patch actually got it right in 6 other spots, but not in this function, where it used size_int rather than bitsize_int and so size_binop can ICE on type mismatch. This is covered by the init-5.c testcase I've just posted, though the ICE goes away when the C FE is fixed (and when it is not, there is another ICE). 2024-10-22 Jakub Jelinek <jakub@redhat.com> PR c/117190 * varasm.cc (array_size_for_constructor): For RAW_DATA_CST, use bitsize_int rather than size_int.	2024-10-22 20:21:17 +02:00
Tobias Burnus	1bdeebe69b	GCN: Initial generic-target handling, add more GCN macro defines Newer llvm-mc assemblers support the gfx-generic targets, permitting to generate code for all GPUs belonging to the same generation, even if not optimal code. This requires LLVM 19. This patch adds the compiler-side support for generic gfx and also adds -march=gfx10-3-generic and -march=gfx-11. However, those -march= are not documented nor used anywhere, yet. Disclaimer: Not tested (as my ROCm does not support it); additionally, libgomp/plugin/plugin-gcn.c has to be updated before it becomes useful. For better compatibility with LLVM's Clang, this commit additionally adds the macro definitions __GFX<9\|10\|11>__ for the architecture family, __AMDGPU__ besides the existing __AMDGCN__ and the two strings-containing macros __amdgcn_processor__ and __amdgcn_target_id__, where the former has '-' replaced by '_' but otherwise both contain the lower case name. For the new generic targets, the same happens, yielding, e.g., __gfx10_3_generic__. gcc/ChangeLog: config/gcn/gcn-devices.def: Add generic version/flag as additional value and architecture family entry; update; add gfx-10-3-generic and gfx11-generic. * config/gcn/gcn-hsa.h (ABI_VERSION_SPEC): Remove (ASM_SPEC): Use generated ABI_VERSION_OPT instead. * config/gcn/gcn-tables.opt: Regenerate * config/gcn/gcn.h (gcn_device_def): Add generic_version and arch_family members. (TARGET_CPU_CPP_BUILTINS): Fix allocation bug, handle '-' in the name and add additional macro defines. * config/gcn/gcn.cc (gcn_devices): Handle it. * config/gcn/gen-gcn-device-macros.awk: Likewise; use ELF name for the macro name; generate ABI_VERSION_OPT. * config/gcn/mkoffload.cc (ELFABIVERSION_AMDGPU_HSA_V6, EF_AMDGPU_GENERIC_VERSION_V, EF_AMDGPU_GENERIC_VERSION_OFFSET, GET_GENERIC_VERSION, SET_GENERIC_VERSION): Define. (get_arch): Call SET_GENERIC_VERSION flag on elf_flags. (copy_early_debug_info): If the arch sets the generic version, use ELFABIVERSION_AMDGPU_HSA_V6.	2024-10-22 20:06:50 +02:00
Torbjörn SVENSSON	205515da82	testsuite: arm: Use check-function-bodies in fp16-aapcs-* tests Converted the tests to use check-function-bodies in order to ensure that the sequence is correct. gcc/testsuite/ChangeLog: * gcc.target/arm/fp16-aapcs-1.c: Use check-function-bodies. * gcc.target/arm/fp16-aapcs-2.c: Likewise. * gcc.target/arm/fp16-aapcs-3.c: Likewise. * gcc.target/arm/fp16-aapcs-4.c: Likewise. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>	2024-10-22 19:04:00 +02:00
Torbjörn SVENSSON	a79ca49b5c	testsuite: arm: Relax expected asm in bitfield* and union-2 tests Below -O2, lsls/lsrs are prefered. For -O2 and above, lsl/lsr are prefered. gcc/testsuite/ChangeLog: * gcc.target/arm/cmse/mainline/8_1m/bitfield-4.c: Allow lsl and lsr instructions. * gcc.target/arm/cmse/mainline/8_1m/bitfield-6.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/bitfield-8.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/bitfield-and-union.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/union-2.c: Likewise. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>	2024-10-22 19:04:00 +02:00
Torbjörn SVENSSON	835ad52fbb	testsuite: arm: Use check-function-bodies in cmse-5 tests Converted the tests to use check-function-bodies in order to ensure that the sequence is correct. This also allows both APSR_nzcvq and APSR_nzcvqg as target selector does not work when the -march and/or -mcpu overrides the target to test. gcc/testsuite/ChangeLog: * gcc.target/arm/cmse/mainline/8m/hard-sp/cmse-5.c: Use check-function-bodies. * gcc.target/arm/cmse/mainline/8m/hard/cmse-5.c: Likewise. * gcc.target/arm/cmse/mainline/8m/soft/cmse-5.c: Likewise. * gcc.target/arm/cmse/mainline/8m/softfp-sp/cmse-5.c: Likewise. * gcc.target/arm/cmse/mainline/8m/softfp/cmse-5.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-5.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/hard/cmse-5.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/soft/cmse-5.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp-sp/cmse-5.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-5.c: Likewise. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>	2024-10-22 19:03:31 +02:00
Jonathan Wakely	85e5b80ee2	libstdc++: Avoid using std::__to_address with iterators In r12-3935-g82626be2d633a9 I added the partial specialization std::pointer_traits<__normal_iterator<It, Cont>> so that __to_address would work with __normal_iterator objects. Soon after that, François replaced it in r12-6004-g807ad4bc854cae with an overload of __to_address that served the same purpose, but was less complicated and less wrong. I now think that both commits were mistakes, and that instead of adding hacks to make __normal_iterator work with __to_address, we should not be using __to_address with iterators at all before C++20. The pre-C++20 std::__to_address function should only be used with pointer-like types, specifically allocator_traits<A>::pointer types. Those pointer-like types are guaranteed to be contiguous iterators, so that getting a raw memory address from them is OK. For arbitrary iterators, even random access iterators, we don't know that it's safe to lower the iterator to a pointer e.g. for std::deque iterators it's not, because (it + n) == (std::to_address(it) + n) only holds within the same block of the deque's storage. For C++20, std::to_address does work correctly for contiguous iterators, including __normal_iterator, and __to_address just calls std::to_address so also works. But we have to be sure we have an iterator that satisfies the std::contiguous_iterator concept for it to be safe, and we can't check that before C++20. So for pre-C++20 code the correct way to handle iterators that might be pointers or might be __normal_iterator is to call __niter_base, and if necessary use is_pointer to check whether __niter_base returned a real pointer. We currently have some uses of std::__to_address with iterators where we've checked that they're either pointers, or __normal_iterator wrappers around pointers, or satisfy std::contiguous_iterator. But this seems a little fragile, and it would be better to just use std::__niter_base for the pointers and __normal_iterator cases, and use C++20 std::to_address when the C++20 std::contiguous_iterator concept is satisfied. This patch does that. libstdc++-v3/ChangeLog: * include/bits/basic_string.h (basic_string::assign): Replace use of __to_address with __niter_base or std::to_address as appropriate. * include/bits/ptr_traits.h (__to_address): Add comment. * include/bits/shared_ptr_base.h (__shared_ptr): Qualify calls to __to_address. * include/bits/stl_algo.h (find): Replace use of __to_address with __niter_base or std::to_address as appropriate. Only use either of them when the range is not empty. * include/bits/stl_iterator.h (__to_address): Remove overload for __normal_iterator. * include/debug/safe_iterator.h (__to_address): Remove overload for _Safe_iterator. * include/std/ranges (views::counted): Replace use of __to_address with std::to_address. * testsuite/24_iterators/normal_iterator/to_address.cc: Removed.	2024-10-22 17:08:32 +01:00
Jennifer Schmitz	bf11ecbb02	testsuite: Add test directive checking removal of link_error This test needs a directive checking the removal of the link_error. Committed as obvious. Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com> gcc/testsuite/ * gcc.dg/tree-ssa/log_ident.c: Add scan for removal of link_error in optimized tree dump.	2024-10-22 15:16:10 +02:00
Patrick Palka	ae614b8a3d	c++: redundant hashing in register_specialization After r15-4050-g5dad738c1dd164 register_specialization needs to set elt.hash to the (maybe) precomputed hash so that the lookup uses it rather than redundantly computing it from scratch. gcc/cp/ChangeLog: * pt.cc (register_specialization): Set elt.hash. Reviewed-by: Jason Merrill <jason@redhat.com>	2024-10-22 08:01:16 -04:00
Richard Sandiford	4e80432c52	testsuite: Skip pr112305.c for -O[01] on simulators gcc.dg/torture/pr112305.c contains an inner loop that executes 0x8000_0014 times and an outer loop that executes 5 times, giving about 10 billion total executions of the inner loop body. At -O2 and above we are able to remove the inner loop, but at -O1 we keep a no-op loop: dls lr, r3 .L3: subs r3, r3, #1 le lr, .L3 and at -O0 we of course don't optimise. This can lead to long execution times on simulators, possibly triggering a timeout. gcc/testsuite * gcc.dg/torture/pr112305.c: Skip at -O0 and -O1 for simulators.	2024-10-22 12:47:45 +01:00
Nathaniel Shead	9f9afc65bb	c++/modules: Handle forward-declared class types In some cases we can access members of a namespace-scope class without ever having performed name-lookup on it; this can occur when a forward-declaration of the class is used as a return type, for instance, or with PIMPL. One possible approach would be to do name lookup in complete_type to force lazy loading to occur, but this seems overly expensive for a relatively rare case. Instead, this patch generalises the existing pending-entity support to handle this case as well. Unfortunately this does mean that almost every class definition will be added to the pending-entity table, and almost always unnecessarily, but I don't see a good way to avoid this. gcc/cp/ChangeLog: * module.cc (depset::DB_IS_MEMBER_BIT): Rename to... (depset::DB_IS_PENDING_BIT): ...this. (depset::is_member): Remove. (depset::is_pending_entity): New function. (depset::hash::make_dependency): Mark definitions of namespace-scope types as maybe-pending entities. (depset::hash::add_class_entities): Rename DB_IS_MEMBER_BIT to DB_IS_PENDING_BIT. (depset::hash::find_dependencies): Use is_pending_entity instead of is_member. (module_state::write_pendings): Likewise; adjust comment. gcc/testsuite/ChangeLog: * g++.dg/modules/inst-4_b.C: Adjust pending-entity count. * g++.dg/modules/member-def-1_c.C: Likewise. * g++.dg/modules/member-def-2_c.C: Likewise. * g++.dg/modules/tpl-spec-3_b.C: Likewise. * g++.dg/modules/tpl-spec-4_b.C: Likewise. * g++.dg/modules/tpl-spec-5_b.C: Likewise. * g++.dg/modules/class-9_a.H: New test. * g++.dg/modules/class-9_b.H: New test. * g++.dg/modules/class-9_c.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>	2024-10-22 22:34:30 +11:00
Richard Biener	d464a52d06	tree-optimization/117254 - ICE with access diangostics The diagnostics code fails to handle non-constant domain max. PR tree-optimization/117254 * gimple-ssa-warn-access.cc (maybe_warn_nonstring_arg): Check the array domain max is constant before using it. * gcc.dg/pr117254.c: New testcase.	2024-10-22 13:25:48 +02:00
Andrew Stubbs	a6b26e5ea0	amdgcn: Refactor device settings into a def file Almost all device-specific settings are now centralised into gcn-devices.def for the compiler, mkoffload, and libgomp. No longer will we have to touch 10 files in multiple places just to add another device without any exotic features. (New ISAs and devices with incompatible metadata will continue to need a bit more.) In order to remove the device-specific conditionals in the code a new value HSACO_ATTR_UNSUPPORTED has been added, indicating that the assembler will reject any setting of that option. This incorporates some of Tobias's patch from March 2024. Co-Authored-By: Tobias Burnus <tburnus@baylibre.com> gcc/ChangeLog: * config.gcc (amdgcn): Add gcn-device-macros.h to tm_file. Add gcn-tables.opt to extra_options. * config/gcn/gcn-hsa.h (NO_XNACK): Delete. (NO_SRAM_ECC): Delete. (SRAMOPT): Move definition to generated file gcn-device-macros.h. (XNACKOPT): Likewise. (ASM_SPEC): Redefine using generated values from gcn-device-macros.h. * config/gcn/gcn-opts.h (enum processor_type): Generate from gcn-devices.def. (TARGET_VEGA10): Delete. (TARGET_VEGA20): Delete. (TARGET_GFX908): Delete. (TARGET_GFX90a): Delete. (TARGET_GFX90c): Delete. (TARGET_GFX1030): Delete. (TARGET_GFX1036): Delete. (TARGET_GFX1100): Delete. (TARGET_GFX1103): Delete. (TARGET_XNACK): Redefine to allow for HSACO_ATTR_UNSUPPORTED. (enum hsaco_attr_type): Add HSACO_ATTR_UNSUPPORTED. (TARGET_TGSPLIT): New define. * config/gcn/gcn.cc (gcn_devices): New constant table. (gcn_option_override): Rework to use gcn_devices table. (gcn_omp_device_kind_arch_isa): Likewise. (output_file_start): Likewise. (gcn_hsa_declare_function_name): Rework using TARGET_* macros. * config/gcn/gcn.h (gcn_devices): Declare struct and table. (TARGET_CPU_CPP_BUILTINS): Rework using gcn_devices. * config/gcn/gcn.opt: Move enum data to generated file gcn-tables.opt. Use new names for the default values. * config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX900): Delete. (EF_AMDGPU_MACH_AMDGCN_GFX906): Delete. (EF_AMDGPU_MACH_AMDGCN_GFX908): Delete. (EF_AMDGPU_MACH_AMDGCN_GFX90a): Delete. (EF_AMDGPU_MACH_AMDGCN_GFX90c): Delete. (EF_AMDGPU_MACH_AMDGCN_GFX1030): Delete. (EF_AMDGPU_MACH_AMDGCN_GFX1036): Delete. (EF_AMDGPU_MACH_AMDGCN_GFX1100): Delete. (EF_AMDGPU_MACH_AMDGCN_GFX1103): Delete. (enum elf_arch_code): Define using gcn-devices.def. (get_arch): Rework using gcn-devices.def. (main): Rework using gcn-devices.def * config/gcn/t-gcn-hsa (gcn-tables.opt): Generate file. (gcn-device-macros.h): Generate file. * config/gcn/t-omp-device: Generate isa list from gcn-devices.def. * config/gcn/gcn-devices.def: New file. * config/gcn/gcn-tables.opt: New file. * config/gcn/gcn-tables.opt.urls: New file. * config/gcn/gen-gcn-device-macros.awk: New file. * config/gcn/gen-opt-tables.awk: New file. libgomp/ChangeLog: * plugin/plugin-gcn.c (EF_AMDGPU_MACH): Generate from gcn-devices.def. (gcn_gfx803_s): Delete. (gcn_gfx900_s): Delete. (gcn_gfx906_s): Delete. (gcn_gfx908_s): Delete. (gcn_gfx90a_s): Delete. (gcn_gfx90c_s): Delete. (gcn_gfx1030_s): Delete. (gcn_gfx1036_s): Delete. (gcn_gfx1100_s): Delete. (gcn_gfx1103_s): Delete. (gcn_isa_name_len): Delete. (isa_hsa_name): Rename ... (isa_name): ... to this, and rework using gcn-devices.def. (isa_gcc_name): Delete. (isa_code): Rework using gcn-devices.def. (max_isa_vgprs): Rework using gcn-devices.def. (isa_matches_agent): Update isa_name usage. (GOMP_OFFLOAD_init_device): Improve diagnostic using the name.	2024-10-22 11:07:05 +00:00
Richard Biener	c33d8c55a7	tree-optimization/117123 - missed PHI equivalence in VN Value-numbering can use its set of equivalences to prove that a PHI node with args <a_1, 5, 10> is equal to a_1 iff on the edges with the constants a_1 == 5 and a_1 == 10 hold. This breaks down when the order of PHI args is <5, 10, a_1> as then we drop to VARYING early. The following mitigates this by shuffling a copy of the edge vector to always process a SSA name argument first. Which should also handle the special-case of a two argument <5, a_1> we already had. PR tree-optimization/117123 * tree-ssa-sccvn.cc (visit_phi): First process a non-constant argument edge to handle more equivalences. Remove the two-arg special case. * g++.dg/tree-ssa/pr117123.C: New testcase.	2024-10-22 09:57:34 +02:00
Stefan Schulze Frielinghaus	9263523b7e	testsuite: Fix typo in ext-floating19.C gcc/testsuite/ChangeLog: * g++.dg/cpp23/ext-floating19.C: Fix typo for bfloat16 guard.	2024-10-22 08:58:14 +02:00
xuli	adf4ece4dc	RISC-V: Add testcases for unsigned .SAT_SUB form 1 with IMM = 1. form 1: T __attribute__((noinline)) \ sat_u_sub_imm##IMM##_##T##_fmt_1 (T y) \ { \ return (T)IMM >= y ? (T)IMM - y : 0; \ } Passed the rv64gcv regression test. Change-Id: I8805225b445cdbbc685f4f54a4d66c7ee8f748e1 Signed-off-by: Li Xu <xuli1@eswincomputing.com> gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_u_sub_imm-1_4.c: New test. * gcc.target/riscv/sat_u_sub_imm-2_4.c: New test. * gcc.target/riscv/sat_u_sub_imm-3_4.c: New test. * gcc.target/riscv/sat_u_sub_imm-4_2.c: New test.	2024-10-22 01:15:39 +00:00
xuli	4e65e12a9a	Match: Support IMM=1 for unsigned scalar .SAT_SUB IMM form 1 This patch would like to support .SAT_SUB when one of the op is IMM = 1 of form1. Form 1: #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \ T __attribute__((noinline)) \ sat_u_sub_imm##IMM##_##T##_fmt_1 (T y) \ { \ return IMM >= y ? IMM - y : 0; \ } Take below form 1 as example: DEF_SAT_U_SUB_IMM_FMT_1(uint8_t, 1) Before this patch: __attribute__((noinline)) uint8_t sat_u_sub_imm1_uint8_t_fmt_1 (uint8_t y) { uint8_t _1; uint8_t _3; <bb 2> [local count: 1073741824]: if (y_2(D) <= 1) goto <bb 3>; [41.00%] else goto <bb 4>; [59.00%] <bb 3> [local count: 440234144]: _3 = y_2(D) ^ 1; <bb 4> [local count: 1073741824]: # _1 = PHI <0(2), _3(3)> return _1; } After this patch: __attribute__((noinline)) uint8_t sat_u_sub_imm1_uint8_t_fmt_1 (uint8_t y) { uint8_t _1; ;; basic block 2, loop depth 0 ;; pred: ENTRY _1 = .SAT_SUB (1, y_2(D)); [tail call] return _1; ;; succ: EXIT } The below test suites are passed for this patch: 1. The rv64gcv fully regression tests. 2. The x86 bootstrap tests. 3. The x86 fully regression tests. Signed-off-by: Li Xu <xuli1@eswincomputing.com> gcc/ChangeLog: * match.pd: Support IMM=1.	2024-10-22 01:13:59 +00:00
xuli	93b6f28781	RISC-V: Add testcases for unsigned .SAT_SUB form 1 with IMM = max -1. form 1: T __attribute__((noinline)) \ sat_u_sub_imm##IMM##_##T##_fmt_1 (T y) \ { \ return (T)IMM >= y ? (T)IMM - y : 0; \ } Passed the rv64gcv regression test. Change-Id: Idaa1ab41f2a5785112279ea8ee2c93236457b740 Signed-off-by: Li Xu <xuli1@eswincomputing.com> gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_u_sub_imm-1_3.c: New test. * gcc.target/riscv/sat_u_sub_imm-2_3.c: New test. * gcc.target/riscv/sat_u_sub_imm-3_3.c: New test. * gcc.target/riscv/sat_u_sub_imm-4_1.c: New test.	2024-10-22 01:12:20 +00:00
xuli	1dccec47ab	Match: Support IMM=max-1 for unsigned scalar .SAT_SUB IMM form 1 This patch would like to support .SAT_SUB when one of the op is IMM = max - 1 of form1. Form 1: #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \ T __attribute__((noinline)) \ sat_u_sub_imm##IMM##_##T##_fmt_1 (T y) \ { \ return IMM >= y ? IMM - y : 0; \ } Take below form 1 as example: DEF_SAT_U_SUB_IMM_FMT_1(uint8_t, 254) Before this patch: __attribute__((noinline)) uint8_t sat_u_sub_imm254_uint8_t_fmt_1 (uint8_t y) { uint8_t _1; uint8_t _3; <bb 2> [local count: 1073741824]: if (y_2(D) != 255) goto <bb 3>; [66.00%] else goto <bb 4>; [34.00%] <bb 3> [local count: 708669600]: _3 = 254 - y_2(D); <bb 4> [local count: 1073741824]: # _1 = PHI <0(2), _3(3)> return _1; } After this patch: __attribute__((noinline)) uint8_t sat_u_sub_imm254_uint8_t_fmt_1 (uint8_t y) { uint8_t _1; <bb 2> [local count: 1073741824]: _1 = .SAT_SUB (254, y_2(D)); [tail call] return _1; } The below test suites are passed for this patch: 1. The rv64gcv fully regression tests. 2. The x86 bootstrap tests. 3. The x86 fully regression tests. Signed-off-by: Li Xu <xuli1@eswincomputing.com> gcc/ChangeLog: * match.pd: Support IMM=max-1.	2024-10-22 01:10:27 +00:00
GCC Administrator	52cc5f0436	Daily bump.	2024-10-22 00:20:27 +00:00
Jeff Law	36e91df771	[committed][PR rtl-optimization/116488] Fix SIGN_EXTEND source handling in ext-dce A while back I noticed that the code to call carry_backpropagate was being called after the optimization step. Which seemed wrong, but at the time I didn't have a testcase showing it as a problem. Now I have 4 :-) The way things used to work, the extension would be stripped away before calling carry_backpropagte, meaning carry_backpropagate would never see a SIGN_EXTENSION. Thus the code trying to account for the sign extended bit was never reached. Getting that bit marked live is what's needed to fix these testcases. Fallout is minor with just an adjustment needed to sensibly deal with vector modes in a place where we didn't have them before. I'm still somewhat concerned about this code. Specifically whether or not we can get in here with arbitrarily complex RTL, and if so do we need to recurse down and look at those sub-expressions. So while this patch fixes the most pressing issue, I wouldn't be terribly surprised if we're back inside this code at some point. Bootstrapped and regression tested on x86_64, ppc64le, riscv64, s390x, mips64, loongarch, aarch64, m68k, alpha, hppa, sh4, sh4eb, perhaps something else that I've forgotten... Also tested on all the crosses in my tester. PR rtl-optimization/116488 PR rtl-optimization/116579 PR rtl-optimization/116915 PR rtl-optimization/117226 gcc/ * ext-dce.cc (carry_backpropagate): Properly handle SIGN_EXTEND, add ZERO_EXTEND handling as well. (ext_dce_process_uses): Call carry_backpropagate before the optimization step. gcc/testsuite/ * gcc.dg/torture/pr116488.c: New test. * gcc.dg/torture/pr116579.c: New test. * gcc.dg/torture/pr116915.c: New test. * gcc.dg/torture/pr117226.c: New test.	2024-10-21 13:37:21 -06:00
Pan Li	cb131a401b	RISC-V: Add testcases for form 8 of vector signed SAT_TRUNC Form 8: #define DEF_VEC_SAT_S_TRUNC_FMT_8(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline)) \ vec_sat_s_trunc_##NT##_##WT##_fmt_8 (NT out, WT in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT x = in[i]; \ NT trunc = (NT)x; \ out[i] = (WT)NT_MIN >= x \|\| x >= (WT)NT_MAX \ ? x < 0 ? NT_MIN : NT_MAX \ : trunc; \ } \ } The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i64-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>	2024-10-21 22:14:30 +08:00
Pan Li	f138806811	RISC-V: Add testcases for form 7 of vector signed SAT_TRUNC Form 7: #define DEF_VEC_SAT_S_TRUNC_FMT_7(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline)) \ vec_sat_s_trunc_##NT##_##WT##_fmt_7 (NT out, WT in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT x = in[i]; \ NT trunc = (NT)x; \ out[i] = (WT)NT_MIN > x \|\| x >= (WT)NT_MAX \ ? x < 0 ? NT_MIN : NT_MAX \ : trunc; \ } \ } The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i64-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>	2024-10-21 22:14:30 +08:00
Pan Li	f411abe793	RISC-V: Add testcases for form 6 of vector signed SAT_TRUNC Form 6: #define DEF_VEC_SAT_S_TRUNC_FMT_6(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline)) \ vec_sat_s_trunc_##NT##_##WT##_fmt_6 (NT out, WT in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT x = in[i]; \ NT trunc = (NT)x; \ out[i] = (WT)NT_MIN >= x \|\| x > (WT)NT_MAX \ ? x < 0 ? NT_MIN : NT_MAX \ j: trunc; \ } \ } The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i64-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>	2024-10-21 22:14:30 +08:00
Pan Li	108c8ef03d	RISC-V: Add testcases for form 5 of vector signed SAT_TRUNC Form 5: #define DEF_VEC_SAT_S_TRUNC_FMT_5(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline)) \ vec_sat_s_trunc_##NT##_##WT##_fmt_5 (NT out, WT in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT x = in[i]; \ NT trunc = (NT)x; \ out[i] = (WT)NT_MIN > x \|\| x > (WT)NT_MAX \ ? x < 0 ? NT_MIN : NT_MAX \ : trunc; \ } \ } The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i64-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>	2024-10-21 22:14:30 +08:00
Pan Li	f30ca9867a	RISC-V: Add testcases for form 4 of vector signed SAT_TRUNC Form 4: #define DEF_VEC_SAT_S_TRUNC_FMT_4(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline)) \ vec_sat_s_trunc_##NT##_##WT##_fmt_4 (NT out, WT in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT x = in[i]; \ NT trunc = (NT)x; \ out[i] = (WT)NT_MIN <= x && x < (WT)NT_MAX \ ? trunc \ : x < 0 ? NT_MIN : NT_MAX; \ } \ } The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i64-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>	2024-10-21 22:14:30 +08:00
Pan Li	efa1617bfc	RISC-V: Add testcases for form 3 of vector signed SAT_TRUNC Form 3: #define DEF_VEC_SAT_S_TRUNC_FMT_3(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline)) \ vec_sat_s_trunc_##NT##_##WT##_fmt_3 (NT out, WT in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT x = in[i]; \ NT trunc = (NT)x; \ out[i] = (WT)NT_MIN < x && x < (WT)NT_MAX \ ? trunc \ : x < 0 ? NT_MIN : NT_MAX; \ } \ } The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i64-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>	2024-10-21 22:14:30 +08:00
Pan Li	033900fc17	RISC-V: Add testcases for form 2 of vector signed SAT_TRUNC Form 2: #define DEF_VEC_SAT_S_TRUNC_FMT_2(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline)) \ vec_sat_s_trunc_##NT##_##WT##_fmt_2 (NT out, WT in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT x = in[i]; \ NT trunc = (NT)x; \ out[i] = (WT)NT_MIN < x && x < (WT)NT_MAX \ ? trunc \ : x < 0 ? NT_MIN : NT_MAX; \ } \ } The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i64-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>	2024-10-21 22:14:30 +08:00
Pan Li	1f3a9c08af	RISC-V: Add testcases for form 1 of vector signed SAT_TRUNC Form 1: #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline)) \ vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT out, WT in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT x = in[i]; \ NT trunc = (NT)x; \ out[i] = (WT)NT_MIN <= x && x <= (WT)NT_MAX \ ? trunc \ : x < 0 ? NT_MIN : NT_MAX; \ } \ } The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/vec_sat_data.h: Add test data for signed SAT_TRUNC. * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i64-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>	2024-10-21 22:14:30 +08:00
Pan Li	b5a0581541	RISC-V: Implement vector SAT_TRUNC for signed integer This patch would like to implement the sstrunc for vector signed integer. Form 1: #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline)) \ vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT out, WT in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT x = in[i]; \ NT trunc = (NT)x; \ out[i] = (WT)NT_MIN <= x && x <= (WT)NT_MAX \ ? trunc \ : x < 0 ? NT_MIN : NT_MAX; \ } \ } DEF_VEC_SAT_S_TRUNC_FMT_1(int32_t, int64_t, INT32_MIN, INT32_MAX) Before this patch: 27 │ vsetvli a5,a2,e64,m1,ta,ma 28 │ vle64.v v1,0(a1) 29 │ slli a3,a5,3 30 │ slli a4,a5,2 31 │ sub a2,a2,a5 32 │ add a1,a1,a3 33 │ vadd.vv v0,v1,v5 34 │ vsetvli zero,zero,e32,mf2,ta,ma 35 │ vnsrl.wx v2,v1,a6 36 │ vncvt.x.x.w v1,v1 37 │ vsetvli zero,zero,e64,m1,ta,ma 38 │ vmsgtu.vv v0,v0,v4 39 │ vsetvli zero,zero,e32,mf2,ta,mu 40 │ vneg.v v2,v2 41 │ vxor.vv v1,v2,v3,v0.t 42 │ vse32.v v1,0(a0) 43 │ add a0,a0,a4 44 │ bne a2,zero,.L3 After this patch: 16 │ vsetvli a5,a2,e32,mf2,ta,ma 17 │ vle64.v v1,0(a1) 18 │ slli a3,a5,3 19 │ slli a4,a5,2 20 │ sub a2,a2,a5 21 │ add a1,a1,a3 22 │ vnclip.wi v1,v1,0 23 │ vse32.v v1,0(a0) 24 │ add a0,a0,a4 25 │ bne a2,zero,.L3 The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/ChangeLog: * config/riscv/autovec.md (sstrunc<mode><v_double_trunc>2): Add new pattern sstrunc for double trunc. (sstrunc<mode><v_quad_trunc>2): Ditto but for quad trunc. (sstrunc<mode><v_oct_trunc>2): Ditto but for oct trunc. * config/riscv/riscv-protos.h (expand_vec_double_sstrunc): Add new func decl to expand double trunc. (expand_vec_quad_sstrunc): Ditto but for quad trunc. (expand_vec_oct_sstrunc): Ditto but for oct trunc. * config/riscv/riscv-v.cc (expand_vec_double_sstrunc): Add new func to expand double trunc. (expand_vec_quad_sstrunc): Ditto but for quad trunc. (expand_vec_oct_sstrunc): Ditto but for oct trunc. Signed-off-by: Pan Li <pan2.li@intel.com>	2024-10-21 22:14:30 +08:00
Pan Li	2987ca6100	Vect: Try the pattern of vector signed integer SAT_TRUNC Almost the same as vector unsigned integer SAT_TRUNC, try to match the signed version during the vector pattern matching. The below test suites are passed for this patch. * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * tree-vect-patterns.cc (gimple_signed_integer_sat_trunc): Add new func decl for signed SAT_TRUNC. (vect_recog_sat_trunc_pattern): Try signed match pattern for the SAT_TRUNC. Signed-off-by: Pan Li <pan2.li@intel.com>	2024-10-21 22:12:08 +08:00
Pan Li	bdbb74e38f	Match: Support form 1 for vector signed integer SAT_TRUNC This patch would like to support the form 1 of the vector signed integer SAT_TRUNC. Aka below example: Form 1: #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline)) \ vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT out, WT in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT x = in[i]; \ NT trunc = (NT)x; \ out[i] = (WT)NT_MIN <= x && x <= (WT)NT_MAX \ ? trunc \ : x < 0 ? NT_MIN : NT_MAX; \ } \ } DEF_VEC_SAT_S_TRUNC_FMT_1(int32_t, int64_t, INT32_MIN, INT32_MAX) Before this patch: 48 │ _87 = .SELECT_VL (ivtmp_85, POLY_INT_CST [2, 2]); 49 │ ivtmp_64 = _87 * 8; 50 │ vect_x_14.10_67 = .MASK_LEN_LOAD (vectp_in.8_65, 64B, { -1, ... }, _87, 0); 51 │ vect_trunc_15.21_78 = (vector([2,2]) int) vect_x_14.10_67; 52 │ _61 = VIEW_CONVERT_EXPR<vector([2,2]) unsigned long>(vect_x_14.10_67); 53 │ _32 = _61 >> 63; 54 │ vect_patt_52.16_73 = (vector([2,2]) int) _32; 55 │ vect__46.17_74 = VIEW_CONVERT_EXPR<vector([2,2]) unsigned int>(vect_patt_52.16_73); 56 │ vect__47.18_75 = -vect__46.17_74; 57 │ vect__21.19_76 = VIEW_CONVERT_EXPR<vector([2,2]) int>(vect__47.18_75); 58 │ vect_x.11_68 = VIEW_CONVERT_EXPR<vector([2,2]) unsigned long>(vect_x_14.10_67); 59 │ vect__5.12_69 = vect_x.11_68 + { 2147483648, ... }; 60 │ mask__34.13_70 = vect__5.12_69 > { 4294967295, ... }; 61 │ _25 = .COND_XOR (mask__34.13_70, vect__21.19_76, { 2147483647, ... }, vect_trunc_15.21_78); 62 │ ivtmp_80 = _87 * 4; 63 │ .MASK_LEN_STORE (vectp_out.23_81, 32B, { -1, ... }, _87, 0, _25); 64 │ vectp_in.8_66 = vectp_in.8_65 + ivtmp_64; 65 │ vectp_out.23_82 = vectp_out.23_81 + ivtmp_80; 66 │ ivtmp_86 = ivtmp_85 - _87; After this patch: 38 │ _77 = .SELECT_VL (ivtmp_75, POLY_INT_CST [2, 2]); 39 │ ivtmp_65 = _77 * 8; 40 │ vect_x_14.10_68 = .MASK_LEN_LOAD (vectp_in.8_66, 64B, { -1, ... }, _77, 0); 41 │ vect_patt_53.11_69 = .SAT_TRUNC (vect_x_14.10_68); 42 │ ivtmp_70 = _77 * 4; 43 │ .MASK_LEN_STORE (vectp_out.12_71, 32B, { -1, ... }, _77, 0, vect_patt_53.11_69); 44 │ vectp_in.8_67 = vectp_in.8_66 + ivtmp_65; 45 │ vectp_out.12_72 = vectp_out.12_71 + ivtmp_70; 46 │ ivtmp_76 = ivtmp_75 - _77; The below test suites are passed for this patch. * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * match.pd: Refine matching for vector signed SAT_TRUNC form 1. Signed-off-by: Pan Li <pan2.li@intel.com>	2024-10-21 22:12:08 +08:00
Andrew Carlotti	8193e71a07	aarch64: Fix costing of move to/from MOVEABLE_SYSREGS This is necessary to prevent reload assuming that a direct FP->FPMR move is valid. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_register_move_cost): Increase costs involving MOVEABLE_SYSREGS.	2024-10-21 15:00:48 +01:00
Andrew Stubbs	0b6d94ce72	amdgcn: silence warning FIRST_SGPR_REG is register zero so the compiler always claims this comparison is redundant. It's right, of course, but I'd have preferred to keep the comparison for completeness. Probably the "correct" solution is to use an enum for these values. gcc/ChangeLog: * config/gcn/gcn.h (SGPR_REGNO_P): Silence warning.	2024-10-21 12:41:01 +00:00
Alex Coplan	c0e54ce199	pair-fusion: Assume alias conflict if common address reg changes [PR116783] As the PR shows, pair-fusion was tricking memory_modified_in_insn_p into returning false when a common base register (in this case, x1) was modified between the mem and the store insn. This lead to wrong code as the accesses really did alias. To avoid this sort of problem, this patch avoids invoking RTL alias analysis altogether (and assume an alias conflict) if the two insns to be compared share a common address register R, and the insns see different definitions of R (i.e. it was modified in between). gcc/ChangeLog: PR rtl-optimization/116783 * pair-fusion.cc (def_walker::cand_addr_uses): New. (def_walker::def_walker): Add parameter for candidate address uses. (def_walker::alias_conflict_p): Declare. (def_walker::addr_reg_conflict_p): New. (def_walker::conflict_p): New. (store_walker::store_walker): Add parameter for candidate address uses and pass to base ctor. (store_walker::conflict_p): Rename to ... (store_walker::alias_conflict_p): ... this. (load_walker::load_walker): Add parameter for candidate address uses and pass to base ctor. (load_walker::conflict_p): Rename to ... (load_walker::alias_conflict_p): ... this. (pair_fusion_bb_info::try_fuse_pair): Collect address register uses for candidate insns and pass down to alias walkers. gcc/testsuite/ChangeLog: PR rtl-optimization/116783 * g++.dg/torture/pr116783.C: New test.	2024-10-21 13:30:08 +01:00
Jonathan Wakely	d0d99fc6b6	libstdc++: Improve 26_numerics/headers/cmath/types_std_c++0x_neg.cc This test checks that the special functions in <cmath> are not declared prior to C++17. But we can remove the target selector and allow it to be tested for C++17 and later, and add target selectors to the individual dg-error directives instead. Also rename the test to match what it actually tests. libstdc++-v3/ChangeLog: * testsuite/26_numerics/headers/cmath/types_std_c++0x_neg.cc: Move to ... * testsuite/26_numerics/headers/cmath/specfun_c++17.cc: here and adjust test to be valid for all -std dialects.	2024-10-21 12:12:15 +01:00
Jonathan Wakely	1003a42815	libstdc++: Simplify C++98 std::vector::_M_data_ptr overload set We don't need separate overloads for returning a const or non-const pointer. We can make the member function const and return a non-const pointer, and let vector::data() const convert it to const as needed. libstdc++-v3/ChangeLog: * include/bits/stl_vector.h (vector::_M_data_ptr): Remove non-const overloads. Always return non-const pointer.	2024-10-21 12:12:15 +01:00
Jonathan Wakely	cba8069125	libstdc++: Fix order of [[...]] and __attribute__((...)) attrs [PR117220] GCC allows these in either order, but Clang doesn't like the C++11-style [[__nodiscard__]] coming after __attribute__((__always_inline__)). libstdc++-v3/ChangeLog: PR libstdc++/117220 * include/bits/stl_iterator.h: Move _GLIBCXX_NODISCARD annotations after __attribute__((__always_inline__)).	2024-10-21 12:12:15 +01:00
Jeevitha	1a4c5643a5	rs6000: Correct the function code for _AMO_LD_DEC_BOUNDED Corrected the function code for the Atomic Memory Operation "Fetch and Decrement Bounded", changing it from 0x1A to 0x1C. 2024-10-11 Jeevitha Palanisamy <jeevitha@linux.ibm.com> gcc/ * config/rs6000/amo.h (enum _AMO_LD): Correct the function code for _AMO_LD_DEC_BOUNDED.	2024-10-21 03:44:04 -05:00
Haochen Jiang	f132c006d7	i386: Refactor get_intel_cpu From ISE, it shows that we will have family 0x13 for Diamond Rapids. Therefore, we need to refactor the get_intel_cpu to accept new families. Also I did some reorder in the switch for clearness by putting earlier added products on top for search convenience. gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_intel_cpu): Refactor the function for future expansion on different family.	2024-10-21 13:42:12 +08:00

1 2 3 4 5 ...

214630 Commits