226394 Commits

Author SHA1 Message Date
Thomas Schwinge
b4eb45a15d Add 'libgomp.c++/target-std__[...]-concurrent-usm.C' test cases for C++ 'std::unordered_map', 'std::unordered_multimap', 'std::unordered_multiset', 'std::unordered_set'
libgomp/
	* testsuite/libgomp.c++/target-std__unordered_map-concurrent-usm.C:
	New.
	* testsuite/libgomp.c++/target-std__unordered_multimap-concurrent-usm.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__unordered_multiset-concurrent-usm.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__unordered_set-concurrent-usm.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__unordered_map-concurrent.C:
	Adjust.
	* testsuite/libgomp.c++/target-std__unordered_multimap-concurrent.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__unordered_multiset-concurrent.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__unordered_set-concurrent.C:
	Likewise.
2026-01-14 16:00:56 +01:00
Thomas Schwinge
4b60a4da49 Add 'libgomp.c++/target-std__[...]-concurrent-usm.C' test cases for C++ 'std::flat_map', 'std::flat_multimap', 'std::flat_multiset', 'std::flat_set'
libgomp/
	* testsuite/libgomp.c++/target-std__flat_map-concurrent-usm.C:
	New.
	* testsuite/libgomp.c++/target-std__flat_multimap-concurrent-usm.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__flat_multiset-concurrent-usm.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__flat_set-concurrent-usm.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__flat_map-concurrent.C: Adjust.
	* testsuite/libgomp.c++/target-std__flat_multimap-concurrent.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__flat_multiset-concurrent.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__flat_set-concurrent.C:
	Likewise.
2026-01-14 16:00:56 +01:00
Thomas Schwinge
e9e76f7607 Fix up 'libgomp.c++/target-std__[...]-concurrent-usm.C' dynamic memory allocation
OpenMP/USM implies memory accessible from host as well as device, but doesn't
imply that allocation vs. deallocation may be done in the opposite context.
For most of the test cases, (by construction) we're not allocating memory
during device execution, so have nothing to clean up.  (..., but still document
these semantics.)  But for a few, we have to clean up:
'libgomp.c++/target-std__map-concurrent-usm.C',
'libgomp.c++/target-std__multimap-concurrent-usm.C',
'libgomp.c++/target-std__multiset-concurrent-usm.C',
'libgomp.c++/target-std__set-concurrent-usm.C'.

For 'libgomp.c++/target-std__multimap-concurrent-usm.C' (only), this issue
already got addressed in commit 90f2ab4b6e
"libgomp.c++/target-std__multimap-concurrent.C: Fix USM memory freeing".
However, instead of invoking the 'clear' function (which doesn't generally
guarantee to release dynamically allocated memory; for example, see PR123582
"C++ unordered associative container: dynamic memory management"), we properly
restore the respective object into pristine state.

	libgomp/
	* testsuite/libgomp.c++/target-std__array-concurrent-usm.C:
	'#define OMP_USM'.
	* testsuite/libgomp.c++/target-std__forward_list-concurrent-usm.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__list-concurrent-usm.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__span-concurrent-usm.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__map-concurrent-usm.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__multimap-concurrent-usm.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__multiset-concurrent-usm.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__set-concurrent-usm.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__valarray-concurrent-usm.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__vector-concurrent-usm.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__bitset-concurrent-usm.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__deque-concurrent-usm.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__array-concurrent.C: Comment.
	* testsuite/libgomp.c++/target-std__bitset-concurrent.C: Likewise.
	* testsuite/libgomp.c++/target-std__deque-concurrent.C: Likewise.
	* testsuite/libgomp.c++/target-std__forward_list-concurrent.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__list-concurrent.C: Likewise.
	* testsuite/libgomp.c++/target-std__span-concurrent.C: Likewise.
	* testsuite/libgomp.c++/target-std__valarray-concurrent.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__vector-concurrent.C: Likewise.
	* testsuite/libgomp.c++/target-std__map-concurrent.C [OMP_USM]:
	Fix up dynamic memory allocation.
	* testsuite/libgomp.c++/target-std__multimap-concurrent.C
	[OMP_USM]: Likewise.
	* testsuite/libgomp.c++/target-std__multiset-concurrent.C
	[OMP_USM]: Likewise.
	* testsuite/libgomp.c++/target-std__set-concurrent.C [OMP_USM]:
	Likewise.
2026-01-14 16:00:56 +01:00
Thomas Schwinge
3dc9eedd95 libgomp: Add a few more OpenMP/USM test cases
... where there are clear differences in behavior for OpenMP/USM run-time
configurations.

We shall further clarify all the intended semantics, once the implementation
begins to differentiate OpenMP 'requires unified_shared_memory' vs.
'requires self_maps'.

	libgomp/
	* testsuite/libgomp.c-c++-common/map-arrayofstruct-2-usm.c: New.
	* testsuite/libgomp.c-c++-common/map-arrayofstruct-3-usm.c:
	Likewise.
	* testsuite/libgomp.c-c++-common/struct-elem-5-usm.c: Likewise.
	* testsuite/libgomp.c-c++-common/target-present-1-usm.c: Likewise.
	* testsuite/libgomp.c-c++-common/target-present-2-usm.c: Likewise.
	* testsuite/libgomp.c-c++-common/target-present-3-usm.c: Likewise.
	* testsuite/libgomp.fortran/map-subarray-5-usm.f90: Likewise.
	* testsuite/libgomp.fortran/map-subarray-6-usm.f90: Likewise.
	* testsuite/libgomp.fortran/map-subarray-7-usm.f90: Likewise.
	* testsuite/libgomp.fortran/target-allocatable-1-1-usm.f90:
	Likewise.
	* testsuite/libgomp.fortran/target-allocatable-1-2-usm.f90:
	Likewise.
	* testsuite/libgomp.fortran/target-enter-data-2-usm.F90: Likewise.
	* testsuite/libgomp.fortran/target-present-1-usm.f90: Likewise.
	* testsuite/libgomp.fortran/target-present-2-usm.f90: Likewise.
	* testsuite/libgomp.fortran/target-present-3-usm.f90: Likewise.
	* testsuite/libgomp.fortran/target-allocatable-1-1.f90: Adjust.
	* testsuite/libgomp.fortran/target-allocatable-1-2.f90: Likewise.
	* testsuite/libgomp.fortran/target-present-1.f90: Likewise.
	* testsuite/libgomp.fortran/target-present-2.f90: Likewise.
	* testsuite/libgomp.fortran/target-present-3.f90: Likewise.
2026-01-14 16:00:56 +01:00
Jakub Jelinek
3a05f190ff defaults: Use argument in default EH_RETURN_DATA_REGNO definition [PR123115]
All targets use the EH_RETURN_DATA_REGNO macro argument except for
NVPTX which uses the default.
The problem is that we get then -Wunused-but-set-variable warning
when building df-scan.cc for NVPTX target with GCC 16 (post r16-2258
PR44677) on:
      unsigned int i;
      /* Mark the registers that will contain data for the handler.  */
      for (i = 0; ; ++i)
        {
          unsigned regno = EH_RETURN_DATA_REGNO (i);
          if (regno == INVALID_REGNUM)
            break;
If it were multiple targets suffering from this, I'd think about
adding something to use i in loops like this, but as it is
just the default definition, the following patch fixes it by
using the argument.

2026-01-14  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/123115
	* defaults.h (EH_RETURN_DATA_REGNO): Add void (N) to the macro
	definition inside of a comma expression before INVALID_REGNUM.
2026-01-14 15:56:29 +01:00
Jakub Jelinek
fc2ee2f20c combine: Partially revert the r12-4475 changes [PR120250]
The r12-4475 change added extra code to recog_for_combine to attempt to
force some constants into the constant pool.
Unfortunately, as this (UB at runtime) testcase shows, such changes are
harmful for computed_jump_p jumps.  The computed_jump_p returns false
for loads from constant pool MEMs:
    case MEM:
      return ! (GET_CODE (XEXP (x, 0)) == SYMBOL_REF
                && CONSTANT_POOL_ADDRESS_P (XEXP (x, 0)));
and so if we try to optimize a computed jump that way, it becomes
a non-computed jump which doesn't match any other jump category
(simplejump_p, tablejump_p, condjump_p, returnjump_p, eh_returnjump_p,
asm goto) and doesn't have any label recorded in JUMP_LABEL (because,
it doesn't really jump to any LABEL), so some passes like dwarf2cfi
can get confused about it and ICE.

The following patch just prevents that, by only doing the r12-4475
changes if it is not a jump.

2026-01-14  Jakub Jelinek  <jakub@redhat.com>

	PR target/120250
	* combine.cc (recog_for_combine): Don't try to put SET_SRC
	into a constant pool if SET_DEST is pc_rtx.

	* gcc.c-torture/compile/pr120250.c: New test.
2026-01-14 15:53:44 +01:00
Richard Biener
948d33f490 tree-optimization/123190 - fix costing of permuted contiguous loads
The following fixes a regression from the time we split load groups
along SLP boundaries.  When we face a permuted load from an access
that is contiguous across loop iterations we emit code that loads
the whole group and then emit required permutations.  The permutations
might not need all those loads, and if we split the group we would
not have emitted them.  Fortunately when analyzing a permutation
we compute both the number of required permutes and the number of
loads that will survive the followin DCE.  So make sure to use that
when costing.  This allows the previously added testcase for PR123190
to undergo epilog vectorization also at -O2 plus when using non-generic
tuning, such as tuning for Zen4 which ups the cost for XMM loads.

	PR tree-optimization/123190
	* tree-vectorizer.h (vect_load_store_data): Add n_loads member.
	* tree-vect-stmts.cc (get_load_store_type): Record the
	number of required loads for permuted loads.
	(vectorizable_load): Make use of this when costing loads
	for VMAT_CONTIGUOUS[_REVERSE].

	* gcc.dg/vect/costmodel/x86_64/costmodel-pr123190-1.c: Do not
	require -mtune=generic.
	* gcc.dg/vect/costmodel/x86_64/costmodel-pr123190-2.c: Add
	variant with -O2 instead of -O3, inner loop not unrolled.
2026-01-14 14:44:00 +01:00
Richard Biener
96bc77e45c tree-optimization/123190 - allow VF == 1 epilog vectorization
The following adjusts the condition where we reject vectorization
because the scalar loop runs only for a single iteration (or two,
in case we need to peel for gaps).  Because this is over-eager
when considering the case of VF == 1 where instead the cost model
should decide wheter it is worthwhile or not.  I'm playing
conservative here and exclude the case of two iterations as I
do not have benchmark evidence.

This helps fixing a regression observed with improved SLP handling,
not exactly for the options used in the PR though, but for a more
common -O3 -march=x86-64-v3 this speeds up 433.milc by 6%.

	PR tree-optimization/123190
	* tree-vect-loop.cc (vect_analyze_loop_costing): Allow
	vectorizing loops with a single scalar iteration iff the
	vectorization factor is 1.

	* gcc.dg/vect/costmodel/x86_64/costmodel-pr123190-1.c: New testcase.
	* gcc.dg/vect/slp-28.c: Avoid epilogue vectorization for
	simplicity.
2026-01-14 14:44:00 +01:00
Jakub Jelinek
9167c9eeea simplify-rtx: Fix up SUBREG and LSHIFTRT order canonicalization for AND with constant [PR123544]
On Tue, Nov 04, 2025 at 12:59:03PM +0530, Kishan Parmar wrote:
>       PR rtl-optimization/93738
>       * simplify-rtx.cc (simplify_binary_operation_1): Canonicalize
>       SUBREG(LSHIFTRT) into LSHIFTRT(SUBREG) when valid.

This change regressed the following testcase on aarch64-linux.
From what I can see, the PR93738 change has been written with non-paradoxical
SUBREGs in mind but on this testcase on aarch64 we have a paradoxical SUBREG,
in particular simplify_binary_operation_1 is called with AND, SImode,
(subreg:SI (lshiftrt:HI (subreg:HI (reg/v:SI 108 [ x ]) 0)
        (const_int 8 [0x8])) 0)
and op1 (const_int 32767 [0x7fff]) and simplifies that since the PR93738
optimization was added into
(and:SI (lshiftrt:SI (reg/v:SI 108 [ x ])
        (const_int 8 [0x8]))
    (const_int 32767 [0x7fff]))
This looks wrong to me.
Consider (reg/v:SI 108 [ x ]) 0) could have value 0x12345678U.
The original expression takes lowpart 16-bits from that, i.e. 0x5678U,
shifts that right logically by 8 bits, so 0x56U, makes a paradoxical SUBREG
from that, i.e. 0x????0056U and masks that with 0x7fff, i.e. result is 0x56U.
The new expression shifts 0x12345678U logically right by 8 bits, i.e. 0x123456U and
masks it by 0x7fff, result 0x3456U.

Thus, I think we need to limit to non-paradoxical SUBREGs.
On the rlwimi-2.c testcase I see on powerpc64le-linux no differences in
emitted assembly without/with the patch.

2026-01-14  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/123544
	* simplify-rtx.cc (simplify_context::simplify_binary_operation_1)
	<case AND>: Don't canonicalize (subreg (lshiftrt (x cnt)) low) into
	(lshiftrt (subreg x low) cnt) if the SUBREG is paradoxical.

	* gcc.dg/pr123544.c: New test.
2026-01-14 13:21:57 +01:00
Tomasz Kamiński
e16de4a10f libstdc++: Add comment justifying separate proxy_random_access_iterator_wrapper.
It meets Cpp17RandomAccessIterator requirements, but does not satisfy
random_access_iterator concept.

libstdc++-v3/ChangeLog:

	* testsuite/util/testsuite_iterators.h: Modify comment.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
2026-01-14 11:43:38 +01:00
Prathamesh Kulkarni
7f114df909 Enable time profile function reordering with AutoFDO.
The patch enables time profile based reordering with AutoFDO with
-fauto-profile -fprofile-reorder-functions, by mapping timestamps obtained from perf
into node->tp_first_run.

The rationale for doing this is:
(1) GCC already implements time-profile function reordering with PGO, the patch enables
it with AutoFDO.
(2) While time profile ordering is primarly meant for optimizing startup time,
we've also observed good effects on code-locality for large internal workloads.
(3) Possibly useful for function reordering when accurate profile annotation is
hard with AutoFDO -- For eg, if branch samples are missing (due to absence of
LBR like structure).

On AutoFDO tools side, a corresponding patch extends gcov to emit 64-bit perf timestamp that
records first execution of function, which loosely corresponds to PGO's time_profile counter.
The timestamp is stored adjacent to head field in toplevel function info.

On GCC side, this patch makes the following changes:

(1) Changes to auto-profile pass:
The patch adds a new field timestamp to function_instance,
and populates it in read_function_instance.
It maintains a new timestamp_info_map from timestamp -> <name, tp_first_run>,
which maps timestamps sorted in ascending order to (1..N), so lowest ordered
timestamp is mapped to 1 and so on. The rationale for this is that
timestamps are 64-bit integers, and we don't need the full 64-bit range
for ordering by tp_first_run.

During annotation, the timestamp associated with function_instance is looked up
in timestamp_info_map, and corresponding mapped value is assigned
to node->tp_first_run.

Dhruv's sourcefile tracking patch already handles LTO privatized symbols.
The patch adds a workaround for mismatched/empty filenames, which should go away
when the issues with AutoFDO tools dwarf parsing are resolved.

(2) Param to disable profile driven opts.
The patch adds param auto-profile-reorder-only which only enables time-profile reordering with
AutoFDO:
(a) Useful as a debugging aid to isolate regression to either function reordering or profile driven opts.
(b) As a stopgap measure to avoid regressions with AutoFDO profile driven opts.
(c) Possibly useful for architectures which do not support branch sampling.

gcc/ChangeLog:
	* auto-profile.cc: (string_table::filenames): New method.
	(function_instance::timestamp_): New member.
	(function_instance::timestamp): New accessor for timestamp_ member.
	(function_instance::set_timestamp): New function.
	(function_instance::prop_timestamp): Likewise.
	(function_instance::prop_timestamp_1): Likewise.
	(function_instance::function_instance): Initialize timestamp_ to 0.
	(function_instance::read_function_instance): Adjust prototype by
	replacing head_count with toplevel param with default value true, and
	stream in head_count and timestamp values from gcov file.
	(autofdo::timestamp_info_map): New std::map.
	(autofdo_source_profile::get_function_instance_by_decl): New argument
	filename with default value NULL.
	(autofdo_source_profile::read): Populate timestamp_info_map and
	propagate timestamp to inlined instances from toplevel function.
	(afdo_annotate_cfg): Assign node->tp_first_run based on
	timestamp_info_map and bail out of annotation if
	param_auto_profile_reorder_only is enabled.
	* params.opt: New param auto-profile-reorder-only.

Signed-off-by: Prathamesh Kulkarni <prathameshk@nvidia.com>
2026-01-14 09:41:03 +00:00
Jason Merrill
c6115e9cf9 MAINTAINERS: update Paolo Carlini email address
Paolo is no longer at Oracle.

ChangeLog:

	* MAINTAINERS: Update Paolo Carlini email address.
2026-01-14 11:29:12 +08:00
Lili Cui
dfd063aecc x86: Disable tight loop alignment for m_CORE_ATOM
For the E-core front end, aligning tight loops provides little benefit.

gcc/ChangeLog:

	* config/i386/x86-tune.def (X86_TUNE_ALIGN_TIGHT_LOOPS):
	disable tight loop alignment for m_CORE_ATOM.
2026-01-14 09:56:27 +08:00
Nathan Sidwell
7ad74d93c6 Clarify function body mismatch
Clearly label the expected and found function bodies.

	gcc/testsuite/
	* lib/scanasm.exp (check_function_body): Clarify mismatch labelling.
2026-01-13 20:28:28 -05:00
Daniel Barboza
2fb19bb38b gcc/tree.h, match.pd: remove 'warn_strict_overflow' ref
During ML discussions of a match.pd pattern that was introducing a new
instance of 'warn_strict_overflow', Richard mentioned that this use
should be discouraged [1]. After pointing out that this usage was
documented in tree.h he then explained that we should remove the note
from the header [2]. Here's the reasoning:

"Ah, we should remove that note.  -Wstrict-overflow proved useless IMO,
it's way too noisy as it diagnoses when the compiler relies on overflow
not happening, not diagnosing when it possibly happens.  That's not a
very useful diagnostic to have - it does not point to a possible problem
in the code (we could as well diagnose _all_ signed arithmetic
operations for the same argument that we might eventually rely on
overflow not happening)."

Aside from removing the tree.h node we're also removing the 2 references
in match.pd. match.pd patterns tend to be copied around to serve as a
base for new patterns (like I did in [3] adding a
'fold_overflow_warning'), and if we want to discourage the use avoiding
its spread is a good start.

Note that there are a lot of references left, most of them in
gcc/fold-const.cc. Some references are using in nested helpers inside
the file, entangled with code that does other things. Removing all
references from the project is out of scope for this quick patch.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2026-January/705320.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2026-January/705482.html
[3] https://gcc.gnu.org/pipermail/gcc-patches/2026-January/704992.html

gcc/ChangeLog:

	* match.pd: remove 'fold_overflow_warning' references.
	* tree.h (TYPE_OVERFLOW_UNDEFINED): remove note telling
	that we must use warn_strict_overflow for every optimization
	based on TYPE_OVERFLOW_UNDEFINED.

gcc/testsuite/ChangeLog:

	* gcc.dg/Wstrict-overflow-1.c: Removed because we no longer
	issue a 'fold_overflow_warning' with the
	`(le (minus (@0 INTEGER_CST@1)) INTEGER_CST@2)` pattern.

Signed-off-by: Daniel Barboza <daniel.barboza@oss.qualcomm.com>
2026-01-13 21:48:28 -03:00
Daniel Barboza
f9a7caf703 MAINTAINERS: add myself to write after approval
ChangeLog:

	* MAINTAINERS: Add myself to write after approval.
2026-01-13 21:38:14 -03:00
GCC Administrator
460edeb8be Daily bump. 2026-01-14 00:16:30 +00:00
Andrew Pinski
15b965d6bb match: Remove redundant type checks from (T1)(a bit_op (T2)b) pattern.
As mentioned in https://gcc.gnu.org/pipermail/gcc-patches/2026-January/705657.html,
there were some redundant checks in this pattern. In the first if,
the check for pointer and OFFSET_TYPE is redundant as there is a check for
INTEGRAL_TYPE_P before hand. For the second one, the check for INTEGRAL_TYPE_P
on the inner most type is not needed as there is a types_match right afterwards

Pushed as obvious after bootstra/test on x86_64-linux-gnu.

gcc/ChangeLog:

	* match.pd (`(T1)(a bit_op (T2)b)`): Remove redundant
	type checks.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2026-01-13 15:21:56 -08:00
Nathaniel Shead
400e3b66b7 c++: modules and coroutines
While working on another issue I found that currently modules do not
work with coroutines at all.  This patch fixes a number of issues in
both the coroutines logic and modules logic to ensure that they play
well together.  To summarize:

- The coroutine proxy objects did not have a DECL_CONTEXT set (required
  for modules to merge declarations).

- The coroutine transformation functions are always non-inline, even
  for an inline ramp function, which means that modules need an override
  to ensure the definitions are available where needed.

- Coroutine transformation functions were not marked DECL_COROUTINE_P,
  despite accessors implying that they were.

- In an importing TU we had lost the connection between the ramp
  functions and the transform functions, as they were kept in a pair
  of global maps.

- Modules streaming couldn't discriminate between the actor or destroy
  functions when merging.

- Modules streaming wasn't setting the cfun->coroutine_component flag,
  needed to activate the middle-end coroutine lowering pass.

This patch also separates the coroutine_info_table initialization from
the ensure_coro_initialized function.  If the first time we see a
coroutine is from a module import, we need to register the
transformation functions now but calling ensure_coro_initialized would
lookup e.g. std::coroutine_traits, which may only be visible from this
module that we're currently reading, causing a recursive load.
Separating the concerns allows this to work correctly.

gcc/cp/ChangeLog:

	* coroutines.cc (create_coroutine_info_table): New function.
	(get_or_insert_coroutine_info): Mark static.
	(ensure_coro_initialized): Likewise; use
	create_coroutine_info_table.
	(coro_promise_type_found_p): Set DECL_CONTEXT for proxies.
	(coro_set_ramp_function): New function.
	(coro_set_transform_functions): New function.
	(coro_build_actor_or_destroy_function): Use
	coro_set_ramp_function, mark as DECL_COROUTINE_P.
	* cp-tree.h (coro_set_transform_functions): Declare.
	(coro_set_ramp_function): Declare.
	* module.cc (struct merge_key): New field coro_disc.
	(dumper::impl::nested_name): Distinguish coroutine transform
	functions.
	(get_coroutine_discriminator): New function.
	(trees_out::key_mergeable): Stream coroutine discriminator.
	(check_mergeable_decl): Adjust comment, check for matching
	coroutine discriminator.
	(trees_in::key_mergeable): Read coroutine discriminator.
	(has_definition): Override for coroutine transform functions.
	(trees_out::write_function_def): Stream linked ramp, actor, and
	destroy functions for coroutines.
	(trees_in::read_function_def): Read them.
	(module_state::read_cluster): Set cfun->coroutine_component.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/coro-1_a.C: New test.
	* g++.dg/modules/coro-1_b.C: New test.

Reviewed-by: Iain Sandoe <iain@sandoe.co.uk>
Reviewed-by: Jason Merrill <jason@redhat.com>
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2026-01-14 08:52:14 +11:00
Nathaniel Shead
ccb4ef5bd0 c++/modules: Update lang_decl_bool streaming
The set of lang_decl flags that we were streaming had gotten out of sync
with the current list; update them.

One notable change is that anticipated_p, which had previously been
deliberately skipped, is now only used for DECL_OMP_PRIVATIZED_MEMBER,
and so should probably be streamed as well.

gcc/cp/ChangeLog:

	* module.cc (trees_out::lang_decl_bools): Update list of flags.
	(trees_in::lang_decl_bools): Likewise.

Reviewed-by: Jason Merrill <jason@redhat.com>
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2026-01-14 08:52:13 +11:00
Pengxuan Zheng
8d5eb4f6c1 match: (X >> C) NE/EQ 0 -> X LT/GE 0 [PR123109]
Implement (X >> C) NE/EQ 0 -> X LT/GE 0 in match.pd instead of fold-const.cc.

Bootstrapped and tested on x86_64 and aarch64.

	PR tree-optimization/123109

gcc/ChangeLog:

	* fold-const.cc (fold_binary_loc): Remove (X >> C) NE/EQ 0 -> X LT/GE 0
	folding.
	* match.pd (`(X >> C) NE/EQ 0 -> X LT/GE 0`): New pattern.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/vrp99.c: Update test.
	* gcc.dg/pr123109.c: New test.

Signed-off-by: Pengxuan Zheng <pengxuan.zheng@oss.qualcomm.com>
2026-01-13 13:04:23 -08:00
Andrew Pinski
80c77d9a1c match: Add simplification of (a*zero_one_valued_p) & b if a & b simplifies [PR119402]
This is a small reassociation for `a*bool & b` into `(a & b) * bool` checking if
`a & b` simplifies. Since it could be the case `b` is `~a` or `a` or something
else that might simplify when anding with `a`.

Note this fixes a regression for aarch64 where the cost of a multiply vs `&-` changed
in GCC 14 and can no longer optimize some cases at the RTL level.

Bootstrapped and tested on x86_64-linux-gnu.

	PR tree-optimization/119402
gcc/ChangeLog:

	* match.pd (`(a*zero_one_valued_p) & b`): New pattern.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/bitops-14.c: New test.
	* gcc.dg/tree-ssa/bitops-15.c: New test.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2026-01-13 12:42:29 -08:00
Andrew Pinski
7a456056ff testsuite/aarch64: Fix aarch64/signbitv2sf.c [PR122522]
The problem here is after some heurstics changes the check
loop is now unrolled so we eliminate the array. This means
the check for not having -2147483648 no longer works as
we don't handle SLP in this case.
So the best option is to force the check loop not to unroll
(no vectorize) as this is just testing we SLP the normal
signbit places rather than dealing with the checking loop.

Pushed as obvious after testing the testcase on aarch64-linux-gnu.

	PR testsuite/122522
gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/signbitv2sf.c (main): Disable
	unrolling and vectorizer for the checking loop.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2026-01-13 11:49:07 -08:00
Martin Uecker
e2acc3d38e c: fix checking ICE related to transparent unions and atomic [PR123309]
When matching function arguments in composite_type_internal and one
type comes from a transparent union, it is possible to end up with
atomic and non-atomic types because this case is not handled correctly.
The type matching logic is rewritten in a cleaner way to use helper
functions and to not walk the argument lists three times.  With this
change, a checking assertion can be added to test for matching qualifiers
for pointers. (In general, this assumption is still violated for
function return types.)

	PR c/123309

gcc/c/ChangeLog:
	* c-typeck.cc (transparent_union_replacement): New function.
	(composite_type_internal): Rewrite logic.
	(type_lists_compatible_p): Remove dead code for NULL arguments.

gcc/testsuite/ChangeLog:
	* gcc.dg/pr123309.c: New test.
	* gcc.dg/union-composite-type.c: New test.
2026-01-13 20:32:56 +01:00
Tomasz Kamiński
76ad28b112 libstdc++: Fix handling iterators with proxy subscript in heap algorithms.
This patch replaces uses of subscripts in heap algorithms, that where introduced
in r16-4100-gaaeca77a79a9a8 with dereference of advanced iterators.

The Cpp17RandomAccessIterator requirements, allows operator[] to return any
type that is convertible to reference, however user-provided comparators are
required only to accept result of dereferencing the iterator (i.e. reference
directly). This is visible, when comparator defines operator() for which
template arguments can be deduduced from reference (which will fail on proxy)
or that accepts types convertible from reference (see included tests).

For test we introduce a new proxy_random_access_iterator_wrapper iterator
in testsuite_iterators.h, that returns a proxy type from subscript operator.
This is separate type (instead of additional template argument and aliases),
as it used for test that work with C++98.

libstdc++-v3/ChangeLog:

	* include/bits/stl_heap.h (std::__is_heap_until, std::__push_heap)
	(std::__adjust_heap): Replace subscript with dereference of
	advanced iterator.
	* testsuite/util/testsuite_iterators.h (__gnu_test::subscript_proxy)
	(__gnu_test::proxy_random_access_iterator_wrapper): Define.
	* testsuite/25_algorithms/sort_heap/check_proxy_brackets.cc: New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
2026-01-13 19:14:26 +01:00
Jerry DeLisle
17582084fa Fortran: Detect missing quote in namelist read.
PR libfortran/123012

libgfortran/ChangeLog:

	* io/list_read.c (read_character): Add new check after
	get_string and provide better comments.

gcc/testsuite/ChangeLog:

	* gfortran.dg/namelist_101.f90: New test.
2026-01-13 09:55:12 -08:00
Andrew Pinski
e0a8b63625 ifcvt: Improve cmp?a&b:a to try with -1 [PR123312]
After the current improvements to ifcvt, on some targets for
cmp?a&b:a it is better to produce `(cmp?b:-1) & a` rather than
`(!cmp?a:0)|(a & b)`. So this extends noce_try_cond_zero_arith (with
a rename to noce_try_cond_arith) to see if `cmp ? a : -1` is cheaper than
`!cmp?a:0`.

Bootstrapped and tested on x86_64-linux-gnu.

	PR rtl-optimization/123312
gcc/ChangeLog:

	* ifcvt.cc (noce_try_cond_zero_arith): Rename to ...
	(noce_try_cond_arith): This. For AND try `cmp ? a : -1`
	also to see which one cost less.
	(noce_process_if_block): Handle the rename.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2026-01-13 09:52:31 -08:00
Jonathan Yong
ecc06d4563 winnt-utf8.manifest: make long path aware
Based on:
https://learn.microsoft.com/en-us/windows/win32/sbscs/application-manifests#longpathaware

gcc:
	* config/i386/winnt-utf8.manifest: enable longPathAware.

Signed-off-by: Jonathan Yong <10walls@gmail.com>
2026-01-13 17:41:10 +00:00
Jonathan Yong
610466e9f3 winnt-utf8.manifest: Use XML example from Microsoft
Based on example from:
https://learn.microsoft.com/en-us/windows/win32/sbscs/application-manifests#activecodepage

	PR driver/108865

gcc:
	* config/i386/winnt-utf8.manifest: correct XML tags

Signed-off-by: Jonathan Yong <10walls@gmail.com>
2026-01-13 17:41:09 +00:00
Jeff Law
77ff7c2bb2 [PR tree-optimization/123530] Fix ICE in recently added match.pd pattern
The gimple optimization passes can create negative shift counts and pass them
into the simplification routines as seen by the code in pr123530.  If we then
call tree_to_uhwi on those values we get a nice little ICE.

This guards the tree_to_uhwi calls on tree_fits_uhwi_p and resolves the ICE.  I
just protected them all in this recently added pattern.

Bootstrapped and regression tested on x86 and riscv.  Also tested on the rest
of the embedded targets without any regressions.

Pushing to the trunk.

	PR tree-optimization/123530
gcc/
	* match.pd (reassociating xor to enable rotations): Verify constants
	fit into a uhwi before trying to extract them as a uhwi.

gcc/testsuite/
	* gcc.dg/torture/pr123530.c: New test.
2026-01-13 07:16:05 -07:00
Richard Biener
e787d5ace5 middle-end/123573 - fix VEC_PERM folding more
The following fixes the fix from r16-6709-ga4716ece529dfd some
more by making sure permute to one operand folding faces same
element number vectors but also insert a VIEW_CONVERT_EXPR for
the case one is VLA and one is VLS (when the VLA case is actually
constant, like with -msve-vector-bits=128).  It also makes the
assert that output and input element numbers match done in
fold_vec_perm which this pattern eventually dispatches to into
a check (as the comment already indicates).

Testcases are in the target specific aarch64 testsuite already.

	PR middle-end/123573
	* fold-const.cc (fold_vec_perm): Actually check, not assert,
	that input and output vector element numbers agree.
	* match.pd (vec_perm @0 @1 @2): Make sure element numbers
	are the same when folding to an input vector and wrap that
	inside a VIEW_CONVERT_EXPR.
2026-01-13 14:39:58 +01:00
Robin Dapp
939dd2324e forwprop: Fix type mismatch in vec constructor [PR123525].
This issue got raised after r16-6671 in which I removed checks for
number-of-element equality.  In the splat case with conversion:

  vector(16) int w;
  vector(8) long int v;
  _13 = BIT_FIELD_REF <w_12(D), 32, 160>;
  _2 = (long int) _13;
  _3 = (long int) _13;
  ...
  _9 = (long int) _13;
  _1 = {_2, _3, _4, _5, _6, _7, _8, _9};

right now we do
  _16 = VEC_PERM_EXPR <w_12(D), w_12(D), { 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5 }>;
  _17 = VIEW_CONVERT_EXPR<vector(8) intD.6>(_16);

where the view convert is actually an optimized
  _17 = BIT_FIELD_REF (_16, 512, 0);

512 is the size of the unconverted source but we should actually use the
converted source type.  That's what this patch does.

	PR tree-optimization/123525

gcc/ChangeLog:

	* tree-ssa-forwprop.cc (simplify_vector_constructor): Use
	converted source type for conversion bit field ref.

gcc/testsuite/ChangeLog:

	* gcc.dg/vect/pr123525.c: New test.
	* g++.dg/vect/pr123525-2.cc: New test.
2026-01-13 12:47:38 +01:00
Robin Dapp
0616834fef if-conv: Prevent vector types in scalar cond reduction [PR123301].
Currently we allow vector types in scalar conditional reductions by
accident (via the GNU vector extension).  This patch prevents that.

	PR tree-optimization/123301

gcc/ChangeLog:

	* tree-if-conv.cc (convert_scalar_cond_reduction):
	Disallow vector types.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/pr123301.c: New test.
2026-01-13 12:47:38 +01:00
Robin Dapp
659f4d0e1e rtlanal: Determine nonzero bits of popcount from operand [PR123501].
The PR involves large mask vectors (e.g. V128BI) from which we take
the popcount.  Currently a (popcount:DI (V128BI)) is assumed to have
at most 8 set bits as we assume the popcount operand also has DImode.

This patch uses the operand mode for unary operations and thus
calculates a proper nonzero-bits mask.

We could do the same estimate for ctz and clz but they use nonzero in a
non-poly way and I didn't want to change more than necessary.  Therefore
the patch just returns -1 when we have a different operand mode for
ctz/clz.

	PR rtl-optimization/123501
	PR rtl-optimization/123444

gcc/ChangeLog:

	* rtlanal.cc (nonzero_bits1): Use operand mode instead of
	operation mode.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/reduc/pr123501.c: New test.
2026-01-13 12:47:38 +01:00
Thomas Schwinge
105fddf356 amdgcn: Adjust failure mode for gfx908 USM: 'libgomp.fortran/map-alloc-comp-9-usm.f90'
The change/rationale that commit 1cf9fda493
"amdgcn: Adjust failure mode for gfx908 USM" applied to a number of test cases
likewise applies to 'libgomp.fortran/map-alloc-comp-9-usm.f90'.

	libgomp/
	* testsuite/libgomp.fortran/map-alloc-comp-9-usm.f90: Require
	working Unified Shared Memory to run the test.
2026-01-13 11:10:03 +01:00
Thomas Schwinge
954f804b73 openmp: Bump Version from 4.5 to 5.2 (2/4): Some more '-Wno-deprecated-openmp'
These changes should've been included in
commit 382edf047e
"openmp: Bump Version from 4.5 to 5.2 (2/4)", to avoid some more instances of:

    warning: use of 'omp declare target' as a synonym for 'omp begin declare target' has been deprecated since OpenMP 5.2 [-Wdeprecated-openmp]

    warning: 'to' clause with 'declare target' deprecated since OpenMP 5.2, use 'enter' [-Wdeprecated-openmp]

    Warning: Non-C_PTR type argument at (1) is deprecated, use HAS_DEVICE_ADDR [-Wdeprecated-openmp]

    Warning: 'to' clause with 'declare target' at (1) deprecated since OpenMP 5.2, use 'enter' [-Wdeprecated-openmp]

	libgomp/
	* testsuite/libgomp.c++/examples-4/declare_target-2.C: Add
	'-Wno-deprecated-openmp'.
	* testsuite/libgomp.c/declare-variant-3-sm30.c: Likewise.
	* testsuite/libgomp.c/declare-variant-3-sm35.c: Likewise.
	* testsuite/libgomp.c/declare-variant-3-sm37.c: Likewise.
	* testsuite/libgomp.c/declare-variant-3-sm52.c: Likewise.
	* testsuite/libgomp.c/declare-variant-3-sm53.c: Likewise.
	* testsuite/libgomp.c/declare-variant-3-sm61.c: Likewise.
	* testsuite/libgomp.c/declare-variant-3-sm70.c: Likewise.
	* testsuite/libgomp.c/declare-variant-3-sm75.c: Likewise.
	* testsuite/libgomp.c/declare-variant-3-sm80.c: Likewise.
	* testsuite/libgomp.c/declare-variant-3-sm89.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx10-3-generic.c:
	Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx1030.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx1031.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx1032.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx1033.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx1034.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx1035.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx1036.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx11-generic.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx1100.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx1101.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx1102.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx1103.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx1150.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx1151.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx1152.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx1153.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx9-4-generic.c:
	Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx9-generic.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx900.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx902.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx904.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx906.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx908.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx909.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx90a.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx90c.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx942.c: Likewise.
	* testsuite/libgomp.c/declare-variant-4-gfx950.c: Likewise.
	* testsuite/libgomp.c/examples-4/async_target-2.c: Likewise.
	* testsuite/libgomp.c/interop-hsa.c: Likewise.
	* testsuite/libgomp.c/target-20.c: Likewise.
	* testsuite/libgomp.c/target-simd-clone-1.c: Likewise.
	* testsuite/libgomp.c/target-simd-clone-2.c: Likewise.
	* testsuite/libgomp.c/target-simd-clone-3.c: Likewise.
	* testsuite/libgomp.fortran/alloc-managed-1.f90: Likewise.
	* testsuite/libgomp.fortran/target9.f90: Likewise.
2026-01-13 11:08:34 +01:00
Thomas Schwinge
ba21851b8d openmp: Bump Version from 4.5 to 5.2 (2/4): 'libgomp.oacc-c-c++-common/vred2d-128.c' [PR123098]
'libgomp.oacc-c-c++-common/vred2d-128.c' had gotten '-Wno-deprecated-openmp'
applied as part of commit 382edf047e
"openmp: Bump Version from 4.5 to 5.2 (2/4)", which conceptually doesn't make
sense, as 'libgomp.oacc-c-c++-common/vred2d-128.c' isn't an OpenMP test case.
In commit 9c119b0fdd
"openmp: Limit - reduction -Wdeprecated-openmp diagnostics to OpenMP, testsuite fixes [PR123098]",
the erroneous diagnostic got disabled, so we don't need
'-Wno-deprecated-openmp' anymore.

	PR testsuite/123098
	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/vred2d-128.c: Remove
	'-Wno-deprecated-openmp'.
2026-01-13 11:07:21 +01:00
Jakub Jelinek
8a99fdb704 Use -latomic_asneeded or -lgcc_s_asneeded to workaround libtool issues [PR123396]
On Mon, Jan 12, 2026 at 12:13:35PM +0100, Florian Weimer wrote:
> One way to work around the libtool problem would be to stick the
> as-needed into an existing .so linker script, or create a new one under
> a different name (say libatomic_optional.so) that has AS_NEEDED in it,
> and link with -latomic_optional.  Then libtool would not have to be
> taught about --push-state/--pop-state etc.

That seems to work.

So far bootstrapped (c,c++,fortran,lto only) and make install tested
on x86_64-linux, tested on a small program without need to libatomic and
struct S { char a[25]; };
_Atomic struct S s;

int main () { struct S t = s; s = t; }
which does at -O0.
Before this patch I got
for i in `find x86_64-pc-linux-gnu/ -name lib\*.so.\*.\*`; do ldd -u $i 2>&1 | grep -q libatomic.so.1 && echo $i; done
x86_64-pc-linux-gnu/libsanitizer/ubsan/.libs/libubsan.so.1.0.0
x86_64-pc-linux-gnu/libsanitizer/asan/.libs/libasan.so.8.0.0
x86_64-pc-linux-gnu/libsanitizer/hwasan/.libs/libhwasan.so.0.0.0
x86_64-pc-linux-gnu/libsanitizer/lsan/.libs/liblsan.so.0.0.0
x86_64-pc-linux-gnu/libsanitizer/tsan/.libs/libtsan.so.2.0.0
x86_64-pc-linux-gnu/32/libsanitizer/ubsan/.libs/libubsan.so.1.0.0
x86_64-pc-linux-gnu/32/libsanitizer/asan/.libs/libasan.so.8.0.0
x86_64-pc-linux-gnu/32/libstdc++-v3/src/.libs/libstdc++.so.6.0.35
x86_64-pc-linux-gnu/libgcobol/.libs/libgcobol.so.2.0.0
x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so.6.0.35
With this patch it prints nothing.

2026-01-13  Jakub Jelinek  <jakub@redhat.com>

	PR libstdc++/123396
gcc/
	* configure.ac (gcc_cv_ld_use_as_needed_ldscript): New test.
	(USE_LD_AS_NEEDED_LDSCRIPT): New AC_DEFINE.
	* gcc.cc (LINK_LIBATOMIC_SPEC): Use "-latomic_asneeded" instead
	of LD_AS_NEEDED_OPTION " -latomic " LD_NO_AS_NEEDED_OPTION
	if USE_LD_AS_NEEDED_LDSCRIPT is defined.
	(init_gcc_specs): Use "-lgcc_s_asneeded" instead of
	LD_AS_NEEDED_OPTION " -lgcc_s " LD_NO_AS_NEEDED_OPTION
	if USE_LD_AS_NEEDED_LDSCRIPT is defined.
	* config.in: Regenerate.
	* configure: Regenerate.
libatomic/
	* acinclude.m4 (LIBAT_BUILD_ASNEEDED_SOLINK): New AM_CONDITIONAL.
	* libatomic_asneeded.so: New file.
	* libatomic_asneeded.a: New file.
	* Makefile.am (toolexeclib_DATA): Set if LIBAT_BUILD_ASNEEDED_SOLINK.
	(all-local): Install those files into gcc subdir.
	* Makefile.in: Regenerate.
	* configure: Regenerate.
libgcc/
	* config/t-slibgcc (SHLIB_ASNEEDED_SOLINK,
	SHLIB_MAKE_ASNEEDED_SOLINK, SHLIB_INSTALL_ASNEEDED_SOLINK): New
	vars.
	(SHLIB_LINK): Include $(SHLIB_MAKE_ASNEEDED_SOLINK).
	(SHLIB_INSTALL): Include $(SHLIB_INSTALL_ASNEEDED_SOLINK).
2026-01-13 10:06:47 +01:00
Paul Thomas
fdfb045223 Fortran: Check constant PDT type specification parameters [PR112460]
2026-01-14  Paul Thomas  <pault@gcc.gnu.org>

gcc/fortran
	PR fortran/112460
	* array.cc (resolve_array_list): Stash the first PDT element
	and check its type specification parameters against those of
	subsequent elements.
	* expr.cc (get_parm_list_from_expr): New function to extract the
	type spec lists from expressions to be compared.
	(gfc_check_type_spec_parms): New function to compare type spec
	lists between two expressions. Emit an error if any constant
	values are different.
	(gfc_check_assign): Check that the PDT type specification parms
	are the same on lhs and rhs.
	* gfortran.h : Add prototype for gfc_check_type_spec_parms.
	* trans-expr.cc (copyable_array_p): PDT arrays are not copyable

gcc/testsuite
	PR fortran/112460
	* gfortran.dg/pdt_81.f03: New test.
2026-01-13 08:19:05 +00:00
Richard Biener
47d09318c4 tree-optimization/123539 - signed UB in vector reduction
With previous changes I overlooked one use of vectype.

	PR tree-optimization/123539
	* tree-vect-loop.cc (vect_create_epilog_for_reduction):
	Use the compute vectype to pun down to smaller or element
	size for by-element reductions.
2026-01-13 09:08:32 +01:00
Andrew Pinski
f8a2eb766f xfail store_merging_19.c for the same reason as store_merging_18.c
store_merging_19.c is almost the same as store_merging_18.c except
it has assume align in it to allow it work on strict align targets.
Somehow when I was looking at the testresults I noticed 18 but not 19
when I was looking into failures.

Pushed as obvious.

gcc/testsuite/ChangeLog:

	* gcc.dg/store_merging_19.c: xfail.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2026-01-12 21:10:44 -08:00
Kito Cheng
cbb9d38152 VN: Fix VN ICE for large _BitInt types
gcc.dg/torture/bitint-18.c triggers an ICE in push_partial_def when
compiling for RISC-V with -O2.  The issue occurs because
build_nonstandard_integer_type cannot handle bit widths larger than
MAX_FIXED_MODE_SIZE.

For BITINT_TYPE with maxsizei > MAX_FIXED_MODE_SIZE, use build_bitint_type
instead of build_nonstandard_integer_type, similar to what tree-sra.cc does.

gcc/ChangeLog:

	* tree-ssa-sccvn.cc (vn_walk_cb_data::push_partial_def): Use
	build_bitint_type for BITINT_TYPE when maxsizei exceeds
	MAX_FIXED_MODE_SIZE.
2026-01-13 11:23:40 +08:00
Kito Cheng
808b684172 RISC-V: Add support for _BitInt [PR117581]
This patch implements _BitInt support for RISC-V target by defining the
type layout and ABI requirements.  The limb mode selection is based on
the bit width, using appropriate integer modes from QImode to TImode.
The implementation also adds the necessary libgcc version symbols for
_BitInt runtime support functions.

Changes in v3:
- Require sync_char_short effective target for bitint-64.c, bitint-82.c
  and bitint-84.c tests since they use atomic operations.
- Add -fno-section-anchors to bitint-32-on-rv64.c and adjust expected
  assembly output patterns.

Changes in v2:
- limb_mode use up to XLEN when N > XLEN, which is different setting from
  the abi_limb_mode.
- Adding missing floatbitinthf in libgcc.

gcc/ChangeLog:

	PR target/117581
	* config/riscv/riscv.cc (riscv_bitint_type_info): New function.
	(TARGET_C_BITINT_TYPE_INFO): Define.

gcc/testsuite/ChangeLog:

	PR target/117581
	* gcc.dg/torture/bitint-64.c: Add sync_char_short effective target
	requirement.
	* gcc.dg/torture/bitint-82.c: Likewise.
	* gcc.dg/torture/bitint-84.c: Likewise.
	* gcc.target/riscv/bitint-32-on-rv64.c: New test.
	* gcc.target/riscv/bitint-alignments.c: New test.
	* gcc.target/riscv/bitint-args.c: New test.
	* gcc.target/riscv/bitint-sizes.c: New test.

libgcc/ChangeLog:

	PR target/117581
	* config/riscv/libgcc-riscv.ver: New file.
	* config/riscv/t-elf (SHLIB_MAPFILES): Add libgcc-riscv.ver.
	* config/riscv/t-softfp32 (softfp_extras): Add floatbitinttf and
	fixtfbitint.
2026-01-13 11:23:40 +08:00
H.J. Lu
e6470a44a2 pr122458.c: Replace .quad with .dc.a
Replace .quad with .dc.a to avoid

/export/build/gnu/tools-build/gcc/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/tools-build/gcc/build-x86_64-linux/gcc/ /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.dg/ipa/pr122458.c -m32 -fdiagnostics-plain-output -O2 -lm -o pr122458.exe
/usr/local/bin/as: /tmp/cc9Bw0pX.o: unsupported relocation type: 0x1
/tmp/ccGrIiOC.s: Assembler messages:
/tmp/ccGrIiOC.s:4: Error: cannot represent relocation type BFD_RELOC_64
compiler exited with status 1
FAIL: gcc.dg/ipa/pr122458.c (test for excess errors)

for 32-bit targets.

	PR ipa/122458
	* gcc.dg/ipa/pr122458.c: Replace .quad with .dc.a.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2026-01-13 11:17:57 +08:00
liuhongt
808fc71d15 Add TARGET_MMX_WITH_SSE to the condition of all 64-bit _Float16 vector related patterns.
gcc/ChangeLog:

	PR target/123484
	* config/i386/mmx.md (divv4hf3): Add TARGET_MMX_WITH_SSE to
	the condition.
	(cmlav4hf4): Ditto.
	(cmla_conjv4hf4): Ditto.
	(cmulv4hf3): Ditto.
	(cmul_conjv4hf3): Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/pr123484.c: New test.
2026-01-12 18:05:47 -08:00
GCC Administrator
7c3584be8c Daily bump. 2026-01-13 00:16:32 +00:00
Andrew Pinski
50df2b1884 match: Simplify (T1)(a bit_op (T2)b) to ((T1)a bit_op b) When b is T1 type and truncating from T2 [PR122845]
This adds the simpliciation of:
```
  <unnamed-signed:3> _1;

  _2 = (signed char) _1;
  _3 = _2 ^ -47;
  _4 = (<unnamed-signed:3>) _3;
```

to:
```
  <unnamed-signed:3> _n;
  _4 = _1 ^ -47;
```

This also fixes PR 122843 by optimizing out the xor such that we get:
```
  _1 = b.a;
  _21 = (<unnamed-signed:3>) t_23(D);
  // t_23 in the original testcase was 200 so this is reduced to 0
  _5 = _1 ^ _21;
  # .MEM_24 = VDEF <.MEM_13>
  b.a = _5;
```
And then there is no cast catch this pattern:
`(bit_xor (convert1? (bit_xor:c @0 @1)) (convert2? (bit_xor:c @0 @2)))`
As we get:
```
  _21 = (<unnamed-signed:3>) t_23(D);
  _5 = _1 ^ _21;
  _22 = (<unnamed-signed:3>) t_23(D);
  _7 = _5 ^ _22;
  _25 = (<unnamed-signed:3>) t_23(D);
  _8 = _7 ^ _25;
  _26 = (<unnamed-signed:3>) t_23(D);
  _9 = _7 ^ _26;
```
After unrolling and then fre will optimize away all of those xor.

Bootstrapped and tested on x86_64-linux-gnu.

	PR tree-optimization/122845
	PR tree-optimization/122843
gcc/ChangeLog:

	* match.pd (`(T1)(a bit_op (T2)b)`): Also
	simplify if T1 is the same type as b and T2 is wider
	type than T1.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/bitops-12.c: New test.
	* gcc.dg/tree-ssa/bitops-13.c: New test.
	* gcc.dg/store_merging_18.c: xfail store merging.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2026-01-12 14:16:20 -08:00
Steven G. Kargl
4cf4b2cd2a Fortran: Add additional checks for constant expressions.
PR fortran/91960

gcc/fortran/ChangeLog:

	* resolve.cc (resolve_fl_parameter): Check the righthand symbol
	is a constant expression.

gcc/testsuite/ChangeLog:

	* gfortran.dg/pr69962.f90: Adjust testcase to ignore new error message.
	* gfortran.dg/pr91960_1.f90: New test.
	* gfortran.dg/pr91960_2.f90: New test.
2026-01-12 14:06:04 -08:00
Patrick Palka
716f4482c5 c++: deferred noexcept parsing for friend tmpl spec [PR123189]
Since we now defer noexcept parsing for templated friends, a couple of
routines related to deferred parsing need to be updated to cope with friend
template specializations -- their TI_TEMPLATE is a TREE_LIST rather than
a TEMPLATE_DECL, and they don't introduce new template parameters.

	PR c++/123189

gcc/cp/ChangeLog:

	* name-lookup.cc (binding_to_template_parms_of_scope_p):
	Gracefully handle TEMPLATE_INFO whose TI_TEMPLATE is a TREE_LIST.
	* pt.cc (maybe_begin_member_template_processing): For a friend
	template specialization consider its class context instead.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/noexcept92.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
2026-01-12 11:21:14 -05:00
Jason Merrill
f36534fe5f c++: tweak testcase for --stds=impcx
Implicit constexpr makes the use of x disappear, avoiding the exposure and
thus the diagnostic.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/internal-17_b.C: Add -fno-implicit-constexpr.
2026-01-13 00:19:19 +08:00