Commit Graph

216327 Commits

Author SHA1 Message Date
GCC Administrator
c5609a755b Daily bump. 2024-12-15 00:17:24 +00:00
Ian Lance Taylor
3e343ef7f0 libbacktrace: don't use ZSTD_CLEVEL_DEFAULT
PR 117812 reports that testing GCC with zstd 1.3.4 fails because
ZSTD_CLEVEL_DEFAULT is not defined, so avoid using it.

	PR libbacktrace/117812
	* zstdtest.c (test_large): Use 3 rather than ZSTD_CLEVEL_DEFAULT
2024-12-14 14:32:11 -08:00
Jovan Vukic
ad519f4619 [PATCH v3] match.pd: Add pattern to simplify (a - 1) & -a to 0
Thank you for the feedback. I have made the minor changes that were requested.
Additionally, I extracted the repetitive code into a reusable helper function,
match_plus_neg_pattern, making the code much more readable. Furthermore, the
logic, code, and tests remain the same as in version 2 of the patch.

gcc/ChangeLog:

	* match.pd: New pattern.
	* simplify-rtx.cc (match_plus_neg_pattern): New helper function.
	(simplify_context::simplify_binary_operation_1): New
	code to handle (a - 1) & -a, (a - 1) | -a and (a - 1) ^ -a.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/bitops-11.c: New test.
2024-12-14 14:48:23 -07:00
Jose E. Marchesi
6866547e24 bpf: fix build adding new required arg to RESOLVE_OVERLOADED_BUILTIN
gcc/ChangeLog

	* config/bpf/bpf.cc (bpf_resolve_overloaded_builtin): Add argument
	`complain'.
2024-12-14 19:15:34 +01:00
Heiko Eißfeldt
a7df4961d1 doc: Fix typos for --enable-host-pie docs in install.texi
gcc/ChangeLog:

	* doc/install.texi (Configuration): Fix typos in documentation
	for --enable-host-pie.
2024-12-14 12:34:00 +00:00
Jakub Jelinek
7f4e85a954 gimple-fold: Fix the recent ifcombine optimization for _BitInt [PR118023]
The BIT_FIELD_REF verifier has:
          if (INTEGRAL_TYPE_P (TREE_TYPE (op))
              && !type_has_mode_precision_p (TREE_TYPE (op)))
            {
              error ("%qs of non-mode-precision operand", code_name);
              return true;
            }
check among other things, so one can't extract something out of say
_BitInt(63) or _BitInt(4096).
The new ifcombine optimization happily creates such BIT_FIELD_REFs
and ICEs during their verification.

The following patch fixes that by rejecting those in decode_field_reference.

2024-12-14  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/118023
	* gimple-fold.cc (decode_field_reference): Return NULL_TREE if
	inner has non-type_has_mode_precision_p integral type.

	* gcc.dg/bitint-119.c: New test.
2024-12-14 11:28:25 +01:00
Jakub Jelinek
9537ca5ad9 warn-access: Fix up matching_alloc_calls_p [PR118024]
The following testcase ICEs because of a bug in matching_alloc_calls_p.
The loop was apparently meant to be walking the two attribute chains
in lock-step, but doesn't really do that.  If the first lookup_attribute
returns non-NULL, the second one is not done, so rmats in that case can
be some random unrelated attribute rather than "malloc" attribute; the
body assumes even rmats if non-NULL is "malloc" attribute and relies
on its argument to be a "malloc" argument and if it is some other
attribute with incompatible attribute, it just crashes.

Now, fixing that in the obvious way, instead of doing
(amats = lookup_attribute ("malloc", amats))
 || (rmats = lookup_attribute ("malloc", rmats))
in the condition do
((amats = lookup_attribute ("malloc", amats)),
 (rmats = lookup_attribute ("malloc", rmats)),
 (amats || rmats))
fixes the testcase but regresses Wmismatched-dealloc-{2,3}.c tests.
The problem is that walking the attribute lists in a lock-step is obviously
a very bad idea, there is no requirement that the same deallocators are
present in the same order on both decls, e.g. there could be an extra malloc
attribute without argument in just one of the lists, or the order of say
free/realloc could be swapped, etc.  We don't generally document nor enforce
any particular ordering of attributes (even when for some attributes we just
handle the first one rather than all).

So, this patch instead simply splits it into two loops, the first one walks
alloc_decl attributes, the second one walks dealloc_decl attributes.
If the malloc attribute argument is a built-in, that doesn't change
anything, and otherwise we have the chance to populate the whole
common_deallocs hash_set in the first loop and then can check it in the
second one (and don't need to use more expensive add method on it, can just
check contains there).  Not to mention that it also fixes the case when
the function would incorrectly return true if there wasn't a common
deallocator between the two, but dealloc_decl had 2 malloc attributes with
the same deallocator.

2024-12-14  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/118024
	* gimple-ssa-warn-access.cc (matching_alloc_calls_p): Walk malloc
	attributes of alloc_decl and dealloc_decl in separate loops rather
	than in lock-step.  Use common_deallocs.contains rather than
	common_deallocs.add in the second loop.

	* gcc.dg/pr118024.c: New test.
2024-12-14 11:27:20 +01:00
Jakub Jelinek
18f0b7d5f3 opts: Use OPTION_SET_P instead of magic value 2 for -fshort-enums default [PR118011]
The magic values for default (usually -1 or sometimes 2) for some options
are from times we haven't global_options_set, I think we should eventually
get rid of all of those.

The PR is about gcc -Q --help=optimizers reporting -fshort-enums as
[enabled] when it is disabled.
For this the following patch is just partial fix; with explicit
gcc -Q --help=optimizers -fshort-enums
or
gcc -Q --help=optimizers -fno-short-enums
it already worked correctly before, with this patch it will report
even with just
gcc -Q --help=optimizers
correct value on most targets, except 32-bit arm with some options or
defaults, so I think it is a step in the right direction.

But, as I wrote in the PR, process_options isn't done before --help=
and even shouldn't be in its current form where it warns on some option
combinations or errors or emits sorry on others, so I think ideally
process_options should have some bool argument whether it is done for
--help= purposes or not, if yes, not emit warnings and just adjust the
options, otherwise do what it currently does.

2024-12-14  Jakub Jelinek  <jakub@redhat.com>

	PR c/118011
gcc/
	* opts.cc (init_options_struct): Don't set opts->x_flag_short_enums to
	2.
	* toplev.cc (process_options): Test !OPTION_SET_P (flag_short_enums)
	rather than flag_short_enums == 2.
gcc/ada/
	* gcc-interface/misc.cc (gnat_post_options): Test
	!OPTION_SET_P (flag_short_enums) rather than flag_short_enums == 2.
2024-12-14 11:25:08 +01:00
Nathaniel Shead
a6a15bc5b7 c++: Disallow decomposition of lambda bases [PR90321]
Decomposition of lambda closure types is not allowed by
[dcl.struct.bind] p6, since members of a closure have no name.

r244909 made this an error, but missed the case where a lambda is used
as a base.  This patch moves the check to find_decomp_class_base to
handle this case.

As a drive-by improvement, we also slightly improve the diagnostics to
indicate why a base class was being inspected.  Ideally the diagnostic
would point directly at the relevant base, but there doesn't seem to be
an easy way to get this location just from the binfo so I don't worry
about that here.

	PR c++/90321

gcc/cp/ChangeLog:

	* decl.cc (find_decomp_class_base): Check for decomposing a
	lambda closure type.  Report base class chains if needed.
	(cp_finish_decomp): Remove no-longer-needed check.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp1z/decomp62.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Marek Polacek <polacek@redhat.com>
2024-12-14 12:28:23 +11:00
Abdo Eid
7238b074b5 libstdc++: Remove duplicate using-declaration in <wchar.h>
libstdc++-v3/ChangeLog:

	* include/c_compatibility/wchar.h (fgetwc): Remove duplicate
	using-declaration.
2024-12-14 01:21:19 +00:00
GCC Administrator
ec6cd3b97a Daily bump. 2024-12-14 00:19:52 +00:00
Jakub Jelinek
b626ebc0d7 cse: Fix up record_jump_equiv checks [PR117095]
The following testcase is miscompiled on s390x-linux with -O2 -march=z15.
The problem happens during cse2, which sees in an extended basic block
(jump_insn 217 78 216 10 (parallel [
            (set (pc)
                (if_then_else (ne (reg:SI 165)
                        (const_int 1 [0x1]))
                    (label_ref 216)
                    (pc)))
            (set (reg:SI 165)
                (plus:SI (reg:SI 165)
                    (const_int -1 [0xffffffffffffffff])))
            (clobber (scratch:SI))
            (clobber (reg:CC 33 %cc))
        ]) "t.c":14:17 discrim 1 2192 {doloop_si64}
     (int_list:REG_BR_PROB 955630228 (nil))
 -> 216)
...
(insn 99 98 100 12 (set (reg:SI 138)
        (const_int 1 [0x1])) "t.c":9:31 1507 {*movsi_zarch}
     (nil))
(insn 100 99 103 12 (parallel [
            (set (reg:SI 137)
                (minus:SI (reg:SI 138)
                    (subreg:SI (reg:HI 135 [ a ]) 0)))
            (clobber (reg:CC 33 %cc))
        ]) "t.c":9:31 1904 {*subsi3}
     (expr_list:REG_DEAD (reg:SI 138)
        (expr_list:REG_DEAD (reg:HI 135 [ a ])
            (expr_list:REG_UNUSED (reg:CC 33 %cc)
                (nil)))))
Note, cse2 has df_note_add_problem () before df_analyze, which add
     (expr_list:REG_UNUSED (reg:SI 165)
        (expr_list:REG_UNUSED (reg:CC 33 %cc)
notes to the first insn (correctly so, %cc is clobbered there and pseudo
165 isn't used after the insn).
Now, cse_extended_basic_block has an extra optimization on conditional
jumps, where it records equivalence on the edge which continues in the ebb.
Here it sees (ne reg:SI 165) (const_int 1) is false on the edge and
remembers that pseudo 165 is comparison equivalent to (const_int 1),
so on insn 100 it decides to replace (reg:SI 138) with (reg:SI 165).

This optimization isn't correct here though, because the JUMP_INSN has
multiple sets.  Before r0-77890 record_jump_equiv has been called from
cse_insn guarded on n_sets == 1 && any_condjump_p (insn), so it wouldn't
be done on the above JUMP_INSN where n_sets == 2.  But since that change
it is guarded with single_set (insn) && any_condjump_p (insn) and that
is true because of the REG_UNUSED note.  Looking at that note is
inappropriate in CSE though, because the whole intent of the pass is to
extend the lifetimes of the pseudos if equivalence is found, so the fact
that there is REG_UNUSED note for (reg:SI 165) and that the reg isn't used
later doesn't imply that it won't be used after the optimization.
So, unless we manage to process the other sets on the JUMP_INSN (it wouldn't
be terribly hard in this exact case, the doloop insn decreases the register
by 1 and so we could just record equivalence to (const_int 0) instead, but
generally it might be hard), we should IMHO just punt if there are multiple
sets.

The patch below adds !multiple_sets (insn) check instead of replacing with
it the single_set (insn) check, because apparently any_condjump_p uses
pc_set which supports the case where PATTERN is a SET to PC (that is a
single_set (insn) && !multiple_sets (insn), PATTERN is a PARALLEL with a
single SET to PC (likewise) and some CLOBBERs, PARALLEL with two or more
SETs where the first one is SET to PC (that could be single_set (insn)
with REG_UNUSED notes but is not !multiple_sets (insn)) or PATTERN
is UNSPEC/UNSPEC_VOLATILE with SET inside of it.  For the last case
!multiple_sets (insn) will be true, but IMHO we shouldn't try to derive
anything from those because we haven't checked the rest of the UNSPEC*
and we don't really know what it does.

2024-12-13  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/117095
	* cse.cc (cse_extended_basic_block): Don't call record_jump_equiv
	if multiple_sets (insn).

	* gcc.c-torture/execute/pr117095.c: New test.
2024-12-14 00:41:00 +01:00
Patrick Palka
b8314ebff2 libstdc++: Avoid unnecessary copies in ranges::min/max [PR112349]
Use a local reference for the (now possibly lifetime extended) result of
*__first so that we copy it only when necessary.

	PR libstdc++/112349

libstdc++-v3/ChangeLog:

	* include/bits/ranges_algo.h (__min_fn::operator()): Turn local
	object __tmp into a reference.
	* include/bits/ranges_util.h (__max_fn::operator()): Likewise.
	* testsuite/25_algorithms/max/constrained.cc (test04): New test.
	* testsuite/25_algorithms/min/constrained.cc (test04): New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
2024-12-13 13:17:29 -05:00
Christophe Lyon
2089009210 arm: [MVE intrinsics] Fix support for predicate constants [PR target/114801]
In this PR, we have to handle a case where MVE predicates are supplied
as a const_int, where individual predicates have illegal boolean
values (such as 0xc for a 4-bit boolean predicate).  To avoid the ICE,
fix the constant (any non-zero value is converted to all 1s) and emit
a warning.

On MVE, V8BI and V4BI multi-bit masks are interpreted byte-by-byte at
instruction level, but end-users should describe lanes rather than
bytes (so all bytes of a true-predicated lane should be '1'), see the
section on MVE intrinsics in the Arm ACLE specification.

Since force_lowpart_subreg cannot handle const_int (because they have VOID mode),
use gen_lowpart on them, force_lowpart_subreg otherwise.

2024-11-20  Christophe Lyon  <christophe.lyon@linaro.org>
	    Jakub Jelinek  <jakub@redhat.com>

	PR target/114801
	gcc/
	* config/arm/arm-mve-builtins.cc
	(function_expander::add_input_operand): Handle CONST_INT
	predicates.

	gcc/testsuite/
	* gcc.target/arm/mve/pr108443.c: Update predicate constant.
	* gcc.target/arm/mve/pr108443-run.c: Likewise.
	* gcc.target/arm/mve/pr114801.c: New test.
2024-12-13 14:25:36 +00:00
Christophe Lyon
4f4e13dd23 arm: [MVE intrinsics] rework vst2q vst4q vld2q vld4q
Implement vst2q, vst4q, vld2q and vld4q using the new MVE builtins
framework.

Since MVE uses different tuple modes than Neon, we need to use
VALID_MVE_STRUCT_MODE because VALID_NEON_STRUCT_MODE is no longer a
super-set of it, for instance in output_move_neon and
arm_print_operand_address.

In arm_hard_regno_mode_ok, the change is similar but a bit more
intrusive.

Expand the VSTRUCT iterator, so that mov<mode> and neon_mov<mode>
patterns from neon.md still work for MVE.

Besides the small updates to the patterns in mve.md, we have to update
vec_load_lanes and vec_store_lanes in vec-common.md so that the
vectorizer can handle the new modes. These patterns are now different
from Neon's, so maybe we should move them back to neon.md and mve.md

The patch adds arm_array_mode, which is used by build_array_type_nelts
and makes it possible to support the new assert in
register_builtin_tuple_types.

gcc/ChangeLog:

	* config/arm/arm-mve-builtins-base.cc (class vst24_impl): New.
	(class vld24_impl): New.
	(vld2q, vld4q, vst2q, vst4q): New.
	* config/arm/arm-mve-builtins-base.def (vld2q, vld4q, vst2q)
	(vst4q): New.
	* config/arm/arm-mve-builtins-base.h (vld2q, vld4q, vst2q, vst4q):
	New.
	* config/arm/arm-mve-builtins.cc (register_builtin_tuple_types):
	Add more asserts.
	* config/arm/arm.cc (TARGET_ARRAY_MODE): New.
	(output_move_neon): Handle MVE struct modes.
	(arm_print_operand_address): Likewise.
	(arm_hard_regno_mode_ok): Likewise.
	(arm_array_mode): New.
	* config/arm/arm.h (VALID_MVE_STRUCT_MODE): Likewise.
	* config/arm/arm_mve.h (vst4q): Delete.
	(vst2q): Delete.
	(vld2q): Delete.
	(vld4q): Delete.
	(vst4q_s8): Delete.
	(vst4q_s16): Delete.
	(vst4q_s32): Delete.
	(vst4q_u8): Delete.
	(vst4q_u16): Delete.
	(vst4q_u32): Delete.
	(vst4q_f16): Delete.
	(vst4q_f32): Delete.
	(vst2q_s8): Delete.
	(vst2q_u8): Delete.
	(vld2q_s8): Delete.
	(vld2q_u8): Delete.
	(vld4q_s8): Delete.
	(vld4q_u8): Delete.
	(vst2q_s16): Delete.
	(vst2q_u16): Delete.
	(vld2q_s16): Delete.
	(vld2q_u16): Delete.
	(vld4q_s16): Delete.
	(vld4q_u16): Delete.
	(vst2q_s32): Delete.
	(vst2q_u32): Delete.
	(vld2q_s32): Delete.
	(vld2q_u32): Delete.
	(vld4q_s32): Delete.
	(vld4q_u32): Delete.
	(vld4q_f16): Delete.
	(vld2q_f16): Delete.
	(vst2q_f16): Delete.
	(vld4q_f32): Delete.
	(vld2q_f32): Delete.
	(vst2q_f32): Delete.
	(__arm_vst4q_s8): Delete.
	(__arm_vst4q_s16): Delete.
	(__arm_vst4q_s32): Delete.
	(__arm_vst4q_u8): Delete.
	(__arm_vst4q_u16): Delete.
	(__arm_vst4q_u32): Delete.
	(__arm_vst2q_s8): Delete.
	(__arm_vst2q_u8): Delete.
	(__arm_vld2q_s8): Delete.
	(__arm_vld2q_u8): Delete.
	(__arm_vld4q_s8): Delete.
	(__arm_vld4q_u8): Delete.
	(__arm_vst2q_s16): Delete.
	(__arm_vst2q_u16): Delete.
	(__arm_vld2q_s16): Delete.
	(__arm_vld2q_u16): Delete.
	(__arm_vld4q_s16): Delete.
	(__arm_vld4q_u16): Delete.
	(__arm_vst2q_s32): Delete.
	(__arm_vst2q_u32): Delete.
	(__arm_vld2q_s32): Delete.
	(__arm_vld2q_u32): Delete.
	(__arm_vld4q_s32): Delete.
	(__arm_vld4q_u32): Delete.
	(__arm_vst4q_f16): Delete.
	(__arm_vst4q_f32): Delete.
	(__arm_vld4q_f16): Delete.
	(__arm_vld2q_f16): Delete.
	(__arm_vst2q_f16): Delete.
	(__arm_vld4q_f32): Delete.
	(__arm_vld2q_f32): Delete.
	(__arm_vst2q_f32): Delete.
	(__arm_vst4q): Delete.
	(__arm_vst2q): Delete.
	(__arm_vld2q): Delete.
	(__arm_vld4q): Delete.
	* config/arm/arm_mve_builtins.def (vst4q, vst2q, vld4q, vld2q):
	Delete.
	* config/arm/iterators.md (VSTRUCT): Add V2x16QI, V2x8HI, V2x4SI,
	V2x8HF, V2x4SF, V4x16QI, V4x8HI, V4x4SI, V4x8HF, V4x4SF.
	(MVE_VLD2_VST2, MVE_vld2_vst2, MVE_VLD4_VST4, MVE_vld4_vst4): New.
	* config/arm/mve.md (mve_vst4q<mode>): Update into ...
	(@mve_vst4q<mode>): ... this.
	(mve_vst2q<mode>): Update into ...
	(@mve_vst2q<mode>): ... this.
	(mve_vld2q<mode>): Update into ...
	(@mve_vld2q<mode>): ... this.
	(mve_vld4q<mode>): Update into ...
	(@mve_vld4q<mode>): ... this.
	* config/arm/vec-common.md (vec_load_lanesoi<mode>) Remove MVE
	support.
	(vec_load_lanesxi<mode>): Likewise.
	(vec_store_lanesoi<mode>): Likewise.
	(vec_store_lanesxi<mode>): Likewise.
	(vec_load_lanes<MVE_vld2_vst2><mode>):
	New.
	(vec_store_lanes<MVE_vld2_vst2><mode>): New.
	(vec_load_lanes<MVE_vld4_vst4><mode>): New.
	(vec_store_lanes<MVE_vld4_vst4><mode>): New.
2024-12-13 14:25:35 +00:00
Christophe Lyon
87235d8ae8 arm: [MVE intrinsics] fix store shape to support tuples
Now that tuples are properly supported, we can update the store shape, to expect
"t0" instead of "v0" as last argument.

gcc/ChangeLog:

	* config/arm/arm-mve-builtins-shapes.cc (struct store_def): Add
	support for tuples.
2024-12-13 14:25:35 +00:00
Christophe Lyon
e9c36605a4 arm: [MVE intrinsics] add support for tuples
This patch is largely a copy/paste from the aarch64 SVE counterpart,
and adds support for tuples to the MVE intrinsics framework.

Introduce function_resolver::infer_tuple_type which will be used to
resolve overloaded vst2q and vst4q function names in a later patch.

Fix access to acle_vector_types in a few places, as well as in
infer_vector_or_tuple_type because we should shift the tuple size to
the right by one bit when computing the array index.

The new wrap_type_in_struct, register_type_decl and infer_tuple_type
are largely copies of the aarch64 versions, and
register_builtin_tuple_types is very similar.

gcc/ChangeLog:

	* config/arm/arm-mve-builtins-shapes.cc (parse_type): Fix access
	to acle_vector_types.
	* config/arm/arm-mve-builtins.cc (wrap_type_in_struct): New.
	(register_type_decl): New.
	(register_builtin_tuple_types): Fix support for tuples.
	(function_resolver::infer_tuple_type): New.
	* config/arm/arm-mve-builtins.h
	(function_resolver::infer_tuple_type): Declare.
	(function_instance::tuple_type): Fix access to acle_vector_types.
2024-12-13 14:25:35 +00:00
Christophe Lyon
1e52a6a2d4 arm: [MVE intrinsics] add modes for tuples
Add V2x and V4x modes, like we do on aarch64 for Advanced SIMD
q-registers.

gcc/ChangeLog:

	* config/arm/arm-modes.def (MVE_STRUCT_MODES): New.
2024-12-13 14:25:35 +00:00
Christophe Lyon
9553e13746 arm: [MVE intrinsics] remove V2DF from MVE_vecs iterator
V2DF is not supported by MVE, so remove it from the only iterator
which contains it.

gcc/ChangeLog:

	* config/arm/iterators.md (MVE_vecs): Remove V2DF.
2024-12-13 14:25:35 +00:00
Christophe Lyon
4d79603e83 arm: [MVE intrinsics] Fix condition for vec_extract patterns
Remove floating-point condition from mve_vec_extract_sext_internal and
mve_vec_extract_zext_internal, since the MVE_2 iterator does not
include any FP mode.

gcc/ChangeLog:

	* config/arm/mve.md (mve_vec_extract_sext_internal): Fix
	condition.
	(mve_vec_extract_zext_internal): Likewise.
2024-12-13 14:25:27 +00:00
Christophe Lyon
e860e8561a arm: [MVE intrinsics] remove useless call_properties implementations.
vstrq_impl derives from store_truncating and vldrq_impl derives from
load_extending which both implement call_properties.

No need to re-implement them in the derived classes.

gcc/ChangeLog:

	* config/arm/arm-mve-builtins-base.cc (vstrq_impl): Remove
	call_properties.
	(vldrq_impl): Likewise.
2024-12-13 14:23:32 +00:00
Christophe Lyon
28e4682944 arm: [MVE intrinsics] rework vldr gather_base_wb
Implement vldr?q_gather_base_wb using the new MVE builtins framework.

gcc/ChangeLog:

	* config/arm/arm-builtins.cc (arm_ldrgbwbxu_qualifiers)
	(arm_ldrgbwbxu_z_qualifiers, arm_ldrgbwbs_qualifiers)
	(arm_ldrgbwbu_qualifiers, arm_ldrgbwbs_z_qualifiers)
	(arm_ldrgbwbu_z_qualifiers): Delete.
	* config/arm/arm-mve-builtins-base.cc (vldrq_gather_base_impl):
	Add support for MODE_wb.
	* config/arm/arm-mve-builtins-shapes.cc (struct
	load_gather_base_def): Likewise.
	* config/arm/arm_mve.h (vldrdq_gather_base_wb_s64): Delete.
	(vldrdq_gather_base_wb_u64): Delete.
	(vldrdq_gather_base_wb_z_s64): Delete.
	(vldrdq_gather_base_wb_z_u64): Delete.
	(vldrwq_gather_base_wb_f32): Delete.
	(vldrwq_gather_base_wb_s32): Delete.
	(vldrwq_gather_base_wb_u32): Delete.
	(vldrwq_gather_base_wb_z_f32): Delete.
	(vldrwq_gather_base_wb_z_s32): Delete.
	(vldrwq_gather_base_wb_z_u32): Delete.
	(__arm_vldrdq_gather_base_wb_s64): Delete.
	(__arm_vldrdq_gather_base_wb_u64): Delete.
	(__arm_vldrdq_gather_base_wb_z_s64): Delete.
	(__arm_vldrdq_gather_base_wb_z_u64): Delete.
	(__arm_vldrwq_gather_base_wb_s32): Delete.
	(__arm_vldrwq_gather_base_wb_u32): Delete.
	(__arm_vldrwq_gather_base_wb_z_s32): Delete.
	(__arm_vldrwq_gather_base_wb_z_u32): Delete.
	(__arm_vldrwq_gather_base_wb_f32): Delete.
	(__arm_vldrwq_gather_base_wb_z_f32): Delete.
	* config/arm/arm_mve_builtins.def (vldrwq_gather_base_nowb_z_u)
	(vldrdq_gather_base_nowb_z_u, vldrwq_gather_base_nowb_u)
	(vldrdq_gather_base_nowb_u, vldrwq_gather_base_nowb_z_s)
	(vldrwq_gather_base_nowb_z_f, vldrdq_gather_base_nowb_z_s)
	(vldrwq_gather_base_nowb_s, vldrwq_gather_base_nowb_f)
	(vldrdq_gather_base_nowb_s, vldrdq_gather_base_wb_z_s)
	(vldrdq_gather_base_wb_z_u, vldrdq_gather_base_wb_s)
	(vldrdq_gather_base_wb_u, vldrwq_gather_base_wb_z_s)
	(vldrwq_gather_base_wb_z_f, vldrwq_gather_base_wb_z_u)
	(vldrwq_gather_base_wb_s, vldrwq_gather_base_wb_f)
	(vldrwq_gather_base_wb_u): Delete
	* config/arm/iterators.md (supf): Remove VLDRWQGBWB_S,
	VLDRWQGBWB_U, VLDRDQGBWB_S, VLDRDQGBWB_U.
	(VLDRWGBWBQ, VLDRDGBWBQ): Delete.
	* config/arm/mve.md (mve_vldrwq_gather_base_wb_<supf>v4si): Delete.
	(mve_vldrwq_gather_base_nowb_<supf>v4si): Delete.
	(mve_vldrwq_gather_base_wb_<supf>v4si_insn): Delete.
	(mve_vldrwq_gather_base_wb_z_<supf>v4si): Delete.
	(mve_vldrwq_gather_base_nowb_z_<supf>v4si): Delete.
	(mve_vldrwq_gather_base_wb_z_<supf>v4si_insn): Delete.
	(mve_vldrwq_gather_base_wb_fv4sf): Delete.
	(mve_vldrwq_gather_base_nowb_fv4sf): Delete.
	(mve_vldrwq_gather_base_wb_fv4sf_insn): Delete.
	(mve_vldrwq_gather_base_wb_z_fv4sf): Delete.
	(mve_vldrwq_gather_base_nowb_z_fv4sf): Delete.
	(mve_vldrwq_gather_base_wb_z_fv4sf_insn): Delete.
	(mve_vldrdq_gather_base_wb_<supf>v2di): Delete.
	(mve_vldrdq_gather_base_nowb_<supf>v2di): Delete.
	(mve_vldrdq_gather_base_wb_<supf>v2di_insn): Delete.
	(mve_vldrdq_gather_base_wb_z_<supf>v2di): Delete.
	(mve_vldrdq_gather_base_nowb_z_<supf>v2di): Delete.
	(mve_vldrdq_gather_base_wb_z_<supf>v2di_insn): Delete.
	(@mve_vldrq_gather_base_wb_<mode>): New.
	(@mve_vldrq_gather_base_wb_z_<mode>): New.
	* config/arm/unspecs.md (VLDRWQGBWB_S, VLDRWQGBWB_U, VLDRWQGBWB_F)
	(VLDRDQGBWB_S, VLDRDQGBWB_U): Delete
	(VLDRGBWBQ, VLDRGBWBQ_Z): New.

gcc/testsuite/ChangeLog:

	* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c:
	Update expected output.
	* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c:
	Likewise.
2024-12-13 14:23:32 +00:00
Christophe Lyon
6505151088 arm: [MVE intrinsics] rework vldr gather_base
Implement vldr?q_gather_base using the new MVE builtins framework.

The patch updates two testcases rather than using different iterators
for predicated and non-predicated versions. According to ACLE:
vldrdq_gather_base_s64 is expected to generate VLDRD.64
vldrdq_gather_base_z_s64 is expected to generate VLDRDT.U64

Both are equally valid, however.

gcc/ChangeLog:

	* config/arm/arm-builtins.cc (arm_ldrgbs_qualifiers)
	(arm_ldrgbu_qualifiers, arm_ldrgbs_z_qualifiers)
	(arm_ldrgbu_z_qualifiers): Delete.
	* config/arm/arm-mve-builtins-base.cc (class
	vldrq_gather_base_impl): New.
	(vldrdq_gather_base, vldrwq_gather_base): New.
	* config/arm/arm-mve-builtins-base.def (vldrdq_gather_base)
	(vldrwq_gather_base): New.
	* config/arm/arm-mve-builtins-base.h: (vldrdq_gather_base)
	(vldrwq_gather_base): New.
	* config/arm/arm_mve.h (vldrwq_gather_base_s32): Delete.
	(vldrwq_gather_base_u32): Delete.
	(vldrwq_gather_base_z_u32): Delete.
	(vldrwq_gather_base_z_s32): Delete.
	(vldrdq_gather_base_s64): Delete.
	(vldrdq_gather_base_u64): Delete.
	(vldrdq_gather_base_z_s64): Delete.
	(vldrdq_gather_base_z_u64): Delete.
	(vldrwq_gather_base_f32): Delete.
	(vldrwq_gather_base_z_f32): Delete.
	(__arm_vldrwq_gather_base_s32): Delete.
	(__arm_vldrwq_gather_base_u32): Delete.
	(__arm_vldrwq_gather_base_z_s32): Delete.
	(__arm_vldrwq_gather_base_z_u32): Delete.
	(__arm_vldrdq_gather_base_s64): Delete.
	(__arm_vldrdq_gather_base_u64): Delete.
	(__arm_vldrdq_gather_base_z_s64): Delete.
	(__arm_vldrdq_gather_base_z_u64): Delete.
	(__arm_vldrwq_gather_base_f32): Delete.
	(__arm_vldrwq_gather_base_z_f32): Delete.
	* config/arm/arm_mve_builtins.def (vldrwq_gather_base_s)
	(vldrwq_gather_base_u, vldrwq_gather_base_z_s)
	(vldrwq_gather_base_z_u, vldrdq_gather_base_s)
	(vldrwq_gather_base_f, vldrdq_gather_base_z_s)
	(vldrwq_gather_base_z_f, vldrdq_gather_base_u)
	(vldrdq_gather_base_z_u): Delete.
	* config/arm/iterators.md (supf): Remove VLDRWQGB_S, VLDRWQGB_U,
	VLDRDQGB_S, VLDRDQGB_U.
	(VLDRWGBQ, VLDRDGBQ): Delete.
	* config/arm/mve.md (mve_vldrwq_gather_base_<supf>v4si): Delete.
	(mve_vldrwq_gather_base_z_<supf>v4si): Delete.
	(mve_vldrdq_gather_base_<supf>v2di): Delete.
	(mve_vldrdq_gather_base_z_<supf>v2di): Delete.
	(mve_vldrwq_gather_base_fv4sf): Delete.
	(mve_vldrwq_gather_base_z_fv4sf): Delete.
	(@mve_vldrq_gather_base_<mode>): New.
	(@mve_vldrq_gather_base_z_<mode>): New.
	* config/arm/unspecs.md (VLDRWQGB_S, VLDRWQGB_U, VLDRDQGB_S)
	(VLDRDQGB_U, VLDRWQGB_F): Delete.
	(VLDRGBQ, VLDRGBQ_Z): New.

gcc/testsuite/ChangeLog:

	* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_s64.c: Update
	expected output.
	* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_u64.c:
	Likewise.
2024-12-13 14:23:32 +00:00
Christophe Lyon
6aae1658d2 arm: [MVE intrinsics] add load_gather_base shape
This patch adds the load_gather_base shape description.

Unlike other load_gather shapes, this one does not support overloaded
forms.

gcc/ChangeLog:

	* config/arm/arm-mve-builtins-shapes.cc (struct
	load_gather_base_def): New.
	* config/arm/arm-mve-builtins-shapes.h: (load_gather_base): New.
2024-12-13 14:23:32 +00:00
Christophe Lyon
e0c38d6c95 arm: [MVE intrinsics] rework vldr gather_shifted_offset
Implement vldr?q_gather_shifted_offset using the new MVE builtins
framework.

gcc/ChangeLog:

	* config/arm/arm-builtins.cc (arm_ldrgu_qualifiers)
	(arm_ldrgs_qualifiers, arm_ldrgs_z_qualifiers)
	(arm_ldrgu_z_qualifiers): Delete.
	* config/arm/arm-mve-builtins-base.cc (vldrq_gather_impl): Add
	support for shifted version.
	(vldrdq_gather_shifted, vldrhq_gather_shifted)
	(vldrwq_gather_shifted): New.
	* config/arm/arm-mve-builtins-base.def (vldrdq_gather_shifted)
	(vldrhq_gather_shifted, vldrwq_gather_shifted): New.
	* config/arm/arm-mve-builtins-base.h (vldrdq_gather_shifted)
	(vldrhq_gather_shifted, vldrwq_gather_shifted): New.
	* config/arm/arm_mve.h (vldrhq_gather_shifted_offset): Delete.
	(vldrhq_gather_shifted_offset_z): Delete.
	(vldrdq_gather_shifted_offset): Delete.
	(vldrdq_gather_shifted_offset_z): Delete.
	(vldrwq_gather_shifted_offset): Delete.
	(vldrwq_gather_shifted_offset_z): Delete.
	(vldrhq_gather_shifted_offset_s32): Delete.
	(vldrhq_gather_shifted_offset_s16): Delete.
	(vldrhq_gather_shifted_offset_u32): Delete.
	(vldrhq_gather_shifted_offset_u16): Delete.
	(vldrhq_gather_shifted_offset_z_s32): Delete.
	(vldrhq_gather_shifted_offset_z_s16): Delete.
	(vldrhq_gather_shifted_offset_z_u32): Delete.
	(vldrhq_gather_shifted_offset_z_u16): Delete.
	(vldrdq_gather_shifted_offset_s64): Delete.
	(vldrdq_gather_shifted_offset_u64): Delete.
	(vldrdq_gather_shifted_offset_z_s64): Delete.
	(vldrdq_gather_shifted_offset_z_u64): Delete.
	(vldrhq_gather_shifted_offset_f16): Delete.
	(vldrhq_gather_shifted_offset_z_f16): Delete.
	(vldrwq_gather_shifted_offset_f32): Delete.
	(vldrwq_gather_shifted_offset_s32): Delete.
	(vldrwq_gather_shifted_offset_u32): Delete.
	(vldrwq_gather_shifted_offset_z_f32): Delete.
	(vldrwq_gather_shifted_offset_z_s32): Delete.
	(vldrwq_gather_shifted_offset_z_u32): Delete.
	(__arm_vldrhq_gather_shifted_offset_s32): Delete.
	(__arm_vldrhq_gather_shifted_offset_s16): Delete.
	(__arm_vldrhq_gather_shifted_offset_u32): Delete.
	(__arm_vldrhq_gather_shifted_offset_u16): Delete.
	(__arm_vldrhq_gather_shifted_offset_z_s32): Delete.
	(__arm_vldrhq_gather_shifted_offset_z_s16): Delete.
	(__arm_vldrhq_gather_shifted_offset_z_u32): Delete.
	(__arm_vldrhq_gather_shifted_offset_z_u16): Delete.
	(__arm_vldrdq_gather_shifted_offset_s64): Delete.
	(__arm_vldrdq_gather_shifted_offset_u64): Delete.
	(__arm_vldrdq_gather_shifted_offset_z_s64): Delete.
	(__arm_vldrdq_gather_shifted_offset_z_u64): Delete.
	(__arm_vldrwq_gather_shifted_offset_s32): Delete.
	(__arm_vldrwq_gather_shifted_offset_u32): Delete.
	(__arm_vldrwq_gather_shifted_offset_z_s32): Delete.
	(__arm_vldrwq_gather_shifted_offset_z_u32): Delete.
	(__arm_vldrhq_gather_shifted_offset_f16): Delete.
	(__arm_vldrhq_gather_shifted_offset_z_f16): Delete.
	(__arm_vldrwq_gather_shifted_offset_f32): Delete.
	(__arm_vldrwq_gather_shifted_offset_z_f32): Delete.
	(__arm_vldrhq_gather_shifted_offset): Delete.
	(__arm_vldrhq_gather_shifted_offset_z): Delete.
	(__arm_vldrdq_gather_shifted_offset): Delete.
	(__arm_vldrdq_gather_shifted_offset_z): Delete.
	(__arm_vldrwq_gather_shifted_offset): Delete.
	(__arm_vldrwq_gather_shifted_offset_z): Delete.
	* config/arm/arm_mve_builtins.def
	(vldrhq_gather_shifted_offset_z_u, vldrhq_gather_shifted_offset_u)
	(vldrhq_gather_shifted_offset_z_s, vldrhq_gather_shifted_offset_s)
	(vldrdq_gather_shifted_offset_s, vldrhq_gather_shifted_offset_f)
	(vldrwq_gather_shifted_offset_f, vldrwq_gather_shifted_offset_s)
	(vldrdq_gather_shifted_offset_z_s)
	(vldrhq_gather_shifted_offset_z_f)
	(vldrwq_gather_shifted_offset_z_f)
	(vldrwq_gather_shifted_offset_z_s, vldrdq_gather_shifted_offset_u)
	(vldrwq_gather_shifted_offset_u, vldrdq_gather_shifted_offset_z_u)
	(vldrwq_gather_shifted_offset_z_u): Delete.
	* config/arm/iterators.md (supf): Remove VLDRHQGSO_S, VLDRHQGSO_U,
	VLDRDQGSO_S, VLDRDQGSO_U, VLDRWQGSO_S, VLDRWQGSO_U.
	(VLDRHGSOQ, VLDRDGSOQ, VLDRWGSOQ): Delete.
	* config/arm/mve.md
	(mve_vldrhq_gather_shifted_offset_<supf><mode>): Delete.
	(mve_vldrhq_gather_shifted_offset_z_<supf><mode>): Delete.
	(mve_vldrdq_gather_shifted_offset_<supf>v2di): Delete.
	(mve_vldrdq_gather_shifted_offset_z_<supf>v2di): Delete.
	(mve_vldrhq_gather_shifted_offset_fv8hf): Delete.
	(mve_vldrhq_gather_shifted_offset_z_fv8hf): Delete.
	(mve_vldrwq_gather_shifted_offset_fv4sf): Delete.
	(mve_vldrwq_gather_shifted_offset_<supf>v4si): Delete.
	(mve_vldrwq_gather_shifted_offset_z_fv4sf): Delete.
	(mve_vldrwq_gather_shifted_offset_z_<supf>v4si): Delete.
	(@mve_vldrq_gather_shifted_offset_<mode>): New.
	(@mve_vldrq_gather_shifted_offset_extend_v4si<US>): New.
	(@mve_vldrq_gather_shifted_offset_z_<mode>): New.
	(@mve_vldrq_gather_shifted_offset_z_extend_v4si<US>): New.
	* config/arm/unspecs.md (VLDRHQGSO_S, VLDRHQGSO_U, VLDRDQGSO_S)
	(VLDRDQGSO_U, VLDRHQGSO_F, VLDRWQGSO_F, VLDRWQGSO_S, VLDRWQGSO_U):
	Delete.
	(VLDRGSOQ, VLDRGSOQ_Z, VLDRGSOQ_EXT, VLDRGSOQ_EXT_Z): New.
2024-12-13 14:23:31 +00:00
Christophe Lyon
218881ac83 arm: [MVE intrinsics] rework vldr gather_offset
Implement vldr?q_gather_offset using the new MVE builtins framework.

The patch introduces a new attribute iterator (MVE_u_elem) to
accomodate the fact that ACLE's expected output description uses "uNN"
for all modes, except V8HF where it expects ".f16".  Using "V_sz_elem"
would work, but would require to update several testcases.

gcc/ChangeLog:

	* config/arm/arm-mve-builtins-base.cc (class vldrq_gather_impl):
	New.
	(vldrbq_gather, vldrdq_gather, vldrhq_gather, vldrwq_gather): New.
	* config/arm/arm-mve-builtins-base.def (vldrbq_gather)
	(vldrdq_gather, vldrhq_gather, vldrwq_gather): New.
	* config/arm/arm-mve-builtins-base.h (vldrbq_gather)
	(vldrdq_gather, vldrhq_gather, vldrwq_gather): New.
	* config/arm/arm_mve.h (vldrbq_gather_offset): Delete.
	(vldrbq_gather_offset_z): Delete.
	(vldrhq_gather_offset): Delete.
	(vldrhq_gather_offset_z): Delete.
	(vldrdq_gather_offset): Delete.
	(vldrdq_gather_offset_z): Delete.
	(vldrwq_gather_offset): Delete.
	(vldrwq_gather_offset_z): Delete.
	(vldrbq_gather_offset_u8): Delete.
	(vldrbq_gather_offset_s8): Delete.
	(vldrbq_gather_offset_u16): Delete.
	(vldrbq_gather_offset_s16): Delete.
	(vldrbq_gather_offset_u32): Delete.
	(vldrbq_gather_offset_s32): Delete.
	(vldrbq_gather_offset_z_s16): Delete.
	(vldrbq_gather_offset_z_u8): Delete.
	(vldrbq_gather_offset_z_s32): Delete.
	(vldrbq_gather_offset_z_u16): Delete.
	(vldrbq_gather_offset_z_u32): Delete.
	(vldrbq_gather_offset_z_s8): Delete.
	(vldrhq_gather_offset_s32): Delete.
	(vldrhq_gather_offset_s16): Delete.
	(vldrhq_gather_offset_u32): Delete.
	(vldrhq_gather_offset_u16): Delete.
	(vldrhq_gather_offset_z_s32): Delete.
	(vldrhq_gather_offset_z_s16): Delete.
	(vldrhq_gather_offset_z_u32): Delete.
	(vldrhq_gather_offset_z_u16): Delete.
	(vldrdq_gather_offset_s64): Delete.
	(vldrdq_gather_offset_u64): Delete.
	(vldrdq_gather_offset_z_s64): Delete.
	(vldrdq_gather_offset_z_u64): Delete.
	(vldrhq_gather_offset_f16): Delete.
	(vldrhq_gather_offset_z_f16): Delete.
	(vldrwq_gather_offset_f32): Delete.
	(vldrwq_gather_offset_s32): Delete.
	(vldrwq_gather_offset_u32): Delete.
	(vldrwq_gather_offset_z_f32): Delete.
	(vldrwq_gather_offset_z_s32): Delete.
	(vldrwq_gather_offset_z_u32): Delete.
	(__arm_vldrbq_gather_offset_u8): Delete.
	(__arm_vldrbq_gather_offset_s8): Delete.
	(__arm_vldrbq_gather_offset_u16): Delete.
	(__arm_vldrbq_gather_offset_s16): Delete.
	(__arm_vldrbq_gather_offset_u32): Delete.
	(__arm_vldrbq_gather_offset_s32): Delete.
	(__arm_vldrbq_gather_offset_z_s8): Delete.
	(__arm_vldrbq_gather_offset_z_s32): Delete.
	(__arm_vldrbq_gather_offset_z_s16): Delete.
	(__arm_vldrbq_gather_offset_z_u8): Delete.
	(__arm_vldrbq_gather_offset_z_u32): Delete.
	(__arm_vldrbq_gather_offset_z_u16): Delete.
	(__arm_vldrhq_gather_offset_s32): Delete.
	(__arm_vldrhq_gather_offset_s16): Delete.
	(__arm_vldrhq_gather_offset_u32): Delete.
	(__arm_vldrhq_gather_offset_u16): Delete.
	(__arm_vldrhq_gather_offset_z_s32): Delete.
	(__arm_vldrhq_gather_offset_z_s16): Delete.
	(__arm_vldrhq_gather_offset_z_u32): Delete.
	(__arm_vldrhq_gather_offset_z_u16): Delete.
	(__arm_vldrdq_gather_offset_s64): Delete.
	(__arm_vldrdq_gather_offset_u64): Delete.
	(__arm_vldrdq_gather_offset_z_s64): Delete.
	(__arm_vldrdq_gather_offset_z_u64): Delete.
	(__arm_vldrwq_gather_offset_s32): Delete.
	(__arm_vldrwq_gather_offset_u32): Delete.
	(__arm_vldrwq_gather_offset_z_s32): Delete.
	(__arm_vldrwq_gather_offset_z_u32): Delete.
	(__arm_vldrhq_gather_offset_f16): Delete.
	(__arm_vldrhq_gather_offset_z_f16): Delete.
	(__arm_vldrwq_gather_offset_f32): Delete.
	(__arm_vldrwq_gather_offset_z_f32): Delete.
	(__arm_vldrbq_gather_offset): Delete.
	(__arm_vldrbq_gather_offset_z): Delete.
	(__arm_vldrhq_gather_offset): Delete.
	(__arm_vldrhq_gather_offset_z): Delete.
	(__arm_vldrdq_gather_offset): Delete.
	(__arm_vldrdq_gather_offset_z): Delete.
	(__arm_vldrwq_gather_offset): Delete.
	(__arm_vldrwq_gather_offset_z): Delete.
	* config/arm/arm_mve_builtins.def (vldrbq_gather_offset_u)
	(vldrbq_gather_offset_s, vldrbq_gather_offset_z_s)
	(vldrbq_gather_offset_z_u, vldrhq_gather_offset_z_u)
	(vldrhq_gather_offset_u, vldrhq_gather_offset_z_s)
	(vldrhq_gather_offset_s, vldrdq_gather_offset_s)
	(vldrhq_gather_offset_f, vldrwq_gather_offset_f)
	(vldrwq_gather_offset_s, vldrdq_gather_offset_z_s)
	(vldrhq_gather_offset_z_f, vldrwq_gather_offset_z_f)
	(vldrwq_gather_offset_z_s, vldrdq_gather_offset_u)
	(vldrwq_gather_offset_u, vldrdq_gather_offset_z_u)
	(vldrwq_gather_offset_z_u): Delete.
	* config/arm/iterators.md (MVE_u_elem): New.
	(supf): Remove VLDRBQGO_S, VLDRBQGO_U, VLDRHQGO_S, VLDRHQGO_U,
	VLDRDQGO_S, VLDRDQGO_U, VLDRWQGO_S, VLDRWQGO_U.
	(VLDRBGOQ, VLDRHGOQ, VLDRDGOQ, VLDRWGOQ): Delete.
	* config/arm/mve.md (mve_vldrbq_gather_offset_<supf><mode>):
	Delete.
	(mve_vldrbq_gather_offset_z_<supf><mode>): Delete.
	(mve_vldrhq_gather_offset_<supf><mode>): Delete.
	(mve_vldrhq_gather_offset_z_<supf><mode>): Delete.
	(mve_vldrdq_gather_offset_<supf>v2di): Delete.
	(mve_vldrdq_gather_offset_z_<supf>v2di): Delete.
	(mve_vldrhq_gather_offset_fv8hf): Delete.
	(mve_vldrhq_gather_offset_z_fv8hf): Delete.
	(mve_vldrwq_gather_offset_fv4sf): Delete.
	(mve_vldrwq_gather_offset_<supf>v4si): Delete.
	(mve_vldrwq_gather_offset_z_fv4sf): Delete.
	(mve_vldrwq_gather_offset_z_<supf>v4si): Delete.
	(@mve_vldrq_gather_offset_<mode>): New.
	(@mve_vldrq_gather_offset_extend_<mode><US>): New.
	(@mve_vldrq_gather_offset_z_<mode>): New.
	(@mve_vldrq_gather_offset_z_extend_<mode><US>): New.
	* config/arm/unspecs.md (VLDRBQGO_S, VLDRBQGO_U, VLDRHQGO_S)
	(VLDRHQGO_U, VLDRDQGO_S, VLDRDQGO_U, VLDRHQGO_F, VLDRWQGO_F)
	(VLDRWQGO_S, VLDRWQGO_U): Delete.
	(VLDRGOQ, VLDRGOQ_Z, VLDRGOQ_EXT, VLDRGOQ_EXT_Z): New.
2024-12-13 14:23:31 +00:00
Christophe Lyon
20e31a082c arm: [MVE intrinsics] add load_ext_gather_offset shape
This patch adds the load_ext_gather_offset shape description.

gcc/ChangeLog:

	* config/arm/arm-mve-builtins-shapes.cc (struct load_ext_gather):
	New.
	(struct load_ext_gather_offset_def): New.
	* config/arm/arm-mve-builtins-shapes.h (load_ext_gather_offset):
	New.
2024-12-13 14:23:31 +00:00
Christophe Lyon
b0512ae20b arm: [MVE intrinsics] rework vstr scatter_base_wb
Implement vstr?q_scatter_base_wb using the new MVE builtins framework.

The patch introduces a new 'b' type for signatures, which
represents the type of the 'base' argument of vstr?q_scatter_base_wb.

gcc/ChangeLog:

	* config/arm/arm-builtins.cc (arm_strsbwbs_qualifiers)
	(arm_strsbwbu_qualifiers, arm_strsbwbs_p_qualifiers)
	(arm_strsbwbu_p_qualifiers): Delete.
	* config/arm/arm-mve-builtins-base.cc (vstrq_scatter_base_impl):
	Add support for MODE_wb.
	* config/arm/arm-mve-builtins-shapes.cc (parse_type): Add support
	for 'b' type.
	(store_scatter_base): Add support for MODE_wb.
	* config/arm/arm-mve-builtins.cc
	(function_resolver::require_pointer_to_type): New.
	* config/arm/arm-mve-builtins.h
	(function_resolver::require_pointer_to_type): New.
	* config/arm/arm_mve.h (vstrdq_scatter_base_wb): Delete.
	(vstrdq_scatter_base_wb_p): Delete.
	(vstrwq_scatter_base_wb_p): Delete.
	(vstrwq_scatter_base_wb): Delete.
	(vstrdq_scatter_base_wb_p_s64): Delete.
	(vstrdq_scatter_base_wb_p_u64): Delete.
	(vstrdq_scatter_base_wb_s64): Delete.
	(vstrdq_scatter_base_wb_u64): Delete.
	(vstrwq_scatter_base_wb_p_s32): Delete.
	(vstrwq_scatter_base_wb_p_f32): Delete.
	(vstrwq_scatter_base_wb_p_u32): Delete.
	(vstrwq_scatter_base_wb_s32): Delete.
	(vstrwq_scatter_base_wb_u32): Delete.
	(vstrwq_scatter_base_wb_f32): Delete.
	(__arm_vstrdq_scatter_base_wb_s64): Delete.
	(__arm_vstrdq_scatter_base_wb_u64): Delete.
	(__arm_vstrdq_scatter_base_wb_p_s64): Delete.
	(__arm_vstrdq_scatter_base_wb_p_u64): Delete.
	(__arm_vstrwq_scatter_base_wb_p_s32): Delete.
	(__arm_vstrwq_scatter_base_wb_p_u32): Delete.
	(__arm_vstrwq_scatter_base_wb_s32): Delete.
	(__arm_vstrwq_scatter_base_wb_u32): Delete.
	(__arm_vstrwq_scatter_base_wb_f32): Delete.
	(__arm_vstrwq_scatter_base_wb_p_f32): Delete.
	(__arm_vstrdq_scatter_base_wb): Delete.
	(__arm_vstrdq_scatter_base_wb_p): Delete.
	(__arm_vstrwq_scatter_base_wb_p): Delete.
	(__arm_vstrwq_scatter_base_wb): Delete.
	* config/arm/arm_mve_builtins.def (vstrwq_scatter_base_wb_u)
	(vstrdq_scatter_base_wb_u, vstrwq_scatter_base_wb_p_u)
	(vstrdq_scatter_base_wb_p_u, vstrwq_scatter_base_wb_s)
	(vstrwq_scatter_base_wb_f, vstrdq_scatter_base_wb_s)
	(vstrwq_scatter_base_wb_p_s, vstrwq_scatter_base_wb_p_f)
	(vstrdq_scatter_base_wb_p_s): Delete.
	* config/arm/iterators.md (supf): Remove VSTRWQSBWB_S,
	VSTRWQSBWB_U, VSTRDQSBWB_S, VSTRDQSBWB_U.
	(VSTRDSBQ, VSTRWSBWBQ, VSTRDSBWBQ): Delete.
	* config/arm/mve.md (mve_vstrwq_scatter_base_wb_<supf>v4si): Delete.
	(mve_vstrwq_scatter_base_wb_p_<supf>v4si): Delete.
	(mve_vstrwq_scatter_base_wb_fv4sf): Delete.
	(mve_vstrwq_scatter_base_wb_p_fv4sf): Delete.
	(mve_vstrdq_scatter_base_wb_<supf>v2di): Delete.
	(mve_vstrdq_scatter_base_wb_p_<supf>v2di): Delete.
	(@mve_vstrq_scatter_base_wb_<mode>): New.
	(@mve_vstrq_scatter_base_wb_p_<mode>): New.
	* config/arm/unspecs.md (VSTRWQSBWB_S, VSTRWQSBWB_U, VSTRWQSBWB_F)
	(VSTRDQSBWB_S, VSTRDQSBWB_U): Delete.
	(VSTRSBWBQ, VSTRSBWBQ_P): New.
2024-12-13 14:23:31 +00:00
Christophe Lyon
39cc2ed30e arm: [MVE intrinsics] rework vstr scatter_base
Implement vstr?q_scatter_base using the new MVE builtins framework.

We need to introduce a new iterator (MVE_4) to support the set needed
by vstr?q_scatter_base (V4SI V4SF V2DI).

gcc/ChangeLog:

	* config/arm/arm-builtins.cc (arm_strsbs_qualifiers)
	(arm_strsbu_qualifiers, arm_strsbs_p_qualifiers)
	(arm_strsbu_p_qualifiers): Delete.
	* config/arm/arm-mve-builtins-base.cc (class
	vstrq_scatter_base_impl): New.
	(vstrwq_scatter_base, vstrdq_scatter_base): New.
	* config/arm/arm-mve-builtins-base.def (vstrwq_scatter_base)
	(vstrdq_scatter_base): New.
	* config/arm/arm-mve-builtins-base.h (vstrwq_scatter_base)
	(vstrdq_scatter_base): New.
	* config/arm/arm_mve.h (vstrwq_scatter_base): Delete.
	(vstrwq_scatter_base_p): Delete.
	(vstrdq_scatter_base_p): Delete.
	(vstrdq_scatter_base): Delete.
	(vstrwq_scatter_base_s32): Delete.
	(vstrwq_scatter_base_u32): Delete.
	(vstrwq_scatter_base_p_s32): Delete.
	(vstrwq_scatter_base_p_u32): Delete.
	(vstrdq_scatter_base_p_s64): Delete.
	(vstrdq_scatter_base_p_u64): Delete.
	(vstrdq_scatter_base_s64): Delete.
	(vstrdq_scatter_base_u64): Delete.
	(vstrwq_scatter_base_f32): Delete.
	(vstrwq_scatter_base_p_f32): Delete.
	(__arm_vstrwq_scatter_base_s32): Delete.
	(__arm_vstrwq_scatter_base_u32): Delete.
	(__arm_vstrwq_scatter_base_p_s32): Delete.
	(__arm_vstrwq_scatter_base_p_u32): Delete.
	(__arm_vstrdq_scatter_base_p_s64): Delete.
	(__arm_vstrdq_scatter_base_p_u64): Delete.
	(__arm_vstrdq_scatter_base_s64): Delete.
	(__arm_vstrdq_scatter_base_u64): Delete.
	(__arm_vstrwq_scatter_base_f32): Delete.
	(__arm_vstrwq_scatter_base_p_f32): Delete.
	(__arm_vstrwq_scatter_base): Delete.
	(__arm_vstrwq_scatter_base_p): Delete.
	(__arm_vstrdq_scatter_base_p): Delete.
	(__arm_vstrdq_scatter_base): Delete.
	* config/arm/arm_mve_builtins.def (vstrwq_scatter_base_s)
	(vstrwq_scatter_base_u, vstrwq_scatter_base_p_s)
	(vstrwq_scatter_base_p_u, vstrdq_scatter_base_s)
	(vstrwq_scatter_base_f, vstrdq_scatter_base_p_s)
	(vstrwq_scatter_base_p_f, vstrdq_scatter_base_u)
	(vstrdq_scatter_base_p_u): Delete.
	* config/arm/iterators.md (MVE_4): New.
	(supf): Remove VSTRWQSB_S, VSTRWQSB_U.
	(VSTRWSBQ): Delete.
	* config/arm/mve.md (mve_vstrwq_scatter_base_<supf>v4si): Delete.
	(mve_vstrwq_scatter_base_p_<supf>v4si): Delete.
	(mve_vstrdq_scatter_base_p_<supf>v2di): Delete.
	(mve_vstrdq_scatter_base_<supf>v2di): Delete.
	(mve_vstrwq_scatter_base_fv4sf): Delete.
	(mve_vstrwq_scatter_base_p_fv4sf): Delete.
	(@mve_vstrq_scatter_base_<mode>): New.
	(@mve_vstrq_scatter_base_p_<mode>): New.
	* config/arm/unspecs.md (VSTRWQSB_S, VSTRWQSB_U, VSTRWQSB_F):
	Delete.
	(VSTRSBQ, VSTRSBQ_P): New.
2024-12-13 14:23:30 +00:00
Christophe Lyon
1f2ab5b390 arm: [MVE intrinsics] Add store_scatter_base shape
This patch adds the store_scatter_base shape description.

gcc/ChangeLog:

	* config/arm/arm-mve-builtins-shapes.cc (store_scatter_base): New.
	* config/arm/arm-mve-builtins-shapes.h (store_scatter_base): New.
2024-12-13 14:23:30 +00:00
Christophe Lyon
c0ab343398 arm: [MVE intrinsics] Check immediate is a multiple in a range
This patch adds support to check that an immediate is a multiple of a
given value in a given range.

This will be used for instance by scatter_base to check that offset is
in +/-4*[0..127].

Unlike require_immediate_range, require_immediate_range_multiple
accepts signed range bounds to handle the above case.

gcc/ChangeLog:

	* config/arm/arm-mve-builtins.cc (report_out_of_range_multiple):
	New.
	(function_checker::require_signed_immediate): New.
	(function_checker::require_immediate_range_multiple): New.
	* config/arm/arm-mve-builtins.h
	(function_checker::require_immediate_range_multiple): New.
	(function_checker::require_signed_immediate): New.
2024-12-13 14:23:30 +00:00
Christophe Lyon
294e5424f2 arm: [MVE intrinsics] rework vstr_scatter_shifted_offset
Implement vstr?q_scatter_shifted_offset intrinsics using the MVE
builtins framework.

We use the same approach as the previous patch, and we now have four
sets of patterns:
- vector scatter stores with shifted offset (non-truncating)
- predicated vector scatter stores with shifted offset (non-truncating)
- truncating vector scatter stores with shifted offset
- predicated truncating vector scatter stores with shifted offset

Note that the truncating patterns do not use an iterator since there
is only one such variant: V4SI to V4HI.

We need to introduce new iterators:
- MVE_VLD_ST_scatter_shifted, same as MVE_VLD_ST_scatter without V16QI
- MVE_scatter_shift to map the mode to the shift amount

gcc/ChangeLog:

	* config/arm/arm-builtins.cc (arm_strss_qualifiers)
	(arm_strsu_qualifiers, arm_strsu_p_qualifiers)
	(arm_strss_p_qualifiers): Delete.
	* config/arm/arm-mve-builtins-base.cc (class vstrq_scatter_impl):
	Add support for shifted version.
	(vstrdq_scatter_shifted, vstrhq_scatter_shifted)
	(vstrwq_scatter_shifted): New.
	* config/arm/arm-mve-builtins-base.def (vstrhq_scatter_shifted)
	(vstrwq_scatter_shifted, vstrdq_scatter_shifted): New.
	* config/arm/arm-mve-builtins-base.h (vstrhq_scatter_shifted)
	(vstrwq_scatter_shifted, vstrdq_scatter_shifted): New.
	* config/arm/arm_mve.h (vstrhq_scatter_shifted_offset): Delete.
	(vstrhq_scatter_shifted_offset_p): Delete.
	(vstrdq_scatter_shifted_offset_p): Delete.
	(vstrdq_scatter_shifted_offset): Delete.
	(vstrwq_scatter_shifted_offset_p): Delete.
	(vstrwq_scatter_shifted_offset): Delete.
	(vstrhq_scatter_shifted_offset_s32): Delete.
	(vstrhq_scatter_shifted_offset_s16): Delete.
	(vstrhq_scatter_shifted_offset_u32): Delete.
	(vstrhq_scatter_shifted_offset_u16): Delete.
	(vstrhq_scatter_shifted_offset_p_s32): Delete.
	(vstrhq_scatter_shifted_offset_p_s16): Delete.
	(vstrhq_scatter_shifted_offset_p_u32): Delete.
	(vstrhq_scatter_shifted_offset_p_u16): Delete.
	(vstrdq_scatter_shifted_offset_p_s64): Delete.
	(vstrdq_scatter_shifted_offset_p_u64): Delete.
	(vstrdq_scatter_shifted_offset_s64): Delete.
	(vstrdq_scatter_shifted_offset_u64): Delete.
	(vstrhq_scatter_shifted_offset_f16): Delete.
	(vstrhq_scatter_shifted_offset_p_f16): Delete.
	(vstrwq_scatter_shifted_offset_f32): Delete.
	(vstrwq_scatter_shifted_offset_p_f32): Delete.
	(vstrwq_scatter_shifted_offset_p_s32): Delete.
	(vstrwq_scatter_shifted_offset_p_u32): Delete.
	(vstrwq_scatter_shifted_offset_s32): Delete.
	(vstrwq_scatter_shifted_offset_u32): Delete.
	(__arm_vstrhq_scatter_shifted_offset_s32): Delete.
	(__arm_vstrhq_scatter_shifted_offset_s16): Delete.
	(__arm_vstrhq_scatter_shifted_offset_u32): Delete.
	(__arm_vstrhq_scatter_shifted_offset_u16): Delete.
	(__arm_vstrhq_scatter_shifted_offset_p_s32): Delete.
	(__arm_vstrhq_scatter_shifted_offset_p_s16): Delete.
	(__arm_vstrhq_scatter_shifted_offset_p_u32): Delete.
	(__arm_vstrhq_scatter_shifted_offset_p_u16): Delete.
	(__arm_vstrdq_scatter_shifted_offset_p_s64): Delete.
	(__arm_vstrdq_scatter_shifted_offset_p_u64): Delete.
	(__arm_vstrdq_scatter_shifted_offset_s64): Delete.
	(__arm_vstrdq_scatter_shifted_offset_u64): Delete.
	(__arm_vstrwq_scatter_shifted_offset_p_s32): Delete.
	(__arm_vstrwq_scatter_shifted_offset_p_u32): Delete.
	(__arm_vstrwq_scatter_shifted_offset_s32): Delete.
	(__arm_vstrwq_scatter_shifted_offset_u32): Delete.
	(__arm_vstrhq_scatter_shifted_offset_f16): Delete.
	(__arm_vstrhq_scatter_shifted_offset_p_f16): Delete.
	(__arm_vstrwq_scatter_shifted_offset_f32): Delete.
	(__arm_vstrwq_scatter_shifted_offset_p_f32): Delete.
	(__arm_vstrhq_scatter_shifted_offset): Delete.
	(__arm_vstrhq_scatter_shifted_offset_p): Delete.
	(__arm_vstrdq_scatter_shifted_offset_p): Delete.
	(__arm_vstrdq_scatter_shifted_offset): Delete.
	(__arm_vstrwq_scatter_shifted_offset_p): Delete.
	(__arm_vstrwq_scatter_shifted_offset): Delete.
	* config/arm/arm_mve_builtins.def
	(vstrhq_scatter_shifted_offset_p_u)
	(vstrhq_scatter_shifted_offset_u)
	(vstrhq_scatter_shifted_offset_p_s)
	(vstrhq_scatter_shifted_offset_s, vstrdq_scatter_shifted_offset_s)
	(vstrhq_scatter_shifted_offset_f, vstrwq_scatter_shifted_offset_f)
	(vstrwq_scatter_shifted_offset_s)
	(vstrdq_scatter_shifted_offset_p_s)
	(vstrhq_scatter_shifted_offset_p_f)
	(vstrwq_scatter_shifted_offset_p_f)
	(vstrwq_scatter_shifted_offset_p_s)
	(vstrdq_scatter_shifted_offset_u, vstrwq_scatter_shifted_offset_u)
	(vstrdq_scatter_shifted_offset_p_u)
	(vstrwq_scatter_shifted_offset_p_u): Delete.
	* config/arm/iterators.md (MVE_VLD_ST_scatter_shifted): New.
	(MVE_scatter_shift): New.
	(supf): Remove VSTRHQSSO_S, VSTRHQSSO_U, VSTRDQSSO_S, VSTRDQSSO_U,
	VSTRWQSSO_U, VSTRWQSSO_S.
	(VSTRHSSOQ, VSTRDSSOQ, VSTRWSSOQ): Delete.
	* config/arm/mve.md (mve_vstrhq_scatter_shifted_offset_p_<supf><mode>): Delete.
	(mve_vstrhq_scatter_shifted_offset_p_<supf><mode>_insn): Delete.
	(mve_vstrhq_scatter_shifted_offset_<supf><mode>): Delete.
	(mve_vstrhq_scatter_shifted_offset_<supf><mode>_insn): Delete.
	(mve_vstrdq_scatter_shifted_offset_p_<supf>v2di): Delete.
	(mve_vstrdq_scatter_shifted_offset_p_<supf>v2di_insn): Delete.
	(mve_vstrdq_scatter_shifted_offset_<supf>v2di): Delete.
	(mve_vstrdq_scatter_shifted_offset_<supf>v2di_insn): Delete.
	(mve_vstrhq_scatter_shifted_offset_fv8hf): Delete.
	(mve_vstrhq_scatter_shifted_offset_fv8hf_insn): Delete.
	(mve_vstrhq_scatter_shifted_offset_p_fv8hf): Delete.
	(mve_vstrhq_scatter_shifted_offset_p_fv8hf_insn): Delete.
	(mve_vstrwq_scatter_shifted_offset_fv4sf): Delete.
	(mve_vstrwq_scatter_shifted_offset_fv4sf_insn): Delete.
	(mve_vstrwq_scatter_shifted_offset_p_fv4sf): Delete.
	(mve_vstrwq_scatter_shifted_offset_p_fv4sf_insn): Delete.
	(mve_vstrwq_scatter_shifted_offset_p_<supf>v4si): Delete.
	(mve_vstrwq_scatter_shifted_offset_p_<supf>v4si_insn): Delete.
	(mve_vstrwq_scatter_shifted_offset_<supf>v4si): Delete.
	(mve_vstrwq_scatter_shifted_offset_<supf>v4si_insn): Delete.
	(@mve_vstrq_scatter_shifted_offset_<mode>): New.
	(@mve_vstrq_scatter_shifted_offset_p_<mode>): New.
	(mve_vstrq_truncate_scatter_shifted_offset_v4si): New.
	(mve_vstrq_truncate_scatter_shifted_offset_p_v4si): New.
	* config/arm/unspecs.md (VSTRDQSSO_S, VSTRDQSSO_U, VSTRWQSSO_S)
	(VSTRWQSSO_U, VSTRHQSSO_F, VSTRWQSSO_F, VSTRHQSSO_S, VSTRHQSSO_U):
	Delete.
	(VSTRSSOQ, VSTRSSOQ_P, VSTRSSOQ_TRUNC, VSTRSSOQ_TRUNC_P): New.
2024-12-13 14:23:30 +00:00
Christophe Lyon
5cfb8ff332 arm: [MVE intrinsics] rework vstr?q_scatter_offset
This patch implements vstr?q_scatter_offset using the new MVE builtins
framework.

It uses a similar approach to a previous patch which grouped
truncating and non-truncating stores in two sets of patterns, rather
than having groups of patterns depending on the destination size.

We need to add the 'integer_64' types of suffixes in order to support
vstrdq_scatter_offset.

The patch introduces the MVE_VLD_ST_scatter iterator, similar to
MVE_VLD_ST but which also includes V2DI (again, for
vstrdq_scatter_offset).

The new MVE_scatter_offset mode attribute is used to map the
destination type to the offset type (both are usually equal, except
when the destination is floating-point).

We end up with four sets of patterns:
- vector scatter stores with offset (non-truncating)
- predicated vector scatter stores with offset (non-truncating)
- truncating vector scatter stores with offset
- predicated truncating vector scatter stores with offset

gcc/ChangeLog:

	* config/arm/arm-mve-builtins-base.cc (class vstrq_scatter_impl):
	New.
	(vstrbq_scatter, vstrhq_scatter, vstrwq_scatter, vstrdq_scatter):
	New.
	* config/arm/arm-mve-builtins-base.def (vstrbq_scatter)
	(vstrhq_scatter, vstrwq_scatter, vstrdq_scatter): New.
	* config/arm/arm-mve-builtins-base.h (vstrbq_scatter)
	(vstrhq_scatter, vstrwq_scatter, vstrdq_scatter): New.
	* config/arm/arm-mve-builtins.cc (integer_64): New.
	* config/arm/arm_mve.h (vstrbq_scatter_offset): Delete.
	(vstrbq_scatter_offset_p): Delete.
	(vstrhq_scatter_offset): Delete.
	(vstrhq_scatter_offset_p): Delete.
	(vstrdq_scatter_offset_p): Delete.
	(vstrdq_scatter_offset): Delete.
	(vstrwq_scatter_offset_p): Delete.
	(vstrwq_scatter_offset): Delete.
	(vstrbq_scatter_offset_s8): Delete.
	(vstrbq_scatter_offset_u8): Delete.
	(vstrbq_scatter_offset_u16): Delete.
	(vstrbq_scatter_offset_s16): Delete.
	(vstrbq_scatter_offset_u32): Delete.
	(vstrbq_scatter_offset_s32): Delete.
	(vstrbq_scatter_offset_p_s8): Delete.
	(vstrbq_scatter_offset_p_s32): Delete.
	(vstrbq_scatter_offset_p_s16): Delete.
	(vstrbq_scatter_offset_p_u8): Delete.
	(vstrbq_scatter_offset_p_u32): Delete.
	(vstrbq_scatter_offset_p_u16): Delete.
	(vstrhq_scatter_offset_s32): Delete.
	(vstrhq_scatter_offset_s16): Delete.
	(vstrhq_scatter_offset_u32): Delete.
	(vstrhq_scatter_offset_u16): Delete.
	(vstrhq_scatter_offset_p_s32): Delete.
	(vstrhq_scatter_offset_p_s16): Delete.
	(vstrhq_scatter_offset_p_u32): Delete.
	(vstrhq_scatter_offset_p_u16): Delete.
	(vstrdq_scatter_offset_p_s64): Delete.
	(vstrdq_scatter_offset_p_u64): Delete.
	(vstrdq_scatter_offset_s64): Delete.
	(vstrdq_scatter_offset_u64): Delete.
	(vstrhq_scatter_offset_f16): Delete.
	(vstrhq_scatter_offset_p_f16): Delete.
	(vstrwq_scatter_offset_f32): Delete.
	(vstrwq_scatter_offset_p_f32): Delete.
	(vstrwq_scatter_offset_p_s32): Delete.
	(vstrwq_scatter_offset_p_u32): Delete.
	(vstrwq_scatter_offset_s32): Delete.
	(vstrwq_scatter_offset_u32): Delete.
	(__arm_vstrbq_scatter_offset_s8): Delete.
	(__arm_vstrbq_scatter_offset_s32): Delete.
	(__arm_vstrbq_scatter_offset_s16): Delete.
	(__arm_vstrbq_scatter_offset_u8): Delete.
	(__arm_vstrbq_scatter_offset_u32): Delete.
	(__arm_vstrbq_scatter_offset_u16): Delete.
	(__arm_vstrbq_scatter_offset_p_s8): Delete.
	(__arm_vstrbq_scatter_offset_p_s32): Delete.
	(__arm_vstrbq_scatter_offset_p_s16): Delete.
	(__arm_vstrbq_scatter_offset_p_u8): Delete.
	(__arm_vstrbq_scatter_offset_p_u32): Delete.
	(__arm_vstrbq_scatter_offset_p_u16): Delete.
	(__arm_vstrhq_scatter_offset_s32): Delete.
	(__arm_vstrhq_scatter_offset_s16): Delete.
	(__arm_vstrhq_scatter_offset_u32): Delete.
	(__arm_vstrhq_scatter_offset_u16): Delete.
	(__arm_vstrhq_scatter_offset_p_s32): Delete.
	(__arm_vstrhq_scatter_offset_p_s16): Delete.
	(__arm_vstrhq_scatter_offset_p_u32): Delete.
	(__arm_vstrhq_scatter_offset_p_u16): Delete.
	(__arm_vstrdq_scatter_offset_p_s64): Delete.
	(__arm_vstrdq_scatter_offset_p_u64): Delete.
	(__arm_vstrdq_scatter_offset_s64): Delete.
	(__arm_vstrdq_scatter_offset_u64): Delete.
	(__arm_vstrwq_scatter_offset_p_s32): Delete.
	(__arm_vstrwq_scatter_offset_p_u32): Delete.
	(__arm_vstrwq_scatter_offset_s32): Delete.
	(__arm_vstrwq_scatter_offset_u32): Delete.
	(__arm_vstrhq_scatter_offset_f16): Delete.
	(__arm_vstrhq_scatter_offset_p_f16): Delete.
	(__arm_vstrwq_scatter_offset_f32): Delete.
	(__arm_vstrwq_scatter_offset_p_f32): Delete.
	(__arm_vstrbq_scatter_offset): Delete.
	(__arm_vstrbq_scatter_offset_p): Delete.
	(__arm_vstrhq_scatter_offset): Delete.
	(__arm_vstrhq_scatter_offset_p): Delete.
	(__arm_vstrdq_scatter_offset_p): Delete.
	(__arm_vstrdq_scatter_offset): Delete.
	(__arm_vstrwq_scatter_offset_p): Delete.
	(__arm_vstrwq_scatter_offset): Delete.
	* config/arm/arm_mve_builtins.def (vstrbq_scatter_offset_s)
	(vstrbq_scatter_offset_u, vstrbq_scatter_offset_p_s)
	(vstrbq_scatter_offset_p_u, vstrhq_scatter_offset_p_u)
	(vstrhq_scatter_offset_u, vstrhq_scatter_offset_p_s)
	(vstrhq_scatter_offset_s, vstrdq_scatter_offset_s)
	(vstrhq_scatter_offset_f, vstrwq_scatter_offset_f)
	(vstrwq_scatter_offset_s, vstrdq_scatter_offset_p_s)
	(vstrhq_scatter_offset_p_f, vstrwq_scatter_offset_p_f)
	(vstrwq_scatter_offset_p_s, vstrdq_scatter_offset_u)
	(vstrwq_scatter_offset_u, vstrdq_scatter_offset_p_u)
	(vstrwq_scatter_offset_p_u) Delete.
	* config/arm/iterators.md (MVE_VLD_ST_scatter): New.
	(MVE_scatter_offset): New.
	(MVE_elem_ch): Add entry for V2DI.
	(supf): Remove VSTRBQSO_S, VSTRBQSO_U, VSTRHQSO_S, VSTRHQSO_U,
	VSTRDQSO_S, VSTRDQSO_U, VSTRWQSO_U, VSTRWQSO_S.
	(VSTRBSOQ, VSTRHSOQ, VSTRDSOQ, VSTRWSOQ): Delete.
	* config/arm/mve.md (mve_vstrbq_scatter_offset_<supf><mode>):
	Delete.
	(mve_vstrbq_scatter_offset_<supf><mode>_insn): Delete.
	(mve_vstrbq_scatter_offset_p_<supf><mode>): Delete.
	(mve_vstrbq_scatter_offset_p_<supf><mode>_insn): Delete.
	(mve_vstrhq_scatter_offset_p_<supf><mode>): Delete.
	(mve_vstrhq_scatter_offset_p_<supf><mode>_insn): Delete.
	(mve_vstrhq_scatter_offset_<supf><mode>): Delete.
	(mve_vstrhq_scatter_offset_<supf><mode>_insn): Delete.
	(mve_vstrdq_scatter_offset_p_<supf>v2di): Delete.
	(mve_vstrdq_scatter_offset_p_<supf>v2di_insn): Delete.
	(mve_vstrdq_scatter_offset_<supf>v2di): Delete.
	(mve_vstrdq_scatter_offset_<supf>v2di_insn): Delete.
	(mve_vstrhq_scatter_offset_fv8hf): Delete.
	(mve_vstrhq_scatter_offset_fv8hf_insn): Delete.
	(mve_vstrhq_scatter_offset_p_fv8hf): Delete.
	(mve_vstrhq_scatter_offset_p_fv8hf_insn): Delete.
	(mve_vstrwq_scatter_offset_fv4sf): Delete.
	(mve_vstrwq_scatter_offset_fv4sf_insn): Delete.
	(mve_vstrwq_scatter_offset_p_fv4sf): Delete.
	(mve_vstrwq_scatter_offset_p_fv4sf_insn): Delete.
	(mve_vstrwq_scatter_offset_p_<supf>v4si): Delete.
	(mve_vstrwq_scatter_offset_p_<supf>v4si_insn): Delete.
	(mve_vstrwq_scatter_offset_<supf>v4si): Delete.
	(mve_vstrwq_scatter_offset_<supf>v4si_insn): Delete.
	(@mve_vstrq_scatter_offset_<mode>): New.
	(@mve_vstrq_scatter_offset_p_<mode>): New.
	(@mve_vstrq_truncate_scatter_offset_<mode>): New.
	(@mve_vstrq_truncate_scatter_offset_p_<mode>): New.
	* config/arm/unspecs.md (VSTRBQSO_S, VSTRBQSO_U, VSTRHQSO_S)
	(VSTRDQSO_S, VSTRDQSO_U, VSTRWQSO_S, VSTRWQSO_U, VSTRHQSO_F)
	(VSTRWQSO_F, VSTRHQSO_U): Delete.
	(VSTRQSO, VSTRQSO_P, VSTRQSO_TRUNC, VSTRQSO_TRUNC_P): New.
2024-12-13 14:23:30 +00:00
Christophe Lyon
bccbb696e5 arm: [MVE intrinsics] add store_scatter_offset shape
This patch adds the store_scatter_offset shape and uses a new helper
class (store_scatter), which will also be used by later patches.

gcc/ChangeLog:

	* config/arm/arm-mve-builtins-shapes.cc (struct store_scatter): New.
	(struct store_scatter_offset_def): New.
	* config/arm/arm-mve-builtins-shapes.h (store_scatter_offset): New.
2024-12-13 14:23:29 +00:00
Christophe Lyon
8080760951 arm: [MVE intrinsics] add mode_after_pred helper in function_shape
This new helper returns true if the mode suffix goes after the
predicate suffix.  This is true in most cases, so the base
implementations in nonoverloaded_base and overloaded_base return true.
For instance: vaddq_m_n_s32.

This will be useful in later patches to implement
vstr?q_scatter_offset_p (_p appears after _offset).

gcc/ChangeLog:

	* config/arm/arm-mve-builtins-shapes.cc (struct
	nonoverloaded_base): Implement mode_after_pred.
	(struct overloaded_base): Likewise.
	* config/arm/arm-mve-builtins.cc (function_builder::get_name):
	Call mode_after_pred as needed.
	* config/arm/arm-mve-builtins.h (function_shape): Add
	mode_after_pred.
2024-12-13 14:23:29 +00:00
Tobias Burnus
46dd8acffe C++: reject OpenMP directives in constexpr functions
gcc/cp/ChangeLog:

	* parser.cc (cp_parser_omp_construct, cp_parser_pragma): Reject
	OpenMP expressions in constexpr functions.

gcc/testsuite/ChangeLog:

	* g++.dg/gomp/pr108607.C: Update dg-error.
	* g++.dg/gomp/pr79664.C: Update dg-error.
	* g++.dg/gomp/omp-constexpr.C: New test.
2024-12-13 14:27:08 +01:00
Robin Dapp
6dcfe87431 genrecog: Split into separate partitions [PR111600].
Hi,

this patch makes genrecog split its output into separate files (10 by
default) in the same vein genemit does.  The changes are mostly
mechanical again, changing printfs and puts to fprintf.
As insn-recog.cc relies on being able to call other recog functions a
header insn-recog.h is introduced that pre declares all of those.

For simplicity the number of files is determined by (re-using)
--with-insnemit-partitions.  Naming suggestions welcome :)

Bootstrapped and regtested on x86 and power10, regtested on riscv.
aarch64 bootstrap is currently blocked because of the
"maybe uninitialized" issue discussed on IRC.

Regards
 Robin

	PR target/111600

gcc/ChangeLog:

	* Makefile.in:  Add insn-recog split.
	* configure: Regenerate.
	* configure.ac: Document that the number of insnemit partitions is
	used for insn-recog as well.
	* genconditions.cc (write_one_condition): Use fprintf.
	* genpreds.cc (write_predicate_expr): Ditto.
	(write_init_reg_class_start_regs): Ditto.
	* genrecog.cc (write_header): Add header file to includes.
	(printf_indent): Use fprintf.
	(change_state): Ditto.
	(print_code): Ditto.
	(print_host_wide_int): Ditto.
	(print_parameter_value): Ditto.
	(print_test_rtx): Ditto.
	(print_nonbool_test): Ditto.
	(print_label_value): Ditto.
	(print_test): Ditto.
	(print_decision): Ditto.
	(print_state): Ditto.
	(print_subroutine_call): Ditto.
	(print_acceptance): Ditto.
	(print_subroutine_start): Ditto.
	(print_pattern): Ditto.
	(print_subroutine): Ditto.
	(print_subroutine_group): Ditto.
	(handle_arg): Add -O and -H for output and header file handling.
	(main): Use callback.
	* gentarget-def.cc (def_target_insn): Use fprintf.
	* read-md.cc (md_reader::print_c_condition): Ditto.
	* read-md.h (class md_reader): Ditto.
2024-12-13 14:20:13 +01:00
Jonathan Wakely
959a80a46d libstdc++: Fix uninitialized data in std::basic_spanbuf::seekoff
I noticed a -Wmaybe-uninitialized warning for this function, which turns
out to be correct. If the caller passes a valid std::ios_base::seekdir
value then there's no problem, but if they pass std::seekdir(999) then
we don't initialize the __base variable before adding it to __off.

Rather than initialize it to an arbitrary value, we should return an
error.

Also add [[unlikely]] attributes to the paths that return an error.

libstdc++-v3/ChangeLog:

	* include/std/spanstream (basic_spanbuf::seekoff): Return an
	error for invalid seekdir values.
2024-12-13 13:06:12 +00:00
Jonathan Wakely
233860f005 libstdc++: Swap expressions in noexcept-specifier of ranges::not_equal_to
Although this should never make a difference for sensible code, we
should really make the expression in the noexcept-specifier match the
expression in the function body.

libstdc++-v3/ChangeLog:

	* include/bits/ranges_cmp.h (not_equal_to): Make order of
	expressions in noexcept-specifier match the body.
	* testsuite/20_util/function_objects/range.cmp/not_equal_to.cc:
	Check noexcept.
2024-12-13 13:04:37 +00:00
Jonathan Wakely
ba1b6ed1c9 libstdc++: Fix -Wsign-compare warning in <regex>
libstdc++-v3/ChangeLog:

	* include/bits/regex.tcc: Fix -Wsign-compare warning.
2024-12-13 12:00:42 +00:00
Jonathan Wakely
55ed7c4443 libstdc++: Fix -Wreorder warning in <pstl/parallel_backend_tbb.h>
libstdc++-v3/ChangeLog:

	* include/pstl/parallel_backend_tbb.h (__merge_func): Fix order
	of mem-initializers.
2024-12-13 12:00:42 +00:00
Jonathan Wakely
29dbd301a2 libstdc++: Fix -Wmisleading-indentation warning in testcase
libstdc++-v3/ChangeLog:

	* testsuite/26_numerics/random/random_device/entropy.cc: Fix
	indentation to avoid -Wmisleading-indentation warning.
2024-12-13 12:00:42 +00:00
Tamar Christina
6a5a1b8175 AArch64: Set L1 data cache size according to size on CPUs
This sets the L1 data cache size for some cores based on their size in their
Technical Reference Manuals.

Today the port minimum is 256 bytes as explained in commit
g:9a99559a478111f7fbeec29bd78344df7651c707, however like Neoverse V2 most cores
actually define the L1 cache size as 64-bytes.  The generic Armv9-A model was
already changed in g:f000cb8cbc58b23a91c84d47d69481904981a1d9 and this
change follows suite for a few other cores based on their TRMs.

This results in less memory pressure when running on large core count machines.

gcc/ChangeLog:

	* config/aarch64/tuning_models/cortexx925.h: Set L1 cache size to 64b.
	* config/aarch64/tuning_models/neoverse512tvb.h: Likewise.
	* config/aarch64/tuning_models/neoversen1.h: Likewise.
	* config/aarch64/tuning_models/neoversen2.h: Likewise.
	* config/aarch64/tuning_models/neoversen3.h: Likewise.
	* config/aarch64/tuning_models/neoversev1.h: Likewise.
	* config/aarch64/tuning_models/neoversev2.h: Likewise.
	(neoversev2_prefetch_tune): Removed.
	* config/aarch64/tuning_models/neoversev3.h: Likewise.
	* config/aarch64/tuning_models/neoversev3ae.h: Likewise.
2024-12-13 11:20:18 +00:00
Tamar Christina
4a9427f75b AArch64: Add CMP+CSEL and CMP+CSET for cores that support it
GCC 15 added two new fusions CMP+CSEL and CMP+CSET.

This patch enables them for cores that support based on their Software
Optimization Guides and generically on Armv9-A.   Even if a core does not
support it there's no negative performance impact.

gcc/ChangeLog:

	* config/aarch64/aarch64-fusion-pairs.def (AARCH64_FUSE_NEOVERSE_BASE):
	New.
	* config/aarch64/tuning_models/neoverse512tvb.h: Use it.
	* config/aarch64/tuning_models/neoversen2.h: Use it.
	* config/aarch64/tuning_models/neoversen3.h: Use it.
	* config/aarch64/tuning_models/neoversev1.h: Use it.
	* config/aarch64/tuning_models/neoversev2.h: Use it.
	* config/aarch64/tuning_models/neoversev3.h: Use it.
	* config/aarch64/tuning_models/neoversev3ae.h: Use it.
	* config/aarch64/tuning_models/cortexx925.h: Add fusions.
	* config/aarch64/tuning_models/generic_armv9_a.h: Add fusions.
2024-12-13 11:17:55 +00:00
Jakub Jelinek
99b9dfaff6 i386: Add vec_fm{addsub,subadd}v2sf4 patterns [PR116979]
As mentioned in the PR, the addition of vec_addsubv2sf3 expander caused
the testcase to be vectorized and no longer to use fma.
The following patch adds new expanders so that it can be vectorized
again with the alternating add/sub fma instructions.

There is some bug on the slp cost computation side which causes it
not to count some scalar multiplication costs, but I think the patch
is desirable anyway before that is fixed and the testcase for now just
uses -fvect-cost-model=unlimited.

2024-12-13  Jakub Jelinek  <jakub@redhat.com>

	PR target/116979
	* config/i386/mmx.md (vec_fmaddsubv2sf4, vec_fmsubaddv2sf4): New
	define_expand patterns.

	* gcc.target/i386/pr116979.c: New test.
2024-12-13 10:32:57 +01:00
Robin Dapp
12a5ab1461 RISC-V: Improve slide1up pattern.
This patch adds a second variant to implement the extract/slide1up
pattern.  In order to do a permutation like
<3, 4, 5, 6> from vectors <0, 1, 2, 3> and <4, 5, 6, 7>
we currently extract <3> from the first vector and re-insert it into the
second vector.  Unless register-file crossing latency is essentially
zero it should be preferable to first slide the second vector up by
one, then slide down the first vector by (nunits - 1).

gcc/ChangeLog:

	* config/riscv/riscv-protos.h (riscv_register_move_cost):
	Export.
	* config/riscv/riscv-v.cc (shuffle_extract_and_slide1up_patterns):
	Rename...
	(shuffle_off_by_one_patterns): ... to this and add slideup/slidedown
	variant.
	(expand_vec_perm_const_1): Call renamed function.
	* config/riscv/riscv.cc (riscv_secondary_memory_needed): Remove
	static.
	(riscv_register_move_cost): Add VR<->GR/FR handling.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/pr112599-2.c: Adjust test
	expectation.
2024-12-13 10:12:46 +01:00
Robin Dapp
528567a7b1 RISC-V: Add even/odd vec_perm_const pattern.
This adds handling for even/odd patterns.

gcc/ChangeLog:

	* config/riscv/riscv-v.cc (shuffle_even_odd_patterns): New
	function.
	(expand_vec_perm_const_1): Use new function.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-evenodd-run.c: New test.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-evenodd.c: New test.
2024-12-13 10:12:40 +01:00
Robin Dapp
cff3050a4f RISC-V: Add interleave pattern.
This patch adds efficient handling of interleaving patterns like
[0 4 1 5] to vec_perm_const.  It is implemented by a slideup and a
gather.

gcc/ChangeLog:

	* config/riscv/riscv-v.cc (shuffle_interleave_patterns): New
	function.
	(expand_vec_perm_const_1): Use new function.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-interleave-run.c: New test.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-interleave.c: New test.
2024-12-13 10:12:27 +01:00
Robin Dapp
71bfc8c33e RISC-V: Add slide to perm_const strategies.
This patch adds a shuffle_slide_patterns to expand_vec_perm_const.
It recognizes permutations like

  {0, 1, 4, 5}
or
  {2, 3, 6, 7}

which can be constructed by a slideup or slidedown of one of the vectors
into the other one.

gcc/ChangeLog:

	* config/riscv/riscv-v.cc (shuffle_slide_patterns): New.
	(expand_vec_perm_const_1): Call new function.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-slide-run.c: New test.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-slide.c: New test.
2024-12-13 10:12:16 +01:00
Robin Dapp
cfdab86f20 RISC-V: Emit vector shift pattern for const_vector [PR117353].
In PR117353 and PR117878 we expand a const vector during reload.  For
this we use an unpredicated left shift.  Normally an insn like this is
split but as we introduce it late and cannot create pseudos anymore
it remains unpredicated and is not recognized by the vsetvl pass (where
we expect all insns to be in predicated RVV format).

This patch directly emits a predicated shift instead.  We could
distinguish between !lra_in_progress and lra_in_progress and emit
an unpredicated shift in the former case but we're not very likely
to optimize it anyway so it doesn't seem worth it.

	PR target/117353
	PR target/117878

gcc/ChangeLog:

	* config/riscv/riscv-v.cc (expand_const_vector): Use predicated
	instead of simple shift.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/pr117353.c: New test.
2024-12-13 10:08:11 +01:00