It can yield an incorrect layout when there is a partial representation
clause on a discriminated record type with a variant part.
gcc/ada/
* gcc-interface/decl.c (components_to_record): If the first component
with rep clause is the _Parent field with variable size, temporarily
set it aside when computing the internal layout of the REP part again.
* gcc-interface/utils.c (finish_record_type): Revert to taking the
maximum when merging sizes for all record types with rep clause.
(merge_sizes): Put SPECIAL parameter last and adjust recursive calls.
This polishes a few rough edges visible in LTO mode.
gcc/ada/
* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Array_Type>: Make the
two fields of the fat pointer type addressable, and do not make the
template type read-only.
<E_Record_Type>: If the type has discriminants mark it as may_alias.
* gcc-interface/utils.c (make_dummy_type): Likewise.
(build_dummy_unc_pointer_types): Likewise.
gcc/fortran/ChangeLog:
PR fortran/98913
* dependency.c (gfc_dep_resolver): Treat local access
to coarrays like any array access in dependency analysis.
gcc/testsuite/ChangeLog:
PR fortran/98913
* gfortran.dg/coarray/array_temporary.f90: New test.
This fixes more memory leaks as discovered by building 521.wrf_r.
2021-02-03 Richard Biener <rguenther@suse.de>
* lto-streamer.c (lto_get_section_name): Free temporary
buffer.
* tree-loop-distribution.c
(loop_distribution::merge_dep_scc_partitions): Free edge data.
As the testcase shows, RTL ifcvt can throw random RTL (whatever it found in
some insns) at expand_binop or expand_unop and expects it to do something
(and then will check if it created valid insns and punts if not).
These functions in the end if the operands don't match try to
copy_to_mode_reg the operands, which does
if (!general_operand (x, VOIDmode))
x = force_operand (x, temp);
but, force_operand is far from handling all possible RTLs, it will ICE for
all more unusual RTL codes. Basically handles just simple arithmetic and
unary RTL operations if they have an optab and
expand_simple_binop/expand_simple_unop ICE on others.
The following patch fixes it by adding some operand verification (whether
there is a hope that copy_to_mode_reg will succeed on those). It is added
both to noce_emit_move_insn (not needed for this exact testcase,
that function simply tries to recog the insn as is and if it fails,
handles some simple binop/unop cases; the patch performs the verification
of their operands) and noce_try_sign_mask.
2021-02-03 Jakub Jelinek <jakub@redhat.com>
PR middle-end/97487
* ifcvt.c (noce_can_force_operand): New function.
(noce_emit_move_insn): Use it.
(noce_try_sign_mask): Likewise. Formatting fix.
* gcc.dg/pr97487-1.c: New test.
* gcc.dg/pr97487-2.c: New test.
The following testcase has ice-on-invalid, it can't be reloaded, but we
shouldn't ICE the compiler because the user typed non-sense.
In current_insn_transform we have:
if (process_alt_operands (reused_alternative_num))
alt_p = true;
if (check_only_p)
return ! alt_p || best_losers != 0;
/* If insn is commutative (it's safe to exchange a certain pair of
operands) then we need to try each alternative twice, the second
time matching those two operands as if we had exchanged them. To
do this, really exchange them in operands.
If we have just tried the alternatives the second time, return
operands to normal and drop through. */
if (reused_alternative_num < 0 && commutative >= 0)
{
curr_swapped = !curr_swapped;
if (curr_swapped)
{
swap_operands (commutative);
goto try_swapped;
}
else
swap_operands (commutative);
}
if (! alt_p && ! sec_mem_p)
{
/* No alternative works with reloads?? */
if (INSN_CODE (curr_insn) >= 0)
fatal_insn ("unable to generate reloads for:", curr_insn);
error_for_asm (curr_insn,
"inconsistent operand constraints in an %<asm%>");
lra_asm_error_p = true;
...
and so handle inline asms there differently (and delete/nullify them after
this) - fatal_insn is only called for non-inline asm.
But in process_alt_operands we do:
/* Both the earlyclobber operand and conflicting operand
cannot both be user defined hard registers. */
if (HARD_REGISTER_P (operand_reg[i])
&& REG_USERVAR_P (operand_reg[i])
&& operand_reg[j] != NULL_RTX
&& HARD_REGISTER_P (operand_reg[j])
&& REG_USERVAR_P (operand_reg[j]))
fatal_insn ("unable to generate reloads for "
"impossible constraints:", curr_insn);
and thus ICE even for inline-asms.
I think it is inappropriate to delete/nullify the insn in
process_alt_operands, as it could be done e.g. in the check_only_p mode,
so this patch just returns false in that case, which results in the
caller have alt_p false, and as inline asm isn't simple move, sec_mem_p
will be also false (and it isn't commutative either), so for check_only_p
it will suggests to the callers it isn't ok and otherwise will emit
error and delete/nullify the inline asm insn.
2021-02-03 Jakub Jelinek <jakub@redhat.com>
PR middle-end/97971
* lra-constraints.c (process_alt_operands): For inline asm, don't call
fatal_insn, but instead return false.
* gcc.target/i386/pr97971.c: New test.
On Tue, Feb 02, 2021 at 02:23:55PM +0100, Richard Biener wrote:
> All I say is that the x86 target
> should either not advertise V1DF shifts or advertise the basic
> ops that reasonable simplification would expect to exist.
The backend has several V1?Imode shifts, but optab only for those V1DImode
ones:
grep '[la]sh[lr]v1[qhsdtox]' tmp-mddump.md
(define_insn ("mmx_ashlv1di3")
(define_insn ("mmx_lshrv1di3")
(define_insn ("avx512bw_ashlv1ti3")
(define_insn ("avx512bw_lshrv1ti3")
(define_insn ("sse2_ashlv1ti3")
(define_insn ("sse2_lshrv1ti3")
(define_expand ("ashlv1di3")
(define_expand ("lshrv1di3")
emit_insn (gen_sse2_lshrv1ti3 (tmp, gen_lowpart (V1TImode, operands[1]),
I think it has been introduced with
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89021#c13
Before we didn't have any V1DImode expanders (except mov/movmisalign, but
those are needed and are supplied for other V1??mode modes too).
This patch just removes the two V1DImode shift expanders with standard names.
2021-02-03 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/98287
* config/i386/mmx.md (<insn><mode>3): For shifts don't enable expander
for V1DImode.
* gcc.dg/pr98287.c: New test.
Previously the SLP pattern matcher was using STMT_VINFO_SLP_VECT_ONLY as a way
to dissolve the SLP only patterns during SLP cancellation. However it seems
like the semantics for STMT_VINFO_SLP_VECT_ONLY are slightly different than what
I expected.
Namely that the non-SLP path can still use a statement marked
STMT_VINFO_SLP_VECT_ONLY. One such example is masked loads which are used both
in the SLP and non-SLP path.
To fix this I now introduce a new flag STMT_VINFO_SLP_VECT_ONLY_PATTERN which is
used only by the pattern matcher.
gcc/ChangeLog:
PR tree-optimization/98928
* tree-vect-loop.c (vect_analyze_loop_2): Change
STMT_VINFO_SLP_VECT_ONLY to STMT_VINFO_SLP_VECT_ONLY_PATTERN.
* tree-vect-slp-patterns.c (complex_pattern::build): Likewise.
* tree-vectorizer.h (STMT_VINFO_SLP_VECT_ONLY_PATTERN): New.
(class _stmt_vec_info): Add slp_vect_pattern_only_p.
gcc/testsuite/ChangeLog:
PR tree-optimization/98928
* gcc.target/i386/pr98928.c: New test.
My patch for 96199 had us re-substitute the parameter types of a constructor
in order to rewrite mentions of members into dependent references. We need
to do that for member functions, too.
gcc/cp/ChangeLog:
PR c++/98929
PR c++/96199
* error.c (dump_expr): Ignore dummy object.
* pt.c (tsubst_baselink): Handle dependent scope.
gcc/testsuite/ChangeLog:
PR c++/98929
* g++.dg/cpp1z/class-deduction-decltype1.C: New test.
Another very simple move from inline asm to builtins.
Only two intrinsics this time.
gcc/ChangeLog:
* config/aarch64/aarch64-simd-builtins.def (ursqrte): Define builtin.
* config/aarch64/aarch64-simd.md (aarch64_ursqrte<mode>): New pattern.
* config/aarch64/arm_neon.h (vrsqrte_u32): Reimplement using builtin.
(vrsqrteq_u32): Likewise.
Another transition from inline asm to builtin.
Only 3 intrinsics converted this time but they use the "+w" constraint in their inline asm
so are more likely to generate redundant moves so benefit more from reimplementation.
gcc/ChangeLog:
* config/aarch64/aarch64-simd-builtins.def (sqxtun2): Define builtin.
* config/aarch64/aarch64-simd.md (aarch64_sqxtun2<mode>_le): Define.
(aarch64_sqxtun2<mode>_be): Likewise.
(aarch64_sqxtun2<mode>): Likewise.
* config/aarch64/arm_neon.h (vqmovun_high_s16): Reimplement using builtin.
(vqmovun_high_s32): Likewise.
(vqmovun_high_s64): Likewise.
* config/aarch64/iterators.md (UNSPEC_SQXTUN2): Define.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/narrow_high-intrinsics.c: Adjust sqxtun2 scan.
This patch updates the flags for the bfloat16 builtins.
The bfdot ones aren't affected by the FPCR/FPSR so can be AUTO_FP
whereas the bfmlal ones follow the normal floating-point instructions and get FP.
gcc/ChangeLog:
* config/aarch64/aarch64-simd-builtins.def (bfdot_lane, bfdot_laneq): Use
AUTO_FP flags.
(bfmlalb_lane, bfmlalt_lane, bfmlalb_lane_q, bfmlalt_lane_q): Use FP flags.
We already have a STORE flag that we use for builtins. This patch introduces a LOAD set
that uses AUTO_FP and FLAG_READ_MEMORY. This allows for more aggressive optimisation of the load
intrinsics.
Turns out we have a great many testcases that do:
float16x4x2_t
f_vld2_lane_f16 (float16_t * p, float16x4x2_t v)
{
float16x4x2_t res;
/* { dg-error "lane 4 out of range 0 - 3" "" { target *-*-* } 0 } */
res = vld2_lane_f16 (p, v, 4);
/* { dg-error "lane -1 out of range 0 - 3" "" { target *-*-* } 0 } */
res = vld2_lane_f16 (p, v, -1);
return res;
}
but since the first res is unused it now gets eliminated early on before we get to give an error
message. Ideally we'd like to warn for both.
This patch takes the conservative approach and doesn't convert the load-lane builtins to LOAD ;
that's something we can improve later.
gcc/ChangeLog:
* config/aarch64/aarch64-builtins.c (FLAG_LOAD): Define.
* config/aarch64/aarch64-simd-builtins.def (ld1x2, ld2, ld3, ld4, ld2r,
ld3r, ld4r, ld1, ld1x3, ld1x4): Use LOAD flags.
This patch relaxes the flags for some builtins to AUTO_FP. These
builtins do permutes and similar, so they shouldn't get the FP flags
when operating on floating-point modes as they don't care about
FPCR/FPSR and exceptions.
gcc/ChangeLog:
* config/aarch64/aarch64-simd-builtins.def (combine, zip1, zip2,
uzp1, uzp2, trn1, trn2, simd_bsl): Use AUTO_FP flags.
This patch relaxes the flags for most integer builtins to NONE as they don't read/write memory
and don't care about the FPCR/FPSR or exceptions so we should be more aggressive with them.
This leads to fallout in a testcase where the result of an intrinsic was unused and it is now
DCE'd. The testcase is adjusted.
gcc/ChangeLog:
* config/aarch64/aarch64-simd-builtins.def (clrsb, clz, ctz, popcount,
vec_smult_lane_, vec_smlal_lane_, vec_smult_laneq_, vec_smlal_laneq_,
vec_umult_lane_, vec_umlal_lane_, vec_umult_laneq_, vec_umlal_laneq_,
ashl, sshl, ushl, srshl, urshl, sdot_lane, udot_lane, sdot_laneq,
udot_laneq, usdot_lane, usdot_laneq, sudot_lane, sudot_laneq, ashr,
ashr_simd, lshr, lshr_simd, srshr_n, urshr_n, ssra_n, usra_n, srsra_n,
ursra_n, sshll_n, ushll_n, sshll2_n, ushll2_n, ssri_n, usri_n, ssli_n,
ssli_n, usli_n, bswap, rbit, simd_bsl, eor3q, rax1q, xarq, bcaxq): Use
NONE builtin flags.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/arg-type-diagnostics-1.c: Return result from foo.
As discussed in the PR, the reduction code isn't able to cope with type
promotions/demotions in the reduction computation, so if we recognize an
over-widening pattern that has vect_reduction_def type, we most likely make
it non-vectorizable.
2021-02-02 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/98848
* tree-vect-patterns.c (vect_recog_over_widening_pattern): Punt if
STMT_VINFO_DEF_TYPE (last_stmt_info) is vect_reduction_def.
* gcc.dg/vect/pr98848.c: New test.
* gcc.dg/vect/pr92205.c: Remove xfail.
This testcase has been fixed by
r11-5904-g4cf70c20cb10acd6fb1016611d05540728176b60
so I'm checking it in so that we can close the PR.
2021-02-02 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/97960
* g++.dg/torture/pr97960.C: New test.
- Check `from` mode is not BLMmode before call store_expr, calling store_expr
with BLKmode will cause ICE.
- Verified with riscv64, x86_64 and aarch64, no introduce new regression.
Note: Those logic was introduced by 3e60ddeb82,
so I cc Jakub for reivew.
Changes for V2:
- Checking mode of `from` rather than mode of `to`.
- Verified on riscv64, x86_64 and aarch64 again.
gcc/ChangeLog:
PR target/98743
* expr.c: Check mode before calling store_expr.
gcc/testsuite/ChangeLog:
PR target/98743
* g++.dg/opt/pr98743.C: New.
This patch enables MVE vornq instructions for auto-vectorization. MVE
vornq insns in mve.md are modified to use ior instead of unspec
expression.
2021-02-01 Christophe Lyon <christophe.lyon@linaro.org>
gcc/
* config/arm/iterators.md (supf): Remove VORNQ_S and VORNQ_U.
(VORNQ): Remove.
* config/arm/mve.md (mve_vornq_s<mode>): New entry for vorn
instruction using expression ior.
(mve_vornq_u<mode>): New expander.
(mve_vornq_f<mode>): Use ior code instead of unspec.
* config/arm/unspecs.md (VORNQ_S, VORNQ_U, VORNQ_F): Remove.
gcc/testsuite/
* gcc.target/arm/simd/mve-vorn.c: Add vorn tests.
Ada makes extensive use of nested functions, which turn all automatic
variables of the enclosing function that are used in nested ones into
members of an artificial FRAME record type.
The address of a local variable is usually passed to asan marking
functions without using a temporary. asan_expand_mark_ifn will reject
an ADDR_EXPRs if it's split out from the call into an SSA_NAMEs.
Taking the address of a member of FRAME within a nested function was
not regarded as a gimple val: while introducing FRAME variables,
current_function_decl pointed to the outermost function, even while
processing a nested function, so decl_address_invariant_p, checking
that the context of the variable is current_function_decl, returned
false for such ADDR_EXPRs.
decl_address_invariant_p, called when determining whether an
expression is a legitimate gimple value, compares the context of
automatic variables with current_function_decl. Some of the
tree-nested function processing doesn't set current_function_decl, but
ADDR_EXPR-processing bits temporarily override it. However, they
restore it before re-gimplifying, which causes even ADDR_EXPRs
referencing automatic variables in the FRAME struct of a nested
function to not be regarded as address-invariant.
This patch moves the restores of current_function_decl in the
ADDR_EXPR-handling bits after the re-gimplification, so that the
correct current_function_decl is used when testing for address
invariance.
for gcc/ChangeLog
* tree-nested.c (convert_nonlocal_reference_op): Move
current_function_decl restore after re-gimplification.
(convert_local_reference_op): Likewise.
for gcc/testsuite/ChangeLog
* gcc.dg/asan/nested-1.c: New.
PR analyzer/93355 tracks that -fanalyzer fails to report the FILE *
leak in read_alias_file in intl/localealias.c.
One reason for the failure is that read_alias_file is marked as
"static", and the path leading to the single call of
read_alias_file is falsely rejected as infeasible due to
PR analyzer/96374. I have been attempting to fix that bug, but
don't have a good solution yet.
Previously, -fanalyzer only directly explored "static" functions
if they were needed for call summaries, instead forcing them to
be indirectly explored, but if we have a feasibility bug like
above, we will fail to report any issues in a function that's
only called by such a falsely infeasible path.
It now seems wrong to me to reject directly exploring static
functions: even if there is currently no way to call a function,
it seems reasonable to warn about bugs within them, since
otherwise these latent bugs are a timebomb in the code.
Hence this patch reworks toplevel_function_p to directly explore
almost all functions, working around these feasiblity issues.
It introduces a naming convention that "__analyzer_"-prefixed
function names don't get directly explored, since this is
useful in the analyzer's DejaGnu-based tests.
This workaround gets PR analyzer/93355 closer to working, but
unfortunately there is a second instance of PR analyzer/96374
within read_alias_file itself which means even with this patch
-fanalyzer falsely rejects the path as infeasible.
Still, this ought to help in other cases, and simplifies the
implementation.
gcc/analyzer/ChangeLog:
PR analyzer/93355
PR analyzer/96374
* engine.cc (toplevel_function_p): Simplify so that
we only reject functions with a "__analyzer_" prefix.
(add_any_callbacks): Delete.
(exploded_graph::build_initial_worklist): Update for
dropped param of toplevel_function_p.
(exploded_graph::build_initial_worklist): Don't bother
looking for callbacks that are reachable from global
initializers.
gcc/testsuite/ChangeLog:
PR analyzer/93355
PR analyzer/96374
* gcc.dg/analyzer/conditionals-3.c: Add "__analyzer_"
prefix to support subroutines where necessary.
* gcc.dg/analyzer/data-model-1.c: Likewise.
* gcc.dg/analyzer/feasibility-1.c (called_by_test_6a): New.
(test_6a): New.
* gcc.dg/analyzer/params.c: Add "__analyzer_" prefix to support
subroutines where necessary.
* gcc.dg/analyzer/pr96651-2.c: Likewise.
* gcc.dg/analyzer/signal-4b.c: Likewise.
* gcc.dg/analyzer/single-field.c: Likewise.
* gcc.dg/analyzer/torture/conditionals-2.c: Likewise.
This patch adds a couple more reduced test cases derived from the
integration test for PR analyzer/93355. In both cases, the analyzer
falsely rejects the buggy code paths as being infeasible due to
PR analyzer/96374, and so the tests are marked as XFAIL for now.
gcc/testsuite/ChangeLog:
PR analyzer/93355
PR analyzer/96374
* gcc.dg/analyzer/pr93355-localealias-feasibility-2.c: New test.
* gcc.dg/analyzer/pr93355-localealias-feasibility-3.c: New test.
This adds a special formatter to OutBuffer to handle formatted printing
of integers, a common case. The replacement is faster and safer.
In dmangle.c, it also gets rid of a number of problematic casts, as seen
on powerpc64 targets.
Reviewed-on: https://github.com/dlang/dmd/pull/12174
gcc/d/ChangeLog:
PR d/98921
* dmd/MERGE: Merge upstream dmd 5e2a81d9c.
This patch moves the vrshrn* intrinsics to builtins away from inline
asm.
It's a bit of code, but it's very similar to the recent vsrhn*
reimplementation except that we use an unspec rather than standard RTL
codes for the functionality.
gcc/ChangeLog:
* config/aarch64/aarch64-simd-builtins.def (rshrn, rshrn2):
Define builtins.
* config/aarch64/aarch64-simd.md (aarch64_rshrn<mode>_insn_le):
Define.
(aarch64_rshrn<mode>_insn_be): Likewise.
(aarch64_rshrn<mode>): Likewise.
(aarch64_rshrn2<mode>_insn_le): Likewise.
(aarch64_rshrn2<mode>_insn_be): Likewise.
(aarch64_rshrn2<mode>): Likewise.
* config/aarch64/aarch64.md (unspec): Add UNSPEC_RSHRN.
* config/aarch64/arm_neon.h (vrshrn_high_n_s16): Reimplement
using builtin.
(vrshrn_high_n_s32): Likewise.
(vrshrn_high_n_s64): Likewise.
(vrshrn_high_n_u16): Likewise.
(vrshrn_high_n_u32): Likewise.
(vrshrn_high_n_u64): Likewise.
(vrshrn_n_s16): Likewise.
(vrshrn_n_s32): Likewise.
(vrshrn_n_s64): Likewise.
(vrshrn_n_u16): Likewise.
(vrshrn_n_u32): Likewise.
(vrshrn_n_u64): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/narrow_high-intrinsics.c: Adjust rshrn2
assembly scan.
PR analyzer/98918 reports various false positives and state explosions
on correct code that frees nodes and other pointers in a singly-linked
list.
The issue is that state-merger in the loop leads to UNKNOWN_VALUEs,
and these are then erroneously used to form compound symbolic values
and regions, such as;
INIT_VAL((*UNKNOWN(struct marker *)).ref)
and:
(*INIT_VAL((*UNKNOWN(struct marker * *))))
The malloc state machine then treats these symbolic values as
identifying specific pointers, and thus e.g. erroneously reports a
double-free when
INIT_VAL((*UNKNOWN(struct marker *)).ref)
is freed twice (on subsequent iterations of the loop).
Similarly, the increasingly complex compound symbolic values have
sm-state which prevents state merging, and eventually lead to the
analysis hitting safety limits and stopping.
This patch makes various compound values involving UNKNOWN be
themselves UNKNOWN, resolving both the false positives and the state
explosions.
gcc/analyzer/ChangeLog:
PR analyzer/98918
* region-model-manager.cc
(region_model_manager::get_or_create_initial_value):
Fold the initial value of *UNKNOWN_PTR to an UNKNOWN value.
(region_model_manager::get_field_region): Fold the value
of UNKNOWN_PTR->FIELD to *UNKNOWN_PTR_OF_&FIELD_TYPE.
gcc/testsuite/ChangeLog:
PR analyzer/98918
* gcc.dg/analyzer/pr98918.c: New test.
N3644 implies that operator- can be used on value-init iterators. We now return
0 if both iterators are value initialized. If only one is value initialized we
keep the UB by returning the result of a normal computation which is a meaningless
value.
libstdc++-v3/ChangeLog:
PR libstdc++/70303
* include/bits/stl_deque.h (std::deque<>::operator-(iterator, iterator)):
Return 0 if both iterators are value-initialized.
* testsuite/23_containers/deque/70303.cc: New test.
* testsuite/23_containers/vector/70303.cc: New test.
Before the change RVO gimple statements were treated as local
stores by modres analysis. But in practice RVO escapes target.
2021-02-01 Sergei Trofimovich <siarheit@google.com>
gcc/ChangeLog:
PR tree-optimization/98499
* ipa-modref.c (analyze_ssa_name_flags): treat RVO
conservatively and assume all possible side-effects.
gcc/testsuite/ChangeLog:
PR tree-optimization/98499
* g++.dg/pr98499.C: new test.
The vmovl_high_* intrinsics map down to the SXTL2/UXTL2 instructions
that already have appropriately-named patterns and expanders,
so it's straightforward to wire them up.
gcc/ChangeLog:
* config/aarch64/aarch64-simd-builtins.def (vec_unpacks_hi,
vec_unpacku_hi_): Define builtins.
* config/aarch64/arm_neon.h (vmovl_high_s8): Reimplement using
builtin.
(vmovl_high_s16): Likewise.
(vmovl_high_s32): Likewise.
(vmovl_high_u8): Likewise.
(vmovl_high_u16): Likewise.
(vmovl_high_u32): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/simd/vmovl_high_1.c: New test.
This fixes accounting issues with using auto_vec and auto_bitmap
for -fmem-report.
2021-02-01 Richard Biener <rguenther@suse.de>
* vec.h (auto_vec::auto_vec): Add memory stat parameters
and pass them on.
* bitmap.h (auto_bitmap::auto_bitmap): Likewise.
In this testcase we're crashing during constexpr evaluation of the
ARRAY_REF b[0] as part of evaluation of the lambda's by-copy capture of b
(which is encoded as a VEC_INIT_EXPR<b>). Since A's constexpr default
constructor is not yet defined, b's initialization is not actually
constant, but because A is an empty type, evaluation of b from
cxx_eval_array_ref is successful and yields an empty CONSTRUCTOR.
And since this CONSTRUCTOR is empty, we {}-initialize the desired array
element, and end up crashing from verify_ctor_sanity during evaluation
of this initializer because we updated new_ctx.ctor without updating
new_ctx.object: the former now has type A[3] and the latter is still the
target of a TARGET_EXPR for b[0][0] created from cxx_eval_vec_init
(and so has type A).
This patch fixes this by setting new_ctx.object appropriately at the
same time that we set new_ctx.ctor from cxx_eval_array_reference.
gcc/cp/ChangeLog:
PR c++/98295
* constexpr.c (cxx_eval_array_reference): Also set
new_ctx.object when setting new_ctx.ctor.
gcc/testsuite/ChangeLog:
PR c++/98295
* g++.dg/cpp0x/constexpr-98295.C: New test.
__builtin_has_attribute doesn't work in templates yet (bug 92104), so
in r11-471 I added a sorry. But that only caught type-dependent
expressions and we also want to sorry on value-dependent expressions.
This patch uses uses_template_parms, but guarded with p_t_d, because
u_t_p sets p_t_d and then v_d_e_p considers variables with reference
types value-dependent, which breaks builtin-has-attribute-6.c.
This is a regression and I also plan to apply this to gcc-10.
gcc/cp/ChangeLog:
PR c++/98355
* parser.c (cp_parser_has_attribute_expression): Use
uses_template_parms instead of type_dependent_expression_p.
gcc/testsuite/ChangeLog:
PR c++/98355
* g++.dg/ext/builtin-has-attribute2.C: New test.
template_args_equal has handled dependent alias specializations for a while,
but in this testcase the actual template argument is a SCOPE_REF, so we
called cp_tree_equal, which doesn't handle aliases specially when we get to
them.
This patch generalizes this by setting a flag so structural_comptypes will
check for template alias equivalence (if we aren't doing partial ordering).
The existing flag, comparing_specializations, was too broad; in particular,
when we're doing decls_match, we want to treat corresponding parameters as
equivalent, so we need to separate that from alias comparison. So I
introduce the comparing_dependent_aliases flag.
From looking at other uses of comparing_specializations, it seems to me that
the new flag is what modules wants, as well.
The other use of comparing_specializations in structural_comptypes is a hack
to deal with spec_hasher::equal not calling push_to_top_level, which we
also don't want to tie to the alias comparison semantics.
This patch also changes how we get to structural comparison of aliases from
checking TYPE_CANONICAL in comptypes to marking the aliases as getting
structural comparison when they are built, which is more consistent with how
e.g. typename is handled.
As I mention in the comment for comparing_dependent_aliases, I think the
default should be to treat different dependent aliases for the same type as
distinct, only treating them as equal during deduction (particularly partial
ordering). But that's a matter for the C++ committee, to try in stage 1.
gcc/cp/ChangeLog:
PR c++/98570
* cp-tree.h: Declare it.
* pt.c (comparing_dependent_aliases): New flag.
(template_args_equal, spec_hasher::equal): Set it.
(dependent_alias_template_spec_p): Assert that we don't
get non-types other than error_mark_node.
(instantiate_alias_template): SET_TYPE_STRUCTURAL_EQUALITY
on complex alias specializations. Set TYPE_DEPENDENT_P here.
(tsubst_decl): Not here.
* module.cc (module_state::read_cluster): Set
comparing_dependent_aliases instead of
comparing_specializations.
* tree.c (cp_tree_equal): Remove comparing_specializations
module handling.
* typeck.c (structural_comptypes): Adjust.
(comptypes): Remove comparing_specializations handling.
gcc/testsuite/ChangeLog:
PR c++/98570
* g++.dg/cpp0x/alias-decl-targ1.C: New test.
Add tests for vmlal_high_* and vmlsl_high_* Neon intrinsics. Since
these intrinsics are only supported for AArch64, these tests are
restricted to only run on AArch64 targets.
gcc/testsuite/ChangeLog:
2021-01-31 Jonathan Wright <jonathan.wright@arm.com>
* gcc.target/aarch64/advsimd-intrinsics/vmlXl_high.inc:
New test template.
* gcc.target/aarch64/advsimd-intrinsics/vmlXl_high_lane.inc:
New test template.
* gcc.target/aarch64/advsimd-intrinsics/vmlXl_high_laneq.inc:
New test template.
* gcc.target/aarch64/advsimd-intrinsics/vmlXl_high_n.inc:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmlal_high.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmlal_high_lane.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmlal_high_laneq.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmlal_high_n.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmlsl_high.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmlsl_high_lane.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmlsl_high_laneq.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmlsl_high_n.c:
New test.
Add tests for vmull_high_* Neon intrinsics. Since these intrinsics
are only supported for AArch64, these tests are restricted to only
run on AArch64 targets.
gcc/testsuite/ChangeLog:
2021-01-29 Jonathan Wright <jonathan.wright@arm.com>
* gcc.target/aarch64/advsimd-intrinsics/vmull_high.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmull_high_lane.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmull_high_laneq.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmull_high_n.c:
New test.
g:87301e3956d44ad45e384a8eb16c79029d20213a and
g:ee4c4fe289e768d3c6b6651c8bfa3fdf458934f4 changed the intrinsics to be
proper RTL but accidentally ended up creating a regression because of the
ordering in the RTL pattern.
The existing RTL that combine should try to match to remove the vec_dup is
aarch64_vec_<su>mlal_lane<Qlane> and aarch64_vec_<su>mult_lane<Qlane> which
expects the select register to be the second operand of mult.
The pattern introduced has it as the first operand so combine was unable to
remove the vec_dup. This flips the order such that the patterns optimize
correctly.
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md (aarch64_<su>mlal_n<mode>,
aarch64_<su>mlsl<mode>, aarch64_<su>mlsl_n<mode>): Flip mult operands.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/advsimd-intrinsics/smlal-smlsl-mull-optimized.c: New test.
This sets DF_RD_PRUNE_DEAD_DEFS like all other uses of the UD/DU
chain problems which makes the RD problem consume a lot less memory.
2021-02-01 Richard Biener <rguenther@suse.de>
PR rtl-optimization/98863
* config/i386/i386-features.c (convert_scalars_to_vector):
Set DF_RD_PRUNE_DEAD_DEFS.
BE ilp32 Linux generates extra stack stwu instructions which shouldn't
be counted in, \m … \M is needed around each instruction, not just the
beginning and end of the entire pattern.
gcc/testsuite/ChangeLog:
2021-02-01 Xionghu Luo <luoxhu@linux.ibm.com>
* gcc.target/powerpc/pr79251.p8.c: Update store count regex.
* gcc.target/powerpc/pr79251.p9.c: Likewise.
If the stdint.h system file follows the ISO C99 specification, it might
not define SIZE_MAX in C++ by default, so provide a local fallback.
gcc/
* system.h (SIZE_MAX): Define if not already defined.
A number of ELF-specific tests were introduced in r11-6140, one
of which fails on all Mach-O/Darwin platforms.
On examination, the tests have no meaningful parallel for Mach-O
which dead strips at the symbol level, and does not make use of
function sections (the fact that a used and an unused symbol are
placed in the same section will not affect dead stripping).
Given that the tests do not demonstrate anything useful on Darwin,
skip them.
gcc/testsuite/ChangeLog:
* c-c++-common/attr-used-5.c: Skip for Darwin.
* c-c++-common/attr-used-6.c: Likewise.
* c-c++-common/attr-used-7.c: Likewise.
* c-c++-common/attr-used-8.c: Likewise.
* c-c++-common/attr-used-9.c: Likewise.
With the recent changes to vector insert optimization, the number of
expected stores for the two testcases has changed.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr79251.p8.c: Update ilp32 store counts.
* gcc.target/powerpc/pr79251.p9.c: Same.
This patch adds a new function to genfusion.pl to generate patterns for
logical-logical fusion. They are enabled by default for power10 and can
be disabled by -mno-power10-fusion-2logical or -mno-power10-fusion.
gcc/ChangeLog
* config/rs6000/genfusion.pl (gen_2logical): New function to
generate patterns for logical-logical fusion.
* config/rs6000/fusion.md: Regenerated patterns.
* config/rs6000/rs6000-cpus.def: Add
OPTION_MASK_P10_FUSION_2LOGICAL.
* config/rs6000/rs6000.c (rs6000_option_override_internal):
Enable logical-logical fusion for p10.
* config/rs6000/rs6000.opt: Add -mpower10-fusion-2logical.
AIX only permits use of Altivec VSRs 20-31 in a Vector Extended ABI mode.
This patch explicitly enables use of the VSRs using the new -mabi=vec-extabi
command line option also implemented in LLVM for AIX.
Bootstrapped on powerpc-ibm-aix7.2.3.0 and powerpc64le-linux-gnu.
gcc/ChangeLog:
* config/rs6000/rs6000.opt (mabi=vec-extabi): New.
(mabi=vec-default): New.
* config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Define
__EXTABI__ for AIX Vector extended ABI.
* config/rs6000/rs6000.c (rs6000_debug_reg_global): Print AIX Vector
extabi info.
(conditional_register_usage): If AIX vec_extabi enabled, vs20-vs31
are non-volatile.
* doc/invoke.texi (PowerPC mabi): Add AIX vec-extabi and vec-default.
> rtl-optimization/98863 - tame i386 specific RPAD pass
>
> caused
>
> FAIL: gcc.c-torture/compile/20051216-1.c -O1 (internal compiler error)
> FAIL: gcc.c-torture/compile/20051216-1.c -O1 (test for excess errors)
The problem is that we don't revert the df flags back.
This patch fixes it by clearing DF_DEFER_INSN_RESCAN after
calling df_process_deferred_rescans, so that it doesn't leak into following
unprepared passes that expect non-deferred rescans.
2021-01-30 Jakub Jelinek <jakub@redhat.com>
* config/i386/i386-features.c (remove_partial_avx_dependency): Clear
DF_DEFER_INSN_RESCAN after calling df_process_deferred_rescans.
* gcc.target/i386/20051216-1.c: New test.
The test (intentionally) is not gcc.dg/vect/, as it needs -fopenmp and uses
OpenMP directives other than simd and therefore can't rely on default
VECTFLAGS and so I think can't safely use vect_int effective target
either. So, I'm just making sure it is vectorized on x86 and on aarch64 (the
latter as an example of a target that doesn't need any extra options to get
the vectorization).
2021-01-30 Jakub Jelinek <jakub@redhat.com>
PR testsuite/98243
* gcc.dg/gomp/simd-2.c: Add -msse2 on x86. Restrict
scan-tree-dump-times to x86 and aarch64 targets.
* gcc.dg/gomp/simd-3.c: Likewise.
This test started failing when I changed the mapping of IEEE 128-bit long
double built-in functions on 2021-01-28. This patch fixes the test so it
uses the correct name.
gcc/testsuite/
2021-01-29 Michael Meissner <meissner@linux.ibm.com>
PR testsuite/98870
* gcc.target/powerpc/ppc-fortran/ieee128-math.f90: Fix the
expected result.
Reload pseudos of ALL_REGS class did not narrow class from constraint
in insn (set (pseudo) (lo_sum ...)) because lo_sum is considered an
object (OBJECT_P) although the insn is not a classic move. To permit
narrowing we are starting to use MEM_P and REG_P instead of OBJECT_P.
gcc/ChangeLog:
PR target/97701
* lra-constraints.c (in_class_p): Don't narrow class only for REG
or MEM.
gcc/testsuite/ChangeLog:
PR target/97701
* gcc.target/aarch64/pr97701.c: New.
Hi,
Per PR91903, GCC ICEs when we attempt to pass a variable
(or out of range value) into the vec_ctf() builtin. Per
investigation, the parameter checking exists for this
builtin with the int types, but was missing for
the long long types. This problem also occurs for the
vec_cts() builtin, which is also fixed by this patch.
This patch adds the missing CODE_FOR_* entries to the
rs6000_expand_binup_builtin to cover that scenario.
This patch also updates some existing tests to remove
calls to vec_ctf() and vec_cts() that contain negative
values.
PR target/91903
2020-01-29 Will Schmidt <will_schmidt@vnet.ibm.com>
gcc/ChangeLog:
* config/rs6000/rs6000-call.c (rs6000_expand_binup_builtin): Add
clauses for CODE_FOR_vsx_xvcvuxddp_scale and
CODE_FOR_vsx_xvcvsxddp_scale to the parameter checking code.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr91903.c: New test.
* gcc.target/powerpc/builtins-1.fold.h: Update.
* gcc.target/powerpc/builtins-2.c: Update.
A couple of module invariants are that the modules are always
allocated in ascending order and appended to the module array. The
entity array is likewise ordered, with each module having spans in
that array in ascending order. Prior to header-units, this was
provided by the way import declarations were encountered. With
header-units we need to load the preprocessor state of header units
before we parse the C++, and this can lead to incorrect ordering of
the entity array. I had made the initialization of a module's
language state a little too lazy. This moves the allocation of entity
array spans into the initial read of a module, thus ensuring the
ordering of those spans. We won't be looking in them until we've
loaded the language portions of that particular module, and even if we
did, we'd find NULLs there and issue a diagnostic.
PR c++/98843
gcc/cp/
* module.cc (module_state_config): Add num_entities field.
(module_state::read_entities): The entity_ary span is
already allocated.
(module_state::write_config): Write num_entities.
(module_state::read_config): Read num_entities.
(module_state::write): Set config's num_entities.
(module_state::read_initial): Allocate the entity ary
span here.
(module_state::read_language): Do not set entity_lwm
here.
gcc/testsuite/
* g++.dg/modules/pr98843_a.C: New.
* g++.dg/modules/pr98843_b.H: New.
* g++.dg/modules/pr98843_c.C: New.
Don't track [1, +INF] for pointer types, treat them as invariant for caching
purposes as they cannot be further refined without evaluating to UNDEFINED.
PR tree-optimization/98866
* gimple-range-gori.h (gori_compute:set_range_invariant): New.
* gimple-range-gori.cc (gori_map::set_range_invariant): New.
(gori_map::m_maybe_invariant): Rename from all_outgoing.
(gori_map::gori_map): Rename all_outgoing to m_maybe_invariant.
(gori_map::is_export_p): Ditto.
(gori_map::calculate_gori): Ditto.
(gori_compute::set_range_invariant): New.
* gimple-range.cc (gimple_ranger::range_of_stmt): Set range
invariant for pointers evaluating to [1, +INF].
This removes analyzing DF with expensive problems which we do not
use at all and which somehow cause 5GB of memory to leak. Instead
just do a defered rescan of added insns.
2021-01-29 Richard Biener <rguenther@suse.de>
PR rtl-optimization/98863
* config/i386/i386-features.c (remove_partial_avx_dependency):
Do not perform DF analysis.
(pass_data_remove_partial_avx_dependency): Remove
TODO_df_finish.
This patch reimplements the vabdl_high intrinsics using builtins.
It slightly cleans up the RTL pattern (the mode iterators) but nothing
interesting apart from that.
gcc/ChangeLog:
* config/aarch64/aarch64-simd-builtins.def (sabdl2, uabdl2):
Define builtins.
* config/aarch64/aarch64-simd.md (aarch64_<sur>abdl2<mode>_3):
Rename to...
(aarch64_<sur>abdl2<mode>): ... This.
(<sur>sadv16qi): Adjust use of above.
* config/aarch64/arm_neon.h (vabdl_high_s8): Reimplement using
builtin.
(vabdl_high_s16): Likewise.
(vabdl_high_s32): Likewise.
(vabdl_high_u8): Likewise.
(vabdl_high_u16): Likewise.
(vabdl_high_u32): Likewise.
This patch reimplements the vabal intrinsics with builtins.
The RTL pattern is cleaned up to emit the right .8b suffixes for the
inputs (though .16b is also accepted)
and iterate over the right modes. The pattern's only other use is
through the sadv16qi expander, which is adjusted.
I've verified that the codegen for sadv16qi is not worse off.
gcc/ChangeLog:
* config/aarch64/aarch64-simd-builtins.def (sabal): Define
builtin.
(uabal): Likewise.
* config/aarch64/aarch64-simd.md (aarch64_<sur>abal<mode>_4):
Rename to...
(aarch64_<sur>abal<mode>): ... This
(<sur>sadv16qi): Adust use of the above.
* config/aarch64/arm_neon.h (vabal_s8): Reimplement using
builtin.
(vabal_s16): Likewise.
(vabal_s32): Likewise.
(vabal_u8): Likewise.
(vabal_u16): Likewise.
(vabal_u32): Likewise.
This patch reimplements the vaddlv* intrinsics using builtins.
The vaddlv_s32 and vaddlv_u32 intrinsics actually perform a pairwise
SADDLP/UADDLP instead of a SADDLV/UADDLV but because they only use
two elements it has the same semantics.
gcc/ChangeLog:
* config/aarch64/aarch64-simd-builtins.def (saddlv, uaddlv):
Define builtins.
* config/aarch64/aarch64-simd.md (aarch64_<su>addlv<mode>):
Define.
* config/aarch64/arm_neon.h (vaddlv_s8): Reimplement using
builtin.
(vaddlv_s16): Likewise.
(vaddlv_u8): Likewise.
(vaddlv_u16): Likewise.
(vaddlvq_s8): Likewise.
(vaddlvq_s16): Likewise.
(vaddlvq_s32): Likewise.
(vaddlvq_u8): Likewise.
(vaddlvq_u16): Likewise.
(vaddlvq_u32): Likewise.
(vaddlv_s32): Likewise.
(vaddlv_u32): Likewise.
* config/aarch64/iterators.md (VDQV_L): New mode iterator.
(unspec): Add UNSPEC_SADDLV, UNSPEC_UADDLV.
(Vwstype): New mode attribute.
(Vwsuf): Likewise.
(VWIDE_S): Likewise.
(USADDLV): New int iterator.
(su): Handle UNSPEC_SADDLV, UNSPEC_UADDLV.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/simd/vaddlv_1.c: New test.
This changes it from bytes to kB since its value is limited to
2147483648.
2021-01-29 Richard Biener <rguenther@suse.de>
* doc/invoke.texi (--param max-gcse-memory): Document unit
of size.
* gcse.c (gcse_or_cprop_is_too_expensive): Adjust.
* params.opt (--param max-gcse-memory): Adjust default and
document unit of size.
This fixes overflow of the memory usage estimate in turn failing
to disable itself on WRF with LTO, causing a few GBs worth of
memory peak.
2021-01-29 Richard Biener <rguenther@suse.de>
PR rtl-optimization/98863
* gcse.c (gcse_or_cprop_is_too_expensive): Use unsigned
HOST_WIDE_INT for the memory estimate.
This avoids computing niters information for fake edges.
2021-01-29 Bin Cheng <bin.cheng@linux.alibaba.com>
Richard Biener <rguenther@suse.de>
PR tree-optimization/97627
* tree-ssa-loop-niter.c (number_of_iterations_exit_assumptions):
Do not analyze fake edges.
* g++.dg/pr97627.C: New testcase.
This changes the REE dataflow to change the explicit all-ones
starting solution to be implicit via a visited flag, removing
the need to initially start with fully populated bitmaps for
all basic-blocks. That reduces peak memory use when compiling
the RTL checking enabled insn-extract.c testcase from PR98144
from 6GB to less than 2GB.
2021-01-29 Richard Biener <rguenther@suse.de>
PR rtl-optimization/98144
* df.h (df_mir_bb_info): Add con_visited member.
* df-problems.c (df_mir_alloc): Initialize con_visited,
do not fully populate IN and OUT.
(df_mir_reset): Likewise.
(df_mir_confluence_0): Set con_visited.
(df_mir_confluence_n): Properly handle implicitely
fully populated IN and OUT as designated by con_visited
and update con_visited accordingly.
The
https://gcc.gnu.org/r11-6707-g7432f255b70811dafaf325d94036ac580891de69https://gcc.gnu.org/r11-6708-gbfab355012ca0f5219da8beb04f2fdaf757d34b7
changes moved the vashl/vashr/vlshr expanders from neon.md to vec-common.md
and changed their condition from TARGET_NEON to ARM_HAVE_<MODE>_ARITH,
so that they apply also for TARGET_HAVE_MVE. But, the ARM_HAVE_<MODE>_ARITH
macros are sometimes true also for TARGET_REALLY_IWMMXT, which at least
from quick skimming of former iwmmxt*.md doesn't have such instructions,
so it seems incorrect to enable them for iwmmxt. Furthermore, even if it
had them, iwmmxt doesn't support any way to broadcast values in those
modes (vec_duplicate and vec_init optabs) and the middle end relies on
if the vector x vector shift/rotate patterns are supported it can emit
vector x scalar shift/rotate by broadcasting the shift amount to a vector.
As the TARGET_NEON vs. TARGET_REALLY_IWMMXT vs. TARGET_HAVE_MVE never seem
to be enabled together, I think we can just write it the following way.
Note, seems iwmmxt actually does support vector x scalar shifts, but doesn't
really enable the optabs that would tell the middle-end code that it does
(and neon and mve don't seem to support those). I'll defer that to anybody
that cares about iwmmxt (if any).
2021-01-29 Jakub Jelinek <jakub@redhat.com>
PR target/98849
* config/arm/vec-common.md (mve_vshlq_<supf><mode>,
vashl<mode>3, vashr<mode>3, vlshr<mode>3): Add
&& !TARGET_REALLY_IWMMXT to conditions.
* gcc.c-torture/compile/pr98849.c: New test.
When expansion emits some control flow insns etc. inside of a former GIMPLE
basic block, find_bb_boundaries needs to split it into multiple basic
blocks.
The code needs to ignore debug insns in decisions how many splits to do or
where in between some non-debug insns the split should be done, but it can
decide where to put debug insns if they can be kept and otherwise throws
them away (they can't stay outside of basic blocks).
On the following testcase, we end up in the bb from expander with
control flow insn
debug insns
barrier
some other insn
(the some other insn is effectively dead after __builtin_unreachable and
we'll optimize that out later).
Without debug insns, we'd do the split when encountering some other insn
and split after PREV_INSN (some other insn), i.e. after barrier (and the
splitting code then moves the barrier in between basic blocks).
But if there are debug insns, we actually split before the first debug insn
that appeared after the control flow insn, so after control flow insn,
and get a basic block that starts with debug insns and then has a barrier
in the middle that nothing moves it out of the bb. This leads to ICEs and
even if it wouldn't, different behavior from -g0.
The reason for treating debug insns that way is a different case, e.g.
control flow insn
debug insns
some other insn
or even
control flow insn
barrier
debug insns
some other insn
where splitting before the first such debug insn allows us to keep them
while otherwise we would have to drop them on the floor, and in those
situations we behave the same with -g0 and -g.
So, the following patch fixes it by resetting debug_insn not just when
splitting the blocks (it is set only after seeing a control flow insn and
before splitting for it if needed), but also when seeing a barrier,
which effectively means we always throw away debug insns after a control
flow insn and before following barrier if any, but there is no way around
that, control flow insn must be the last in the bb (BB_END) and BARRIER
after it, debug insns aren't allowed outside of bb.
We still handle the other cases fine (when there is no barrier or when
debug insns appear only after the barrier).
2021-01-29 Jakub Jelinek <jakub@redhat.com>
PR debug/98331
* cfgbuild.c (find_bb_boundaries): Reset debug_insn when seeing
a BARRIER.
* gcc.dg/pr98331.c: New test.
My r11-86 adjusted cp_parser_class_name to do
- scope = parser->scope;
+ scope = parser->scope ? parser->scope : parser->context->object_type;
if (scope == error_mark_node)
return error_mark_node;
but that caused endless looping in cp_parser_type_specifier_seq (the
while (true) loop) in this invalid test, because we never set a parser
error, therefore cp_parser_type_specifier returned error_mark_node
instead of NULL_TREE, and we never issued the "expected type-specifier"
error.
At first I thought I'd just add cp_parser_simulate_error right before
the return, but that regresses crash81.C -- we'd emit multiple errors
for "T::X". So the next best thing seemed to revert to pre-r11-86
behavior: return early when parser->scope is bad, otherwise proceed to
get the parser error.
gcc/cp/ChangeLog:
PR c++/96137
* parser.c (cp_parser_class_name): If parser->scope is
error_mark_node, return it, otherwise continue.
gcc/testsuite/ChangeLog:
PR c++/96137
* g++.dg/parse/error63.C: New test.
The go1 compiler always turns on debugging, to support Go stack traces
and functions like runtime.Callers. With the recent switch to turn on
DWARF 5 by default, this caused failures with some versions of gas,
such as 2.35.1, because the assembly code would assume DWARF 5 but the
driver would not pass --gdwarf-5 to gas. gas would then give an
error: "file number less than one".
This change avoids that problem by having the gccgo driver spec add a
-g option to the command line if no other -g option is present. The
newly added -g option is passed to the assembler as --gdwarf-5.
* gospec.c (lang_specific_driver): Add -g if no debugging options
were passed.
We emit a bogus warning on the following testcase, suggesting that the
operator should return *this even when it does that already.
The problem is that normally cp_build_indirect_ref_1 ensures that *this
is folded as current_class_ref, but in templates (if return type is
non-dependent, otherwise check_return_expr doesn't check it) it didn't
go through cp_build_indirect_ref_1, but just built another INDIRECT_REF.
Which means it then doesn't compare pointer-equal to current_class_ref.
The following patch fixes it by doing in build_x_indirect_ref for
*this what cp_build_indirect_ref_1 would do.
2021-01-28 Jakub Jelinek <jakub@redhat.com>
PR c++/98841
* typeck.c (build_x_indirect_ref): For *this, return current_class_ref.
* g++.dg/warn/effc5.C: New test.
A year ago I submitted this patch:
~~
Here we trip on the TYPE_USER_ALIGN (t) assert in strip_typedefs: it
gets "const d[0]" with TYPE_USER_ALIGN=0 but the result built by
build_cplus_array_type is "const char[0]" with TYPE_USER_ALIGN=1.
When we strip_typedefs the element of the array "const d", we see it's
a typedef_variant_p, so we look at its DECL_ORIGINAL_TYPE, which is
char, but we need to add the const qualifier, so we call
cp_build_qualified_type -> build_qualified_type
where get_qualified_type checks to see if we already have such a type
by walking the variants list, which in this case is:
char -> c -> const char -> const char -> d -> const d
Because check_base_type only checks TYPE_ALIGN and not TYPE_USER_ALIGN,
we choose the first const char, which has TYPE_USER_ALIGN set. If the
element type of an array has TYPE_USER_ALIGN, the array type gets it too.
So we can make check_base_type stricter. I was afraid that it might make
us reuse types less often, but measuring showed that we build the same
amount of types with and without the patch, while bootstrapping.
~~
However, the patch broke a few tests on STRICT_ALIGNMENT platforms and
had to be reverted. This is another try. The original patch is kept
unchanged, but I added the finalize_type_size hunk that ought to fix the
STRICT_ALIGNMENT issues.
The problem is that finalize_type_size can clear TYPE_USER_ALIGN on the
main variant of a type, but doesn't clear it on any of the variants.
Then we end up with types which share the same TYPE_MAIN_VARIANT, but
their TYPE_CANONICAL differs and then the usual "canonical types differ
for identical types" follows.
I've created alignas19.C to exercise this scenario. What happens is:
- when parsing the class S we create a type S in xref_tag,
- we see alignas(8) so common_handle_aligned_attribute sets T_U_A in S,
- we parse the member function fn and build_memfn_type creates a copy
of S to add const; this variant has T_U_A set,
- we finish_struct S which calls layout_class_type -> finish_record_type
-> finalize_size_type where we reset T_U_A in S (but const S keeps it),
- finish_non_static_data_member for arr calls maybe_dummy_object with
type = S,
- maybe_dummy_object calls same_type_ignoring_top_level_qualifiers_p
to check if S and TREE_TYPE (current_class_ref), which is const S,
are the same,
- same_type_ignoring_top_level_qualifiers_p creates cv-unqualified
versions of the passed types. Previously we'd use our main variant
S when stripping "const S" of const, but since the T_U_A flags don't
match (check_base_type), we create a new variant S'. Then we crash in
comptypes because S and S' have the same TYPE_MAIN_VARIANT but
different TYPE_CANONICALs.
With my patch we'll clear T_U_A for S's variants too, and then instead
of S' we'll just use S.
gcc/ChangeLog:
PR c++/94775
* stor-layout.c (finalize_type_size): If we reset TYPE_USER_ALIGN in
the main variant, maybe reset it in its variants too.
* tree.c (check_base_type): Return true only if TYPE_USER_ALIGN match.
(check_aligned_type): Check if TYPE_USER_ALIGN match.
gcc/testsuite/ChangeLog:
PR c++/94775
* g++.dg/cpp0x/alignas19.C: New test.
* g++.dg/warn/Warray-bounds15.C: New test.
Neon vector comparisons have a dedicated version when comparing with
constant zero: it means its cost is free.
Adjust the cost in arm_rtx_costs_internal accordingly, for Neon only,
since MVE does not support this.
2021-01-28 Christophe Lyon <christophe.lyon@linaro.org>
gcc/
PR target/98730
* config/arm/arm.c (arm_rtx_costs_internal): Adjust cost of vector
of constant zero for comparisons.
gcc/testsuite/
PR target/98730
* gcc.target/arm/simd/vceqzq_p64.c: Update expected result.
The PowerPC has two different 128-bit long double types, one that uses a pair
of doubles to get more mantissa range, and the other using the IEEE 128-bit
754R binary floating point format. The pair of doubles has been used as the
traditional format, and we are in the process of moving to allow an
implementation to switch to using IEEE 128-bit floating point. The GLIBC and
LIBSTDC++ libraries have been modified to have functions using the two
different formats in their libraries with different names.
This patch goes through all of the built-in functions that either take long
double arguments or return long double, and changes the name from the
traditional name to the IEEE 128-bit name. The minimum GLIBC version to
support IEEE 128-bit floating point is 2.32.
The names changed are:
* <name>l is usually mapped to __<name>ieee128;
* <extra>printf is mapped to __<extra>printfieee128; (and)
* <extra>scanf is mapped to __isoc99_<extra>scanfieee128.
A few functions have different mappings:
* dreml => __remainderieee128;
* gammal => __lgammaieee128;
* gammal_r => __lgammaieee128_r;
* lgammal_r => __lgammaieee128_r;
* nexttoward => __nexttoward_to_ieee128;
* nexttowardf => __nexttowardf_to_ieee128;
* nexttowardl => __nexttowardl_to_ieee128;
* pow10l => __exp10ieee128;
* scalbl => __scalbieee128;
* significandl => __significandieee128; (and)
* sincosl => __sincosieee128.
gcc/
2021-01-28 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/rs6000.c (rs6000_mangle_decl_assembler_name): Add
support for mapping built-in function names for long double
built-in functions if long double is IEEE 128-bit.
gcc/testsuite/
2021-01-28 Michael Meissner <meissner@linux.ibm.com>
* gcc.target/powerpc/float128-longdouble-math.c: New test.
* gcc.target/powerpc/float128-longdouble-stdio.c: New test.
* gcc.target/powerpc/float128-math.c: Adjust test for new name
being generated. Add support for running test on power10. Add
support for running if long double defaults to 64-bits.
As the testcase shows, for vars appearing in templates, we don't attach
the asm spec string to the pattern decls, nor pass it back to cp_finish_decl
during instantiation.
The following patch does that.
2021-01-28 Jakub Jelinek <jakub@redhat.com>
PR c++/33661
PR c++/98847
* decl.c (cp_finish_decl): For register vars with asmspec in templates
call set_user_assembler_name and set DECL_HARD_REGISTER.
* pt.c (tsubst_expr): When instantiating DECL_HARD_REGISTER vars,
pass asmspec_tree to cp_finish_decl.
* g++.target/i386/pr98847.C: New test.
Typedefs are streamed by streaming the underlying type, and then
recreating the typedef. But this breaks checking a duplicate is the
same as the original when it is a template alias -- we end up checking
a template alias (eg __void_t) against the underlying type (void).
And those are not the same template alias. This stops pretendig that
the underlying type is the typedef for that checking and tells
is_matching_decl 'you have a typedef', so it knows what to do. (We do
not want to recreate the typedef of the duplicate, because that whole
set of nodes is going to go away.)
PR c++/98770
gcc/cp/
* module.cc (trees_out::decl_value): Swap is_typedef & TYPE_NAME
check order.
(trees_in::decl_value): Do typedef frobbing only when installing
a new typedef, adjust is_matching_decl call. Swap is_typedef
& TYPE_NAME check.
(trees_in::is_matching_decl): Add is_typedef parm. Adjust variable
names and deal with typedef checking.
gcc/testsuite/
* g++.dg/modules/pr98770_a.C: New.
* g++.dg/modules/pr98770_b.C: New.
This patch reimplements the vshrn_high_n* intrinsics that generate the
SHRN2 instruction.
It is a vec_concat of the narrowing shift with the bottom part of the
destination register, so we need a little-endian and a big-endian version and an expander to
pick between them.
gcc/ChangeLog:
* config/aarch64/aarch64-simd-builtins.def (shrn2): Define
builtin.
* config/aarch64/aarch64-simd.md (aarch64_shrn2<mode>_insn_le):
Define.
(aarch64_shrn2<mode>_insn_be): Likewise.
(aarch64_shrn2<mode>): Likewise.
* config/aarch64/arm_neon.h (vshrn_high_n_s16): Reimlplement
using builtins.
(vshrn_high_n_s32): Likewise.
(vshrn_high_n_s64): Likewise.
(vshrn_high_n_u16): Likewise.
(vshrn_high_n_u32): Likewise.
(vshrn_high_n_u64): Likewise.
This patch reimplements the vshrn_n* intrinsics to use RTL builtins.
These perform a narrowing right shift.
Although the intrinsic generates the half-width mode (e.g. V8HI ->
V8QI), the new pattern generates a full 128-bit mode (V8HI -> V16QI) by representing the
fill-with-zeroes semantics of the SHRN instruction. The narrower (V8QI) result is extracted with a
lowpart subreg.
I found this allows the RTL optimisers to do a better job at optimising
redundant moves away in frequently-occurring SHRN+SRHN2 pairs, like in:
uint8x16_t
foo (uint16x8_t in1, uint16x8_t in2)
{
uint8x8_t tmp = vshrn_n_u16 (in2, 7);
uint8x16_t tmp2 = vshrn_high_n_u16 (tmp, in1, 4);
return tmp2;
}
gcc/ChangeLog:
* config/aarch64/aarch64-simd-builtins.def (shrn): Define
builtin.
* config/aarch64/aarch64-simd.md (aarch64_shrn<mode>_insn_le):
Define.
(aarch64_shrn<mode>_insn_be): Likewise.
(aarch64_shrn<mode>): Likewise.
* config/aarch64/arm_neon.h (vshrn_n_s16): Reimplement using
builtins.
(vshrn_n_s32): Likewise.
(vshrn_n_s64): Likewise.
(vshrn_n_u16): Likewise.
(vshrn_n_u32): Likewise.
(vshrn_n_u64): Likewise.
* config/aarch64/iterators.md (vn_mode): New mode attribute.
The latest fix introduced a comparison of executables and this cannot
directly work on Windows because they are timestamped. Moreover nobody
sets $(exeext) at top level, at least on MinGW, so you get weird behavior
because some tools add the implicit .exe suffix and others do not.
contrib/
PR lto/85574
* compare-lto: Deal with PE-COFF executables specifically.
gfc_call_malloc should malloc an area of size 1 if no size given.
gcc/fortran/ChangeLog:
PR fortran/86470
* trans.c (gfc_call_malloc): Allocate area of size 1 if passed
size is NULL (as documented).
gcc/testsuite/ChangeLog:
PR fortran/86470
* gfortran.dg/gomp/pr86470.f90: New test.
I've noticed we still refer to C++20 as draft standard, and there is a pasto
in C++23 description.
2021-01-28 Jakub Jelinek <jakub@redhat.com>
* c.opt (-std=c++2a, -std=c++20, -std=gnu++2a, -std=gnu++20): Remove
draft from description.
(-std=c++2b): Fix a pasto, 2020 -> 2023.
The following avoids repeatedly turning VALUE RTXen into
sth useful and re-applying a constant offset through get_addr
via DSE check_mem_read_rtx. Instead perform this once for
all stores to be visited in check_mem_read_rtx. This avoids
allocating 1.6GB of garbage PLUS RTXen on the PR80960
testcase, fixing the memory usage regression from old GCC.
2021-01-27 Richard Biener <rguenther@suse.de>
PR rtl-optimization/80960
* dse.c (check_mem_read_rtx): Call get_addr on the
offsetted address.
UNSPEC_SI_FROM_SF is not supported when TARGET_DIRECT_MOVE_64BIT
is false for -m32, don't generate VIEW_CONVERT_EXPR(ARRAY_REF) for
variable vector insert. Remove rs6000_expand_vector_set_var helper
function, adjust the p8 and p9 definitions position and make them
static.
The previous commit r11-6858 missed check m32, This patch is tested pass
on P7BE{m32,m64}/P8BE{m32,m64}/P8LE/P9LE with
RUNTESTFLAGS="--target_board =unix'{-m32,-m64}'" for BE targets.
gcc/ChangeLog:
2021-01-27 Xionghu Luo <luoxhu@linux.ibm.com>
David Edelsohn <dje.gcc@gmail.com>
PR target/98799
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
Don't generate VIEW_CONVERT_EXPR for fcode ALTIVEC_BUILTIN_VEC_INSERT
when -m32.
* config/rs6000/rs6000-protos.h (rs6000_expand_vector_set_var):
Delete.
* config/rs6000/rs6000.c (rs6000_expand_vector_set): Remove the
wrapper call rs6000_expand_vector_set_var for cleanup. Call
rs6000_expand_vector_set_var_p9 and rs6000_expand_vector_set_var_p8
directly.
(rs6000_expand_vector_set_var): Delete.
(rs6000_expand_vector_set_var_p9): Make static.
(rs6000_expand_vector_set_var_p8): Make static.
gcc/testsuite/ChangeLog:
2021-01-27 Xionghu Luo <luoxhu@linux.ibm.com>
PR target/98827
* gcc.target/powerpc/fold-vec-insert-char-p8.c: Adjust ilp32.
* gcc.target/powerpc/fold-vec-insert-char-p9.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-double.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-float-p8.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-float-p9.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-int-p8.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-int-p9.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-longlong.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-short-p8.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-short-p9.c: Likewise.
* gcc.target/powerpc/pr79251.p8.c: Likewise.
* gcc.target/powerpc/pr79251.p9.c: Likewise.
* gcc.target/powerpc/vsx-builtin-7.c: Likewise.
* gcc.target/powerpc/pr79251-run.c: Build and run with vsx
option.
This patch fixes -march option parsing when `p` extension exists,
e.g., -march=rv64imafdcp should produce
.attribute arch, "rv64i2p0_m2p0_a2p0_f2p0_d2p0_c2p0_p"
rather than
.attribute arch, "rv64i2p0_m2p0_a2p0_f2p0_d2p0_c_p"
gcc/ChangeLog:
* common/config/riscv/riscv-common.c
(riscv_subset_list::parsing_subset_version): Fix -march option parsing
when `p` extension exists.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/attribute-18.c: New test.
update_equiv_regs can use reg classes of pseudos and they are set up in
register pressure sensitive scheduling and loop invariant motion and in
live range shrinking. This info can become obsolete if we add new pseudos
since the last set up. Recalculate it again if the new pseudos were
added.
gcc/ChangeLog:
PR rtl-optimization/97684
* ira.c (ira): Call ira_set_pseudo_classes before
update_equiv_regs when it is necessary.
gcc/testsuite/ChangeLog:
PR rtl-optimization/97684
* gcc.target/i386/pr97684.c: New.
The handling of dependent scopes and unsuitable scopes in lookup_using_decl
was a bit convoluted; I tweaked it for a while and then eventually
reorganized much of the function to hopefully be clearer. Along the way I
noticed a couple of ways we were mishandling inherited constructors.
The local binding for a dependent using is the USING_DECL.
Implement instantiation of a dependent USING_DECL at function scope.
gcc/cp/ChangeLog:
PR c++/97874
* name-lookup.c (lookup_using_decl): Clean up handling
of dependency and inherited constructors.
(finish_nonmember_using_decl): Handle DECL_DEPENDENT_P.
* pt.c (tsubst_expr): Handle DECL_DEPENDENT_P.
gcc/testsuite/ChangeLog:
PR c++/97874
* g++.dg/lookup/using4.C: No error in C++20.
* g++.dg/cpp0x/decltype37.C: Adjust message.
* g++.dg/template/crash75.C: Adjust message.
* g++.dg/template/crash76.C: Adjust message.
* g++.dg/cpp0x/inh-ctor36.C: New test.
* g++.dg/cpp1z/inh-ctor39.C: New test.
* g++.dg/cpp2a/using-enum-7.C: New test.
The https://gcc.gnu.org/legacy-ml/gcc-patches/2018-07/msg01895.html
patch that introduced this pattern claimed:
Would generate:
combine_balanced_int:
bfxil w0, w1, 0, 16
uxtw x0, w0
ret
But with this patch generates:
combine_balanced_int:
bfxil w0, w1, 0, 16
ret
and it is indeed what it should generate, but it doesn't do that,
it emits bfxil x0, x1, 0, 16
instead which doesn't zero extend from 32 to 64 bits, but preserves
the bits from the destination register.
2021-01-27 Jakub Jelinek <jakub@redhat.com>
PR target/98853
* config/aarch64/aarch64.md (*aarch64_bfxilsi_uxtw): Use
%w0, %w1 and %2 instead of %0, %1 and %2.
* gcc.c-torture/execute/pr98853-1.c: New test.
* gcc.c-torture/execute/pr98853-2.c: New test.
This patch adds the first batch of patterns to support p10 fusion. These
will allow combine to create a single insn for a pair of instructions
that power10 can fuse and execute. These particular fusion pairs have the
requirement that only cr0 can be used when fusing a load with a compare
immediate of -1/0/1 (if signed) or 0/1 (if unsigned), so we want combine
to put that requirement in, and if it doesn't work out the splitter
can change it back into 2 insns so scheduling can move them apart.
The patterns are generated by a script genfusion.pl and live in new file
fusion.md. This script will be expanded to generate more patterns for
fusion.
This also adds option -mpower10-fusion which defaults on for power10 and
will gate all these fusion patterns. In addition I have added an
undocumented option -mpower10-fusion-ld-cmpi (which may be removed later)
that just controls the load+compare-immediate patterns. I have made
these default on for power10 but they are not disallowed for earlier
processors because it is still valid code. This allows us to test the
correctness of fusion code generation by turning it on explicitly.
gcc/ChangeLog:
* config/rs6000/genfusion.pl: New script to generate
define_insn_and_split patterns so combine can arrange fused
instructions next to each other.
* config/rs6000/fusion.md: New file, generated fused instruction
patterns for combine.
* config/rs6000/predicates.md (const_m1_to_1_operand): New predicate.
(non_update_memory_operand): New predicate.
* config/rs6000/rs6000-cpus.def: Add OPTION_MASK_P10_FUSION and
OPTION_MASK_P10_FUSION_LD_CMPI to ISA_3_1_MASKS_SERVER and
POWERPC_MASKS.
* config/rs6000/rs6000-protos.h (address_is_non_pfx_d_or_x): Add
prototype.
* config/rs6000/rs6000.c (rs6000_option_override_internal):
Automatically set OPTION_MASK_P10_FUSION and
OPTION_MASK_P10_FUSION_LD_CMPI if target is power10.
(rs600_opt_masks): Allow -mpower10-fusion
in function attributes.
(address_is_non_pfx_d_or_x): New function.
* config/rs6000/rs6000.h: Add MASK_P10_FUSION.
* config/rs6000/rs6000.md: Include fusion.md.
* config/rs6000/rs6000.opt: Add -mpower10-fusion
and -mpower10-fusion-ld-cmpi.
* config/rs6000/t-rs6000: Add dependencies involving fusion.md.
Bash and GNU echo do not interpret backslash escapes by default, so use
printf when printing \n or \t in strings.
libstdc++-v3/ChangeLog:
* testsuite/experimental/simd/generate_makefile.sh: Use printf
instead of echo when printing escape characters.
Add a new check-simd target to the testsuite. The new target creates a
subdirectory, generates the necessary Makefiles, and spawns submakes to
build and run the tests. Running this testsuite with defaults on my
machine takes half of the time the dejagnu testsuite required to only
determine whether to run tests. Since the simd testsuite integrated in
dejagnu increased the time of the whole libstdc++ testsuite by ~100%
this approach is a compromise for speed while not sacrificing coverage
too much. Since the test driver is invoked individually per test
executable from a Makefile, make's jobserver (-j) trivially parallelizes
testing.
Testing different flags and with simulator (or remote execution) is
possible. E.g. `make check-simd DRIVEROPTS=-q
target_list="unix{-m64,-m32}{-march=sandybridge,-march=skylake-avx512}{,-
ffast-math}"`
runs the testsuite 8 times in different subdirectories, using 8
different combinations of compiler flags, only outputs failing tests
(-q), and prints all summaries at the end. It skips most ABI tags by
default unless --run-expensive is passed to DRIVEROPTS or
GCC_TEST_RUN_EXPENSIVE is not empty.
To use a simulator, the CHECK_SIMD_CONFIG variable needs to point to a
shell script which calls `define_target <name> <flags> <simulator>` and
set target_list as needed. E.g.:
case "$target_triplet" in
x86_64-*)
target_list="unix{-march=sandybridge,-march=skylake-avx512}
;;
powerpc64le-*)
define_target power8 "-static -mcpu=power8" "/usr/bin/qemu-ppc64le -cpu
power8"
define_target power9 -mcpu=power9 "$HOME/bin/run_on_gcc135"
target_list="power8 power9{,-ffast-math}"
;;
esac
libstdc++-v3/ChangeLog:
* scripts/check_simd: New file. This script is called from the
the check-simd target. It determines a set of compiler flags and
simulator setups for calling generate_makefile.sh and passes the
information back to the check-simd target, which recurses to the
generated Makefiles.
* scripts/create_testsuite_files: Remove files below simd/tests/
from testsuite_files and place them in testsuite_files_simd.
* testsuite/Makefile.am: Add testsuite_files_simd. Add
check-simd target.
* testsuite/Makefile.in: Regenerate.
* testsuite/experimental/simd/driver.sh: New file. This script
compiles and runs a given simd test, logging its output and
status. It uses the timeout command to implement compile and
test timeouts.
* testsuite/experimental/simd/generate_makefile.sh: New file.
This script generates a Makefile which uses driver.sh to compile
and run the tests and collect the logs into a single log file.
* testsuite/experimental/simd/tests/abs.cc: New file. Tests
abs(simd).
* testsuite/experimental/simd/tests/algorithms.cc: New file.
Tests min/max(simd, simd).
* testsuite/experimental/simd/tests/bits/conversions.h: New
file. Contains functions to support tests involving conversions.
* testsuite/experimental/simd/tests/bits/make_vec.h: New file.
Support functions make_mask and make_vec.
* testsuite/experimental/simd/tests/bits/mathreference.h: New
file. Support functions to supply precomputed math function
reference data.
* testsuite/experimental/simd/tests/bits/metahelpers.h: New
file. Support code for SFINAE testing.
* testsuite/experimental/simd/tests/bits/simd_view.h: New file.
* testsuite/experimental/simd/tests/bits/test_values.h: New
file. Test functions to easily drive a test with simd objects
initialized from a given list of values and a range of random
values.
* testsuite/experimental/simd/tests/bits/ulp.h: New file.
Support code to determine the ULP distance of simd objects.
* testsuite/experimental/simd/tests/bits/verify.h: New file.
Test framework for COMPARE'ing simd objects and instantiating
the test templates with value_type and ABI tag.
* testsuite/experimental/simd/tests/broadcast.cc: New file. Test
simd broadcasts.
* testsuite/experimental/simd/tests/casts.cc: New file. Test
simd casts.
* testsuite/experimental/simd/tests/fpclassify.cc: New file.
Test floating-point classification functions.
* testsuite/experimental/simd/tests/frexp.cc: New file. Test
frexp(simd).
* testsuite/experimental/simd/tests/generator.cc: New file. Test
simd generator constructor.
* testsuite/experimental/simd/tests/hypot3_fma.cc: New file.
Test 3-arg hypot(simd,simd,simd) and fma(simd,simd,sim).
* testsuite/experimental/simd/tests/integer_operators.cc: New
file. Test integer operators.
* testsuite/experimental/simd/tests/ldexp_scalbn_scalbln_modf.cc:
New file. Test ldexp(simd), scalbn(simd), scalbln(simd), and
modf(simd).
* testsuite/experimental/simd/tests/loadstore.cc: New file. Test
(converting) simd loads and stores.
* testsuite/experimental/simd/tests/logarithm.cc: New file. Test
log*(simd).
* testsuite/experimental/simd/tests/mask_broadcast.cc: New file.
Test simd_mask broadcasts.
* testsuite/experimental/simd/tests/mask_conversions.cc: New
file. Test simd_mask conversions.
* testsuite/experimental/simd/tests/mask_implicit_cvt.cc: New
file. Test simd_mask implicit conversions.
* testsuite/experimental/simd/tests/mask_loadstore.cc: New file.
Test simd_mask loads and stores.
* testsuite/experimental/simd/tests/mask_operator_cvt.cc: New
file. Test simd_mask operators convert as specified.
* testsuite/experimental/simd/tests/mask_operators.cc: New file.
Test simd_mask compares, subscripts, and negation.
* testsuite/experimental/simd/tests/mask_reductions.cc: New
file. Test simd_mask reductions.
* testsuite/experimental/simd/tests/math_1arg.cc: New file. Test
1-arg math functions on simd.
* testsuite/experimental/simd/tests/math_2arg.cc: New file. Test
2-arg math functions on simd.
* testsuite/experimental/simd/tests/operator_cvt.cc: New file.
Test implicit conversions on simd binary operators behave as
specified.
* testsuite/experimental/simd/tests/operators.cc: New file. Test
simd compares, subscripts, not, unary minus, plus, minus,
multiplies, divides, increment, and decrement.
* testsuite/experimental/simd/tests/reductions.cc: New file.
Test reduce(simd).
* testsuite/experimental/simd/tests/remqo.cc: New file. Test
remqo(simd).
* testsuite/experimental/simd/tests/simd.cc: New file. Basic
sanity checks of simd types.
* testsuite/experimental/simd/tests/sincos.cc: New file. Test
sin(simd) and cos(simd).
* testsuite/experimental/simd/tests/split_concat.cc: New file.
Test split(simd) and concat(simd, simd).
* testsuite/experimental/simd/tests/splits.cc: New file. Test
split(simd_mask).
* testsuite/experimental/simd/tests/trigonometric.cc: New file.
Test remaining trigonometric functions on simd.
* testsuite/experimental/simd/tests/trunc_ceil_floor.cc: New
file. Test trunc(simd), ceil(simd), and floor(simd).
* testsuite/experimental/simd/tests/where.cc: New file. Test
masked operations using where.
Adds <experimental/simd>.
This implements the simd and simd_mask class templates via
[[gnu::vector_size(N)]] data members. It implements overloads for all of
<cmath> for simd. Explicit vectorization of the <cmath> functions is not
finished.
The majority of functions are marked as [[gnu::always_inline]] to enable
quasi-ODR-conforming linking of TUs with different -m flags.
Performance optimization was done for x86_64. ARM, Aarch64, and POWER
rely on the compiler to recognize reduction, conversion, and shuffle
patterns.
Besides verification using many different machine flages, the code was
also verified with different fast-math flags.
libstdc++-v3/ChangeLog:
* doc/xml/manual/status_cxx2017.xml: Add implementation status
of the Parallelism TS 2. Document implementation-defined types
and behavior.
* include/Makefile.am: Add new headers.
* include/Makefile.in: Regenerate.
* include/experimental/simd: New file. New header for
Parallelism TS 2.
* include/experimental/bits/numeric_traits.h: New file.
Implementation of P1841R1 using internal naming. Addition of
missing IEC559 functionality query.
* include/experimental/bits/simd.h: New file. Definition of the
public simd interfaces and general implementation helpers.
* include/experimental/bits/simd_builtin.h: New file.
Implementation of the _VecBuiltin simd_abi.
* include/experimental/bits/simd_converter.h: New file. Generic
simd conversions.
* include/experimental/bits/simd_detail.h: New file. Internal
macros for the simd implementation.
* include/experimental/bits/simd_fixed_size.h: New file. Simd
fixed_size ABI specific implementations.
* include/experimental/bits/simd_math.h: New file. Math
overloads for simd.
* include/experimental/bits/simd_neon.h: New file. Simd NEON
specific implementations.
* include/experimental/bits/simd_ppc.h: New file. Implement bit
shifts to avoid invalid results for integral types smaller than
int.
* include/experimental/bits/simd_scalar.h: New file. Simd scalar
ABI specific implementations.
* include/experimental/bits/simd_x86.h: New file. Simd x86
specific implementations.
* include/experimental/bits/simd_x86_conversions.h: New file.
x86 specific conversion optimizations. The conversion patterns
work around missing conversion patterns in the compiler and
should be removed as soon as PR85048 is resolved.
* testsuite/experimental/simd/standard_abi_usable.cc: New file.
Test that all (not all fixed_size<N>, though) standard simd and
simd_mask types are usable.
* testsuite/experimental/simd/standard_abi_usable_2.cc: New
file. As above but with -ffast-math.
* testsuite/libstdc++-dg/conformance.exp: Don't build simd tests
from the standard test loop. Instead use
check_vect_support_and_set_flags to build simd tests with the
relevant machine flags.
This avoids cases of PHI node vectorization that just causes us
to insert vector CTORs inside loops for values only required
outside of the loop.
2021-01-27 Richard Biener <rguenther@suse.de>
PR tree-optimization/98854
* tree-vect-slp.c (vect_build_slp_tree_2): Also build
PHIs from scalars when the number of CTORs matches the
number of children.
* gcc.dg/vect/bb-slp-pr98854.c: New testcase.
This reuses the code from std::string::find, which was improved by
r244225, but string_view was not changed to match.
libstdc++-v3/ChangeLog:
PR libstdc++/66414
* include/bits/string_view.tcc
(basic_string_view::find(const CharT*, size_type, size_type)):
Optimize.
This implements WG21 P1679R3, adding contains member functions to
basic_string_view and basic_string.
libstdc++-v3/ChangeLog:
* include/bits/basic_string.h (basic_string::contains): New
member functions.
* include/std/string_view (basic_string_view::contains):
Likewise.
* include/std/version (__cpp_lib_string_contains): Define.
* testsuite/21_strings/basic_string/operations/starts_with/char/1.cc:
Remove trailing whitespace.
* testsuite/21_strings/basic_string/operations/starts_with/wchar_t/1.cc:
Likewise.
* testsuite/21_strings/basic_string/operations/contains/char/1.cc: New test.
* testsuite/21_strings/basic_string/operations/contains/wchar_t/1.cc: New test.
* testsuite/21_strings/basic_string_view/operations/contains/char/1.cc: New test.
* testsuite/21_strings/basic_string_view/operations/contains/char/2.cc: New test.
* testsuite/21_strings/basic_string_view/operations/contains/wchar_t/1.cc: New test.
2021-01-27 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/93924
PR fortran/93925
* trans-expr.c (gfc_conv_procedure_call): Suppress the call to
gfc_conv_intrinsic_to_class for unlimited polymorphic procedure
pointers.
(gfc_trans_assignment_1): Similarly suppress class assignment
for class valued procedure pointers.
gcc/testsuite/
PR fortran/93924
PR fortran/93925
* gfortran.dg/proc_ptr_52.f90 : New test.
On Linux, GCC emits .note.GNU-stack sections when compiling code to mark
the code as not needing or needing executable stack, missing section means
unknown. But assembly files need to be marked manually. We already
mark various *.S files in libgcc manually, but the
avx_resms64f.o
avx_resms64fx.o
avx_resms64.o
avx_resms64x.o
avx_savms64f.o
avx_savms64.o
sse_resms64f.o
sse_resms64fx.o
sse_resms64.o
sse_resms64x.o
sse_savms64f.o
sse_savms64.o
files aren't marked, so when something links it in, it will require
executable stack. Nothing in the assembly requires executable stack though.
2021-01-27 Jakub Jelinek <jakub@redhat.com>
* config/i386/savms64.h: Add .note.GNU-stack section on Linux.
* config/i386/savms64f.h: Likewise.
* config/i386/resms64.h: Likewise.
* config/i386/resms64f.h: Likewise.
* config/i386/resms64x.h: Likewise.
* config/i386/resms64fx.h: Likewise.
This patch drops the no-strict-aliasing hack in m128-check.h and instead
ensures the tests read objects with the right dynamic type.
2021-01-27 Jakub Jelinek <jakub@redhat.com>
* gcc.target/i386/m128-check.h (CHECK_EXP): Remove
optimize ("no-strict-aliasing") attribute.
* gcc.target/i386/sse-andnps-1.c (TEST): Copy e into float[4]
array to avoid violating TBAA.
* gcc.target/i386/sse2-andpd-1.c (TEST): Copy e.d into double[2]
array to avoid violating TBAA.
* gcc.target/i386/sse-andps-1.c (TEST): Copy e.f into float[4]
array to avoid violating TBAA.
* gcc.target/i386/sse2-andnpd-1.c (TEST): Copy e into double[2]
array to avoid violating TBAA.
2021-01-27 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/98472
* trans-array.c (gfc_conv_expr_descriptor): Include elemental
procedure pointers in the assert under the comment 'elemental
function' and eliminate the second, spurious assert.
gcc/testsuite/
PR fortran/98472
* gfortran.dg/elemental_function_5.f90 : New test.
PROP_trees actually means GIMPLE IL, rather than GENERIC, so better
not to confuse users.
2021-01-27 Jakub Jelinek <jakub@redhat.com>
* tree-pass.h (PROP_trees): Rename to ...
(PROP_gimple): ... this.
* cfgexpand.c (pass_data_expand): Replace PROP_trees with PROP_gimple.
* passes.c (execute_function_dump, execute_function_todo,
execute_one_ipa_transform_pass, execute_one_pass): Likewise.
* varpool.c (ctor_for_folding): Likewise.
In 4.8 and earlier we used to fold the following to 0 during GENERIC folding,
but we don't do that anymore because ctor_for_folding etc. has been turned into a
GIMPLE centric API, but as the testcase shows, it is invoked even during
GENERIC folding and there the automatic vars still should have meaningful
initializers. I've verified that the C++ FE drops TREE_READONLY on
automatic vars with const qualified types if they require non-constant
(runtime) initialization.
2021-01-27 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/97260
* varpool.c: Include tree-pass.h.
(ctor_for_folding): In GENERIC return DECL_INITIAL for TREE_READONLY
non-TREE_SIDE_EFFECTS automatic variables.
* gcc.dg/tree-ssa/pr97260.c: New test.
Derived from the changes that added C++2a support in 2017.
r8-3237-g026a79f70cf33f836ea5275eda72d4870a3041e5
No C++23 features are added here.
Use of -std=c++23 sets __cplusplus to 202100L.
$ g++ -std=c++23 -dM -E -x c++ - < /dev/null | grep cplusplus
#define __cplusplus 202100L
gcc/
* doc/cpp.texi (__cplusplus): Document value for -std=c++23
or -std=gnu++23.
* doc/invoke.texi: Document -std=c++23 and -std=gnu++23.
* dwarf2out.c (highest_c_language): Recognise C++20 and C++23.
(gen_compile_unit_die): Recognise C++23.
gcc/c-family/
* c-common.h (cxx_dialect): Add cxx23 as a dialect.
* c.opt: Add options for -std=c++23, std=c++2b, -std=gnu++23
and -std=gnu++2b
* c-opts.c (set_std_cxx23): New.
(c_common_handle_option): Set options when -std=c++23 is enabled.
(c_common_post_options): Adjust comments.
(set_std_cxx20): Likewise.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_c++2a):
Check for C++2a or C++23.
(check_effective_target_c++20_down): New.
(check_effective_target_c++23_only): New.
(check_effective_target_c++23): New.
* g++.dg/cpp23/cplusplus.C: New.
libcpp/
* include/cpplib.h (c_lang): Add CXX23 and GNUCXX23.
* init.c (lang_defaults): Add rows for CXX23 and GNUCXX23.
(cpp_init_builtins): Set __cplusplus to 202100L for C++23.
In this testcase, we refer to the a parameter through a reference in its own
member, which we asserted couldn't happen by marking the parameter as
'restrict'. This assumption could also be broken if the address escapes
from the constructor.
gcc/cp/ChangeLog:
PR c++/97474
* call.c (type_passed_as): Don't mark invisiref restrict.
gcc/testsuite/ChangeLog:
PR c++/97474
* g++.dg/torture/pr97474.C: New test.
In the discussion of PR98463, Jakub pointed out that in C++17 and up,
cxx_fold_indirect_ref_1 could use the field we build for an empty base. I
tried implementing that, but it broke one of the tuple tests, so I did some
more digging.
To start with, I generalized the PR98463 patch to handle the case where we
do have a field, for an empty base or [[no_unique_address]] member. This is
enough also for the no-field case because the member of the empty base must
itself be an empty field; if it weren't, the base would not be empty.
I looked for related PRs and found 97566, which was also fixed by the patch.
After some poking around to figure out why, I noticed that the testcase had
been breaking because E, though an empty class, has an ABI nvsize of one
byte, and we were giving the [[no_unique_address]] FIELD_DECL that
DECL_SIZE, whereas in build_base_field_1 empty base fields always get
DECL_SIZE zero, and various places were relying on that to recognize empty
fields. So I adjusted both the size and the checking. When I adjusted
check_bases I wondered if we were correctly handling bases with only empty
data members, but it appears we do.
I'm deferring the cxx_fold_indirect_ref_1 change until stage 1, as I don't
think it actually fixes anything.
gcc/cp/ChangeLog:
PR c++/97566
PR c++/98463
* class.c (layout_class_type): An empty field gets size 0.
(is_empty_field): New.
(check_bases): Check it.
* cp-tree.h (is_empty_field): Declare it.
* constexpr.c (cxx_eval_store_expression): Check it.
(cx_check_missing_mem_inits): Likewise.
* init.c (perform_member_init): Likewise.
* typeck2.c (process_init_constructor_record): Likewise.
gcc/testsuite/ChangeLog:
PR c++/97566
* g++.dg/cpp2a/no_unique_address10.C: New test.
* g++.dg/cpp2a/no_unique_address9.C: New test.
This patch drops the no-strict-aliasing hack in m128-check.h and instead
ensures the tests read objects with the right dynamic type.
2021-01-26 Jakub Jelinek <jakub@redhat.com>
* gcc.target/powerpc/m128-check.h (CHECK_EXP): Remove
optimize ("no-strict-aliasing") attribute.
* gcc.target/powerpc/sse-andnps-1.c (TEST): Copy e into float[4]
array to avoid violating TBAA.
* gcc.target/powerpc/sse2-andpd-1.c (TEST): Copy e.d into double[2]
array to avoid violating TBAA.
* gcc.target/powerpc/sse-andps-1.c (TEST): Copy e.f into float[4]
array to avoid violating TBAA.
* gcc.target/powerpc/sse2-andnpd-1.c (TEST): Copy e into double[2]
array to avoid violating TBAA.
This is the profiled bootstrap failure for s390x/Linux on the mainline,
which has been introduced by the modref pass but actually exposing an
existing issue in the maybe_pad_type function that is visible on s390x.
The issue is too weak a test for the addressability of the inner component.
gcc/ada/
Marius Hillenbrand <mhillen@linux.ibm.com>
PR ada/98228
* gcc-interface/utils.c (maybe_pad_type): Test the size of the new
packable type instead of its alignment for addressability's sake.
My recent dwarf2asm.c patch broke powerpc*-*-* bootstrap, while most target
define POINTER_SIZE to (cond ? cst1 : cst2) or constant, rs6000 defines
it to a variable, and the arbitrarily chosen type of that variable determines
whether we get warnings on comparison of that against signed or unsigned
ints.
Fixed by adding a cast.
2021-01-26 Jakub Jelinek <jakub@redhat.com>
PR bootstrap/98839
* dwarf2asm.c (dw2_assemble_integer): Cast DWARF2_ADDR_SIZE to int
in comparison.
The testcase in the patch doesn't assemble, because the instruction requires
that the penultimate operand (lsb) range is [0, 32] (or [0, 64]) and the last
operand's range is [1, 32 - lsb] (or [1, 64 - lsb]).
The INTVAL (shft_amnt) < GET_MODE_BITSIZE (mode) will accept the lsb operand
to be in range [MIN, 32] (or [MIN, 64]) and then we invoke UB in the
compiler and sometimes it will make it through.
The patch changes all the INTVAL uses in that function to UINTVAL,
which isn't strictly necessary, but can be done (e.g. after the
UINTVAL (shft_amnt) < GET_MODE_BITSIZE (mode) check we know it is not
negative and thus INTVAL (shft_amnt) and UINTVAL (shft_amnt) then behave the
same. But, I had to add INTVAL (mask) > 0 check in that case, otherwise we
risk (hypothetically) emitting instruction that doesn't assemble.
The problem is with masks that have the MSB bit set, while the instruction
can handle those, e.g.
ubfiz w1, w0, 13, 19
will do
(w0 << 13) & 0xffffe000
in RTL we represent SImode constants with MSB set as negative HOST_WIDE_INT,
so it will actually be HOST_WIDE_INT_C (0xffffffffffffe000), and
the instruction uses %P3 to print the last operand, which calls
asm_fprintf (f, "%u", popcount_hwi (INTVAL (x)))
to print that. But that will not print 19, but 51 instead, will include
there also all the copies of the sign bit.
Not supporting those masks with MSB set isn't a big loss though, they really
shouldn't appear normally, as both GIMPLE and RTL optimizations should
optimize those away (one isn't masking any bits off with such masks, so
just w0 << 13 will do too).
2021-01-26 Jakub Jelinek <jakub@redhat.com>
PR target/98681
* config/aarch64/aarch64.c (aarch64_mask_and_shift_for_ubfiz_p):
Use UINTVAL (shft_amnt) and UINTVAL (mask) instead of INTVAL (shft_amnt)
and INTVAL (mask). Add && INTVAL (mask) > 0 condition.
* gcc.c-torture/execute/pr98681.c: New test.
This avoids dumping them as <<< ??? >>>.
2021-01-26 Richard Biener <rguenther@suse.de>
* gimple-pretty-print.c (dump_binary_rhs): Handle
VEC_WIDEN_{PLUS,MINUS}_{LO,HI}_EXPR.
This fixes VECTOR_CST element access with POLY_INT elements and
allows to produce dump files of the PR98726 testcase without
ICEing.
2021-01-26 Richard Biener <rguenther@suse.de>
PR middle-end/98726
* tree.h (vector_cst_int_elt): Remove.
* tree.c (vector_cst_int_elt): Use poly_wide_int for computations,
make static.
libgcc/ChangeLog:
PR gcov-profile/98739
* libgcov.h (gcov_topn_add_value): Do not train when
we have a merged profile with a negative number of total
value.
I don't know why these were disabled. There're no direct min/max DPP
instructions for this mode, but the "use moves" strategy works fine.
gcc/ChangeLog:
* config/gcn/gcn.c (gcn_expand_reduc_scalar): Use move instructions
for V64DFmode min/max reductions.
D front-end changes:
- Contracts for pre- and postconditions are now implicitly "this"
const, so that state can no longer be altered in these functions.
- Inside a constructor scope, assigning to aggregate declaration
members is done by considering the first assignment as initialization
and subsequent assignments as modifications of the constructed
object. For const/immutable fields the initialization is accepted in
the constructor but subsequent modifications are not. However this
rule did not apply when inside a constructor scope there is a call to
a different constructor. This been changed so it is now an error
when there's a double initialization of immutable fields inside a
constructor.
Phobos changes:
- Don't run unit-tests for unsupported clocks in std.datetime. The
phobos and phobos_shared tests now add -fversion=Linux_Pre_2639 if
required.
- Deprecate public extern(C) bindings for getline and getdelim in
std.stdio. The correct module for bindings is core.sys.posix.stdio.
Reviewed-on: https://github.com/dlang/dmd/pull/12153https://github.com/dlang/phobos/pull/7768
gcc/d/ChangeLog:
* dmd/MERGE: Merge upstream dmd 609c3ce2d.
* d-compiler.cc (Compiler::loadModule): Rename to ...
(Compiler::onParseModule): ... this.
(Compiler::onImport): New function.
* d-lang.cc (d_parse_file): Remove call to Compiler::loadModule.
libphobos/ChangeLog:
* src/MERGE: Merge upstream phobos 3dd5df686.
* testsuite/libphobos.phobos/phobos.exp: Add compiler flag
-fversion=Linux_Pre_2639 if target is linux_pre_2639.
* testsuite/libphobos.phobos_shared/phobos_shared.exp: Likewise.
The new testcase FAILs on i686-linux with:
gcc/testsuite/gcc.dg/pr98807.c: In function 'foo0':
gcc/testsuite/gcc.dg/pr98807.c:20:1: warning: SSE vector return without SSE enabled changes the ABI [-Wpsabi]
gcc/testsuite/gcc.dg/pr98807.c:19:1: note: the ABI for passing parameters with 16-byte alignment has changed in GCC 4.6
gcc/testsuite/gcc.dg/pr98807.c:19:1: warning: SSE vector argument without SSE enabled changes the ABI [-Wpsabi]
FAIL: gcc.dg/pr98807.c (test for excess errors)
Adding usual testcase treatment for such cases.
2021-01-26 Jakub Jelinek <jakub@redhat.com>
PR middle-end/98807
* gcc.dg/pr98807.c: Add -Wno-psabi -w to dg-options.
For the 32-bit targets the limitations of the object
file format (e.g. 32-bit ELF) will not allow > 2GiB debug info anyway,
and as I've just tested, e.g. on x86_64 with -m32 -gdwarf64 will not work
even on tiny testcases:
as: pr64716.o: unsupported relocation type: 0x1
pr64716.s: Assembler messages:
pr64716.s:6013: Error: cannot represent relocation type BFD_RELOC_64
as: pr64716.o: unsupported relocation type: 0x1
pr64716.s:6015: Error: cannot represent relocation type BFD_RELOC_64
as: pr64716.o: unsupported relocation type: 0x1
pr64716.s:6017: Error: cannot represent relocation type BFD_RELOC_64
So yes, we can either do a sorry, error, or could just avoid 64-bit
relocations (depending on endianity instead of emitting
.quad expression_that_needs_relocation
emit
.long expression_that_needs_relocation, 0
or
.long 0, expression_that_needs_relocation
This patch implements that last option, dunno if we need also configure tests
for that or not, maybe some 32-bit targets use 64-bit ELF and can handle such
relocations.
> 64bit relocs are not required here? That is, can one with
> dwarf64 choose 32bit forms for select offsets (like could
> dwz exploit this?)?
I guess it depends on whether for 32-bit target and -gdwarf64, when
calling dw2_assemble_integer with non-CONST_INT argument we only
need positive values or might need negative ones too.
Because positive ones can be easily emulated through that
.long expression, 0
or
.long 0, expression
depending on endianity, but I'm afraid there is no way to emit
0 or -1 depending on the sign of expression, when it needs relocations.
Looking through dw2_asm_output_delta calls, at least the vast majority
of the calls seem to guarantee being positive, not 100% sure about
one case in .debug_line views, but I'd hope it is ok too.
In most cases, the deltas are between two labels where the first one
in the arguments is later in the same section than the other one,
or where the second argument is the start of a section or another section
base.
2021-01-26 Jakub Jelinek <jakub@redhat.com>
* dwarf2asm.c (dw2_assemble_integer): Handle size twice as large
as DWARF2_ADDR_SIZE if x is not a scalar int by emitting it as
two halves, one with x and the other with const0_rtx, ordered
depending on endianity.
GNAT may create temporaries to hold return values of function calls.
If such a temporary is created as part of a dynamic initializer of a
variable in a unit other than the one being compiled, the initializer
is dropped, including the temporary and its binding block.
Don't issue asan mark calls for such variables, they are gone.
for gcc/ChangeLog
* gimplify.c (gimplify_decl_expr): Skip asan marking calls for
temporaries not seen in binding block, and not about to be
added as gimple variables.
for gcc/testsuite/ChangeLog
* gnat.dg/asan1.adb: New test.
* gnat.dg/asan1_pkg.ads: New additional source.
Check for initialization of substrings beyond bounds in DATA statements.
gcc/fortran/ChangeLog:
PR fortran/70070
* data.c (create_character_initializer): Check substring indices
against bounds.
(gfc_assign_data_value): Catch error returned from
create_character_initializer.
gcc/testsuite/ChangeLog:
PR fortran/70070
* gfortran.dg/pr70070.f90: New test.
In this testcase, cxx_eval_store_expression got confused trying to build up
CONSTRUCTORs for initializing a subobject because the subobject is a member
of an empty base. In C++14 mode and below we don't build FIELD_DECLs for
empty bases, so the CONSTRUCTOR skipped the empty base, and treated the
member as a member of the derived class, which breaks.
Fixed by recognizing this situation and giving up on trying to build a
CONSTRUCTOR for the inner target at that point; since it doesn't have any
data, we don't need to actually store anything.
gcc/cp/ChangeLog:
PR c++/98463
* constexpr.c (get_or_insert_ctor_field): Add check.
(cxx_eval_store_expression): Handle discontinuity of refs.
gcc/testsuite/ChangeLog:
PR c++/98463
* g++.dg/cpp2a/no_unique_address8.C: New test.
binutils since https://sourceware.org/bugzilla/show_bug.cgi?id=25612
changes from March last year until the
https://sourceware.org/pipermail/binutils/2020-August/112684.html
fix in early August emits incorrect .debug_info when assembling files
with --gdwarf-5. Instead of emitting proper DWARF 5 .debug_info header,
it emits DWARF 4 .debug_info header with 5 as the dwarf version instead of
4. This results e.g. in libgcc.a (morestack.o) having garbage in its
.debug_info sections and e.g. libbacktrace during pretty much all libgo
tests fails miserably.
The following patch adds a workaround for that, don't set
HAVE_AS_GDWARF_5_DEBUG_FLAG if readelf can't read the .debug_info back.
Built tested on x86_64-linux against both binutils 2.35 (buggy ones) and
latest binutils trunk, the former with the patch now has DWARF 3
.debug_line and DWARF 2 .debug_info in morestack.o, while the latter
as before correct DWARF 5 .debug_line and .debug_info.
2021-01-25 Jakub Jelinek <jakub@redhat.com>
PR debug/98811
* configure.ac (HAVE_AS_GDWARF_5_DEBUG_FLAG): Only define if
readelf -wi is able to read the emitted .debug_info back.
* configure: Regenerated.
This simplifies vector_element_bits further, avoiding any mode
dependence and instead relying on boolean vector construction
to populate element precision accordingly.
2021-01-25 Richard Biener <rguenther@suse.de>
PR middle-end/98807
* tree.c (vector_element_bits): Always use precision of
the element type for boolean vectors.
* gcc.dg/pr98807.c: New testcase.
We have to use ENDFILE_SPEC for the default linker script and not
STARTFILE_SPEC, since STARTFILE_SPEC is place before the user provided
library search paths.
gcc/
* config/rtems.h (STARTFILE_SPEC): Remove qnolinkcmds.
(ENDFILE_SPEC): Evaluate qnolinkcmds.
This is a regression present on the mainline, 10 and 9 branches, in the
form of an internal error with the Ada compiler when a covariant-only
thunk is inlined into its caller.
gcc/ada/
* gcc-interface/trans.c (make_covariant_thunk): Set the DECL_CONTEXT
of the parameters and do not set TREE_PUBLIC on the thunk.
(maybe_make_gnu_thunk): Pass the alias to the covariant thunk.
* gcc-interface/utils.c (finish_subprog_decl): Set the DECL_CONTEXT
of the parameters here...
(begin_subprog_body): ...instead of here.
gcc/testsuite/
* gnat.dg/thunk2.adb, gnat.dg/thunk2.ads: New test.
* gnat.dg/thunk2_pkg.ads: New helper.
2021-01-25 Steve Kargl <kargl@gcc.gnu.org>
gcc/fortran
PR fortran/98517
* resolve.c (resolve_charlen): Check that length expression is
present before testing for scalar/integer..
gcc/testsuite/
PR fortran/98517
* gfortran.dg/charlen_18.f90 : New test.
The use of -nostdlib and -nodefaultlibs disables the processing of
LIB_SPEC (%L) as specified by LINK_COMMAND_SPEC and thus disables the
default linker script for RTEMS. Move the linker script to
STARTFILE_SPEC which is controlled by -nostdlib and -nostartfiles. This
fits better since the linker script defines the platform start file
provided by the board support package in RTEMS.
gcc/
* config/rtems.h (STARTFILE_SPEC): Remove nostdlib and
nostartfiles handling since this is already done by
LINK_COMMAND_SPEC. Evaluate qnolinkcmds.
(ENDFILE_SPEC): Remove nostdlib and nostartfiles handling since this
is already done by LINK_COMMAND_SPEC.
(LIB_SPECS): Remove nostdlib and nodefaultlibs handling since
this is already done by LINK_COMMAND_SPEC. Remove qnolinkcmds
evaluation.
As mentioned in the PR, the compiler behaves differently during strncmp
and strncasecmp folding between 32-bit and 64-bit hosts targeting 64-bit
target. I think that is highly undesirable.
The culprit is the host_size_t_cst_p predicate that is used by
fold_const_call, which punts if the target size_t constants don't fit into
host size_t. This patch gets rid of that behavior, instead it punts the
same when it doesn't fit into uhwi.
The predicate was used for strncmp and strncasecmp folding and for bcmp, memcmp and
memchr folding.
The constant is in all cases compared to 0, we can do that whether it fits
into size_t or unsigned HOST_WIDE_INT, then it is used in s2 <= s0 or
s2 <= s1 comparisons where s0 and s1 already have uhwi type and represent
the sizes of the objects.
The important difference is for strn{,case}cmp folding, we pass that s2
value as the last argument to the host functions comparing the c_getstr
results. If s2 fits into size_t, then my patch makes no difference,
but if it is larger, we know the 2 c_getstr objects need to fit into the
host address space, so larger s2 should just act essentially as strcmp
or strcasecmp; as none of those objects can occupy 100% of the address
space, using MIN (SIZE_MAX, s2) achieves that.
2021-01-25 Jakub Jelinek <jakub@redhat.com>
PR testsuite/98771
* fold-const-call.c (host_size_t_cst_p): Renamed to ...
(size_t_cst_p): ... this. Check and store unsigned HOST_WIDE_INT
value rather than host size_t.
(fold_const_call): Change type of s2 from size_t to
unsigned HOST_WIDE_INT. Use size_t_cst_p instead of
host_size_t_cst_p. For strncmp calls, pass MIN (s2, SIZE_MAX)
instead of s2 as last argument.
The dynamic section on MIPS is read-only, but this was not properly
handled in the runtime library. The segfault only occurred for programs
that linked to the shared libphobos library.
libphobos/ChangeLog:
PR d/98806
* libdruntime/gcc/sections/elf_shared.d (MIPS_Any): Declare version
for MIPS32 and MIPS64.
(getDependencies): Adjust dlpi_addr on MIPS_Any.
This patch fixes PR17314. Previously, when class C attempted
to access member a declared in class A through class B, where class B
privately inherits from A and class C inherits from B, GCC would correctly
report an access violation, but would erroneously report that the reason was
because a was "protected", when in fact, from the point of view of class C,
it was really "private". This patch updates the diagnostics code to generate
more correct errors in cases of failed inheritance such as these.
The reason this bug happened was because GCC was examining the
declared access of decl, instead of looking at it in the
context of class inheritance.
gcc/cp/ChangeLog:
2021-01-21 Anthony Sharp <anthonysharp15@gmail.com>
* call.c (complain_about_access): Altered function.
* cp-tree.h (complain_about_access): Changed parameters of function.
(get_parent_with_private_access): Declared new function.
* search.c (get_parent_with_private_access): Defined new function.
* semantics.c (enforce_access): Modified function.
* typeck.c (complain_about_unrecognized_member): Updated function
arguments in complain_about_access.
gcc/testsuite/ChangeLog:
2021-01-21 Anthony Sharp <anthonysharp15@gmail.com>
* g++.dg/lookup/scoped1.C: Modified testcase to run successfully
with changes.
* g++.dg/tc1/dr142.C: Same as above.
* g++.dg/tc1/dr52.C: Same as above.
* g++.old-deja/g++.brendan/visibility6.C: Same as above.
* g++.old-deja/g++.brendan/visibility8.C: Same as above.
* g++.old-deja/g++.jason/access8.C: Same as above.
* g++.old-deja/g++.law/access4.C: Same as above.
* g++.old-deja/g++.law/visibility12.C: Same as above.
* g++.old-deja/g++.law/visibility4.C: Same as above.
* g++.old-deja/g++.law/visibility8.C: Same as above.
* g++.old-deja/g++.other/access4.C: Same as above.
The x86 __m64 type is defined as:
/* The Intel API is flexible enough that we must allow aliasing with other
vector types, and their scalar components. */
typedef int __m64 __attribute__ ((__vector_size__ (8), __may_alias__));
and so matches the comment above it in that reads and stores through
pointers to __m64 can alias anything.
But in the rs6000 headers that is the case only for __m128, but not __m64.
The following patch adds that attribute, which fixes the
FAIL: gcc.target/powerpc/sse-movhps-1.c execution test
FAIL: gcc.target/powerpc/sse-movlps-1.c execution test
regressions that appeared when Honza improved ipa-modref.
2021-01-23 Jakub Jelinek <jakub@redhat.com>
PR testsuite/97301
* config/rs6000/mmintrin.h (__m64): Add __may_alias__ attribute.
In the testcase pr97399.C below, finish_qualified_id_expr at parse time
adds an implicit 'this->' to the expression tmp::integral<T> (because
it's type-dependent, and also current_class_ptr is set at this point)
within the trailing return type. Later when substituting into this
trailing return type we crash because we can't resolve the 'this', since
tsubst_function_type does inject_this_parm only for non-static member
functions, which tmp::func is not.
This patch fixes this issue by removing the type-dependence check
in finish_qualified_id_expr added by r9-5972, and instead relaxes
shared_member_p to handle dependent USING_DECLs:
> I think I was wrong in my assertion around Alex's patch that
> shared_member_p should abort on a dependent USING_DECL; it now seems
> appropriate for it to return false if we don't know, we just need to
> adjust the comment to say that.
And when parsing a friend function declaration, we shouldn't be setting
current_class_ptr at all, so this patch additionally suppresses
inject_this_parm in this case.
Finally, the self-contained change to cp_parser_init_declarator is so
that we properly communicate static-ness to cp_parser_direct_declarator
when parsing a member function template. This lets us reject the
explicit use of 'this' in the testcase this2.C below.
gcc/cp/ChangeLog:
PR c++/97399
* cp-tree.h (shared_member_p): Adjust declaration.
* parser.c (cp_parser_init_declarator): If the storage class
specifier is sc_static, pass true for static_p to
cp_parser_declarator.
(cp_parser_direct_declarator): Don't do inject_this_parm when
the declarator is a friend.
* search.c (shared_member_p): Change return type to bool and
adjust function body accordingly. Return false for a dependent
USING_DECL instead of aborting.
* semantics.c (finish_qualified_id_expr): Rely on shared_member_p
even when type-dependent.
gcc/testsuite/ChangeLog:
PR c++/88548
PR c++/97399
* g++.dg/cpp0x/this2.C: New test.
* g++.dg/template/pr97399.C: New test.
The recent vec insert code generation changes were not reflected in the
expected output for ilp32 targets. This patch updates the expected
instructions and counts.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/fold-vec-insert-char-p9.c: Adjust ilp32.
* gcc.target/powerpc/fold-vec-insert-float-p9.c: Same.
* gcc.target/powerpc/fold-vec-insert-int-p9.c: Same.
* gcc.target/powerpc/fold-vec-insert-longlong.c: Same.
* gcc.target/powerpc/fold-vec-insert-short-p9.c: Same.
* gcc.target/powerpc/pr79251.p9.c: Same.
I discovered very strange code in inject_parm_decls:
if (args && is_this_parameter (args))
{
gcc_checking_assert (current_class_ptr == NULL_TREE);
current_class_ptr = NULL_TREE;
We are tripping up on the assert because when we call inject_parm_decls,
current_class_ptr is set to 'A'. It was set by inject_this_parameter
after we've parsed the parameter-declaration-clause of the member
function foo. It seems correct to set ccp/ccr to A::B when we're
late parsing the noexcept-specifiers of bar* functions in B, so that
this-> does the right thing. Since inject_parm_decls doesn't expect
to see non-null ccp/ccr, reset it before calling inject_parm_decls.
gcc/cp/ChangeLog:
PR c++/96623
* parser.c (inject_parm_decls): Remove a redundant assignment.
(cp_parser_class_specifier_1): Clear current_class_{ptr,ref}
before calling inject_parm_decls.
gcc/testsuite/ChangeLog:
PR c++/96623
* g++.dg/cpp0x/noexcept64.C: New test.
This testcase was disabled in the distant past when AIX did not have
support for DWARF and the testcase explicitly invokes DWARF debugging.
This patch re-enables the testcase.
gcc/testsuite/ChangeLog:
* g++.dg/eh/spbp.C: Remove skip on AIX.
Spotted while fixing the rs6000 aliasing issue.
2021-01-22 Jakub Jelinek <jakub@redhat.com>
* gcc.target/powerpc/m128-check.h (CHECK_EXP, CHECK_FP_EXP): Fix a
typo, UINON_TYPE to UNION_TYPE.
On Mon, Sep 21, 2020 at 10:12:20AM +0200, Richard Biener wrote:
> On Mon, 21 Sep 2020, Jan Hubicka wrote:
> > these testcases now fails because they contains an invalid type puning
> > that happens via const VALUE_TYPE *v pointer. Since the check function
> > is noinline, modref is needed to trigger the wrong code.
> > I think it is easiest to fix it by no-strict-aliasing.
> >
> > Regtested x86_64-linux, OK?
>
> OK.
>
> > * gcc.target/i386/m128-check.h: Add no-strict aliasing to
> > CHECK_EXP macro.
> >
> > diff --git a/gcc/testsuite/gcc.target/i386/m128-check.h b/gcc/testsuite/gcc.target/i386/m128-check.h
> > index 48b23328539..6f414b07be7 100644
> > --- a/gcc/testsuite/gcc.target/i386/m128-check.h
> > +++ b/gcc/testsuite/gcc.target/i386/m128-check.h
> > @@ -78,6 +78,7 @@ typedef union
> >
> > #define CHECK_EXP(UINON_TYPE, VALUE_TYPE, FMT) \
> > static int \
> > +__attribute__((optimize ("no-strict-aliasing"))) \
> > __attribute__((noinline, unused)) \
> > check_##UINON_TYPE (UINON_TYPE u, const VALUE_TYPE *v) \
> > { \
On powerpc64le the tests suffer from the exact same issue.
2021-01-22 Jakub Jelinek <jakub@redhat.com>
* gcc.target/powerpc/m128-check.h (check_##UINON_TYPE): Add
optimize ("no-strict-aliasing") attribute.
When GCC is emitting .debug_line or .gnu.debuglto_.debug_line section by
itself (happens either with too old or non-GNU assembler, with
-gno-as-loc-support or with -flto) on empty translation units, it violates
the DWARF 5 requirements.
The standard says:
"The first entry is the current directory of the compilation."
and a few lines later:
"The first entry in the sequence is the primary source file whose file name
exactly matches that given in the DW_AT_name attribute in the compilation
unit debugging information entry."
GCC emits 4 zeros (directory entry format count, directories count,
filename entry format count and filename count), which would be ok if the
spec said The first entry may be rather than is.
I had a brief look at whether I could just fall through into the rest of the
function, but there are too many assumptions that there is at least one
normal file that it can't be done that way easily.
So this patch instead extends the early out code to emit the required
minimum, which is 15 bytes more than we used to emit before.
2021-01-22 Jakub Jelinek <jakub@redhat.com>
PR debug/98796
* dwarf2out.c (output_file_names): For -gdwarf-5, if there are no
filenames to emit, still emit the required 0 index directory and
filename entries that match DW_AT_comp_dir and DW_AT_name of the
compilation unit.
As Jakub points out in the PR, I was mixing up
DECL_HAS_IN_CHARGE_PARM_P (which is true for the abstract maybe-in-charge
constructor) and DECL_HAS_VTT_PARM_P (which is true for a base constructor
that needs to handle virtual bases).
gcc/cp/ChangeLog:
PR c++/98744
* call.c (make_base_init_ok): Use DECL_HAS_VTT_PARM_P.
gcc/testsuite/ChangeLog:
PR c++/98744
* g++.dg/init/elide7.C: New test.
Alex' 2 years old change to build_zero_init_1 to return NULL pointer with
reference type for references breaks the sanitizers, the assignment of NULL
to a reference typed member is then instrumented before it is overwritten
with a non-NULL address later on.
That change has been done to fix error recovery ICE during
process_init_constructor_record, where we:
if (TYPE_REF_P (fldtype))
{
if (complain & tf_error)
error ("member %qD is uninitialized reference", field);
else
return PICFLAG_ERRONEOUS;
}
a few lines earlier, but then continue and ICE when build_zero_init returns
NULL.
The following patch reverts the build_zero_init_1 change and instead creates
the NULL with reference type constants during the error recovery.
The pr84593.C testcase Alex' change was fixing still works as before.
2021-01-22 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/95693
* init.c (build_zero_init_1): Revert the 2018-03-06 change to
return build_zero_cst for reference types.
* typeck2.c (process_init_constructor_record): Instead call
build_zero_cst here during error recovery instead of build_zero_init.
* g++.dg/ubsan/pr95693.C: New test.
r11-6301 added some asserts in mangle.c, and now we trip over one of
them. In particular, it's the one asserting that we didn't get
IDENTIFIER_ANY_OP_P when mangling an expression with a dependent name.
As this testcase shows, it's possible to get that, so turn the assert
into an if and write "on". That changes the mangling in the following
way:
With this patch:
$ c++filt _ZN1i1hIJ1adS1_EEEDTcldtdefpTonclspcvT__EEEDpS2_
decltype (((*this).(operator()))((a)(), (double)(), (a)())) i::h<a, double, a>(a, double, a)
G++10:
$ c++filt _ZN1i1hIJ1adS1_EEEDTcldtdefpTclspcvT__EEEDpS2_
decltype (((*this).(operator()))((a)(), (double)(), (a)())) i::h<a, double, a>(a, double, a)
clang++/icc:
$ c++filt _ZN1i1hIJ1adS1_EEEDTclonclspcvT__EEEDpS2_
decltype ((operator())((a)(), (double)(), (a)())) i::h<a, double, a>(a, double, a)
This is now tracked in PR98756.
gcc/cp/ChangeLog:
PR c++/98545
* mangle.c (write_member_name): Emit abi_warn_or_compat_version_crosses
warnings regardless of abi_version_at_least.
(write_expression): When the expression is a dependent name
and an operator name, write "on" before writing its name.
gcc/ChangeLog:
PR c++/98545
* doc/invoke.texi: Update C++ ABI Version 15 description.
gcc/testsuite/ChangeLog:
PR c++/98545
* g++.dg/abi/mangle76.C: New test.
2021-01-22 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/98565
* trans-intrinsic.c (gfc_conv_associated): Do not add a _data
component for scalar class function targets. Instead, fix the
function result and access the _data from that.
gcc/testsuite/
PR fortran/98565
* gfortran.dg/associated_target_7.f90 : New test.
I stumbled across PR 47059 from 2010 which has been addressed by
store-merging. I am going to close it but would like to add its
testcase too.
gcc/testsuite/ChangeLog:
2021-01-08 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/47059
* gcc.dg/tree-ssa/pr47059.c: New test.
We ICE here because we end up comparing a poly_int64 with a scalar using
<= rather than maybe_le.
This patch fixes that in the way rich suggests in the PR.
gcc/ChangeLog:
PR tree-optimization/98766
* tree-ssa-math-opts.c (convert_mult_to_fma): Use maybe_le when
comparing against type size with param_avoid_fma_max_bits.
gcc/testsuite/ChangeLog:
PR tree-optimization/98766
* gcc.dg/pr98766.c: New test.
Header unit names come from the path the preprocessor determines, and
thus can be absolute. This tweaks the testsuite to elide that
absoluteness when embedded in a CMI name. We were also not
distinguishing link and execute tests by the $std flags, so append
them when necessary.
PR testsuite/98795
gcc/testsuite/
* g++.dg/modules/modules.exp (module_cmi_p): Avoid
embedded absolute paths.
(module_do_it): Append $std to test name.
The previous change made AVX512 mask vectors correct but disregarded
the possibility of generic (BLKmode) boolean vectors which are exposed
by the frontends already.
2021-01-22 Richard Biener <rguenther@suse.de>
PR middle-end/98793
* tree.c (vector_element_bits): Key single-bit bool vector on
integer mode rather than not vector mode.
* gcc.dg/pr98793.c: New testcase.
vec_insert accepts 3 arguments, arg0 is input vector, arg1 is the value
to be insert, arg2 is the place to insert arg1 to arg0. Current expander
generates stxv+stwx+lxv if arg2 is variable instead of constant, which
causes serious store hit load performance issue on Power. This patch tries
1) Build VIEW_CONVERT_EXPR for vec_insert (i, v, n) like v[n&3] = i to
unify the gimple code, then expander could use vec_set_optab to expand.
2) Expand the IFN VEC_SET to fast instructions: lvsr+insert+lvsl.
In this way, "vec_insert (i, v, n)" and "v[n&3] = i" won't be expanded too
early in gimple stage if arg2 is variable, avoid generating store hit load
instructions.
For Power9 V4SI:
addi 9,1,-16
rldic 6,6,2,60
stxv 34,-16(1)
stwx 5,9,6
lxv 34,-16(1)
=>
rlwinm 6,6,2,28,29
mtvsrwz 0,5
lvsr 1,0,6
lvsl 0,0,6
xxperm 34,34,33
xxinsertw 34,0,12
xxperm 34,34,32
Though instructions increase from 5 to 7, the performance is improved
60% in typical cases.
Tested with V2DI, V2DF V4SI, V4SF, V8HI, V16QI on Power9-LE.
2021-01-22 Xionghu Luo <luoxhu@linux.ibm.com>
gcc/ChangeLog:
PR target/79251
PR target/98065
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
Ajdust variable index vec_insert from address dereference to
ARRAY_REF(VIEW_CONVERT_EXPR) tree expression.
* config/rs6000/rs6000-protos.h (rs6000_expand_vector_set_var):
New declaration.
* config/rs6000/rs6000.c (rs6000_expand_vector_set_var): New function.
2021-01-22 Xionghu Luo <luoxhu@linux.ibm.com>
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr79251.p9.c: New test.
* gcc.target/powerpc/pr79251-run.c: New test.
* gcc.target/powerpc/pr79251.h: New header.
The driver checks whether OPT_SPECIAL_input_file options are readable.
There's no need, the compiler proper will do that anyway.
gcc/
* gcc.c (process_command): Don't check OPT_SPECIAL_input_file
existence here.
The previous change exposed a miscompile when trying to interpret
CHREC_RIGHT correctly which in fact it already was to the extent
it is used. The following reverts this part of the change, only
retaining the singling out of HOST_WIDE_INT_MIN.
2021-01-22 Richard Biener <rguenther@suse.de>
PR middle-end/98773
* tree-data-ref.c (initalize_matrix_A): Revert previous
change, retaining failing on HOST_WIDE_INT_MIN CHREC_RIGHT.
* gcc.dg/torture/pr98773.c: New testcase.
In the PR Andrew said he has implemented a simplification that has been
added to LLVM, but that actually is not true, what is in there are
X * (X cmp 0.0 ? +-1.0 : -+1.0) simplifications into +-abs(X)
but what has been added into GCC are (X cmp 0.0 ? +-1.0 : -+1.0)
simplifications into copysign(1, +-X) and then
X * copysign (1, +-X) into +-abs (X).
The problem is with the (X cmp 0.0 ? +-1.0 : -+1.0) simplifications,
they don't work correctly when X is zero.
E.g.
(X > 0.0 ? 1.0 : -1.0)
is -1.0 when X is either -0.0 or 0.0, but copysign will make it return
1.0 for 0.0 and -1.0 only for -0.0.
(X >= 0.0 ? 1.0 : -1.0)
is 1.0 when X is either -0.0 or 0.0, but copysign will make it return
still 1.0 for 0.0 and -1.0 for -0.0.
The simplifications were guarded on !HONOR_SIGNED_ZEROS, but as discussed in
the PR, that option doesn't mean that -0.0 will not ever appear as operand
of some operation, it is hard to guarantee that without compiler adding
canonicalizations of -0.0 to 0.0 after most of the operations and thus
making it very slow, but that the user asserts that he doesn't care if the result
of operations will be 0.0 or -0.0. Not to mention that some of the
transformations are incorrect even for positive 0.0.
So, instead of those simplifications this patch recognizes patterns where
those ?: expressions are multiplied by X, directly into +-abs.
That works fine even for 0.0 and -0.0 (as long as we don't care about
whether the result is exactly 0.0 or -0.0 in those cases), because
whether the result of copysign is -1.0 or 1.0 doesn't matter when it is
multiplied by 0.0 or -0.0.
As a follow-up, maybe we should add the simplification mentioned in the PR,
in particular doing copysign by hand through
VIEW_CONVERT_EXPR <int, float_X> < 0 ? -float_constant : float_constant
into copysign (float_constant, float_X). But I think that would need to be
done in phiopt.
2021-01-22 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/90248
* match.pd (X cmp 0.0 ? 1.0 : -1.0 -> copysign(1, +-X),
X cmp 0.0 ? -1.0 : +1.0 -> copysign(1, -+X)): Remove
simplifications.
(X * (X cmp 0.0 ? 1.0 : -1.0) -> +-abs(X),
X * (X cmp 0.0 ? -1.0 : 1.0) -> +-abs(X)): New simplifications.
* gcc.dg/tree-ssa/copy-sign-1.c: Don't expect any copysign
builtins.
* gcc.dg/pr90248.c: New test.
As discussed in the PR, the problem here is that the routines changed in
this patch sign extend the difference of index and low_bound from the
precision of the index, so e.g. when index is unsigned int and contains
value -2U, we treat it as index -2 rather than 0x00000000fffffffeU on 64-bit
arches.
On the other hand, get_inner_reference which is used during expansion, does:
if (! integer_zerop (low_bound))
index = fold_build2 (MINUS_EXPR, TREE_TYPE (index),
index, low_bound);
offset = size_binop (PLUS_EXPR, offset,
size_binop (MULT_EXPR,
fold_convert (sizetype, index),
unit_size));
which effectively requires that either low_bound is constant 0 and then
index in ARRAY_REFs can be arbitrary type which is then sign or zero
extended to sizetype, or low_bound is something else and then index and
low_bound must have compatible types and it is still converted afterwards to
sizetype and from there then a few lines later:
expr.c- if (poly_int_tree_p (offset))
expr.c- {
expr.c: poly_offset_int tem = wi::sext (wi::to_poly_offset (offset),
expr.c- TYPE_PRECISION (sizetype));
The following patch makes those routines match what get_inner_reference is
doing.
2021-01-22 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/98255
* tree-dfa.c (get_ref_base_and_extent): For ARRAY_REFs, sign
extend index - low_bound from sizetype's precision rather than index
precision.
(get_addr_base_and_unit_offset_1): Likewise.
* tree-ssa-sccvn.c (ao_ref_init_from_vn_reference): Likewise.
* gimple-fold.c (fold_const_aggregate_ref_1): Likewise.
* gcc.dg/pr98255.c: New test.
This fixes factor_out_conditional_conversion to avoid creating overlapping
lifetimes for abnormals. It also makes sure we do deal with a conditional
conversion (at least for one PHI arg def) - for the testcase that wasn't the case.
2021-01-22 Richard Biener <rguenther@suse.de>
PR tree-optimization/98786
* tree-ssa-phiopt.c (factor_out_conditional_conversion): Avoid
adding new uses of abnormals. Verify we deal with a conditional
conversion.
* gcc.dg/torture/pr98786.c: New testcase.
gcc/ChangeLog:
PR target/96891
PR target/98348
* config/i386/sse.md (VI_128_256): New mode iterator.
(*avx_cmp<mode>3_1, *avx_cmp<mode>3_2, *avx_cmp<mode>3_3,
*avx_cmp<mode>3_4, *avx2_eq<mode>3, *avx2_pcmp<mode>3_1,
*avx2_pcmp<mode>3_2, *avx2_gt<mode>3): New
define_insn_and_split to lower avx512 vector comparison to avx
version when dest is vector.
(*<avx512>_cmp<mode>3,*<avx512>_cmp<mode>3,*<avx512>_ucmp<mode>3):
define_insn_and_split for negating the comparison result.
* config/i386/predicates.md (float_vector_all_ones_operand):
New predicate.
* config/i386/i386-expand.c (ix86_expand_sse_movcc): Use
general NOT operator without UNSPEC_MASKOP.
gcc/testsuite/ChangeLog:
PR target/96891
PR target/98348
* gcc.target/i386/avx512bw-pr96891-1.c: New test.
* gcc.target/i386/avx512f-pr96891-1.c: New test.
* gcc.target/i386/avx512f-pr96891-2.c: New test.
* gcc.target/i386/avx512f-pr96891-3.c: New test.
* g++.target/i386/avx512f-pr96891-1.C: New test.
* gcc.target/i386/bitwise_mask_op-3.c: Adjust testcase.
Another ICE with delayed noexcept parsing, but a bit gnarlier.
A function definition marked with __attribute__((used)) ought to be
emitted even when it is not referenced in the TU. For a member function
template marked with __attribute__((used)) this means that it will
be instantiated: in instantiate_class_template_1 we have
11971 /* Instantiate members marked with attribute used. */
11972 if (r != error_mark_node && DECL_PRESERVE_P (r))
11973 mark_used (r);
It is not so surprising that this doesn't work well with delayed
noexcept parsing: when we're processing the function template we delay
the parsing, so the member "foo" is found, but then when we're
instantiating it, "foo" hasn't yet been seen, which creates a
discrepancy and a crash ensues. "foo" hasn't yet been seen because
instantiate_class_template_1 just loops over the class members and
instantiates right away.
To make it work, this patch uses a vector to keep track of members
marked with attribute used and uses it to instantiate such members
only after we're done with the class; in particular, after we have
called finish_member_declaration for each member. And we ought to
be verifying that we did emit such members, so I've added a bunch
of dg-finals.
gcc/cp/ChangeLog:
PR c++/97966
* pt.c (instantiate_class_template_1): Instantiate members
marked with attribute used only after we're done instantiating
the class.
gcc/testsuite/ChangeLog:
PR c++/97966
* g++.dg/cpp0x/noexcept63.C: New test.
Both lambda-uneval1.C and lambda-uneval5.C test that a symbol is not
declared global by looking for "globl" assembler directive. The testcases
generate the "lglobl" directive in AIX XCOFF, which is a false positive.
This patch restricts the regex to ignore a prepended "l". The patch
also tightens the regex to specifically look for space, tab or period
between the "globl" and the symbol.
Tested on powerpc-ibm-aix7.2.3.0 and powerpc64le-linux-gnu.
* g++.dg/cpp2a/lambda-uneval1.C: Ignore preceding "l" and
intervening period.
* g++.dg/cpp2a/lambda-uneval5.C: Ignore preceding "l" and
explicitly check for intervening space, tab or period.
LRA did not extend ira_reg_equiv after generation of a pseudo in
eliminate_regs_in_insn which might results in LRA crash. It is better not
to extend ira_reg_equiv but to use preliminary generated pseudo. The
patch implements it.
gcc/ChangeLog:
PR rtl-optimization/98777
* lra-int.h (lra_pmode_pseudo): New extern.
* lra.c (lra_pmode_pseudo): New global.
(lra): Set it up.
* lra-eliminations.c (eliminate_regs_in_insn): Use it.
gcc/testsuite/ChangeLog:
PR rtl-optimization/98777
* gcc.target/riscv/pr98777.c: New.
Suppose we have:
(set (reg/v:TF 63) (mem/c:TF (reg/v:DI 62)))
(set (reg:FPRX2 66) (subreg:FPRX2 (reg/v:TF 63) 0))
It is clearly profitable to propagate the first insn into the second
one and get:
(set (reg:FPRX2 66) (mem/c:FPRX2 (reg/v:DI 62)))
fwprop actually manages to perform this, but doesn't think the result is
worth it, which results in unnecessary store/load sequences on s390.
Improve the situation by classifying SUBREG -> MEM changes as
profitable.
gcc/ChangeLog:
2021-01-15 Ilya Leoshkevich <iii@linux.ibm.com>
* fwprop.c (fwprop_propagation::classify_result): Allow
(subreg (mem)) simplifications.
Here after resolving the address of a template-id inside decltype, we
end up instantiating the chosen specialization (from the call to
mark_used in resolve_nondeduced_context), even though only its type is
needed.
This patch sets cp_unevaluated_operand throughout finish_decltype_type,
so that in particular it's set during the call to
resolve_nondeduced_context within.
gcc/cp/ChangeLog:
PR c++/71879
* semantics.c (finish_decltype_type): Set up a cp_unevaluated
sentinel at the start of the function. Remove a now-redundant
manual adjustment of cp_unevaluated_operand.
gcc/testsuite/ChangeLog:
PR c++/71879
* g++.dg/cpp0x/decltype-71879.C: New test.
One may not use a null this pointer to invoke a static member
function. This fixes the remaining ubsan errors found with an
ubsan bootstrap.
PR c++/98624
gcc/cp/
* module.cc (depset::hash::find_dependencies): Add
module arg.
(trees_out::core_vals): Check state before calling
write_location.
(sort_cluster, module_state::write): Adjust
find_dependencies call.
The following testcase is rejected even when it is valid.
The problem is that potential_constant_expression_1 doesn't have the
accurate *jump_target tracking cxx_eval_* has, and when the loop has
a condition that isn't guaranteed to be always true, the body isn't walked
at all. That is mostly a correct conservative behavior, except that it
doesn't detect if there are any return statements in the body, which means
the loop might return instead of falling through to the next statement.
We already have code for return stmt discovery in code snippets we don't
try to evaluate for switches, so this patch reuses that for FOR_STMT
and WHILE_STMT bodies.
Note, I haven't touched FOR_EXPR, with statement expressions it could
have return stmts in it too, or it could have break or continue statements
that wouldn't bind to the current loop but to something outer. That
case is clearly mishandled by potential_constant_expression_1 even
when the condition is missing or is always true, and it wouldn't surprise me
if cxx_eval_* didn't handle it right either, so I'm deferring that to
separate PR for later. We'd need proper test coverage for all of that.
> Hmm, IF_STMT probably also needs to check the else clause, if the condition
> isn't a known constant.
You're right, I thought it was ok because it recurses with tf_none, but
if the then branch is potentially constant and only else returns, continues
or breaks, then as the enhanced testcase shows we were mishandling it too.
2021-01-21 Jakub Jelinek <jakub@redhat.com>
PR c++/98672
* constexpr.c (check_for_return_continue_data): Add break_stmt member.
(check_for_return_continue): Also look for BREAK_STMT. Handle
SWITCH_STMT by ignoring break_stmt from its body.
(potential_constant_expression_1) <case FOR_STMT>,
<case WHILE_STMT>: If the condition isn't constant true, check if
the loop body can contain a return stmt.
<case SWITCH_STMT>: Adjust check_for_return_continue_data initializer.
<case IF_STMT>: If recursion with tf_none is successful,
merge *jump_target from the branches - returns with highest priority,
breaks or continues lower. If then branch is potentially constant and
doesn't return, check the else branch if it could return, break or
continue.
* g++.dg/cpp1y/constexpr-98672.C: New test.
The aarch64_sqdml<SBINQOPS:as>l patterns are of the form:
[(set (match_operand:<VWIDE> 0 "register_operand" "=w")
(SBINQOPS:<VWIDE>
(match_operand:<VWIDE> 1 "register_operand" "0")
(ss_ashift:<VWIDE>
(mult:<VWIDE>
(sign_extend:<VWIDE>
(match_operand:VSD_HSI 2 "register_operand" "w"))
(sign_extend:<VWIDE>
(match_operand:VSD_HSI 3 "register_operand" "w")))
(const_int 1))))]
where SBINQOPS is ss_plus and ss_minus. The problem is that for the
ss_plus case the RTL
is not canonical: the (match_oprand 1) should be the second arm of the
PLUS.
I've seen this manifest in combine missing some legitimate
simplifications because it generates
the canonical ss_plus form and fails to match the pattern.
This patch splits the patterns into the ss_plus and ss_minus forms with
the canonical form for each.
I've seen this improve my testcase (which I can't include as it's too
large and not easy to test reliably).
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md (aarch64_sqdml<SBINQOPS:as>l<mode>):
Split into...
(aarch64_sqdmlal<mode>): ... This...
(aarch64_sqdmlsl<mode>): ... And this.
(aarch64_sqdml<SBINQOPS:as>l_lane<mode>): Split into...
(aarch64_sqdmlal_lane<mode>): ... This...
(aarch64_sqdmlsl_lane<mode>): ... And this.
(aarch64_sqdml<SBINQOPS:as>l_laneq<mode>): Split into...
(aarch64_sqdmlsl_laneq<mode>): ... This...
(aarch64_sqdmlal_laneq<mode>): ... And this.
(aarch64_sqdml<SBINQOPS:as>l_n<mode>): Split into...
(aarch64_sqdmlsl_n<mode>): ... This...
(aarch64_sqdmlal_n<mode>): ... And this.
(aarch64_sqdml<SBINQOPS:as>l2<mode>_internal): Split into...
(aarch64_sqdmlal2<mode>_internal): ... This...
(aarch64_sqdmlsl2<mode>_internal): ... And this.
The following traits can now access non-public members:
- hasMember
- getMember
- getOverloads
- getVirtualMethods
- getVirtualFuntions
This fixes a long-standing issue in D where the allMembers trait would
correctly return non-public members but those non-public members would
be inaccessible to other traits.
Reviewed-on: https://github.com/dlang/dmd/pull/12135
gcc/d/ChangeLog:
* dmd/MERGE: Merge upstream dmd 3a7ebef73.
Like all vcmp intrinsics, __arm_vcmpneq_s8 should return a mve_pred16_t.
2021-01-21 Christophe Lyon <christophe.lyon@linaro.org>
gcc/
* config/arm/arm_mve.h (__arm_vcmpneq_s8): Fix return type.
This was a header file that deployed the stat-hack inside a class
(both a member-class and a [non-static data] member had the same
name). Due to the way that's represented in name lookup we missed the
class. Sadly just changing the representation globally has
detrimental effects elsewhere, and this is a rare case, so just
creating a new overload on the fly shouldn't be a problem.
PR c++/98530
gcc/cp/
* name-lookup.c (lookup_class_binding): Rearrange a stat-hack.
gcc/testsuite/
* g++.dg/modules/stat-mem-1.h: New.
* g++.dg/modules/stat-mem-1_a.H: New.
* g++.dg/modules/stat-mem-1_b.C: New.
This removes a trivial whitespace difference between the currently
committed file and the one regenerated by autotools.
libstdc++-v3/ChangeLog:
* src/c++17/Makefile.in: Regenerate.
2021-01-21 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/96320
* decl.c (gfc_match_modproc): It is not an error to find a
module procedure declaration within a contains block.
* expr.c (gfc_check_vardef_context): Pure procedure result is
assignable. Change 'own_scope' accordingly.
* resolve.c (resolve_typebound_procedure): A procedure that
has the module procedure attribute is almost certainly a
module procedure, whatever its interface.
gcc/testsuite/
PR fortran/96320
* gfortran.dg/module_procedure_5.f90 : New test.
* gfortran.dg/module_procedure_6.f90 : New test.
This adds more guards to the VEC_PERM_EXPR scan, namely that
we also could end up with load-lanes and of course no vectorization
at all. Need dependent scans (scan-if-scan-X PASSed ...).
2021-01-21 Richard Biener <rguenther@suse.de>
PR testsuite/97299
* gcc.dg/vect/slp-reduc-3.c: Amend target selectors.
If SRC had been assigned a mode narrower than the copy, we can't
always link DEST into the chain even they have same
hard_regno_nregs(i.e. HImode/SImode in i386 backend).
i.e
kmovw %k0, %edi
vmovd %edi, %xmm2
vpshuflw $0, %xmm2, %xmm0
kmovw %k0, %r8d
kmovd %k0, %r9d
...
- movl %r9d, %r11d
+ vmovd %xmm2, %r11d
gcc/ChangeLog:
PR rtl-optimization/98694
* regcprop.c (copy_value): If SRC had been assigned a mode
narrower than the copy, we can't link DEST into the chain even
they have same hard_regno_nregs(i.e. HImode/SImode in i386
backend).
gcc/testsuite/ChangeLog:
PR rtl-optimization/98694
* gcc.target/i386/pr98694.c: New test.
g++.dg/warn/Wstringop-overflow-6.C tests for a bogus overflow warning in
system headers. This testcase was generating a -Wchar-subscript warning
on AIX because ctype_inline.h was subscripting AIX _OBJ_DATA using a char.
The _M_table case cast the subscript to unsigned char, but the _OBJ_DATA
case did not.
The investigation also exposed that AIX has added a thread-safe variant
of access to __lc_type that had not been applied to the libstdc++
implementation.
This patch casts the subscript to unsigned char and adds the THREAD_SAFE
variant. libstdc++ always is compiled with pthreads, but it is good
to make the situation explicit and to document the appropriate usage.
Bootstrapped on powerpc-ibm-aix7.2.3.0.
libstdc++-v3/ChangeLog:
* config/os/aix/ctype_inline.h (bool ctype<char>:: is): Cast
_OBJ_DATA subscript to unsigned char. Add _THREAD_SAFE access to
__lc_type.
(const char* ctype<char>:: is): Same.
Adjust testcase to so the ADD that is expected to overflow cannot
be optimized.
gcc/testsuite
* gcc.dg/torture/ftrapv-2.c: Make overflow instruction unremovable.
On Wed, Jan 20, 2021 at 05:04:39PM +0100, Florian Weimer wrote:
> Sorry, this appears to cause OpenMP task state corruption in RPM. We
> have only seen this on s390x.
Haven't actually verified it, but my suspection is that this is a caller
stack corruption.
We play with fire with the GOMP_task API/ABI extensions, the GOMP_task
function used to be:
void
GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) (void *, void *),
long arg_size, long arg_align, bool if_clause, unsigned flags);
and later:
void
GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) (void *, void *),
long arg_size, long arg_align, bool if_clause, unsigned flags,
void **depend);
and later:
void
GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) (void *, void *),
long arg_size, long arg_align, bool if_clause, unsigned flags,
void **depend, int priority);
and now:
void
GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) (void *, void *),
long arg_size, long arg_align, bool if_clause, unsigned flags,
void **depend, int priority, void *detach)
and which of those depend, priority and detach argument is present depends
on the bits in flags.
I'm afraid the compiler just decided to spill the detach = NULL store in
if ((flags & GOMP_TASK_FLAG_DETACH) == 0)
detach = NULL;
on s390x into the argument stack slot. Not a problem if the caller passes
all those 10 arguments, but if not, can clobber random stack location.
This hack should fix it up. Priority doesn't need changing, but I've
changed it anyway just to be safe. With the patch none of the 3 arguments
are ever modified, so I'd hope gcc doesn't decide to spill something
unrelated there.
2021-01-20 Jakub Jelinek <jakub@redhat.com>
* task.c (GOMP_task): Rename priority argument to priority_arg,
add priority automatic variable and modify that variable. Instead of
clearing detach argument when GOMP_TASK_FLAG_DETACH bit is not set,
check flags for that bit.
I'd forgotten that left shifting a negative value is UB until C++20.
Insert some casts to do unsigned shifts.
PT c++/98625
gcc/cp/
* module.cc (bytes_in::i, bytes_in::wi): Avoid left shift of
signed type.
In certain intrinsics use cases GCC leaves SETs of a bottom-element vec
select lying around:
(vec_select:DI (reg:V2DI 34 v2 [orig:128 __o ] [128])
(parallel [
(const_int 0 [0])
])))
This can be treated as a simple move in aarch64 when done between SIMD
registers for all normal widths.
These go through the aarch64_get_lane pattern.
This patch adds a splitter there to simplify these extracts to a move
that can, perhaps, be optimised a way.
Another benefit is if the destination is memory we can use a simpler STR
instruction rather than ST1-lane.
gcc/
* config/aarch64/aarch64-simd.md (aarch64_get_lane<mode>):
Convert to define_insn_and_split. Split into simple move when moving
bottom element.
gcc/testsuite/
* gcc.target/aarch64/vdup_lane_2.c: Scan for fmov rather than
dup.
One of the advantages of LRA is that you can create new pseudos from it
just fine. The code in rs6000_emit_le_vsx_store was not aware of this.
This patch changes that, in the process fixing PR98549 (where it is
shown that we do call rs6000_emit_le_vsx_store during LRA, which we
used to assert can not happen).
2021-01-20 Segher Boessenkool <segher@kernel.crashing.org>
* config/rs6000/rs6000.c (rs6000_emit_le_vsx_store): Change assert.
Adjust comment. Simplify code.
As mentioned in the PR, with -gdwarf-5 (or -g now) -flto -ffat-lto-objects,
users can't strip the LTO sections with
strip -p -R .gnu.lto_* -R .gnu.debuglto_* -N __gnu_lto_v1
anymore when GCC is configured against recent binutils.
The problem is that in that case .gnu.debuglto_.debug_line_str section is
then used, which is fine for references to strings in .gnu.debuglto_.*
sections, but not when those references are in .debug_info section too;
those should really reference separate strings in .debug_line_str section.
For .gnu.debuglto_.debug_str vs. .debug_str we handle it right, we
reset_indirect_string the strings and thus force creation of new labels for
the second time.
But for DW_FORM_line_strp as the patch shows, there were multiple problems.
First one was that reset_indirect_string, even when called through traverse
on debug_line_str_hash, didn't do anything at all (fixed by first hunk).
The second bug was that the DW_FORM_line_strp strings, which were supposed
to be only visible through debug_line_str_hash, leaked into debug_str_hash
(second hunk).
And the third thing is that when we reset debug_line_str_hash, we should
still make those strings DW_FORM_line_strp if they are accessed.
One could do it by reinstantiating DW_FORM_line_strp right away in
reset_indirect_string and not clear debug_line_str_hash, but that has the
disadvantage that we then force emitting .debug_line_str strings that aren't
really needed - we need those from the CU DIEs' DW_AT_name and
DW_AT_comp_dir attributes, but when emitting .debug_line section through
assembler, we don't need to emit the strings we only needed for
.gnu.debuglto_.debug_line which is always emitted by the compiler.
2021-01-20 Jakub Jelinek <jakub@redhat.com>
PR debug/98765
* dwarf2out.c (reset_indirect_string): Also reset indirect strings
with DW_FORM_line_strp form.
(prune_unused_types_update_strings): Don't add into debug_str_hash
indirect strings with DW_FORM_line_strp form.
(adjust_name_comp_dir): New function.
(dwarf2out_finish): Call it on CU DIEs after resetting
debug_line_str_hash.
Patch cf2ac1c30a for solving PR97969 was
assumed for targets with absent 3-op add insn. But the original patch did
not check this. This patch adds the check.
gcc/ChangeLog:
PR rtl-optimization/98722
* lra-eliminations.c (eliminate_regs_in_insn): Check that target
has no 3-op add insn to transform insns containing two pluses.
gcc/testsuite/ChangeLog:
PR rtl-optimization/98722
* g++.target/s390/pr98722.C: New.
The following tries to handle overflow in the integer computations
done by lambda ops of dependence analysis by failing instead of
silently continuing with overflowed values.
It also avoids treating large unsigned CHREC_RIGHT as negative
unless the chrec is of pointer type and avoids the most negative
integer value to avoid excessive overflow checking (with this
the fix for PR98758 can be partly simplified as seen).
I've added add_hwi and mul_hwi functions computing HOST_WIDE_INT
signed sum and product with indicating overflow, they hopefully
get matched to the appropriate internal functions.
I don't have any testcases triggering overflow in any of the
guarded computations.
2021-01-20 Richard Biener <rguenther@suse.de>
* hwint.h (add_hwi): New function.
(mul_hwi): Likewise.
* tree-data-ref.c (initialize_matrix_A): Properly translate
tree constants and avoid HOST_WIDE_INT_MIN.
(lambda_matrix_row_add): Avoid undefined integer overflow
and return true on such overflow.
(lambda_matrix_right_hermite): Handle overflow from
lambda_matrix_row_add gracefully. Simplify previous fix.
(analyze_subscript_affine_affine): Likewise.
This patch adds patterns for optimizing
x < y || y == XXX_MIN to x <= y-1
x >= y && y != XXX_MIN to x > y-1
if y is an integer with TYPE_OVERFLOW_WRAPS.
This fixes pr96674.
Tested on x86_64-pc-linux-gnu.
For this function
bool f(unsigned a, unsigned b)
{
return (b == 0) | (a < b);
}
the code without the patch is
test esi,esi
sete al
cmp esi,edi
seta dl
or eax,edx
ret
the code with the patch is
sub esi,0x1
cmp esi,edi
setae al
ret
PR tree-optimization/96674
gcc/
* match.pd: New patterns: x < y || y == XXX_MIN --> x <= y - 1
x >= y && y != XXX_MIN --> x > y - 1
gcc/testsuite
* gcc.dg/pr96674.c: New tests.
Here, during partial instantiation of the generic lambda, we do
tsubst_copy on the CLASS_PLACEHOLDER_TEMPLATE for U{0} which yields a
(level-lowered) TEMPLATE_TEMPLATE_PARM rather than the corresponding
TEMPLATE_DECL. This later confuses do_class_deduction which expects
that a CLASS_PLACEHOLDER_TEMPLATE is always a TEMPLATE_DECL.
gcc/cp/ChangeLog:
PR c++/95434
* pt.c (tsubst) <case TEMPLATE_TYPE_PARM>: If tsubsting
CLASS_PLACEHOLDER_TEMPLATE yields a TEMPLATE_TEMPLATE_PARM,
adjust to its TEMPLATE_TEMPLATE_PARM_TEMPLATE_DECL.
gcc/testsuite/ChangeLog:
PR c++/95434
* g++.dg/cpp2a/lambda-generic9.C: New test.
When parsing the base-clause of a class declaration, we need to defer
access checking until the entire base-clause has been seen, so that
access can be properly checked relative to the scope of the class with
all its bases attached. This allows us to accept the declaration of
struct D from Example 2 of [class.access.general] (access12.C below).
Similarly when substituting into the base-clause of a class template,
which is the subject of PR82613.
gcc/cp/ChangeLog:
PR c++/82613
* parser.c (cp_parser_class_head): Defer access checking when
parsing the base-clause until all bases are seen and attached
to the class type.
* pt.c (instantiate_class_template): Likewise when substituting
into dependent bases.
gcc/testsuite/ChangeLog:
PR c++/82613
* g++.dg/parse/access12.C: New test.
* g++.dg/template/access35.C: New test.
duplicate_and_interleave is the main fallback way of loading
a repeating sequence of elements into variable-length vectors.
The code handles cases in which the number of elements in the
sequence is potentially several times greater than the number
of elements in a vector.
Let:
- NE be the (compile-time) number of elements in the sequence
- NR be the (compile-time) number of vector results and
- VE be the (run-time) number of elements in each vector
The basic approach is to duplicate each element into a
separate vector, giving NE vectors in total, then use
log2(NE) rows of NE permutes to generate NE results.
In the worst case — when VE has no known compile-time factor
and NR >= NE — all of these permutes are necessary. However,
if VE is known to be a multiple of 2**F, then each of the
first F permute rows produces duplicate results; specifically,
the high permute for a given pair is the same as the low permute.
The code dealt with this by reusing the low result for the
high result. This part was OK.
However, having duplicate results from one row meant that the
next row did duplicate work. The redundancies would be optimised
away by later passes, but the code tried to avoid generating them
in the first place. This is the part that went wrong.
Specifically, NR is typically less than NE when some permutes are
redundant, so the code tried to use NR to reduce the amount of work
performed. The problem was that, although it correctly calculated
a conservative bound on how many results were needed in each row,
it chose the wrong results for anything other than the final row.
This doesn't usually matter for fully-packed SVE vectors. We first
try to coalesce smaller elements into larger ones, so normally
VE ends up being 2**VQ (where VQ is the number of 128-bit blocks
in an SVE vector). In that situation we'd only apply the faulty
optimisation to the final row, i.e. the case it handled correctly.
E.g. for things like:
void
f (long *x)
{
for (int i = 0; i < 100; i += 8)
{
x[i] += 1;
x[i + 1] += 2;
x[i + 2] += 3;
x[i + 3] += 4;
x[i + 4] += 5;
x[i + 5] += 6;
x[i + 6] += 7;
x[i + 7] += 8;
}
}
(already tested by the testsuite), we'd have 3 rows of permutes
producing 4 vector results. The schemne produced:
1st row: 8 results from 4 permutes, highs duplicates of lows
2nd row: 8 results from 8 permutes (half of which are actually redundant)
3rd row: 4 results from 4 permutes
However, coalescing elements is trickier for unpacked vectors,
and at the moment we don't try to do it (see the GET_MODE_SIZE
check in can_duplicate_and_interleave_p). Unpacked vectors
therefore stress the code in ways that packed vectors didn't.
The patch fixes this by removing the redundancies from each row,
rather than trying to work around them later. This also removes
the redundant work in the second row of the example above.
gcc/
PR tree-optimization/98535
* tree-vect-slp.c (duplicate_and_interleave): Use quick_grow_cleared.
If the high and low permutes are the same, remove the high permutes
from the working set and only continue with the low ones.
The following patch fixes two bugs in the access_ref::inform_access function
(plus some formatting nits).
The first problem is that ref can be various things, e.g. *_DECL, or
SSA_NAME, or IDENTIFIER_NODE. And allocfn is non-NULL only if ref is
(at least originally) an SSA_NAME initialized to the result of some
allocator function (but not e.g. __builtin_alloca_with_align which is
handled differently).
A few lines above the last hunk of this patch in builtins.c, the code uses
if (mode == access_read_write || mode == access_write_only)
{
if (allocfn == NULL_TREE)
{
if (*offstr)
inform (loc, "at offset %s into destination object %qE of size %s",
offstr, ref, sizestr);
else
inform (loc, "destination object %qE of size %s", ref, sizestr);
return;
}
if (*offstr)
inform (loc,
"at offset %s into destination object of size %s "
"allocated by %qE", offstr, sizestr, allocfn);
else
inform (loc, "destination object of size %s allocated by %qE",
sizestr, allocfn);
return;
}
so if allocfn is NULL, it prints whatever ref is, if it is non-NULL,
it prints instead the allocation function. But strangely the hunk
a few lines below wasn't consistent with that and instead printed the
first form only if DECL_P (ref) and would ICE if ref wasn't a decl but
still allocfn was NULL. Fixed by making it consistent what the code does
earlier.
Another bug is that the code earlier contains an ugly hack for VLAs and was
assuming that SSA_NAME_IDENTIFIER must be non-NULL on the lhs of
__builtin_alloca_with_align. While that is likely true for the cases where
the compiler emits this builtin for VLAs (and it will also be true that
the name of the VLA in that case can be taken from that identifier up to the
first .), the builtin is user accessible as the testcase shows, so one can
have any other SSA_NAME in there. I think it would be better to add some
more reliable way how to identify VLA names corresponding to
__builtin_alloca_with_align allocations, perhaps internal fn or whatever,
but that is beyond the scope of this patch.
2021-01-20 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/98721
* builtins.c (access_ref::inform_access): Don't assume
SSA_NAME_IDENTIFIER must be non-NULL. Print messages about
object whenever allocfn is NULL, rather than only when DECL_P
is true. Use %qE instead of %qD for that. Formatting fixes.
* gcc.dg/pr98721-1.c: New test.
* gcc.dg/pr98721-2.c: New test.
This fixes some int arithmetic issues and a bogus truncation.
2021-01-20 Richard Biener <rguenther@suse.de>
PR tree-optimization/98758
* tree-data-ref.c (int_divides_p): Use lambda_int arguments.
(lambda_matrix_right_hermite): Avoid undefinedness with
signed integer abs and multiplication.
(analyze_subscript_affine_affine): Use lambda_int.
* gcc.dg/torture/pr98758.c: New testcase.
Similarly to how we handle erroneous operands to e.g. allocate clause,
this change just removes those clauses instead of accessing TYPE_MAIN_VARIANT
of its type, which doesn't work on error_mark_node. Also, just for good
measure, bails out if TYPE_NAME is NULL.
2021-01-20 Jakub Jelinek <jakub@redhat.com>
PR c++/98742
* semantics.c (finish_omp_clauses) <case OMP_CLAUSE_DETACH>: If
error_operand_p, remove clause without further checking. Check
for non-NULL TYPE_NAME.
* c-c++-common/gomp/task-detach-2.c: New test.
PR debug/98751 reports an issue in which most of libgccjit's tests
fails in DWARF 5 handling with
`.Ldebug_loc2' is already defined"
asm errors.
The bogus label is being emitted at the 3rd in-process iteration, at:
31673 ASM_OUTPUT_LABEL (asm_out_file, loc_section_label);
which on the initial iteration emits:
145 │ .Ldebug_loc0:
on the 2nd iteration:
145 │ .Ldebug_loc1:
and on the 3rd iteration:
145 │ .Ldebug_loc2:
which is a duplicate of a label emitted earlier:
138 │ .section .debug_loclists,"",@progbits
139 │ .long .Ldebug_loc3-.Ldebug_loc2
140 │ .Ldebug_loc2:
141 │ .value 0x5
142 │ .byte 0x8
143 │ .byte 0
144 │ .long 0
145 │ .Ldebug_loc2:
The issue seems to be that init_sections_and_labels creates the label
ASM_GENERATE_INTERNAL_LABEL (loc_section_label, DEBUG_LOC_SECTION_LABEL,
generation);
where "generation" is a static local to init_sections_and_labels that
increments, and thus eventually hits the duplicate value.
It appears that this value is intended to be either 0 or 1, but in
the libgccjit case the compilation code can be invoked an arbitrary
number of times in-process, and hence can eventually lead to a
label name collision.
This patch adds code to dwarf2out_c_finalize (called by
toplev::finalize in libgccjit) to reset the generation counts,
fixing the issue.
gcc/ChangeLog:
PR debug/98751
* dwarf2out.c (output_line_info): Rename static variable
"generation", moving it out of the function to...
(output_line_info_generation): New.
(init_sections_and_labels): Likewise, renaming the variable to...
(init_sections_and_labels_generation): New.
(dwarf2out_c_finalize): Reset the new variables.
This patch re-enables the DWARF5 tests that seem to be functioning again.
It adds a comment to pr41445-7.c that any changes in lines need to be
reflected in the expected output.
The patch also allows for additional failures in ucs.c and reflects that
builtin-sprintf-warn-20.c requires 4 byte wide char support.
gcc/testsuite/ChangeLog:
* gcc.dg/cpp/ucs.c: Expect Invalid warning for 2byte wchar.
* gcc.dg/debug/dwarf2/inline6.c: Remove skip AIX.
* gcc.dg/debug/dwarf2/lang-c11.c: Remove skip AIX.
* gcc.dg/debug/dwarf2/pr41445-7.c: Remove skip AIX.
* gcc.dg/debug/dwarf2/pr41445-8.c: Remove skip AIX.
* gcc.dg/tree-ssa/builtin-sprintf-warn-20.c: Require 4byte wchar.
maybe_instantiate_noexcept doesn't expect to see error_mark_node, but
the new callsite I introduced in r11-6476 can pass error_mark_node to
it. So cope.
gcc/cp/ChangeLog:
PR c++/98659
* pt.c (maybe_instantiate_noexcept): Return false if FN is
error_mark_node.
gcc/testsuite/ChangeLog:
PR c++/98659
* g++.dg/template/deduce8.C: New test.
My recent patch that introduced push_using_decl_bindings didn't
handle USING_DECL redeclaration, therefore things broke. This patch
amends that by breaking out a part of finish_nonmember_using_decl
out to a separate function, push_using_decl_bindings, and calling it.
It needs an overload, because name_lookup is only available inside
of name-lookup.c.
gcc/cp/ChangeLog:
PR c++/98687
* name-lookup.c (push_using_decl_bindings): New, broken out of...
(finish_nonmember_using_decl): ...here.
* name-lookup.h (push_using_decl_bindings): Update declaration.
* pt.c (tsubst_expr): Update the call to push_using_decl_bindings.
gcc/testsuite/ChangeLog:
PR c++/98687
* g++.dg/lookup/using64.C: New test.
* g++.dg/lookup/using65.C: New test.
gcc/ChangeLog:
PR middle-end/98664
* tree-ssa-live.c (remove_unused_scope_block_p): Keep scopes for
all functions, even if they're not declared artificial or inline.
* tree.c (tree_inlined_location): Use macro expansion location
only if scope traversal fails to expose one.
gcc/testsuite/ChangeLog:
PR middle-end/98664
* gcc.dg/Wvla-larger-than-4.c: Adjust expected output.
* gcc.dg/plugin/diagnostic-test-inlining-3.c: Same.
* g++.dg/warn/Wfree-nonheap-object-5.C: New test.
* gcc.dg/Wfree-nonheap-object-4.c: New test.
This patch removes a vestigial use of dk_no_check from
cp_parser_late_parsing_for_member, which ideally should have been
removed as part of the PR41437 patch that improved access checking
inside templates. This allows us to correctly reject f1 and f2 in
the testcase access34.C below (whereas before we'd only reject f3).
Additional testing revealed a new access issue when late-parsing a hidden
friend within a class template. In the testcase friend68.C below, we're
tripping over the checking assert from friend_accessible_p(f, S::j, S, S)
during lookup of j in x.j (for which type_dependent_object_expression_p
returns false, which is why we're doing the lookup at parse time). The
reason for the assert failure is that DECL_FRIENDLIST(S) contains f but
DECL_BEFRIENDING_CLASSES(f) is empty, and so friend_accessible_p (which
looks at DECL_BEFRIENDING_CLASSES) wants to return false, but is_friend
(which looks at DECL_FRIENDLIST) returns true.
For sake of symmetry one would expect that DECL_BEFRIENDING_CLASSES(f)
contains S, but add_friend avoids updating DECL_BEFRIENDING_CLASSES when
the class type (S in this case) is dependent, for some reason.
This patch works around this issue by making friend_accessible_p
consider the DECL_FRIEND_CONTEXT of the access scope. Thus we sidestep
the DECL_BEFRIENDING_CLASSES / DECL_FRIENDLIST asymmetry issue while
correctly validating the x.j access at parse time.
A earlier version of this patch checked friend_accessible_p instead of
protected_accessible_p in the DECL_FRIEND_CONTEXT hunk below, but this
had the side effect of making us accept the ill-formed testcase friend69.C
below (ill-formed because the hidden friend g is not actually a member
of A, so g doesn't have access to B's members despite B befriending A).
gcc/cp/ChangeLog:
PR c++/41437
PR c++/58993
* search.c (friend_accessible_p): If scope is a hidden friend
defined inside a dependent class, consider access from the
class.
* parser.c (cp_parser_late_parsing_for_member): Don't push a
dk_no_check access state.
gcc/testsuite/ChangeLog:
PR c++/41437
PR c++/58993
* g++.dg/opt/pr87974.C: Adjust.
* g++.dg/template/access34.C: New test.
* g++.dg/template/friend68.C: New test.
* g++.dg/template/friend69.C: New test.
Since certain members of a class are a complete-class context
[class.mem.general]p7, we delay their parsing untile the whole class has
been parsed. For instance, NSDMIs and noexcept-specifiers. The order
in which we perform this delayed parsing matters; we were first parsing
NSDMIs and only they did we parse noexcept-specifiers. That turns out
to be wrong: since NSDMIs may use noexcept-specifiers, we must process
noexcept-specifiers first. Otherwise we'll ICE in code that doesn't
expect to see DEFERRED_PARSE.
This doesn't just shift the problem, noexcept-specifiers can use members
with a NSDMI just fine, and I've also tested a similar test with this
member function:
bool f() { return __has_nothrow_constructor (S<true>); }
and that compiled fine too.
gcc/cp/ChangeLog:
PR c++/98333
* parser.c (cp_parser_class_specifier_1): Perform late-parsing
of NSDMIs before late-parsing of noexcept-specifiers.
gcc/testsuite/ChangeLog:
PR c++/98333
* g++.dg/cpp0x/noexcept62.C: New test.
There's no need for this function to have an object, so make it
static and avoid UB.
PR c++/98624
gcc/cp/
* module.cc (trees_out::write_location): Make static.
memrefs_conflict_p assumes that:
[XB + XO, XB + XO + XS)
does not alias
[YB + YO, YB + YO + YS)
whenever:
[XO, XO + XS)
does not intersect
[YO, YO + YS)
In other words, the accesses can alias only if XB == YB at runtime.
However, this doesn't cope correctly with section anchors.
For example, if XB is an anchor symbol and YB is at offset
XO from the anchor, then:
[XB + XO, XB + XO + XS)
overlaps
[YB, YB + YS)
whatever the value of XO is. In other words, when doing the
alias check for two symbols whose local definitions are in
the same block, we should apply the known difference between
their block offsets to the intersection test above.
gcc/
PR rtl-optimization/92294
* alias.c (compare_base_symbol_refs): Take an extra parameter
and add the distance between two symbols to it. Enshrine in
comments that -1 means "either 0 or 1, but we can't tell
which at compile time".
(memrefs_conflict_p): Update call accordingly.
(rtx_equal_for_memref_p): Likewise. Take the distance between symbols
into account.
Hi,
This is a follow-up fix to clean up pr91799. Per review of test results,
it appears that the combination of target and dg-require stanzas is
not sufficient to properly limit the test to 64-bit only on darwin.
This adds an additional dg-require clause to limit the test to 64-bit
environments.
Tested on power7 and power8 using assorted variations of
make -k check-gcc-c "RUNTESTFLAGS=powerpc.exp=pr88233.c
--target_board=unix/'{-mcpu=power7,-mcpu=power6,-mcpu=power8}''{-m32,-m64}'"
PR target/91799
2021-01-19 Will Schmidt <will_schmidt@vnet.ibm.com>
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr88233.c: Update dg- stanzas.
The following avoids ICEing on a indirect calls with a fnspec
in modref analysis.
2021-01-19 Richard Biener <rguenther@suse.de>
PR ipa/98330
* ipa-modref.c (analyze_stmt): Only record a summary for a
direct call.
* g++.dg/pr98330.C: New testcase.
* gcc.dg/pr98330.c: Likewise.
Since SSA names do leak into global tree data structures like
TYPE_SIZE or in this case GFC_DECL_SAVED_DESCRIPTOR because of
frontend bugs we have to be careful to wipe references to the
CFG when we deconstruct SSA form because we now do ggc_free that.
2021-01-19 Richard Biener <rguenther@suse.de>
PR middle-end/98638
* tree-ssanames.c (fini_ssanames): Zero SSA_NAME_DEF_STMT.
Enable a define FIX_LEON3FT_TN0018 for the LEON3FT targets affected
by the GRLIB-TN-0018 errata described here:
https://www.gaisler.com/notes
gcc/
* config/sparc/rtemself.h (TARGET_OS_CPP_BUILTINS): Add
built-in define __FIX_LEON3FT_TN0018.
This fixes input_location leaking with an invalid BLOCK from
expand_call_inline to tree_function_versioning via clone
materialization.
2021-01-19 Richard Biener <rguenther@suse.de>
PR ipa/97673
* tree-inline.c (tree_function_versioning): Set input_location
to UNKNOWN_LOCATION throughout the function.
* gfortran.dg/pr97673.f90: New testcase.
IPA-SRA already contains a check to figure out that an otherwise dead
parameter is actually required because of non-call exceptions, but it
is not present at the equivalent spot where SRA figures out whether
the return statement is used for anything useful. This patch adds
that condition there.
Unfortunately, even though this patch should be good enough for any
normal (I'd even say reasonable) use of the compiler, it hints that
when the user manually switches all sorts of DCE, IPA-SRA would
probably leave behind problematic statements manipulating what
originally were return values, just like it does for parameters (PR
93385). Fixing this properly might unfortunately be a separate issue
from the mentioned bug because the LHS of a call is changed during
call redirection and the caller often is not a clone. But I'll see
what I can do.
Meanwhile, the patch below has been bootstrapped and tested on x86_64.
gcc/ChangeLog:
2021-01-18 Martin Jambor <mjambor@suse.cz>
PR ipa/98690
* ipa-sra.c (ssa_name_only_returned_p): New parameter fun. Check
whether non-call exceptions allow removal of a statement.
(isra_analyze_call): Pass the appropriate function to
ssa_name_only_returned_p.
gcc/testsuite/ChangeLog:
2021-01-18 Martin Jambor <mjambor@suse.cz>
PR ipa/98690
* g++.dg/ipa/pr98690.C: New test.
Think about this case:
./multilib-generator rv32imc-ilp32-rv32imac,rv32imacxthead-f
Here are 2 problems:
1. A unexpected 'xtheadf' extension was made.
2. The arch 'rv32imac' was not be created.
This modification fix these two, and also sorts 'multi-letter'.
gcc/ChangeLog:
* config/riscv/arch-canonicalize (longext_sort): New function for
sorting 'multi-letter'.
* config/riscv/multilib-generator: Adjusting the loop of 'alt' in
'alts'. The 'arch' may not be the first of 'alts'.
(_expand_combination): Add underline for the 'ext' without '*'.
This is because, a single-letter extension can always be treated well
with a '_' prefix, but it cannot be separated out if it is appended
to a multi-letter.
This change reads go:embed directives and attaches them to variables.
We still don't do anything with the directives.
This change also reads the file passed in the -fgo-embedcfg option.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/281533
PR debug/98716
* dwarf.c (read_v2_paths): Allocate zero entry for dirs and
filenames.
(read_line_program): Remove parameter u, change caller. Don't
subtract one from dirs and filenames index.
(read_function_entry): Don't subtract one from filenames index.
PPC64 can generate jumps with clobbered pseudo-regs and a BB with
such jump can have abnormal output edges. IRA hits an assert when trying
to split abnormal critical edge to deal with asm goto output reloads
later. The patch just skips splitting abnormal edges. It is assumed
that asm-goto with output reloads can not be in BB with output abnormal edges.
gcc/ChangeLog:
PR target/97847
* ira.c (ira): Skip abnormal critical edge splitting.
After r11-6614 made cp_walk_subtrees walk into the template of a CTAD
placeholder, we now correctly accept the below testcase. We used to
reject it because find_parameter_packs_r would fail to find the
parameter pack Ts inside the CTAD placeholder within the pack expansion.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/class-deduction77.C: New test.
I forgot one line, which means that if the second operand of the multiplication
isn't constant, it would be just the same as the first one.
2021-01-18 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/98727
* tree-ssa-math-opts.c (match_arith_overflow): Fix up computation of
second .MUL_OVERFLOW operand for signed multiplication with overflow
checking if the second operand of multiplication is not constant.
* gcc.c-torture/execute/pr98727.c: New test.
In dce6c58db8 msebor extended the
"malloc" attribute to support user-defined allocator/deallocator
pairs.
This patch extends the "malloc" checker within -fanalyzer to use
these attributes. It is based on an earlier patch:
'RFC: add "deallocated_by" attribute for use by analyzer'
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555544.html
which added a different attribute. The patch needed a lot of reworking
to support multiple deallocators per allocator.
My hope was that this would provide a minimal level of markup that would
support library-checking without requiring lots of further markup.
I attempted to use this to detect a memory leak within a Linux
driver (CVE-2019-19078), by adding the attribute to mark these fns:
extern struct urb *usb_alloc_urb(int iso_packets, gfp_t mem_flags);
extern void usb_free_urb(struct urb *urb);
where there is a leak of a "urb" on an error-handling path.
Unfortunately I ran into the problem that there are various other fns
that take "struct urb *" and the analyzer conservatively assumes that a
urb passed to them might or might not be freed and thus stops tracking
state for them.
Hence this will only detect issues for the simplest cases (without
adding another attribute).
gcc/analyzer/ChangeLog:
* analyzer.h (is_std_named_call_p): New decl.
* diagnostic-manager.cc (path_builder::get_sm): New.
(state_change_event_creator::state_change_event_creator): Add "pb"
param.
(state_change_event_creator::on_global_state_change): Don't consider
state changes affecting other state_machines.
(state_change_event_creator::on_state_change): Likewise.
(state_change_event_creator::m_pb): New field.
(diagnostic_manager::add_events_for_eedge): Pass pb to visitor
ctor.
* region-model-impl-calls.cc
(region_model::impl_deallocation_call): New.
* region-model.cc: Include "attribs.h".
(region_model::on_call_post): Handle fndecls referenced by
__attribute__((deallocated_by(FOO))).
* region-model.h (region_model::impl_deallocation_call): New decl.
* sm-malloc.cc: Include "stringpool.h" and "attribs.h". Add
leading comment.
(class api): Delete.
(enum resource_state): Update comment for change from api to
deallocator and deallocator_set.
(allocation_state::allocation_state): Drop api param. Add
"deallocators" and "deallocator".
(allocation_state::m_api): Drop field in favor of...
(allocation_state::m_deallocators): New field.
(allocation_state::m_deallocator): New field.
(enum wording): Add WORDING_DEALLOCATED.
(struct deallocator): New.
(struct standard_deallocator): New.
(struct custom_deallocator): New.
(struct deallocator_set): New.
(struct custom_deallocator_set): New.
(struct standard_deallocator_set): New.
(struct deallocator_set_map_traits): New.
(malloc_state_machine::m_malloc): Drop field
(malloc_state_machine::m_scalar_new): Likewise.
(malloc_state_machine::m_vector_new): Likewise.
(malloc_state_machine::m_free): New field
(malloc_state_machine::m_scalar_delete): Likewise.
(malloc_state_machine::m_vector_delete): Likewise.
(malloc_state_machine::deallocator_map_t): New typedef.
(malloc_state_machine::m_deallocator_map): New field.
(malloc_state_machine::deallocator_set_cache_t): New typedef.
(malloc_state_machine::m_custom_deallocator_set_cache): New field.
(malloc_state_machine::custom_deallocator_set_map_t): New typedef.
(malloc_state_machine::m_custom_deallocator_set_map): New field.
(malloc_state_machine::m_dynamic_sets): New field.
(malloc_state_machine::m_dynamic_deallocators): New field.
(api::api): Delete.
(deallocator::deallocator): New ctor.
(deallocator::hash): New.
(deallocator::dump_to_pp): New.
(deallocator::cmp): New.
(deallocator::cmp_ptr_ptr): New.
(standard_deallocator::standard_deallocator): New ctor.
(deallocator_set::deallocator_set): New ctor.
(deallocator_set::dump): New.
(custom_deallocator_set::custom_deallocator_set): New ctor.
(custom_deallocator_set::contains_p): New.
(custom_deallocator_set::maybe_get_single): New.
(custom_deallocator_set::dump_to_pp): New.
(standard_deallocator_set::standard_deallocator_set): New ctor.
(standard_deallocator_set::contains_p): New.
(standard_deallocator_set::maybe_get_single): New.
(standard_deallocator_set::dump_to_pp): New.
(start_p): New.
(class mismatching_deallocation): Update for conversion from api
to deallocator_set and deallocator.
(double_free::emit): Use %qs.
(class use_after_free): Update for conversion from api to
deallocator_set and deallocator.
(malloc_leak::describe_state_change): Only emit "allocated here" on
a start->nonnull transition, rather than on other transitions to
nonnull.
(allocation_state::dump_to_pp): Update for conversion from api to
deallocator_set.
(allocation_state::get_nonnull): Likewise.
(malloc_state_machine::malloc_state_machine): Likewise.
(malloc_state_machine::~malloc_state_machine): New.
(malloc_state_machine::add_state): Update for conversion from api
to deallocator_set.
(malloc_state_machine::get_or_create_custom_deallocator_set): New.
(malloc_state_machine::maybe_create_custom_deallocator_set): New.
(malloc_state_machine::get_or_create_deallocator): New.
(malloc_state_machine::on_stmt): Update for conversion from api
to deallocator_set. Handle "__attribute__((malloc(FOO)))", and
the special attribute set on FOO.
(malloc_state_machine::on_allocator_call): Update for conversion
from api to deallocator_set. Add "returns_nonnull" param and use
it to affect which state to transition to.
(malloc_state_machine::on_deallocator_call): Update for conversion
from api to deallocator_set.
gcc/ChangeLog:
* attribs.h (fndecl_dealloc_argno): New decl.
* builtins.c (call_dealloc_argno): Split out second half of
function into...
(fndecl_dealloc_argno): New.
* doc/extend.texi (Common Function Attributes): Document the
interaction between the analyzer and the malloc attribute.
* doc/invoke.texi (Static Analyzer Options): Likewise.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/attr-malloc-1.c: New test.
* gcc.dg/analyzer/attr-malloc-2.c: New test.
* gcc.dg/analyzer/attr-malloc-4.c: New test.
* gcc.dg/analyzer/attr-malloc-5.c: New test.
* gcc.dg/analyzer/attr-malloc-6.c: New test.
* gcc.dg/analyzer/attr-malloc-CVE-2019-19078-usb-leak.c: New test.
* gcc.dg/analyzer/attr-malloc-misuses.c: New test.
libstdc++-v3/ChangeLog:
PR libstdc++/98725
* testsuite/20_util/unique_ptr/io/lwg2948.cc: Do not try to
write to a wide character stream if wide character support is
disabled in the library.
Support for loop SLP splitting exposed that slp-11b.c has
folding that breaks SLP discovery which isn't what was intended
when the testcase was written. The following makes it SLP-able
and "only" run into the issue that a load permutation is required.
And tries to adjust the target selectors accordingly.
2021-01-18 Richard Biener <rguenther@suse.de>
PR testsuite/97494
* gcc.dg/vect/slp-11b.c: Adjust.
These two tests need:
dg-require-effective-target arm_crypto_ok
dg-add-options arm_crypto
because they use intrinsics that need -mfpu=crypto-neon-fp-armv8.
2021-01-18 Christophe Lyon <christophe.lyon@linaro.org>
gcc/testsuite/
PR target/71233
* gcc.target/arm/simd/vceqz_p64.c: Use arm_crypto options.
* gcc.target/arm/simd/vceqzq_p64.c: Likewise.
This avoids looking for permute optimization when SLP cannot be applied.
2021-01-18 Richard Biener <rguenther@suse.de>
PR testsuite/97299
* gcc.dg/vect/slp-reduc-3.c: Guard VEC_PERM_EXPR scan.
As mentioned in the PR, since the switch to DWARF5 by default instead of
DWARF4, gcc fails to build when configured against recent binutils.
The problem is that cxx11-ios_failure* is built in separate steps,
-S compilation (with -g -O2) followed by some sed and followed by
-c -g -O2 -g0 assembly. When gcc is configured against recent binutils
and DWARF5 is the default, we emit .file 0 "..." directive on which the
assembler then fails (unless --gdwarf-5 is passed to it, but we don't want
that generally because on the other side older assemblers don't like -g*
passed to it when invoked on *.s file with compiler generated debug info.
I hope the bug will be fixed soon on the binutils side, but it would be nice
to have a workaround.
The following patch is one of the possibilities, another one is to do that
but add configure check for whether it is needed,
essentially
echo 'int main () { return 0; }' > conftest.c
${CXX} ${CXXFLAGS} -g -O2 -S conftest.c -o conftest.s
${CXX} ${CXXFLAGS} -g -O2 -g0 -c conftest.s -o conftest.o
and if the last command fails, we need that -gno-as-loc-support.
Or yet another option would be I think do a different check, whether
${CXX} ${CXXFLAGS} -g -O2 -S conftest.c -o conftest.s
${CXX} ${CXXFLAGS} -g -O2 -c conftest.s -o conftest.o
works and if yes, don't add the -g0 to cxx11-ios_failure*.s assembly.
2021-01-18 Jakub Jelinek <jakub@redhat.com>
PR debug/98708
* src/c++11/Makefile.am (cxx11-ios_failure-lt.s, cxx11-ios_failure.s):
Compile with -gno-as-loc-support.
* src/c++11/Makefile.in: Regenerated.
This patch introduces gomp_sem_getcount wrapper, which uses sem_getvalue
for POSIX and atomic loads for linux futex and accel. rtems for now
remains broken.
2021-01-18 Jakub Jelinek <jakub@redhat.com>
* config/linux/sem.h (gomp_sem_getcount): New function.
* config/posix/sem.h (gomp_sem_getcount): New function.
* config/posix/sem.c (gomp_sem_getcount): New function.
* config/accel/sem.h (gomp_sem_getcount): New function.
* task.c (task_fulfilled_p): Use gomp_sem_getcount.
(omp_fulfill_event): Likewise.
Recent code generation changes have affected the count of some instructions.
This patch updates the instruction count for fold-vec-extract on P7 and P8.
Also, some of SSE emulation intrinsics only work on LE systems.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/fold-vec-extract-char.p7.c: Adjust addi count.
* gcc.target/powerpc/fold-vec-extract-double.p7.c: Same.
* gcc.target/powerpc/fold-vec-extract-float.p7.c: Same.
* gcc.target/powerpc/fold-vec-extract-float.p8.c: Same.
* gcc.target/powerpc/fold-vec-extract-int.p7.c: Same.
* gcc.target/powerpc/fold-vec-extract-int.p8.c: Same.
* gcc.target/powerpc/fold-vec-extract-short.p7.c: Same.
* gcc.target/powerpc/fold-vec-extract-short.p8.c: Same.
* gcc.target/powerpc/sse-andnps-1.c: Restrict to LE.
* gcc.target/powerpc/sse-movhps-1.c: Restrict to LE.
* gcc.target/powerpc/sse-movlps-1.c: Restrict to LE.
* gcc.target/powerpc/sse2-andnpd-1.c: Restrict to LE.
AIX does not support DWARF 5.
This patch skips the DWARF 5-specific testcases.
gcc/testsuite/ChangeLog:
* g++.dg/debug/dwarf2/inline-ns-2.C: Skip on AIX.
* g++.dg/debug/dwarf2/inline-var-2.C: Skip on AIX.
* g++.dg/debug/dwarf2/inline-var-3.C: Skip on AIX.
* g++.dg/debug/dwarf2/lang-cpp11.C: Skip on AIX.
* g++.dg/debug/dwarf2/lang-cpp14.C: Skip on AIX.
* g++.dg/debug/dwarf2/lang-cpp17.C: Skip on AIX.
* g++.dg/debug/dwarf2/lang-cpp20.C: Skip on AIX.
* gcc.dg/debug/dwarf2/inline6.c: Skip on AIX.
* gcc.dg/debug/dwarf2/lang-c11.c: Skip on AIX.
* gcc.dg/debug/dwarf2/pr41445-7.c: Skip on AIX.
* gcc.dg/debug/dwarf2/pr41445-8.c: Skip on AIX.
GCC now defaults to DWARF 5. AIX only supports DWARF 4 (3.5).
This patch overrides the default DWARF version to 4 unless explicitly
stated.
gcc/ChangeLog:
* config/rs6000/aix71.h (SUBTARGET_OVERRIDE_OPTIONS): Override
dwarf_version to 4.
* config/rs6000/aix72.h (SUBTARGET_OVERRIDE_OPTIONS): Same.
after switching to materialization of clones on demand, the verifier
can happen to see edges leading to a clone of a materialized clone.
This means its clone_of is NULL and former_clone_of needs to be
checked in order to verify that the callee is a clone of the original
decl, which it did not do and reported edges to pointing to a wrong
place.
Fixed with the following patch, which has been pre-approved by Honza.
Bootstrapped and tested on x86_64-linux, pushed to master.
Martin
gcc/ChangeLog:
2021-01-15 Martin Jambor <mjambor@suse.cz>
PR ipa/98222
* cgraph.c (clone_of_p): Check also former_clone_of as we climb
the clone tree.
gcc/testsuite/ChangeLog:
2021-01-15 Martin Jambor <mjambor@suse.cz>
PR ipa/98222
* gcc.dg/ipa/pr98222.c: New test.
2021-01-16 Jakub Jelinek <jakub@redhat.com>
* gfortran.dg/iso_fortran_binding_uint8_array_driver.c: Include
../../../libgfortran/ISO_Fortran_binding.h rather than
ISO_Fortran_binding.h.
This multilib supports Nios II configurations with the "Nios II Floating
Point Hardware 2 Component".
gcc/
* config/nios2/t-rtems: Reset all MULTILIB_* variables. Shorten
multilib directory names. Use MULTILIB_REQUIRED instead of
MULTILIB_EXCEPTIONS. Add -mhw-mul -mhw-mulx -mhw-div
-mcustom-fpu-cfg=fph2 multilib.
The new -mcustom-fpu-cfg=fph2 option variant is useful to build a
multilib for the "Nios II Floating Point Hardware 2 Component":
https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/ug/ug_nios2_custom_instruction.pdf
Directly using the corresponding -mcustom-insn=N options for this
floating-point unit leads to a combinatorial explosion in the potential
count of multilibs which may break the build.
gcc/
* config/nios2/nios2.c (NIOS2_FPU_CONFIG_NUM): Adjust value.
(nios2_init_fpu_configs): Provide register values for new
-mcustom-fpu-cfg=fph2 option variant.
* doc/invoke.texi (-mcustom-fpu-cfg=fph2): Document new option
variant.
Do not warn if custom instructions are not used due to missing
optimization flags. This prevents build errors with -Werror which
cannot be disabled via a dedicated warning option.
One reason to remove these warnings is to enable a multilib for the
"Nios II Floating Point Hardware 2 Component". For example, the
libatomic target library in GCC is built with -Werror and the warnings
removed by this patch resulted in errors like:
cc1: error: switch '-mcustom-fmins' has no effect unless '-ffinite-math-only' is specified [-Werror]
cc1: error: switch '-mcustom-fmaxs' has no effect unless '-ffinite-math-only' is specified [-Werror]
cc1: error: switch '-mcustom-round' has no effect unless '-fno-math-errno' is specified [-Werror]
gcc/
* config/nios2/nios2.c (nios2_custom_check_insns): Remove
custom instruction warnings.
While we had a ((1 << x) & 1) != 0 to x == 0 optimization already,
this patch adds ((cst << x) & 1) optimization too, this time the
second constant must be 1 though, not some power of two, but the first
one can be any constant. If it is even, the result is false, if it is
odd, the result is x == 0.
2021-01-16 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/96669
* match.pd ((CST << x) & 1 -> x == 0): New simplification.
* gcc.dg/tree-ssa/pr96669-1.c: Adjust regexp.
* gcc.dg/tree-ssa/pr96669-2.c: New test.
On the following testcase, handle_builtin_memcmp in the strlen pass folds
the memcmp into comparison of two MEM_REFs. But nothing triggers updating
of addressable vars afterwards, so even when the parameters are no longer
address taken, we force the parameters to stack and back anyway.
This patch causes TODO_update_address_taken to happen right before last forwprop
pass (at the end of last cd_dce), so after strlen1 too.
2021-01-16 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/96271
* passes.def: Pass false argument to first two pass_cd_dce
instances and true to last instance. Add comment that
last instance rewrites no longer addressed locals.
* tree-ssa-dce.c (pass_cd_dce): Add update_address_taken_p member and
initialize it.
(pass_cd_dce::set_pass_param): New method.
(pass_cd_dce::execute): Return TODO_update_address_taken from
last cd_dce instance.
* gcc.target/i386/pr96271.c: New test.
-fcf-protection is automatically enabled in libstdc++ on Linux/x86.
Starting from
commit 77d372abec
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Thu Jan 14 05:56:46 2021 -0800
x86: Error on -fcf-protection with incompatible target
GCC issues an error on -fcf-protection with incompatible target:
... -fcf-protection ... libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit-hle.cc -m32 -O2 -g0 -fno-exceptions -fno-asynchronous-unwind-tables -march=i486 ...
cc1plus: error: '-fcf-protection' is not compatible with this target
FAIL: 29_atomics/atomic_flag/test_and_set/explicit-hle.cc (test for excess errors)
Add -fcf-protection=none to -march=i486 to compile explicit-hle.cc.
* testsuite/29_atomics/atomic_flag/test_and_set/explicit-hle.cc:
Add -fcf-protection=none to -march=i486.
Unlike the other global variables, it is not reset at the beginning of a
function so can leak into the next one.
gcc/ChangeLog:
* final.c (final_start_function_1): Reset force_source_line.
libgfortran/ChangeLog:
* runtime/ISO_Fortran_binding.c (CFI_establish): Fixed signed
char arrays. Signed char or uint8_t arrays would cause
crashes unless an element size is specified.
gcc/testsuite/ChangeLog:
* gfortran.dg/iso_fortran_binding_uint8_array.f90: New test.
* gfortran.dg/iso_fortran_binding_uint8_array_driver.c: New test.
This was an assert that was too picky. The reason I had to alter
array construction was that on stream in, we cannot dynamically determine
a type's dependentness. Thus on stream out of the 'problematic' types,
we save the dependentness for reconstruction. Fortunately the paths into
cp_build_qualified_type_real from streamin with arrays do have the array's
dependentess set as needed.
PR c++/98538
gcc/cp/
* tree.c (cp_build_qualified_type_real): Propagate an array's
dependentness to the copy, if known.
gcc/testsuite/
* g++.dg/template/pr98538.C: New.
I missed some testsuite fall out with my patch to fix mkdeps file
mangling.
PR preprocessor/95253
gcc/testsuite/
* g++.dg/modules/dep-1_a.C: Adjust expected output.
* g++.dg/modules/dep-1_b.C: Likewise.
* g++.dg/modules/dep-2.C: Likewise.
The following patch generalizes the PR64309 simplifications, so that instead
of working only with constants 1 and 1 it works with any two power of two
constants, and works also for right shift (in that case it rules out the
first one being negative, as it is arithmetic shift then).
2021-01-15 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/96669
* match.pd (((1 << A) & 1) != 0 -> A == 0,
((1 << A) & 1) == 0 -> A != 0): Generalize for 1s replaced by
possibly different power of two constants and to right shift too.
* gcc.dg/tree-ssa/pr96669-1.c: New test.
This patch simplifies comparisons that test the sign bit xored together.
If the comparisons are both < 0 or both >= 0, then we should xor the operands
together and compare the result to < 0, if the comparisons are different,
we should compare to >= 0.
2021-01-15 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/96681
* match.pd ((x < 0) ^ (y < 0) to (x ^ y) < 0): New simplification.
((x >= 0) ^ (y >= 0) to (x ^ y) < 0): Likewise.
((x < 0) ^ (y >= 0) to (x ^ y) >= 0): Likewise.
((x >= 0) ^ (y < 0) to (x ^ y) >= 0): Likewise.
* gcc.dg/tree-ssa/pr96681.c: New test.
The -dumpbase and -dumpdir options are excluded from the producer
string output in debug information, but -dumpbase-ext was not. This
patch excludes it as well.
for gcc/ChangeLog
* opts.c (gen_command_line_string): Exclude -dumpbase-ext.
Here, initializing from { } implies a call to the default constructor for
base. We were then seeing that we're initializing a base subobject, so we
tried to copy the result of that call. This is clearly wrong; we should
initialize the base directly from its default constructor.
This patch does a lot of refactoring of unsafe_copy_elision_p and adds
make_safe_copy_elision that will also try to do the base constructor
rewriting from the last patch.
gcc/cp/ChangeLog:
PR c++/98642
* call.c (unsafe_return_slot_p): Return int.
(init_by_return_slot_p): Split out from...
(unsafe_copy_elision_p): ...here.
(unsafe_copy_elision_p_opt): New name for old meaning.
(build_over_call): Adjust.
(make_safe_copy_elision): New.
* typeck2.c (split_nonconstant_init_1): Elide copy from safe
list-initialization.
* cp-tree.h: Adjust.
gcc/testsuite/ChangeLog:
PR c++/98642
* g++.dg/cpp1z/elide5.C: New test.
While working on PR98642 I noticed that in this testcase we were eliding the
copy, calling the complete default constructor to initialize the B base
subobject, and therefore wrongly initializing the non-existent A subobject
of B. The test doesn't care whether the copy is elided or not, but checks
that we are actually calling a base constructor for B.
The patch preserves the elision, but changes the initializer to call the
base constructor instead of the complete constructor.
gcc/cp/ChangeLog:
* call.c (base_ctor_for, make_base_init_ok): New.
(build_over_call): Use make_base_init_ok.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/elide4.C: New test.
build_vec_init_elt models initialization from some arbitrary object of the
type, i.e. copy, but in the case of list-initialization we don't do a copy
from the elements, we initialize them directly.
gcc/cp/ChangeLog:
PR c++/63707
* tree.c (build_vec_init_expr): Don't call build_vec_init_elt
if we got a CONSTRUCTOR.
gcc/testsuite/ChangeLog:
PR c++/63707
* g++.dg/cpp0x/initlist-array13.C: New test.
Use __builtin_alloca. Some systems don't have alloca.h or alloca.
Co-Authored-By: Olivier Hainque <hainque@adacore.com>
for gcc/testsuite/ChangeLog
* gcc.dg/analyzer/alloca-leak.c: Drop alloca.h, use builtin.
* gcc.dg/analyzer/data-model-1.c: Likewise.
* gcc.dg/analyzer/malloc-1.c: Likewise.
* gcc.dg/analyzer/malloc-paths-8.c: Likewise.
The fix for this PR didn't come with any test coverage, I've added
tests that make sure we optimize it no matter what order of the x ^ y ^ z
operands is used.
2021-01-15 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/96671
* gcc.dg/tree-ssa/pr96671-1.c: New test.
* gcc.dg/tree-ssa/pr96671-2.c: New test.
In one of the selftests in g:f10960558540636800cf5d3d6355969621fbc17e
I didn't consider that paths can contain backslashes, which happens
for the tempfiles on Windows hosts.
gcc/ChangeLog:
PR bootstrap/98696
* diagnostic.c
(selftest::test_print_parseable_fixits_bytes_vs_display_columns):
Escape the tempfile name when constructing the expected output.
Ok, here is an updated patch which fixes what I found, and implements what
has been discussed on the mailing list and on IRC, i.e. if the types
are compatible as well as alias sets are same, then it prints
what c_fold_indirect_ref_for_warn managed to create, otherwise it uses
that info for printing offsets using offsetof (except when it starts
with ARRAY_REFs, because one can't have offsetof (struct T[2][2], [1][0].x.y)
The uninit-38.c test (which was the only one I believe which had tests on the
exact spelling of MEM_REF printing) contains mainly changes to have space
before * for pointer types (as that is how the C pretty-printers normally
print types, int * rather than int*), plus what might be considered a
regression from what Martin printed, but it is actually a correctness fix.
When the arg is a pointer with type pointer to VLA with char element type
(let's say the pointer is p), which is what happens in several of the
uninit-38.c tests, omitting the (char *) cast is incorrect, as p + 1
is not the 1 byte after p, but pointer to the end of the VLA.
It only happened to work because of the hacks (which I don't like at all
and are dangerous, DECL_ARTIFICIAL var names with dot inside can be pretty
much anything, e.g. a lot of passes construct their helper vars from some
prefix that designates intended use of the var plus numeric suffix), where
the a.1 pointer to VLA is printed as a which if one is lucky happens to be
a variable with VLA type (rather than pointer to it), and for such vars
a + 1 is indeed &a[0] + 1 rather than &a + 1. But if we want to do this
reliably, we'd need to make sure it comes from VLA (e.g. verify that the
SSA_NAME is defined to __builtin_alloca_with_align and that there exists
a corresponding VAR_DECL with DECL_VALUE_EXPR that has the a.1 variable
in it).
2021-01-15 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/98597
* c-pretty-print.c: Include options.h.
(c_fold_indirect_ref_for_warn): New function.
(print_mem_ref): Use it. If it returns something that has compatible
type and is TBAA compatible with zero offset, print it and return,
otherwise print it using offsetof syntax or array ref syntax. Fix up
printing if MEM_REFs first operand is ADDR_EXPR, or when the first
argument has pointer to array type. Print pointers using the standard
formatting.
* gcc.dg/uninit-38.c: Expect a space in between type name and asterisk.
Expect for now a (char *) cast for VLAs.
* gcc.dg/uninit-40.c: New test.
The PR98597 patch regresses on _Atomic-3.c, as in the C FE building an
array type with qualified elements results in a type incompatible with
when an array type with unqualified elements is qualified afterwards.
This patch adds a workaround for that.
2021-01-15 Jakub Jelinek <jakub@redhat.com>
* c-typeck.c (c_finish_omp_clauses): For reduction build array with
unqualified element type and then call c_build_qualified_type on the
ARRAY_TYPE.
This patch reimplements some more intrinsics using RTL builtins in the
straightforward way.
Thankfully most of the RTL infrastructure is already in place for it.
gcc/
* config/aarch64/aarch64-simd.md (*aarch64_<su>mlsl_hi<mode>):
Rename to...
(aarch64_<su>mlsl_hi<mode>): ... This.
(aarch64_<su>mlsl_hi<mode>): Define.
(*aarch64_<su>mlsl<mode): Rename to...
(aarch64_<su>mlsl<mode): ... This.
* config/aarch64/aarch64-simd-builtins.def (smlsl, umlsl,
smlsl_hi, umlsl_hi): Define builtins.
* config/aarch64/arm_neon.h (vmlsl_high_s8, vmlsl_high_s16,
vmlsl_high_s32, vmlsl_high_u8, vmlsl_high_u16, vmlsl_high_u32,
vmlsl_s8, vmlsl_s16, vmlsl_s32, vmlsl_u8,
vmlsl_u16, vmlsl_u32): Reimplement with builtins.
-fsyntax-only is handled specially in the driver and causes it to add
'-o /dev/null' (or a suitable OS-specific variant thereof). PCH is
handled in the language driver. I'd not sufficiently protected the
-fmodule-only action of adding a dummy assembler from the actions of
-fsyntax-only, so we ended up with two -o options.
PR c++/98591
gcc/cp/
* lang-specs.h: Fix handling of -fmodule-only with -fsyntax-only.
This patch adds a small target-specific pass to remove redundant SVE
PTEST instructions. There are two important uses of this:
- Removing PTESTs after WHILELOs (PR88836). The original testcase
no longer exhibits the problem due to more recent optimisations,
but it can still be seen in simple cases like the one in the patch.
It also shows up in 450.soplex.
- Removing PTESTs after RDFFRs in ACLE code.
This is just an interim “solution” for GCC 11. I hope to replace
it with something generic and target-independent for GCC 12.
However, the use cases above are very important for performance,
so I'd rather not leave the bug unfixed for yet another release cycle.
Since the pass is intended to be short-lived, I've not added
a command-line option for it. The pass can be disabled using
-fdisable-rtl-cc_fusion if necessary.
Although what the pass does is independent of SVE, it's motivated
only by SVE cases and doesn't trigger for any non-SVE test I've seen.
I've therefore gated it on TARGET_SVE and restricted it to PTEST
patterns.
gcc/
PR target/88836
* config.gcc (aarch64*-*-*): Add aarch64-cc-fusion.o to extra_objs.
* Makefile.in (RTL_SSA_H): New variable.
* config/aarch64/t-aarch64 (aarch64-cc-fusion.o): New rule.
* config/aarch64/aarch64-protos.h (make_pass_cc_fusion): Declare.
* config/aarch64/aarch64-passes.def: Add pass_cc_fusion after
pass_combine.
* config/aarch64/aarch64-cc-fusion.cc: New file.
gcc/testsuite/
PR target/88836
* gcc.target/aarch64/sve/acle/general/ldff1_8.c: New test.
* gcc.target/aarch64/sve/ptest_1.c: Likewise.
Noticed while working on something else that the insn_change_watermark
destructor could call cancel_changes for changes that no longer exist.
The loop in cancel_changes is a nop in that case, but:
num_changes = num;
can mess things up.
I think this would only affect nested uses of insn_change_watermark.
gcc/
* recog.h (insn_change_watermark::~insn_change_watermark): Avoid
calling cancel_changes for changes that no longer exist.
One of the test cases failed to link because of missing paths to
libatomic. Reuse procedures in lib/atomic-dg.exp to gather these paths.
gcc/testsuite/ChangeLog:
2021-01-15 Marius Hillenbrand <mhillen@linux.ibm.com>
* gcc.target/s390/s390.exp: Call lib atomic-dg.exp to link
libatomic into testcases in gcc.target/s390/md.
* gcc.target/s390/md/atomic_exchange-1.c: Remove no unnecessary
-latomic.
This patch adds implementations for vceqq_p64, vceqz_p64 and
vceqzq_p64 intrinsics.
vceqq_p64 uses the existing vceq_p64 after splitting the input vectors
into their high and low halves.
vceqz[q] simply call the vceq and vceqq with a second argument equal
to zero.
The added (executable) testcases make sure that the poly64x2_t
variants have results with one element of all zeroes (false) and the
other element with all bits set to one (true).
2021-01-15 Christophe Lyon <christophe.lyon@linaro.org>
gcc/
PR target/71233
* config/arm/arm_neon.h (vceqz_p64, vceqq_p64, vceqzq_p64): New.
gcc/testsuite/
PR target/71233
* gcc.target/aarch64/advsimd-intrinsics/p64_p128.c: Add tests for
vceqz_p64, vceqq_p64 and vceqzq_p64.
* gcc.target/arm/simd/vceqz_p64.c: New test.
* gcc.target/arm/simd/vceqzq_p64.c: New test.
The testcases show that we fail to disregard alignment for invariant
loads. The patch handles them like we handle gather and scatter.
2021-01-15 Richard Biener <rguenther@suse.de>
PR tree-optimization/96376
* tree-vect-stmts.c (get_load_store_type): Disregard alignment
for VMAT_INVARIANT.
gcc/ChangeLog:
* doc/install.texi: Document that some tests need pytest module.
* doc/sourcebuild.texi: Likewise.
gcc/testsuite/ChangeLog:
* lib/gcov.exp: Use 'env python3' for execution of pytests.
Check that pytest accepts all needed options first.
Improve formatting of PASS/FAIL lines.
This aligns p so that the testcase is meaningful for targets
without a hw misaligned access.
2021-01-15 Richard Biener <rguenther@suse.de>
PR testsuite/96147
* gcc.dg/vect/bb-slp-32.c: Align p.
This changes gcc.dg/vect/bb-slp-9.c to scan for a vectorized load
instead of a vectorized BB which then correctly captures the
unaligned load we try to test and not some intermediate built
from scalar vector.
2021-01-15 Richard Biener <rguenther@suse.de>
PR testsuite/96147
* gcc.dg/vect/bb-slp-9.c: Scan for a vector load transform.
gcc.dg/vect/slp-45.c failed to key the vectorization capability
scanning on vect_hw_misalign. Since the stores are strided
they cannot be (all) analyzed to be aligned.
2021-01-15 Richard Biener <rguenther@suse.de>
PR testsuite/96147
* gcc.dg/vect/slp-45.c: Key scanning on
vect_hw_misalign.
This removes scanning that's too difficult to get correct for all
targets, leaving the correctness test for them and keeping the
vectorization capability check to vect_hw_misalign targets.
2021-01-15 Richard Biener <rguenther@suse.de>
PR testsuite/96147
* gcc.dg/vect/slp-43.c: Remove ! vect_hw_misalign scan.
This patch adds implementations for vceqq_p64, vceqz_p64 and
vceqzq_p64 intrinsics.
vceqq_p64 uses the existing vceq_p64 after splitting the input vectors
into their high and low halves.
vceqz[q] simply call the vceq and vceqq with a second argument equal
to zero.
The added (executable) testcases make sure that the poly64x2_t
variants have results with one element of all zeroes (false) and the
other element with all bits set to one (true).
2021-01-15 Christophe Lyon <christophe.lyon@linaro.org>
gcc/
PR target/71233
* config/arm/arm_neon.h (vceqz_p64, vceqq_p64, vceqzq_p64): New.
gcc/testsuite/
PR target/71233
* gcc.target/aarch64/advsimd-intrinsics/p64_p128.c: Add tests for
vceqz_p64, vceqq_p64 and vceqzq_p64.
The testcase morphed in a way no longer testing what it was originally supposed to do and slightly altering it shows the original issue isn't fixed (anymore).
The limit as set as result of PR91403 (and dups) prevents the issue for larger
arrays but the testcase has
double a[128][128];
which results in a group size of "just" 512 (the limit is 4096). Avoiding
the 'BB vectorization with gaps at the end of a load is not supported'
by altering it to do
void foo(void)
{
b[0] = a[0][0];
b[1] = a[1][0];
b[2] = a[2][0];
b[3] = a[3][127];
}
shows that costing has improved further to not account the dead loads making
the previous test inefficient. In fact the underlying issue isn't fixed
(we do code-generate dead loads).
In fact the vector permute load is even profitable, just the excessive
code-generation issue exists (and is "fixed" by capping it a constant
boundary, just too high for this particular testcase).
The testcase now has "dups", so I'll simply remove it.
2021-01-15 Richard Biener <rguenther@suse.de>
PR testsuite/96098
* gcc.dg/vect/bb-slp-pr68892.c: Remove.
The recent changes to error on mixing -march=i386 and -fcf-protection broke
bootstrap. This patch changes lib{atomic,gomp,itm} configury, so that it
only adds -march=i486 to flags if really needed (i.e. when 486 or later isn't
on by default already). Similarly, it will not use ifuncs if -mcx16
(or -march=i686 for 32-bit) is on by default.
2021-01-15 Jakub Jelinek <jakub@redhat.com>
PR target/70454
libatomic/
* configure.tgt: For i?86 and x86_64 determine if -march=i486 needs to
be added through preprocessor check on
__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4. Determine if try_ifunc is needed
based on preprocessor check on __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16
or __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8.
libgomp/
* configure.tgt: For i?86 and x86_64 determine if -march=i486 needs to
be added through preprocessor check on
__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4.
libitm/
* configure.tgt: For i?86 and x86_64 determine if -march=i486 needs to
be added through preprocessor check on
__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4.
This patch enables MVE vshr instructions for auto-vectorization. New
MVE patterns are introduced that take a vector of constants as second
operand, all constants being equal.
The existing mve_vshrq_n_<supf><mode> is kept, as it takes a single
immediate as second operand, and is used by arm_mve.h.
The vashr<mode>3 and vlshr<mode>3 expanders are moved fron neon.md to
vec-common.md, updated to rely on the normal expansion scheme to
generate shifts by immediate.
2020-12-03 Christophe Lyon <christophe.lyon@linaro.org>
gcc/
* config/arm/mve.md (mve_vshrq_n_s<mode>_imm): New entry.
(mve_vshrq_n_u<mode>_imm): Likewise.
* config/arm/neon.md (vashr<mode>3, vlshr<mode>3): Move to ...
* config/arm/vec-common.md: ... here.
gcc/testsuite/
* gcc.target/arm/simd/mve-vshr.c: Add tests for vshr.
This patch enables MVE vshlq instructions for auto-vectorization.
The existing mve_vshlq_n_<supf><mode> is kept, as it takes a single
immediate as second operand, and is used by arm_mve.h.
We move the vashl<mode>3 insn from neon.md to an expander in
vec-common.md, and the mve_vshlq_<supf><mode> insn from mve.md to
vec-common.md, adding the second alternative fron neon.md.
mve_vshlq_<supf><mode> will be used by a later patch enabling
vectorization for vshr, as a unified version of
ashl3<mode3>_[signed|unsigned] from neon.md. Keeping the use of unspec
VSHLQ enables to generate both 's' and 'u' variants.
It is not clear whether the neon_shift_[reg|imm]<q> attribute is still
suitable, since this insn is also used for MVE.
I kept the mve_vshlq_<supf><mode> naming instead of renaming it to
ashl3_<supf>_<mode> as discussed because the reference in
arm_mve_builtins.def automatically inserts the "mve_" prefix and I
didn't want to make a special case for this.
I haven't yet found why the v16qi and v8hi tests are not vectorized.
With dest[i] = a[i] << b[i] and:
{
int i;
unsigned int i.24_1;
unsigned int _2;
int16_t * _3;
short int _4;
int _5;
int16_t * _6;
short int _7;
int _8;
int _9;
int16_t * _10;
short int _11;
unsigned int ivtmp_42;
unsigned int ivtmp_43;
<bb 2> [local count: 119292720]:
<bb 3> [local count: 954449105]:
i.24_1 = (unsigned int) i_23;
_2 = i.24_1 * 2;
_3 = a_15(D) + _2;
_4 = *_3;
_5 = (int) _4;
_6 = b_16(D) + _2;
_7 = *_6;
_8 = (int) _7;
_9 = _5 << _8;
_10 = dest_17(D) + _2;
_11 = (short int) _9;
*_10 = _11;
i_19 = i_23 + 1;
ivtmp_42 = ivtmp_43 - 1;
if (ivtmp_42 != 0)
goto <bb 5>; [87.50%]
else
goto <bb 4>; [12.50%]
<bb 5> [local count: 835156386]:
goto <bb 3>; [100.00%]
<bb 4> [local count: 119292720]:
return;
}
the vectorizer says:
mve-vshl.c:37:96: note: ==> examining statement: _5 = (int) _4;
mve-vshl.c:37:96: note: vect_is_simple_use: operand *_3, type of def: internal
mve-vshl.c:37:96: note: vect_is_simple_use: vectype vector(8) short int
mve-vshl.c:37:96: missed: conversion not supported by target.
mve-vshl.c:37:96: note: vect_is_simple_use: operand *_3, type of def: internal
mve-vshl.c:37:96: note: vect_is_simple_use: vectype vector(8) short int
mve-vshl.c:37:96: note: vect_is_simple_use: operand *_3, type of def: internal
mve-vshl.c:37:96: note: vect_is_simple_use: vectype vector(8) short int
mve-vshl.c:37:117: missed: not vectorized: relevant stmt not supported: _5 = (int) _4;
mve-vshl.c:37:96: missed: bad operation or unsupported loop bound.
mve-vshl.c:37:96: note: ***** Analysis failed with vector mode V8HI
2020-12-03 Christophe Lyon <christophe.lyon@linaro.org>
gcc/
* config/arm/mve.md (mve_vshlq_<supf><mode>): Move to
vec-commond.md.
* config/arm/neon.md (vashl<mode>3): Delete.
* config/arm/vec-common.md (mve_vshlq_<supf><mode>): New.
(vasl<mode>3): New expander.
gcc/testsuite/
* gcc.target/arm/simd/mve-vshl.c: Add tests for vshl.
Avoid advancing to the next stmt when inserting at region boundary
and deal with a vector def being not the only child.
2021-01-15 Richard Biener <rguenther@suse.de>
PR tree-optimization/98685
* tree-vect-slp.c (vect_schedule_slp_node): Refactor handling
of vector extern defs.
* gcc.dg/vect/bb-slp-pr98685.c: New testcase.
I ran sed script late over the tests which accidentally
introduced a syntax error in the tests.
This fixes it.
Committed under the obvious rule.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/complex/complex-mla-template.c: Fix sed.
* gcc.dg/vect/complex/complex-mls-template.c: Likewise.
This is the code that parses an embedcfg file, which is a JSON file
created by the go command when it sees go:embed directives. This code
is not yet called, and does not yet do anything. It's being sent as a
separate CL to isolate just the JSON parsing code.
* Make-lang.in (GO_OBJS): Add go/embed.o.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/281532
I removed the "Alpha" warning from the JIT wiki page on
2020-05-18:
https://gcc.gnu.org/wiki/JIT?action=diff&rev1=47&rev2=48
but forgot to remove it from the documentation, which this
patch does.
gcc/jit/ChangeLog:
* docs/cp/index.rst: Remove "Alpha" warning.
* docs/index.rst: Likewise.
* docs/_build/texinfo/libgccjit.texi: Regenerate
This function had two different local variables for TREE_TYPE (field), one
of which shadowed a parameter, and wasn't using them consistently.
gcc/cp/ChangeLog:
* typeck2.c (process_init_constructor_record): Use fldtype
variable consistently.
If fancy_abort is called before the diagnostic subsystem is initialized,
internal_error will crash internally in a way that prevents a useful
message reaching the user.
This can happen with libgccjit in the case of gcc_assert failures
that occur outside of the libgccjit mutex that guards the rest of
gcc's state, including global_dc (when global_dc may not be
initialized yet, or might be in use by another thread).
I tried a few approaches to fixing this as noted in PR jit/98586
e.g. using a temporary diagnostic_context and initializing it for
the call to internal_error, however the more code that runs, the
more chance there is for other errors to occur.
The best fix appears to be to simply fall back to a minimal abort
implementation that only relies on i18n, as implemented by this
patch.
gcc/ChangeLog:
PR jit/98586
* diagnostic.c (diagnostic_kind_text): Break out this array
from...
(diagnostic_build_prefix): ...here.
(fancy_abort): Detect when diagnostic_initialize has not yet been
called and fall back to a minimal implementation of printing the
ICE, rather than segfaulting in internal_error.
GCC has had the ability to emit fix-it hints in machine-readable form
since GCC 7 via -fdiagnostics-parseable-fixits and
-fdiagnostics-generate-patch.
The former emits additional specially-formatted lines to stderr; the
option and its format were directly taken from a pre-existing option
in clang.
Ideally this could be used by IDEs so that the user can select specific
fix-it hints and have the IDE apply them to the user's source code
(perhaps turning them into clickable elements, perhaps with an
"Apply All" option, etc). Eclipse CDT has supported this option in
this way for a few years:
https://bugs.eclipse.org/bugs/show_bug.cgi?id=497670
As a user of Emacs I would like Emacs to support such a feature.
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=25987 tracks supporting
GCC fix-it output in Emacs. The discussion there identifies two issues
with the existing option:
(a) columns in the output are specified as byte-offsets within the
line (for exact compatibility with the option in clang), whereas emacs
would prefer to consume them as what GCC 11 calls "display columns".
https://gcc.gnu.org/onlinedocs/gcc/Diagnostic-Message-Formatting-Options.html#index-fdiagnostics-column-unit
(b) injecting a command-line option into the build is a fiddly manual
step, varying between build systems. It's far easier for the
user if Emacs simply sets an environment variable when compiling,
GCC uses this to enable the option if it recognizes the value, and
the emacs compilation buffer decodes the additional lines of output
and adds appropriate widgets. In some ways it is a workaround for
not having a language server. Doing it this way means that for the
various combinations of older and newer GCC and older and newer Emacs
that a sufficiently modern combination of both can automatically
support the rich fix-it UI, whereas other combinations will either
not provide the envvar, or silently ignore it, gracefully doing
nothing extra.
Hence this patch adds a new GCC_EXTRA_DIAGNOSTIC_OUTPUT environment
variable to GCC which enables output of machine-parseable fix-it hints.
GCC_EXTRA_DIAGNOSTIC_OUTPUT=fixits-v1 is equivalent to the existing
-fdiagnostics-parseable-fixits option.
GCC_EXTRA_DIAGNOSTIC_OUTPUT=fixits-v2 is the same, but changes the
column output mode to "display columns" rather than bytes, as
required by Emacs.
The discussion in that Emacs bug has some concerns about the encoding
of these lines, and, indeed, the encoding of GCC's stderr in general:
currently we emit a mixture of bytes and UTF-8; I believe we emit
filenames as bytes, diagnostic messages as UTF-8, and quote source code
in the original encoding (PR other/93067 covers converting it to UTF-8 on
output). This patch prints octal-escaped bytes for bytes within
filenames and replacement text that aren't printable (as per
-fdiagnostics-parseable-fixits).
gcc/ChangeLog:
* diagnostic.c (diagnostic_initialize): Eliminate
parseable_fixits_p in favor of initializing extra_output_kind from
GCC_EXTRA_DIAGNOSTIC_OUTPUT.
(convert_column_unit): New function, split out from...
(diagnostic_converted_column): ...this.
(print_parseable_fixits): Add "column_unit" and "tabstop" params.
Use them to call convert_column_unit on the column values.
(diagnostic_report_diagnostic): Eliminate conditional on
parseable_fixits_p in favor of a switch statement on
extra_output_kind, passing the appropriate values to the new
params of print_parseable_fixits.
(selftest::test_print_parseable_fixits_none): Update for new
params of print_parseable_fixits.
(selftest::test_print_parseable_fixits_insert): Likewise.
(selftest::test_print_parseable_fixits_remove): Likewise.
(selftest::test_print_parseable_fixits_replace): Likewise.
(selftest::test_print_parseable_fixits_bytes_vs_display_columns):
New.
(selftest::diagnostic_c_tests): Call it.
* diagnostic.h (enum diagnostics_extra_output_kind): New.
(diagnostic_context::parseable_fixits_p): Delete field in favor
of...
(diagnostic_context::extra_output_kind): ...this new field.
* doc/invoke.texi (Environment Variables): Add
GCC_EXTRA_DIAGNOSTIC_OUTPUT.
* opts.c (common_handle_option): Update handling of
OPT_fdiagnostics_parseable_fixits for change to diagnostic_context
fields.
gcc/testsuite/ChangeLog:
* gcc.dg/plugin/diagnostic-test-show-locus-GCC_EXTRA_DIAGNOSTIC_OUTPUT-fixits-v1.c:
New file.
* gcc.dg/plugin/diagnostic-test-show-locus-GCC_EXTRA_DIAGNOSTIC_OUTPUT-fixits-v2.c:
New file.
* gcc.dg/plugin/plugin.exp (plugin_test_list): Add them.
This adds the initial tests for the complex mul, mls and mla.
These will be enabled in the commits that add the optabs.
Committed as obvious variations of existing tests.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/complex/complex-mla-template.c: New test.
* gcc.dg/vect/complex/complex-mls-template.c: New test.
* gcc.dg/vect/complex/complex-mul-template.c: New test.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c: New test.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c: New test.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-half-float.c: New test.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c: New test.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-float.c: New test.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-half-float.c: New test.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-double.c: New test.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-float.c: New test.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-half-float.c: New test.
* gcc.dg/vect/complex/fast-math-complex-mla-double.c: New test.
* gcc.dg/vect/complex/fast-math-complex-mla-float.c: New test.
* gcc.dg/vect/complex/fast-math-complex-mla-half-float.c: New test.
* gcc.dg/vect/complex/fast-math-complex-mls-double.c: New test.
* gcc.dg/vect/complex/fast-math-complex-mls-float.c: New test.
* gcc.dg/vect/complex/fast-math-complex-mls-half-float.c: New test.
* gcc.dg/vect/complex/fast-math-complex-mul-double.c: New test.
* gcc.dg/vect/complex/fast-math-complex-mul-float.c: New test.
* gcc.dg/vect/complex/fast-math-complex-mul-half-float.c: New test.
This introduces a common class complex_operations_pattern which encapsulates
the complex add, mul, fma and fms pattern in such a way so that the first match
is shared.
gcc/ChangeLog:
* tree-vect-slp-patterns.c (class complex_operations_pattern,
complex_operations_pattern::matches,
complex_operations_pattern::recognize,
complex_operations_pattern::build): New.
(slp_patterns): Use it.
This introduces a post processing step for the pattern matcher to flatten
permutes introduced by the complex multiplications patterns.
This performs a blend early such that SLP is not cancelled by the LOAD_LANES
permute. This is a temporary workaround to the fact that loads are not CSEd
during building and is required to produce efficient code.
gcc/ChangeLog:
* tree-vect-slp.c (optimize_load_redistribution_1): New.
(optimize_load_redistribution, vect_is_slp_load_node): New.
(vect_match_slp_patterns): Use it.
This applies the same feedback received for MUL and the rest to
ADD which was already committed. In short it elides the intermediate
nodes vec and avoids the use of truncate on the SLP child.
gcc/ChangeLog:
* tree-vect-slp-patterns.c (complex_add_pattern::build):
Elide nodes.
I've been implementing a PyGTK viewer for the output of
-fdump-analyzer-json, to help me debug analyzer issues:
https://github.com/davidmalcolm/gcc-analyzer-viewer
The viewer is very much just a work in progress.
This patch adds some fields that were missing from the dump, and
fixes some mistakes I spotted whilst working on the viewer.
gcc/analyzer/ChangeLog:
* engine.cc (strongly_connected_components::to_json): New.
(worklist::to_json): New.
(exploded_graph::to_json): JSON-ify the worklist.
* exploded-graph.h (strongly_connected_components::to_json): New
decl.
(worklist::to_json): New decl.
* store.cc (store::to_json): Fix comment.
* supergraph.cc (supernode::to_json): Fix reference to
"returning_call" in comment. Add optional "fun" to JSON.
(edge_kind_to_string): New.
(superedge::to_json): Add "kind" to JSON.
This test was failing in C++11 because variable templates are only
available in C++14.
gcc/testsuite/ChangeLog:
* g++.dg/template/pr98372.C: Only run in C++14 and up.
Substrings were not reduced early enough for use in initializations,
such as DATA statements. Add an early simplification for substrings
with constant starting and ending points.
gcc/fortran/ChangeLog:
* gfortran.h (gfc_resolve_substring): Add prototype.
* primary.c (match_string_constant): Simplify substrings with
constant starting and ending points.
* resolve.c: Rename resolve_substring to gfc_resolve_substring.
(gfc_resolve_ref): Use renamed function gfc_resolve_substring.
gcc/testsuite/ChangeLog:
* substr_10.f90: New test.
* substr_9.f90: New test.
We get occasional failures of 30_threads/future/members/poll.cc
on some platforms whose high resolution clock doesn't have such a high
resolution; wait_for_0 ends up as 0, and then some asserts fail as
intervals measured as longer than zero are tested for less than
several times zero.
This patch adds some calibration in the iteration count to set a
measurable base time interval with some additional margin.
for libstdc++-v3/ChangeLog
* testsuite/30_threads/future/members/poll.cc: Calibrate
iteration count.
The sigsetjmp analyzer tests use jmp_buf in sigsetjmp and siglongjmp
calls. Not every system that supports sigsetjmp uses the same data
structure for setjmp and sigsetjmp, which results in type mismatches.
This patch changes the tests to use sigjmp_buf, that is the
POSIX-specific type for use with sigsetjmp and siglongjmp.
for gcc/testsuite/ChnageLog
* gcc.dg/analyzer/sigsetjmp-5.c: Use sigjmp_buf.
* gcc.dg/analyzer/sigsetjmp-6.c: Likewise.
The getpass function is not available on all systems; and not
necessarily declared in unistd.h, as expected by the sensitive-1
analyzer test.
Since this is a compile-only test, it doesn't really matter if the
function is defined in the system libraries. All we need is a
declaration, to avoid warnings from calling an undeclared function.
This patch adds the declaration, in a way that is most unlikely to
conflict with any existing declaration.
for gcc/testsuite/ChangeLog
* gcc.dg/analyzer/sensitive-1.c: Declare getpass.
Similar to nvptx offloading, see PR65099 "nvptx offloading: hard-coded 64-bit
assumptions".
gcc/
* config/gcn/mkoffload.c (main): Create an offload image only in
64-bit configurations.
During error recovery after an invalid derived type specification it was
possible to try to resolve an invalid array specification. We now skip
this if the component has the ALLOCATABLE or POINTER attribute and the
shape is not deferred.
gcc/fortran/ChangeLog:
PR fortran/98661
* resolve.c (resolve_component): Derived type components with
ALLOCATABLE or POINTER attribute shall have a deferred shape.
gcc/testsuite/ChangeLog:
PR fortran/98661
* gfortran.dg/pr98661.f90: New test.
During error recovery after an invalid derived type specification it was
possible to try to resolve an invalid array specification. We now skip
this if the component has the ALLOCATABLE or POINTER attribute and the
shape is not deferred.
gcc/fortran/ChangeLog:
PR fortran/98661
* resolve.c (resolve_component): Derived type components with
ALLOCATABLE or POINTER attribute shall have a deferred shape.
gcc/testsuite/ChangeLog:
PR fortran/98661
* gfortran.dg/pr98661.f90: New test.
As recently again discussed in <https://gcc.gnu.org/PR97436> "[nvptx] -m32
support", nvptx offloading other than for 64-bit host has never been
implemented, tested, supported. So we simply should buildn't the nvptx libgomp
plugin in this case.
This avoids build problems if, for example, in a (standard) bi-arch
x86_64-pc-linux-gnu '-m64'/'-m32' build, libcuda is available only in a 64-bit
variant but not in a 32-bit one, which, for example, is the case if you build
GCC against the CUDA toolkit's 'stubs/libcuda.so' (see
<https://stackoverflow.com/a/52784819>).
This amends PR65099 commit a92defdab7 (r225560)
"[nvptx offloading] Only 64-bit configurations are currently supported" to
match the way we're doing this for the HSA/GCN plugins.
libgomp/
PR libgomp/65099
* plugin/configfrag.ac (PLUGIN_NVPTX): Restrict to supported
configurations.
* configure: Regenerate.
* plugin/plugin-nvptx.c (nvptx_get_num_devices): Remove 64-bit
check.
Fix ordering problem on Windows targets where filesystem_error was used
before being defined.
libstdc++-v3/ChangeLog:
PR libstdc++/98471
* include/bits/fs_path.h (__throw_conversion_error): New
function to throw or abort on character conversion errors.
(__wstr_from_utf8): Move definition after filesystem_error has
been defined. Use __throw_conversion_error.
(path::_S_convert<_EcharT>): Use __throw_conversion_error.
(path::_S_str_convert<_CharT, _Traits, _Allocator>): Likewise.
(path::u8string): Likewise.
This fixes an infinite loop one could see for:
git show b87ec922c4 | ./contrib/mklog.py
contrib/ChangeLog:
* mklog.py: Fix infinite loop for unsupported files.
-fcf-protection with CF_BRANCH inserts ENDBR32 at function entries.
ENDBR32 is NOP only on 64-bit processors and 32-bit TARGET_CMOV
processors. Issue an error for -fcf-protection with CF_BRANCH when
compiling for 32-bit non-TARGET_CMOV targets.
gcc/
PR target/98667
* config/i386/i386-options.c (ix86_option_override_internal):
Issue an error for -fcf-protection with CF_BRANCH when compiling
for 32-bit non-TARGET_CMOV targets.
gcc/testsuite/
PR target/98667
* gcc.target/i386/pr98667-1.c: New file.
* gcc.target/i386/pr98667-2.c: Likewise.
* gcc.target/i386/pr98667-3.c: Likewise.
This improves dependence analysis on refs that access the same
array but with different typed but same sized accesses. That's
obviously safe for the case of types that cannot have any
access function based off them. For the testcase this is
signed short vs. unsigned short.
2021-01-14 Richard Biener <rguenther@suse.de>
PR tree-optimization/98674
* tree-data-ref.c (base_supports_access_fn_components_p): New.
(initialize_data_dependence_relation): For two bases without
possible access fns resort to type size equality when determining
shape compatibility.
* gcc.dg/vect/pr98674.c: New testcase.
Also pass -mpreferred-stack-boundary=4 -mno-stackrealign to avoid
disabling STV by:
/* Disable STV if -mpreferred-stack-boundary={2,3} or
-mincoming-stack-boundary={2,3} or -mstackrealign - the needed
stack realignment will be extra cost the pass doesn't take into
account and the pass can't realign the stack. */
if (ix86_preferred_stack_boundary < 128
|| ix86_incoming_stack_boundary < 128
|| opts->x_ix86_force_align_arg_pointer)
opts->x_target_flags &= ~MASK_STV;
PR target/98676
* gcc.target/i386/pr95021-1.c: Add -mpreferred-stack-boundary=4
-mno-stackrealign.
* gcc.target/i386/pr95021-3.c: Likewise.
The patch adding these files was approved in 2020 but it wasn't
committed until 2021, so the copyright years were not updated along with
the years in all the existing files.
libstdc++-v3/ChangeLog:
* include/std/barrier: Update copyright years. Fix whitespace.
* include/std/version: Fix whitespace.
* testsuite/30_threads/barrier/1.cc: Update copyright years.
* testsuite/30_threads/barrier/2.cc: Likewise.
* testsuite/30_threads/barrier/arrive.cc: Likewise.
* testsuite/30_threads/barrier/arrive_and_drop.cc: Likewise.
* testsuite/30_threads/barrier/arrive_and_wait.cc: Likewise.
* testsuite/30_threads/barrier/completion.cc: Likewise.
I flubbed an application of De Morgan's law. Let's just express the
logic directly and let the compiler figure it out. This bug made it
look like pr52830 was fixed, but it is not.
PR c++/98372
gcc/cp/
* tree.c (cp_tree_equal): Correct map_context logic.
gcc/testsuite/
* g++.dg/cpp0x/constexpr-52830.C: Restore dg-ice
* g++.dg/template/pr98372.C: New.
I've made two mistakes in the *sse4_1_zero_extend* define_insn_and_split
patterns. One is that when it uses vector_operand, it should use Bm rather
than m constraint, and the other one is that because it is a post-reload
splitter it needs isa attribute to select which alternatives are valid for
which ISAs. Sorry for messing this up.
2021-01-14 Jakub Jelinek <jakub@redhat.com>
PR target/98670
* config/i386/sse.md (*sse4_1_zero_extendv8qiv8hi2_3,
*sse4_1_zero_extendv4hiv4si2_3, *sse4_1_zero_extendv2siv2di2_3):
Use Bm instead of m for non-avx. Add isa attribute.
* gcc.target/i386/pr98670.c: New test.
This patch optimizes two GIMPLE operations into just one.
As mentioned in the PR, there is some risk this might create more expensive
constants, but sometimes it will make them on the other side less expensive,
it really depends on the exact value.
And if it is an important issue, we should do it in md or during expansion.
2021-01-14 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/96688
* match.pd (~(X >> Y) -> ~X >> Y): New simplification if
~X can be simplified.
* gcc.dg/tree-ssa/pr96688.c: New test.
* gcc.dg/tree-ssa/reassoc-37.c: Adjust scan-tree-dump regex.
* gcc.target/i386/pr66821.c: Likewise.
At the moment, if we use only one vector of an LD4 result,
we'll treat the LD4 as having the cost of a single load.
But all 4 loads and any associated permutes take place
regardless of which results are actually used.
This patch therefore counts the cost of unused LOAD_LANES
results against the first statement in a group. An alternative
would be to multiply the ncopies of the first stmt by the group
size and treat other stmts in the group as having zero cost,
but I thought that might be more surprising when reading dumps.
gcc/
* tree-vect-stmts.c (vect_model_load_cost): Account for unused
IFN_LOAD_LANES results.
gcc/testsuite/
* gcc.target/aarch64/sve/cost_model_11.c: New test.
* gcc.target/aarch64/sve/mask_struct_load_5.c: Use
-fno-vect-cost-model.
Turns out __builtin_convertvector is not as good a fit for the widening
and narrowing intrinsics as I had hoped.
During the veclower phase we lower most of it to bitfield operations and
hope DCE cleans it back up into
vector pack/unpack and extend operations. I received reports that in
more complex cases GCC fails to do that
and we're left with many vector extract operations that clutter the
output.
I think veclower can be improved on that front, but for GCC 10 I'd like
to just implement these builtins
with a good old RTL builtin rather than inline asm.
gcc/
* config/aarch64/aarch64-simd.md (aarch64_<su>xtl<mode>):
Define.
(aarch64_xtn<mode>): Likewise.
* config/aarch64/aarch64-simd-builtins.def (sxtl, uxtl, xtn):
Define
builtins.
* config/aarch64/arm_neon.h (vmovl_s8): Reimplement using
builtin.
(vmovl_s16): Likewise.
(vmovl_s32): Likewise.
(vmovl_u8): Likewise.
(vmovl_u16): Likewise.
(vmovl_u32): Likewise.
(vmovn_s16): Likewise.
(vmovn_s32): Likewise.
(vmovn_s64): Likewise.
(vmovn_u16): Likewise.
(vmovn_u32): Likewise.
(vmovn_u64): Likewise.
The vmovn_high* intrinsics are supposed to map to XTN2 instructions that
narrow their source vector and instert it into the top half of the destination vector.
This patch reimplements them away from inline assembly to an RTL builtin
that performs a vec_concat with a truncate.
gcc/
* config/aarch64/aarch64-simd.md (aarch64_xtn2<mode>_le):
Define.
(aarch64_xtn2<mode>_be): Likewise.
(aarch64_xtn2<mode>): Likewise.
* config/aarch64/aarch64-simd-builtins.def (xtn2): Define
builtins.
* config/aarch64/arm_neon.h (vmovn_high_s16): Reimplement using
builtins.
(vmovn_high_s32): Likewise.
(vmovn_high_s64): Likewise.
(vmovn_high_u16): Likewise.
(vmovn_high_u32): Likewise.
(vmovn_high_u64): Likewise.
gcc/testsuite/
* gcc.target/aarch64/narrow_high-intrinsics.c: Adjust
scan-assembler-times for xtn2.
While running glibc tests several *-textrel tests failed showing that
relocations remained against read only sections. It turned out this was
related to exception headers data encoding being wrong.
By default pointer encoding will always use the DW_EH_PE_absptr format.
This patch uses format DW_EH_PE_pcrel and DW_EH_PE_sdata4. Optionally
DW_EH_PE_indirect is included for global symbols. This eliminates the
relocations.
gcc/ChangeLog:
* config/or1k/or1k.h (ASM_PREFERRED_EH_DATA_FORMAT): New macro.
Define TARGET_ASM_FILE_END as file_end_indicate_exec_stack to allow
generation of the ".note.GNU-stack" section note. This allows binutils
to properly set PT_GNU_STACK in the program header.
This fixes a glibc execstack testsuite test failure found while working
on the OpenRISC glibc port.
gcc/ChangeLog:
* config/or1k/linux.h (TARGET_ASM_FILE_END): Define macro.
This allows the openrisc softfloat implementation to set exceptions.
This also sets the correct tininess after rounding value to be
consistent with hardware and simulator implementations.
libgcc/ChangeLog:
* config/or1k/sfp-machine.h (FP_RND_NEAREST, FP_RND_ZERO,
FP_RND_PINF, FP_RND_MINF, FP_RND_MASK, FP_EX_OVERFLOW,
FP_EX_UNDERFLOW, FP_EX_INEXACT, FP_EX_INVALID, FP_EX_DIVZERO,
FP_EX_ALL): New constant macros.
(_FP_DECL_EX, FP_ROUNDMODE, FP_INIT_ROUNDMODE,
FP_HANDLE_EXCEPTIONS): New macros.
(_FP_TININESS_AFTER_ROUNDING): Change to 1.
This is used in libgcc and now glibc to detect when hardware floating
point operations are supported by the target.
gcc/ChangeLog:
* config/or1k/or1k.h (TARGET_CPU_CPP_BUILTINS): Add builtin
define for __or1k_hard_float__.
Defining this to not abort as found when working on running tests in
the glibc test suite.
We implement this with a call to _mcount with no arguments. The required
return address's will be pulled from the stack. Passing the LR (r9) as
an argument had problems as sometimes r9 is clobbered by the GOT logic
in the prologue before the call to _mcount.
gcc/ChangeLog:
* config/or1k/or1k.h (NO_PROFILE_COUNTERS): Define as 1.
(PROFILE_HOOK): Define to call _mcount.
(FUNCTION_PROFILER): Change from abort to no-op.
In r11-4690 we removed the call to finish_nonmember_using_decl in
tsubst_expr/DECL_EXPR in the USING_DECL block. This was done not
to perform name lookup twice for a non-dependent using-decl, which
sounds sensible.
However, finish_nonmember_using_decl also pushes the decl's bindings
which we still have to do so that we can find the USING_DECL's name
later. In this case, we've got a USING_DECL N::operator<< that we are
tsubstituting. We already looked it up while parsing the template
"foo", and lookup_using_decl stashed the OVERLOAD it found into
USING_DECL_DECLS. Now we just have to update the IDENTIFIER_BINDING of
the identifier for operator<< with the overload the name is bound to.
I didn't want to export push_local_binding so I've introduced a new
wrapper.
gcc/cp/ChangeLog:
PR c++/98231
* name-lookup.c (push_using_decl_bindings): New.
* name-lookup.h (push_using_decl_bindings): Declare.
* pt.c (tsubst_expr): Call push_using_decl_bindings.
gcc/testsuite/ChangeLog:
PR c++/98231
* g++.dg/lookup/using63.C: New test.
These simplifications are only simplifications if the (~D ^ C) or (D ^ C)
expressions fold into gimple vals, but in that case they decrease number of
operations by 1.
2021-01-13 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/96691
* match.pd ((~X | C) ^ D -> (X | C) ^ (~D ^ C),
(~X & C) ^ D -> (X & C) ^ (D ^ C)): New simplifications if
(~D ^ C) or (D ^ C) can be simplified.
* gcc.dg/tree-ssa/pr96691.c: New test.
contrib/ChangeLog:
* gcc-changelog/git_commit.py: Support wrapping of functions
in parentheses that can take multiple lines.
* gcc-changelog/test_email.py: Add tests for it.
* gcc-changelog/test_patches.txt: Add 2 patches.
This avoids canonicalizing BIT_FIELD_REF <T1> (a, <sz>, 0) to
(T1)a on integer typed a. This confuses the vectorizer SLP matching.
With this delayed to after vector lowering the testcase in PR92645
from Skia is now finally optimized to reasonable assembly.
2021-01-13 Richard Biener <rguenther@suse.de>
PR tree-optimization/92645
* match.pd (BIT_FIELD_REF to conversion): Delay canonicalization
until after vector lowering.
* gcc.target/i386/pr92645-7.c: New testcase.
* gcc.dg/tree-ssa/ssa-fre-54.c: Adjust.
* gcc.dg/pr69047.c: Likewise.
I misunderstood the cp_build_function_call_vec API, thinking a NULL
vector was an acceptable way of passing no arguments. You need to
pass a vector of no elements.
PR c++/98626
gcc/cp/
* module.cc (module_add_import_initializers): Pass a
zero-element argument vector.
This patch extends the MLS/MSB patterns to support unpacked
integer vectors. The type suffix could be either the element
size or the container size, but using the element size should
be more efficient.
gcc/
* config/aarch64/aarch64-sve.md (fnma<mode>4): Extend from SVE_FULL_I
to SVE_I.
(@aarch64_pred_fnma<mode>, cond_fnma<mode>, *cond_fnma<mode>_2)
(*cond_fnma<mode>_4, *cond_fnma<mode>_any): Likewise.
gcc/testsuite/
* gcc.target/aarch64/sve/mls_2.c: New test.
* g++.target/aarch64/sve/cond_mls_1.C: Likewise.
* g++.target/aarch64/sve/cond_mls_2.C: Likewise.
* g++.target/aarch64/sve/cond_mls_3.C: Likewise.
* g++.target/aarch64/sve/cond_mls_4.C: Likewise.
* g++.target/aarch64/sve/cond_mls_5.C: Likewise.
This patch extends the MLA/MAD patterns to support unpacked
integer vectors. The type suffix could be either the element
size or the container size, but using the element size should
be more efficient.
gcc/
* config/aarch64/aarch64-sve.md (fma<mode>4): Extend from SVE_FULL_I
to SVE_I.
(@aarch64_pred_fma<mode>, cond_fma<mode>, *cond_fma<mode>_2)
(*cond_fma<mode>_4, *cond_fma<mode>_any): Likewise.
gcc/testsuite/
* gcc.target/aarch64/sve/mla_2.c: New test.
* g++.target/aarch64/sve/cond_mla_1.C: Likewise.
* g++.target/aarch64/sve/cond_mla_2.C: Likewise.
* g++.target/aarch64/sve/cond_mla_3.C: Likewise.
* g++.target/aarch64/sve/cond_mla_4.C: Likewise.
* g++.target/aarch64/sve/cond_mla_5.C: Likewise.
This improves SLP discovery in the face of existing vectors allowing
punning of the vector shape (or even punning from an integer type).
For punning from integer types this does not yet handle lane zero
extraction being represented as conversion rather than BIT_FIELD_REF.
2021-01-13 Richard Biener <rguenther@suse.de>
PR tree-optimization/92645
* tree-vect-slp.c (vect_build_slp_tree_1): Relax supported
BIT_FIELD_REF argument.
(vect_build_slp_tree_2): Record the desired vector type
on the external vector def.
(vectorizable_slp_permutation): Handle required punning
of existing vector defs.
* gcc.target/i386/pr92645-6.c: New testcase.
Noticed while testing on a different machine that the sve/sel_*.c
tests require .variant_pcs support but don't test for it.
.variant_pcs post-dates SVE so there shouldn't be a need to test
for both.
gcc/testsuite/
* gcc.target/aarch64/sve/sel_1.c: Require aarch64_variant_pcs.
* gcc.target/aarch64/sve/sel_2.c: Likewise.
* gcc.target/aarch64/sve/sel_3.c: Likewise.
Noticed while looking at something else that the comment above
def_lookup got the description of the comparisons the wrong way
round.
gcc/
* rtl-ssa/accesses.h (def_lookup): Fix order of comparison results.
This patch fixes a regression on sh4 introduced by the rtl-ssa stuff.
The port had a pattern:
(define_insn "movsf_ie"
[(set (match_operand:SF 0 "general_movdst_operand"
"=f,r,f,f,fy, f,m, r, r,m,f,y,y,rf,r,y,<,y,y")
(match_operand:SF 1 "general_movsrc_operand"
" f,r,G,H,FQ,mf,f,FQ,mr,r,y,f,>,fr,y,r,y,>,y"))
(use (reg:SI FPSCR_MODES_REG))
(clobber (match_scratch:SI 2 "=X,X,X,X,&z, X,X, X, X,X,X,X,X, y,X,X,X,X,X"))]
"TARGET_SH2E
&& (arith_reg_operand (operands[0], SFmode)
|| fpul_operand (operands[0], SFmode)
|| arith_reg_operand (operands[1], SFmode)
|| fpul_operand (operands[1], SFmode)
|| arith_reg_operand (operands[2], SImode))"
But recog can generate this pattern from something that matches:
[(set (match_operand:SF 0 "general_movdst_operand")
(match_operand:SF 1 "general_movsrc_operand")
(use (reg:SI FPSCR_MODES_REG))]
with recog adding the (clobber (match_scratch:SI)) automatically.
recog tests the C condition before adding the clobber, so there might
not be an operands[2] to test.
Similarly, gen_movsf_ie takes only two arguments, with operand 2
being filled in automatically. The only way to create this pattern
with a REG operands[2] before RA would be to generate it directly
from RTL. AFAICT the only things that do this are the secondary
reload patterns, which are generated during RA and come with
pre-vetted operands.
arith_reg_operand rejects 6 specific registers:
return (regno != T_REG && regno != PR_REG
&& regno != FPUL_REG && regno != FPSCR_REG
&& regno != MACH_REG && regno != MACL_REG);
The fpul_operand tests allow FPUL_REG, leaving 5 invalid registers.
However, in all alternatives of movsf_ie, either operand 0 or
operand 1 is a register that belongs r, f or y, none of which
include any of the 5 rejected registers. This means that any
post-RA pattern would satisfy the operands[0] or operands[1]
condition without the operands[2] test being necessary.
gcc/
* config/sh/sh.md (movsf_ie): Remove operands[2] test.
contrib/ChangeLog:
* gcc-changelog/git_commit.py: Allow modifications of older
ChangeLog (or specific) files without need to make a ChangeLog
entry.
* gcc-changelog/test_email.py: Test it.
* gcc-changelog/test_patches.txt: Add new patch.
When the application sets SA_SIGINFO, the signal trampoline parameters
are different to follow POSIX.
libgcc/
* config/i386/gnu-unwind.h (x86_gnu_fallback_frame_state): Add the
posix siginfo case to struct handler_args. Detect between legacy
and siginfo from the second parameter, which is a small sigcode in
the legacy case, and a pointer in the siginfo case.
The following patch implements what I've talked about, i.e. to no longer
force operands of vec_perm_const into registers in the generic code, but let
each of the (currently 8) targets force it into registers individually,
giving the targets better control on if it does that and when and allowing
them to do something special with some particular operands.
And then defines the define_insn_and_split for the 256-bit and 512-bit
permutations into vpmovzx* (only the bw, wd and dq cases, in theory we could
add define_insn_and_split patterns also for the bd, bq and wq).
2021-01-13 Jakub Jelinek <jakub@redhat.com>
PR target/95905
* optabs.c (expand_vec_perm_const): Don't force v0 and v1 into
registers before calling targetm.vectorize.vec_perm_const, only after
that.
* config/i386/i386-expand.c (ix86_vectorize_vec_perm_const): Handle
two argument permutation when one operand is zero vector and only
after that force operands into registers.
* config/i386/sse.md (*avx2_zero_extendv16qiv16hi2_1): New
define_insn_and_split pattern.
(*avx512bw_zero_extendv32qiv32hi2_1): Likewise.
(*avx512f_zero_extendv16hiv16si2_1): Likewise.
(*avx2_zero_extendv8hiv8si2_1): Likewise.
(*avx512f_zero_extendv8siv8di2_1): Likewise.
(*avx2_zero_extendv4siv4di2_1): Likewise.
* config/mips/mips.c (mips_vectorize_vec_perm_const): Force operands
into registers.
* config/arm/arm.c (arm_vectorize_vec_perm_const): Likewise.
* config/sparc/sparc.c (sparc_vectorize_vec_perm_const): Likewise.
* config/ia64/ia64.c (ia64_vectorize_vec_perm_const): Likewise.
* config/aarch64/aarch64.c (aarch64_vectorize_vec_perm_const): Likewise.
* config/rs6000/rs6000.c (rs6000_vectorize_vec_perm_const): Likewise.
* config/gcn/gcn.c (gcn_vectorize_vec_perm_const): Likewise. Use std::swap.
* gcc.target/i386/pr95905-2.c: Use scan-assembler-times instead of
scan-assembler. Add tests with zero vector as first __builtin_shuffle
operand.
* gcc.target/i386/pr95905-3.c: New test.
* gcc.target/i386/pr95905-4.c: New test.
This header was removed recently, so Doxygen shouldn't try to process
it.
libstdc++-v3/ChangeLog:
* doc/doxygen/user.cfg.in (INPUT): Remove include/debug/array.
VN tried to express a sign extension from int to long of
a trucated quantity with a plain conversion but that loses the
truncation. Since there's no single operand doing truncate plus
sign extend (there was a proposed SEXT_EXPR to do that at some
point mapping to RTL sign_extract) don't bother to appropriately
model this with two ops (which the VN insert machinery doesn't
handle and which is unlikely to CSE fully).
2021-01-13 Richard Biener <rguenther@suse.de>
PR tree-optimization/98640
* tree-ssa-sccvn.c (visit_nary_op): Do not try to
handle plus or minus from a truncated operand to be
sign-extended.
* gcc.dg/torture/pr98640.c: New testcase.
In the following testcase we only optimize f2 and f7 to btrl, although we
should optimize that way all of the functions. The problem is the type
demotion/narrowing (which is performed solely during the generic folding and
not later), without it we see the AND performed in SImode and match it as
btrl, but with it while the shifts are still performed in SImode, the
AND is already done in QImode or HImode low part of the shift.
2021-01-13 Jakub Jelinek <jakub@redhat.com>
PR target/96938
* config/i386/i386.md (*btr<mode>_1, *btr<mode>_2): New
define_insn_and_split patterns.
(splitter after *btr<mode>_2): New splitter.
* gcc.target/i386/pr96938.c: New test.
The following patch adds patterns (so far 128-bit only) for permutations
like { 0 16 1 17 2 18 3 19 4 20 5 21 6 22 7 23 } where the second
operand is CONST0_RTX CONST_VECTOR to be emitted as pmovzx.
2021-01-13 Jakub Jelinek <jakub@redhat.com>
PR target/95905
* config/i386/predicates.md (pmovzx_parallel): New predicate.
* config/i386/sse.md (*sse4_1_zero_extendv8qiv8hi2_3): New
define_insn_and_split pattern.
(*sse4_1_zero_extendv4hiv4si2_3): Likewise.
(*sse4_1_zero_extendv2siv2di2_3): Likewise.
* gcc.target/i386/pr95905-1.c: New test.
* gcc.target/i386/pr95905-2.c: New test.
This patch removes code to fix the v0 register in
gcn_conditional_register_usage that was missed out of the previous patch
removing the need for that:
https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534284.html
2021-01-13 Julian Brown <julian@codesourcery.com>
gcc/
* config/gcn/gcn.c (gcn_conditional_register_usage): Remove dead code
to fix v0 register.
This patch fixes a corner case in the AMD GCN md-reorg pass when the
EXEC register is live on entry to a BB, and could be clobbered by code
inserted by the pass before a use in (e.g.) a different BB.
2021-01-13 Julian Brown <julian@codesourcery.com>
gcc/
* config/gcn/gcn.c (gcn_md_reorg): Fix case where EXEC reg is live
on entry to a BB.
GCN has a reciprocal-approximation instruction but no
hardware divide. This patch adjusts the open-coded reciprocal
approximation/Newton-Raphson refinement steps to use fused multiply-add
instructions as is necessary to obtain a properly-rounded result, and
adds further refinement steps to correctly round the full division result.
The patterns in question are still guarded by a flag_reciprocal_math
condition, and do not yet support denormals.
2021-01-13 Julian Brown <julian@codesourcery.com>
gcc/
* config/gcn/gcn-valu.md (recip<mode>2<exec>, recip<mode>2): Use unspec
for reciprocal-approximation instructions.
(div<mode>3): Use fused multiply-accumulate operations for reciprocal
refinement and division result.
* config/gcn/gcn.md (UNSPEC_RCP): New unspec constant.
gcc/testsuite/
* gcc.target/gcn/fpdiv.c: New test.
This patch fixes a typo in the subdf3 pattern that meant it had a
non-standard name and thus the compiler would emit a libcall rather than
the proper hardware instruction for DFmode subtraction.
2021-01-13 Julian Brown <julian@codesourcery.com>
gcc/
* config/gcn/gcn-valu.md (subdf): Rename to...
(subdf3): This.
On powerpc64le, this caused a failure in TestUnshareUidGidMapping
due to stack corruption which resulted in a bogus execve syscall.
Use the existing c wrapper to ensure we respect the ppc abi for
variadic functions.
Fixes PR go/98610
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/282717
This patch adds new movmisalign<mode>_mve_load and store patterns for
MVE to help vectorization. They are very similar to their Neon
counterparts, but use different iterators and instructions.
Indeed MVE supports less vectors modes than Neon, so we use the
MVE_VLD_ST iterator where Neon uses VQX.
Since the supported modes are different from the ones valid for
arithmetic operators, we introduce two new sets of macros:
ARM_HAVE_NEON_<MODE>_LDST
true if Neon has vector load/store instructions for <MODE>
ARM_HAVE_<MODE>_LDST
true if any vector extension has vector load/store instructions for <MODE>
We move the movmisalign<mode> expander from neon.md to vec-commond.md, and
replace the TARGET_NEON enabler with ARM_HAVE_<MODE>_LDST.
The patch also updates the mve-vneg.c test to scan for the better code
generation when loading and storing the vectors involved: it checks
that no 'orr' instruction is generated to cope with misalignment at
runtime.
This test was chosen among the other mve tests, but any other should
be OK. Using a plain vector copy loop (dest[i] = a[i]) is not a good
test because the compiler chooses to use memcpy.
For instance we now generate:
test_vneg_s32x4:
vldrw.32 q3, [r1]
vneg.s32 q3, q3
vstrw.32 q3, [r0]
bx lr
instead of:
test_vneg_s32x4:
orr r3, r1, r0
lsls r3, r3, #28
bne .L15
vldrw.32 q3, [r1]
vneg.s32 q3, q3
vstrw.32 q3, [r0]
bx lr
.L15:
push {r4, r5}
ldrd r2, r3, [r1, #8]
ldrd r5, r4, [r1]
rsbs r2, r2, #0
rsbs r5, r5, #0
rsbs r4, r4, #0
rsbs r3, r3, #0
strd r5, r4, [r0]
pop {r4, r5}
strd r2, r3, [r0, #8]
bx lr
2021-01-12 Christophe Lyon <christophe.lyon@linaro.org>
PR target/97875
gcc/
* config/arm/arm.h (ARM_HAVE_NEON_V8QI_LDST): New macro.
(ARM_HAVE_NEON_V16QI_LDST, ARM_HAVE_NEON_V4HI_LDST): Likewise.
(ARM_HAVE_NEON_V8HI_LDST, ARM_HAVE_NEON_V2SI_LDST): Likewise.
(ARM_HAVE_NEON_V4SI_LDST, ARM_HAVE_NEON_V4HF_LDST): Likewise.
(ARM_HAVE_NEON_V8HF_LDST, ARM_HAVE_NEON_V4BF_LDST): Likewise.
(ARM_HAVE_NEON_V8BF_LDST, ARM_HAVE_NEON_V2SF_LDST): Likewise.
(ARM_HAVE_NEON_V4SF_LDST, ARM_HAVE_NEON_DI_LDST): Likewise.
(ARM_HAVE_NEON_V2DI_LDST): Likewise.
(ARM_HAVE_V8QI_LDST, ARM_HAVE_V16QI_LDST): Likewise.
(ARM_HAVE_V4HI_LDST, ARM_HAVE_V8HI_LDST): Likewise.
(ARM_HAVE_V2SI_LDST, ARM_HAVE_V4SI_LDST, ARM_HAVE_V4HF_LDST): Likewise.
(ARM_HAVE_V8HF_LDST, ARM_HAVE_V4BF_LDST, ARM_HAVE_V8BF_LDST): Likewise.
(ARM_HAVE_V2SF_LDST, ARM_HAVE_V4SF_LDST, ARM_HAVE_DI_LDST): Likewise.
(ARM_HAVE_V2DI_LDST): Likewise.
* config/arm/mve.md (*movmisalign<mode>_mve_store): New pattern.
(*movmisalign<mode>_mve_load): New pattern.
* config/arm/neon.md (movmisalign<mode>): Move to ...
* config/arm/vec-common.md: ... here.
PR target/97875
gcc/testsuite/
* gcc.target/arm/simd/mve-vneg.c: Update test.
LRA can loop infinitely on targets without `reg + imm` insns. Register elimination
on such targets can increase register pressure resulting in permanent
stack size increase and changing elimination offset. To avoid such situation, a simple
transformation can be done to avoid register pressure increase after
generating reload insns containing eliminated hard regs.
gcc/ChangeLog:
PR target/97969
* lra-eliminations.c (eliminate_regs_in_insn): Add transformation
of pattern 'plus (plus (hard reg, const), pseudo)'.
gcc/testsuite/ChangeLog:
PR target/97969
* gcc.target/arm/pr97969.c: New.
This patch teaches cp_walk_subtrees to visit the template represented
by a CTAD placeholder, which would otherwise be not visited during
find_template_parameters. The template may be a template template
parameter (as in the first testcase), or it may implicitly use the
template parameters of an enclosing class template (as in the second
testcase), and in either case we need to visit this tree to record the
template parameters used therein for later satisfaction.
gcc/cp/ChangeLog:
PR c++/98611
* tree.c (cp_walk_subtrees) <case TEMPLATE_TYPE_PARM>: Visit
the template of a CTAD placeholder.
gcc/testsuite/ChangeLog:
PR c++/98611
* g++.dg/cpp2a/concepts-ctad1.C: New test.
* g++.dg/cpp2a/concepts-ctad2.C: New test.
This fixes the check that disqualifies BB vectorization because of
required unrolling to match up with the later exact_div we do. To
not disable the ability to split groups that do not match up
exactly with a choosen vector type this also introduces a soft-fail
mechanism to vect_build_slp_tree_1 which delays failing to after
the matches[] array is populated from other checks and only then
determines the split point according to the vector type.
2021-01-12 Richard Biener <rguenther@suse.de>
PR tree-optimization/98550
* tree-vect-slp.c (vect_record_max_nunits): Check whether
the group size is a multiple of the vector element count.
(vect_build_slp_tree_1): When we need to fail because
the vector type choosen causes unrolling do so lazily
without affecting matches only at the end to guide group splitting.
* g++.dg/opt/pr98550.C: New testcase.
Similarly to 7f967bd2a7, we need to
compare string with strcmp.
gcc/ChangeLog:
PR c++/97284
* optc-save-gen.awk: Compare also n_target_save vars with
strcmp.
gcc/ChangeLog:
* gcov.c (source_info::debug): New.
(print_usage): Add --debug (-D) option.
(process_args): Likewise.
(generate_results): Call src->debug after
accumulate_line_counts.
(read_graph_file): Properly assign id for EXIT_BLOCK.
* profile.c (branch_prob): Dump function body before it is
instrumented.
As the testcase shows, my latest changes caused ICE on that testcase.
The problem is that arith_overflow_check_p now can change the use_stmt
argument (has a reference), so that if it succeeds (returns non-zero),
it points it to the GIMPLE_COND or EQ/NE or COND_EXPR assignment from the
TRUNC_DIV_EXPR assignment.
The problem was that it would change use_stmt also if it returned 0 in some
cases, such as multiple imm uses of the division, and in one of the callers
if arith_overflow_check_p returns 0 it looks at use_stmt again and performs
other checks, which of course assumes that use_stmt is the one passed
to arith_overflow_check_p and not e.g. NULL instead or some other unrelated
stmt.
The following patch fixes that by only changing use_stmt when we are about
to return non-zero (for the MULT_EXPR case, which is the only one with the
need to use different use_stmt).
2021-01-12 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/98629
* tree-ssa-math-opts.c (arith_overflow_check_p): Don't update use_stmt
unless returning non-zero.
* gcc.c-torture/compile/pr98629.c: New test.
We already had x != 0 && y != 0 to (x | y) != 0 and
x != -1 && y != -1 to (x & y) != -1 and
x < 32U && y < 32U to (x | y) < 32U, this patch adds signed
x < 0 && y < 0 to (x | y) < 0. In that case, the low/high seem to be
always the same and just in_p indices whether it is >= 0 or < 0,
also, all types in the same bucket (same precision) should be type
compatible, but we can have some >= 0 and some < 0 comparison mixed,
so the patch handles that by using the right BIT_IOR_EXPR or BIT_AND_EXPR
and doing one set of < 0 or >= 0 first, then BIT_NOT_EXPR and then the other
one. I had to move optimize_range_tests_var_bound before this optimization
because that one deals with signed a >= 0 && a < b, and limited it to the
last reassoc pass as reassoc itself can't virtually undo this optimization
yet (and not sure if vrp would be able to).
2021-01-12 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/95731
* tree-ssa-reassoc.c (optimize_range_tests_cmp_bitwise): Also optimize
x < 0 && y < 0 && z < 0 into (x | y | z) < 0 for signed x, y, z.
(optimize_range_tests): Call optimize_range_tests_cmp_bitwise
only after optimize_range_tests_var_bound.
* gcc.dg/tree-ssa/pr95731.c: New test.
* gcc.c-torture/execute/pr95731.c: New test.
As reported by Matthias, --enable-link-serialization=1 can currently start
two concurrent links first (e.g. gnat1 and cc1).
The problem is that make var = value values seem to work differently between
dependencies and actual rules (where it was tested).
As the language make fragments can be in different order, we can have:
ada.prev = ... magic that will become $(c.serial) under --enable-link-serialization=1
gnat1$(exe): ..... $(ada.prev)
...
c.serial = cc1$(exe)
and while if I add echo $(ada.prev) in the gnat1 rule's command, it prints
cc1, the dependencies are actually evaluated during reading of the goal or
when.
The configure creates (and puts into Makefile) some serialization order of
the languages and in that order c always comes first, and the rest is
actually sorted the way the all_lang_makefrags are already sorted,
so just by forcing c/Make-lang.in first we achieve that X.serial variable
is always defined before some other Y.prev will use it in its goal
dependencies.
2021-01-12 Jakub Jelinek <jakub@redhat.com>
* configure.ac: Ensure c/Make-lang.in comes first in @all_lang_makefrags@.
* configure: Regenerated.
This PR wants us not to warn about missing field initializers when
the code in question takes places in decltype and similar. Fixed
thus.
gcc/cp/ChangeLog:
PR c++/98620
* typeck2.c (process_init_constructor_record): Don't emit
-Wmissing-field-initializers warnings in unevaluated contexts.
gcc/testsuite/ChangeLog:
PR c++/98620
* g++.dg/warn/Wmissing-field-initializers-2.C: New test.
Use a dtor to automatically remove ITER from IMM_USE list in
FOR_EACH_IMM_USE_STMT.
for gcc/ChangeLog
* ssa-iterators.h (end_imm_use_stmt_traverse): Forward
declare.
(auto_end_imm_use_stmt_traverse): New struct.
(FOR_EACH_IMM_USE_STMT): Use it.
(BREAK_FROM_IMM_USE_STMT, RETURN_FROM_IMM_USE_STMT): Remove,
along with uses...
* gimple-ssa-strength-reduction.c: ... here, ...
* graphite-scop-detection.c: ... here, ...
* ipa-modref.c, ipa-pure-const.c, ipa-sra.c: ... here, ...
* tree-predcom.c, tree-ssa-ccp.c: ... here, ...
* tree-ssa-dce.c, tree-ssa-dse.c: ... here, ...
* tree-ssa-loop-ivopts.c, tree-ssa-math-opts.c: ... here, ...
* tree-ssa-phiprop.c, tree-ssa.c: ... here, ...
* tree-vect-slp.c: ... and here, ...
* doc/tree-ssa.texi: ... and the example here.
This patch adds support for both conditional and unconditional unpacked
ASRD. This meant adding a new define_insn for the unconditional form,
instead of reusing the conditional instructions. It also meant
extending the current conditional patterns to support merging with
any independent value, not just zero.
gcc/
* config/aarch64/aarch64-sve.md (sdiv_pow2<mode>3): Extend from
SVE_FULL_I to SVE_I. Generate an UNSPEC_PRED_X.
(*sdiv_pow2<mode>3): New pattern.
(@cond_<sve_int_op><mode>): Extend from SVE_FULL_I to SVE_I.
Wrap the ASRD in an UNSPEC_PRED_X.
(*cond_<sve_int_op><mode>_2): Likewise. Replace the UNSPEC_PRED_X
predicate with a constant PTRUE, if it isn't already.
(*cond_<sve_int_op><mode>_z): Replace with...
(*cond_<sve_int_op><mode>_any): ...this new pattern.
gcc/testsuite/
* gcc.target/aarch64/sve/asrdiv_4.c: New test.
* gcc.target/aarch64/sve/cond_asrd_1.c: Likewise.
* gcc.target/aarch64/sve/cond_asrd_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_asrd_2.c: Likewise.
* gcc.target/aarch64/sve/cond_asrd_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_asrd_3.c: Likewise.
* gcc.target/aarch64/sve/cond_asrd_3_run.c: Likewise.
This patch adds support for unpacked conditional BIC. The type suffix
could be taken from the element size or the container size, so the
patch continues to use the element size. This is consistent with
the existing support for unconditional BIC.
gcc/
* config/aarch64/aarch64-sve.md (*cond_bic<mode>_2): Extend from
SVE_FULL_I to SVE_I.
(*cond_bic<mode>_any): Likewise.
gcc/testsuite/
* g++.target/aarch64/sve/cond_bic_1.C: New test.
* g++.target/aarch64/sve/cond_bic_2.C: Likewise.
* g++.target/aarch64/sve/cond_bic_3.C: Likewise.
* g++.target/aarch64/sve/cond_bic_4.C: Likewise.
This patch extends the SMULH and UMULH support to unpacked vectors.
The type suffix must be taken from the element size rather than the
container size.
The main use of these patterns is to support division and modulus
by a constant. The conditional forms would be hard to trigger from
non-ACLE code, and ACLE code needs fully-packed vectors only.
gcc/
* config/aarch64/aarch64-sve.md (<su>mul<mode>3_highpart)
(@aarch64_pred_<MUL_HIGHPART:optab><mode>): Extend from SVE_FULL_I
to SVE_I.
gcc/testsuite/
* gcc.target/aarch64/sve/mul_highpart_3.c: New test.
This patch adds support for unpacked SVE SABD and UABD.
It also rewrites the patterns so that they match as combine
patterns without the need for REG_EQUAL notes. Finally,
there was no pattern for merging with the second input,
which can be handled by reversing the operands.
The type suffix needs to be taken from the element size rather
than the container size.
gcc/
* config/aarch64/aarch64-sve.md (<su>abd<mode>_3): Extend from
SVE_FULL_I to SVE_I.
(*aarch64_cond_<su>abd<mode>_2): Likewise.
(*aarch64_cond_<su>abd<mode>_any): Likewise.
(@aarch64_pred_<su>abd<mode>): Likewise. Use UNSPEC_PRED_X
for the max and min but not for the minus.
(*aarch64_cond_<su>abd<mode>_3): New pattern.
gcc/testsuite/
* g++.target/aarch64/sve/abd_1.C: New test.
* g++.target/aarch64/sve/cond_abd_1.C: Likewise.
* g++.target/aarch64/sve/cond_abd_2.C: Likewise.
* g++.target/aarch64/sve/cond_abd_3.C: Likewise.
* g++.target/aarch64/sve/cond_abd_4.C: Likewise.
This patch extends the ADR patterns to handle unpacked vectors.
They would work with both elements and containers, but since
the instructions only support .s and .d, we get more coverage
by using containers.
gcc/
* config/aarch64/iterators.md (SVE_24I): New iterator.
* config/aarch64/aarch64-sve.md (*aarch64_adr<mode>_shift): Extend from
SVE_FULL_SDI to SVE_24I. Use containers rather than elements.
gcc/testsuite/
* gcc.target/aarch64/sve/adr_6.c: New test.
This patch adds support for conditional binary ADD, SUB, MUL, SMAX,
UMAX, SMIN, UMIN, LSL, LSR, ASR, AND, ORR and EOR. It's not really
possible to split it up further given how the patterns are written.
Min, max and right-shift need the element size rather than the container
size. The others would work with both, although MUL should be more
efficient when applied to elements instead of containers.
gcc/
* config/aarch64/aarch64-sve.md (@cond_<SVE_INT_BINARY:optab><mode>)
(*cond_<SVE_INT_BINARY:optab><mode>_2): Extend from SVE_FULL_I
to SVE_I.
(*cond_<SVE_INT_BINARY:optab><mode>_3): Likewise.
(*cond_<SVE_INT_BINARY:optab><mode>_any): Likewise.
(*cond_<SVE_INT_BINARY:optab><mode>_2_const): Likewise.
(*cond_<SVE_INT_BINARY:optab><mode>_any_const): Likewise.
gcc/testsuite/
* g++.target/aarch64/sve/cond_arith_1.C: New test.
* g++.target/aarch64/sve/cond_arith_2.C: Likewise.
* g++.target/aarch64/sve/cond_arith_3.C: Likewise.
* g++.target/aarch64/sve/cond_arith_4.C: Likewise.
* g++.target/aarch64/sve/cond_shift_1.C: New test.
* g++.target/aarch64/sve/cond_shift_2.C: Likewise.
* g++.target/aarch64/sve/cond_shift_3.C: Likewise.
* g++.target/aarch64/sve/cond_shift_4.C: Likewise.
This patch makes the SVE_INT_BINARY_IMM patterns support
unpacked arithmetic, covering MUL, SMAX, SMIN, UMAX and UMIN.
For min and max, the type suffix must be taken from the element
size rather than the container size.
The XFAILs are due to PR98602.
gcc/
* config/aarch64/aarch64-sve.md (<SVE_INT_BINARY_IMM:optab><mode>3)
(@aarch64_pred_<SVE_INT_BINARY_IMM:optab><mode>)
(*post_ra_<SVE_INT_BINARY_IMM:optab><mode>3): Extend from SVE_FULL_I
to SVE_I.
gcc/testsuite/
PR testsuite/98602
* g++.target/aarch64/sve/max_1.C: New test.
* g++.target/aarch64/sve/min_1.C: Likewise.
* gcc.target/aarch64/sve/mul_2.c: Likewise.
This patch adds support for unpacked SVE LSL, ASR and LSR.
For right shifts, the type suffix needs to be taken from the
element size rather than the container size.
gcc/
* config/aarch64/aarch64-sve.md (<ASHIFT:optab><mode>3)
(v<ASHIFT:optab><mode>3, @aarch64_pred_<optab><mode>)
(*post_ra_v<ASHIFT:optab><mode>3): Extend from SVE_FULL_I to SVE_I.
gcc/testsuite/
* gcc.target/aarch64/sve/shift_2.c: New test.
In GCC10 cp_walk_subtrees has been changed to walk template arguments.
As the following testcase, that changed the mangling of some functions.
I believe the previous behavior that find_abi_tags_r doesn't recurse into
template args has been the correct one, but setting *walk_subtrees = 0
for the types and handling the types subtree walking manually in
find_abi_tags_r looks too hard, there are a lot of subtrees and details what
should and shouldn't be walked, both in tree.c (walk_type_fields there,
which is static) and in cp_walk_subtrees itself.
The following patch abuses the fact that *walk_subtrees is an int to
tell cp_walk_subtrees it shouldn't walk the template args.
Co-authored-by: Jason Merrill <jason@redhat.com>
gcc/cp/ChangeLog:
PR c++/98481
* class.c (find_abi_tags_r): Set *walk_subtrees to 2 instead of 1
for types.
(mark_abi_tags_r): Likewise.
* decl2.c (min_vis_r): Likewise.
* tree.c (cp_walk_subtrees): If *walk_subtrees_p is 2, look through
typedefs.
gcc/testsuite/ChangeLog:
PR c++/98481
* g++.dg/abi/abi-tag24.C: New test.
The vectorizer, for large permuted grouped loads, generates
inefficient intermediate code (cleaned up only later) that runs
into complexity issues in SCEV analysis and elsewhere. For the
non-single-element interleaving case we already put a hard limit
in place, this applies the same limit to the missing case.
2021-01-11 Richard Biener <rguenther@suse.de>
PR tree-optimization/91403
* tree-vect-data-refs.c (vect_analyze_group_access_1): Cap
single-element interleaving group size at 4096 elements.
* gcc.dg/vect/pr91403.c: New testcase.
The .ld1_args file is not created when HAVE_GNU_LD is false.
The ltrans0.ltrans_arg file is not created when the make jobserver
is available, so remove the MAKEFLAGS variable.
Add an exception for *.gcc_args files similar to the
exception for *.cdtor.* files.
Limit both exceptions to targets that define EH_FRAME_THROUGH_COLLECT2.
That means although the test case does not use C++ constructors
or destructors it is still using dwarf2 frame info.
2021-01-11 Bernd Edlinger <bernd.edlinger@hotmail.de>
PR testsuite/98225
* gcc.misc-tests/outputs.exp: Unset MAKEFLAGS.
Expect .ld1_args only when GNU LD is used.
Add an exception for *.gcc_args files.
This fixes a double-counting in the reduction cost when vectorizing
the reduction through the regular vectorizable_* functions.
2021-01-11 Richard Biener <rguenther@suse.de>
PR tree-optimization/98526
* tree-vect-loop.c (vect_model_reduction_cost): Remove costing
of the actual reduction op for the regular case.
(vectorizable_reduction): Cost the stmts
vect_transform_reduction produces here.
The deprecation phase for access checks is finished.
The `-ftransition=import` and `-ftransition=checkimports` switches no
longer have an effect and are now removed. Symbols that are not visible
in a particular scope will no longer be found by the compiler.
Reviewed-on: https://github.com/dlang/dmd/pull/12124
gcc/d/ChangeLog:
* dmd/MERGE: Merge upstream dmd 2d3d13748.
* d-lang.cc (d_handle_option): Remove OPT_ftransition_checkimports and
OPT_ftransition_import.
* gdc.texi (Warnings): Remove documentation for -ftransition=import
and -ftransition=checkimports.
* lang.opt (ftransition=checkimports): Remove.
(ftransition=import): Remove.
The vec-abi-varargs-1.c testcase on IBM Z currently fails.
While adding an SI mode vector to a DI mode vector the first is unpacked using:
_28 = BIT_INSERT_EXPR <{ 0, 0, 0, 0 }, _2, 0>;
_34 = [vec_unpack_lo_expr] _28;
However, on big endian targets lo refers to the right hand side of the vector - in this case the zeroes.
2021-01-11 Andreas Krebbel <krebbel@linux.ibm.com>
* tree-ssa-forwprop.c (simplify_vector_constructor): For
big-endian, use UNPACK[_FLOAT]_HI.
This fixes a memory leak in complex_add_pattern because I was not calling
vect_free_slp_tree when dissolving one side of the TWO_OPERANDS nodes.
Secondly it also upgrades the class to the new inteface required by the other
patterns.
gcc/ChangeLog:
* tree-vect-slp-patterns.c (class complex_pattern,
class complex_add_pattern): Add parameters to matches.
(complex_add_pattern::build): Free memory.
(complex_add_pattern::matches): Move validation end of match.
(complex_add_pattern::recognize): Likewise.
This fixes a bug with externals and linear_loads_p where I forgot to save the
value before returning.
It also fixes handling of nodes with multiple children on a non VEC_PERM node.
There the child iteration would already resolve the kind and the loads are All
expected to be the same if valid so just return one.
gcc/ChangeLog:
* tree-vect-slp-patterns.c (linear_loads_p): Fix externals.
This fixes an issue where is_linear_load_p could return the incorrect
permutation kind because it is singe pass.
This arranges the candidates in such a way that there won't be any ambiguity so
that the function can still be linear but give correct values.
gcc/ChangeLog:
* tree-vect-slp-patterns.c (is_linear_load_p): Fix ambiguity.
For floating point multiply, we have nice code in reassoc to reassociate
multiplications to almost optimal sequence of as few multiplications as
possible (or library call), but for integral types we just give up
because there is no __builtin_powi* for those types.
As there is no library routine we could use, instead of adding new internal
call just to hold it temporarily and then lower to multiplications again,
this patch for the integral types calls into the sincos pass routine that
expands it into multiplications right away.
2021-01-11 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/95867
* tree-ssa-math-opts.h: New header.
* tree-ssa-math-opts.c: Include tree-ssa-math-opts.h.
(powi_as_mults): No longer static. Use build_one_cst instead of
build_real. Formatting fix.
* tree-ssa-reassoc.c: Include tree-ssa-math-opts.h.
(attempt_builtin_powi): Handle multiplication reassociation without
powi_fndecl using powi_as_mults.
(reassociate_bb): For integral types don't require
-funsafe-math-optimizations to call attempt_builtin_powi.
* gcc.dg/tree-ssa/pr95867.c: New test.
On top of the previous widening_mul patch, this one recognizes also
(non-perfect) signed multiplication with overflow, like:
int
f5 (int x, int y, int *res)
{
*res = (unsigned) x * y;
return x && (*res / x) != y;
}
The problem with such checks is that they invoke UB if x is -1 and
y is INT_MIN during the division, but perhaps the code knows that
those values won't appear. As that case is UB, we can do for that
case whatever we want and handling that case as signed overflow
is the best option. If x is a constant not equal to -1, then the checks
are 100% correct though.
Haven't tried to pattern match bullet-proof checks, because I really don't
know if users would write it in real-world code like that,
perhaps
*res = (unsigned) x * y;
return x && (x == -1 ? (*res / y) != x : (*res / x) != y);
?
https://wiki.sei.cmu.edu/confluence/display/c/INT32-C.+Ensure+that+operations+on+signed+integers+do+not+result+in+overflow
suggests to use twice as wide multiplication (perhaps we should handle that
too, for both signed and unsigned), or some very large code
with 4 different divisions nested in many conditionals, no way one can
match all the possible variants thereof.
2021-01-11 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/95852
* tree-ssa-math-opts.c (maybe_optimize_guarding_check): Change
mul_stmts parameter type to vec<gimple *> &. Before cond_stmt
allow in the bb any of the stmts in that vector, div_stmt and
up to 3 cast stmts.
(arith_cast_equal_p): New function.
(arith_overflow_check_p): Add cast_stmt argument, handle signed
multiply overflow checks.
(match_arith_overflow): Adjust caller. Handle signed multiply
overflow checks.
* gcc.target/i386/pr95852-3.c: New test.
* gcc.target/i386/pr95852-4.c: New test.
The following patch pattern recognizes some forms of multiplication followed
by overflow check through division by one of the operands compared to the
other one, with optional removal of guarding non-zero check for that operand
if possible. The patterns are replaced with effectively
__builtin_mul_overflow or __builtin_mul_overflow_p. The testcases cover 64
different forms of that.
2021-01-11 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/95852
* tree-ssa-math-opts.c (maybe_optimize_guarding_check): New function.
(uaddsub_overflow_check_p): Renamed to ...
(arith_overflow_check_p): ... this. Handle also multiplication
with overflow check.
(match_uaddsub_overflow): Renamed to ...
(match_arith_overflow): ... this. Add cfg_changed argument. Handle
also multiplication with overflow check. Adjust function comment.
(math_opts_dom_walker::after_dom_children): Adjust callers. Call
match_arith_overflow also for MULT_EXPR.
* gcc.target/i386/pr95852-1.c: New test.
* gcc.target/i386/pr95852-2.c: New test.
__builtin_convertvector seems well-suited to implementing the vmovl and
vmovn intrinsics that widen and narrow
the integer elements in a vector.
This removes some more inline assembly from the intrinsics.
gcc/
* config/aarch64/arm_neon.h (vmovl_s8): Reimplement using
__builtin_convertvector.
(vmovl_s16): Likewise.
(vmovl_s32): Likewise.
(vmovl_u8): Likewise.
(vmovl_u16): Likewise.
(vmovl_u32): Likewise.
(vmovn_s16): Likewise.
(vmovn_s32): Likewise.
(vmovn_s64): Likewise.
(vmovn_u16): Likewise.
(vmovn_u32): Likewise.
(vmovn_u64): Likewise.
gcc/testsuite/ChangeLog:
PR gcov-profile/98273
* lib/gcov.exp: Add run-gcov-pytest function which runs pytest.
* g++.dg/gcov/pr98273.C: New test.
* g++.dg/gcov/gcov.py: New test.
* g++.dg/gcov/test-pr98273.py: New test.
gcc/ChangeLog:
* gimple-if-to-switch.cc (struct condition_info): Use auto_var.
(if_chain::is_beneficial): Delete clusters
(find_conditions): Make second argument of conditions_in_bbs a
pointer so that we control over it's lifetime.
(pass_if_to_switch::execute): Delete them.
This patch is to make move_unallocated_pseudos consistent
to what we have in function find_moveable_pseudos, where we
record the original pseudo into pseudo_replaced_reg only if
validate_change succeeds with newreg. To ensure every
unallocated pseudo in move_unallocated_pseudos has expected
information, it's better to add a check and skip it if it's
unexpected. This avoids possible ICEs in future.
gcc/ChangeLog:
* ira.c (move_unallocated_pseudos): Check other_reg and skip if
it isn't set.
Adds TEST_OUTPUT directives and reduces the verbosity of many tests.
Reviewed-on: https://github.com/dlang/dmd/pull/12112
gcc/d/ChangeLog:
* dmd/MERGE: Merge upstream dmd cb1106ad5.
Expression-based contract syntax has been added. Contracts that consist
of a single assertion can now be written more succinctly and multiple
`in` or `out` contracts can be specified for the same function.
Reviewed-on: https://github.com/dlang/dmd/pull/12106
gcc/d/ChangeLog:
* dmd/MERGE: Merge upstream dmd e598f69c0.
Remove fallout from commit 0bd675183d ("match.pd: Add ~(X - Y) -> ~X
+ Y simplification [PR96685]") and paper over the regression caused as
it is not the matter of the test cases affected.
Previously assembly like this:
.text
.align 1
.globl eq_notsi
.type eq_notsi, @function
eq_notsi:
.word 0 # 35 [c=0] procedure_entry_mask
subl2 $4,%sp # 46 [c=32] *addsi3
mcoml 4(%ap),%r0 # 32 [c=16] *one_cmplsi2_ccz
jeql .L1 # 34 [c=26] *branch_ccz
addl2 $2,%r0 # 31 [c=32] *addsi3
.L1:
ret # 40 [c=0] return
.size eq_notsi, .-eq_notsi
was produced. Now this:
.text
.align 1
.globl eq_notsi
.type eq_notsi, @function
eq_notsi:
.word 0 # 36 [c=0] procedure_entry_mask
subl2 $4,%sp # 48 [c=32] *addsi3
movl 4(%ap),%r0 # 33 [c=16] *movsi_2
cmpl %r0,$-1 # 34 [c=8] *cmpsi_ccz/1
jeql .L3 # 35 [c=26] *branch_ccz
subl3 %r0,$1,%r0 # 32 [c=32] *subsi3/1
ret # 27 [c=0] return
.L3:
clrl %r0 # 31 [c=2] *movsi_2
ret # 41 [c=0] return
.size eq_notsi, .-eq_notsi
is, which cannot work with post-reload comparison elimination, due to
the comparison against -1 rather than 0.
Use subtraction from a constant then rather than addition as the former
operation is not transformed, removing these regressions:
FAIL: gcc.target/vax/cmpelim-eq-notsi.c -O1 scan-rtl-dump-times cmpelim "deleting insn with uid" 1
FAIL: gcc.target/vax/cmpelim-eq-notsi.c -O1 scan-assembler-not \t(bit|cmpz?|tst).
FAIL: gcc.target/vax/cmpelim-eq-notsi.c -O1 scan-assembler one_cmplsi[^ ]*_ccz(/[0-9]+)?\n
FAIL: gcc.target/vax/cmpelim-lt-notsi.c -O1 scan-rtl-dump-times cmpelim "deleting insn with uid" 1
FAIL: gcc.target/vax/cmpelim-lt-notsi.c -O1 scan-assembler-not \t(bit|cmpz?|tst).
FAIL: gcc.target/vax/cmpelim-lt-notsi.c -O1 scan-assembler one_cmplsi[^ ]*_ccn(/[0-9]+)?\n
and likewise across some of the other the optimization levels verified.
The LE variant appears unaffected as the new transformation produces
slightly different although still suboptimal code:
.text
.align 1
.globl le_notsi
.type le_notsi, @function
le_notsi:
.word 0 # 27 [c=0] procedure_entry_mask
subl2 $4,%sp # 34 [c=32] *addsi3
movl 4(%ap),%r1 # 23 [c=16] *movsi_2
mcoml %r1,%r0 # 24 [c=8] *one_cmplsi2_ccnz
jleq .L1 # 26 [c=26] *branch_ccnz
subl3 %r1,$1,%r0 # 22 [c=32] *subsi3/1
.L1:
ret # 32 [c=0] return
.size le_notsi, .-le_notsi
but update the test case too, for consistency with the other two.
gcc/testsuite/
* gcc.target/vax/cmpelim-eq-notsi.c: Use subtraction from a
constant then rather than addition.
* gcc.target/vax/cmpelim-le-notsi.c: Likewise.
* gcc.target/vax/cmpelim-lt-notsi.c: Likewise.
For predictable semantics propagate the mode from operands referred by
the FP substitution to the `const_double_zero' expressions used with the
associated condition code calculation. Use an iterator to make copies
of the FP substitution across the FP modes supported as the substitution
now has to match the mode of the operands.
gcc/
* config/vax/vax.md (subst_f<cc>): Add mode to operands and
`const_double_zero'.
For predictable semantics propagate the mode from operands referred by
FP substitutions to the `const_double_zero' expressions used with the
associated condition code calculation, resulting in the following update
to insn-emit.c code produced for the `pdp11-aout' target (with machine
description line numbering change noise removed):
@@ -1514,7 +1514,7 @@
gen_rtx_COMPARE (CCmode,
gen_rtx_ABS (DFmode,
operand1),
- CONST_DOUBLE_ATOF ("0", VOIDmode))),
+ CONST_DOUBLE_ATOF ("0", DFmode))),
gen_rtx_SET (operand0,
gen_rtx_ABS (DFmode,
copy_rtx (operand1)))));
@@ -1555,7 +1555,7 @@
gen_rtx_COMPARE (CCmode,
gen_rtx_NEG (DFmode,
operand1),
- CONST_DOUBLE_ATOF ("0", VOIDmode))),
+ CONST_DOUBLE_ATOF ("0", DFmode))),
gen_rtx_SET (operand0,
gen_rtx_NEG (DFmode,
copy_rtx (operand1)))));
@@ -1790,7 +1790,7 @@
gen_rtx_MULT (DFmode,
operand1,
operand2),
- CONST_DOUBLE_ATOF ("0", VOIDmode))),
+ CONST_DOUBLE_ATOF ("0", DFmode))),
gen_rtx_SET (operand0,
gen_rtx_MULT (DFmode,
copy_rtx (operand1),
@@ -1942,7 +1942,7 @@
gen_rtx_DIV (DFmode,
operand1,
operand2),
- CONST_DOUBLE_ATOF ("0", VOIDmode))),
+ CONST_DOUBLE_ATOF ("0", DFmode))),
gen_rtx_SET (operand0,
gen_rtx_DIV (DFmode,
copy_rtx (operand1),
Provide a new iterator to provide copies of FP substitutions across the
FP modes supported as the substitutions now need to match the mode of
the operands.
gcc/
* config/pdp11/pdp11.md (PDPfp): New mode iterator.
(fcc_cc, fcc_ccnz): Use it. Add mode to `const_double_zero' and
operands.
As conversions between signed integers and signed enums with the same
precision are useless in GIMPLE, it seems strange that we require that
POINTER_DIFF_EXPR result must be INTEGER_TYPE.
If we really wanted to require that, we'd need to change the gimplifier
to ensure that, which it isn't the case on the following testcase.
What is going on during the gimplification is that when we have the
(enum T) (p - q) cast, it is stripped through
/* Strip away as many useless type conversions as possible
at the toplevel. */
STRIP_USELESS_TYPE_CONVERSION (*expr_p);
and when the MODIFY_EXPR is gimplified, the *to_p has enum T type,
while *from_p has intptr_t type and as there is no conversion in between,
we just create GIMPLE_ASSIGN from that.
2021-01-09 Jakub Jelinek <jakub@redhat.com>
PR c++/98556
* tree-cfg.c (verify_gimple_assign_binary): Allow lhs of
POINTER_DIFF_EXPR to be any integral type.
* c-c++-common/pr98556.c: New test.
If an asm insn fails constraint checking during vregs, it is just deleted.
We don't delete asm goto though because of the edges to the labels, so
instantiate_virtual_regs_in_insn would just remove the inputs and their
constraints, the pattern etc.
This worked fine when asm goto couldn't have output operands, but causes
ICEs later on when it has more than one output (and furthermore doesn't
really remove the problematic outputs). The problem is that
for multiple outputs we have a PARALLEL with multiple ASM_OPERANDS, but
those must use the same ASM_OPERANDS_INPUT_VEC etc., but the code was
adjusting just one.
The following patch turns invalid asm goto into a bare
asm goto ("" : : : : lab, lab2, lab3);
i.e. no inputs/outputs/clobbers, just the labels.
2021-01-09 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/98603
* function.c (instantiate_virtual_regs_in_insn): For asm goto
with impossible constraints, drop all SETs, CLOBBERs, drop PARALLEL
if any, set ASM_OPERANDS mode to VOIDmode and change
ASM_OPERANDS_OUTPUT_CONSTRAINT and ASM_OPERANDS_OUTPUT_IDX.
* gcc.target/i386/pr98603.c: New test.
* gcc.target/aarch64/pr98603.c: New test.
Back when I introduced debug markers, I seem to have been under the
impression that location line 0 would only ever occur for unknown and
builtin locations.
Though line 0 never comes up in normal processing of source files, and
debug info formats often cannot represent them, I suppose there's no
need to preemptively discard them during final.
for gcc/ChangeLog
PR debug/97714
* final.c (notice_source_line): Narrow down the condition to
skip a line-0 marker.
for gcc/testsuite/ChangeLog
PR debug/97714
* gcc.dg/debug/pr97714.c: New.
The destination register is only partially overwritten, so + should be
used instead of =.
gcc/ChangeLog:
2021-01-08 Ilya Leoshkevich <iii@linux.ibm.com>
* config/s390/vector.md (*tf_to_fprx2_0): Rename from
"*mov_tf_to_fprx2_0" for consistency, fix constraint.
(*tf_to_fprx2_1): Rename from "*mov_tf_to_fprx2_1" for
consistency, fix constraint.
Give end users the opportunity to find out whether long doubles are
stored in floating-point register pairs or in vector registers, so that
they could fine-tune their asm statements.
gcc/ChangeLog:
2020-12-14 Ilya Leoshkevich <iii@linux.ibm.com>
* config/s390/s390-c.c (s390_def_or_undef_macro): Accept
callables instead of mask values.
(struct target_flag_set_p): New predicate.
(s390_cpu_cpp_builtins_internal): Define or undefine
__LONG_DOUBLE_VX__ macro.
2020-12-14 Ilya Leoshkevich <iii@linux.ibm.com>
gcc/testsuite/ChangeLog:
* gcc.target/s390/vector/long-double-vx-macro-off-on.c: New test.
* gcc.target/s390/vector/long-double-vx-macro-on-off.c: New test.
We shouldn't do replace_result_decl after evaluating a call that returns
a PMF because PMF temporaries aren't wrapped in a TARGET_EXPR (and so we
can't trust ctx->object), and PMF initializers can't be self-referential
anyway, so replace_result_decl would always be a no-op.
To that end, this patch changes the relevant AGGREGATE_TYPE_P test to
CLASS_TYPE_P, which should rule out PMFs (as well as arrays, which we
can't return and therefore won't see here). This fixes an ICE from the
sanity check in replace_result_decl in the below testcase during
constexpr evaluation of the call f() in the initializer g(f()).
gcc/cp/ChangeLog:
PR c++/98551
* constexpr.c (cxx_eval_call_expression): Check CLASS_TYPE_P
instead of AGGREGATE_TYPE_P before calling replace_result_decl.
gcc/testsuite/ChangeLog:
PR c++/98551
* g++.dg/cpp0x/constexpr-pmf2.C: New test.
In the first testcase below, we incorrectly reject the use of the
protected non-static member A::var0 from C<int>::g() because
check_accessibility_of_qualified_id, at template parse time, determines
that the access doesn't go through 'this'. (This happens because the
dependent base B<T> of C<T> doesn't have a binfo object, so it appears
to DERIVED_FROM_P that A is not an indirect base of C<T>.) From there
we create the corresponding deferred access check, which we then
perform at instantiation time and which (expectedly) fails.
The problem ultimately seems to be that we can't in general determine
whether a use of a scoped non-static member goes through 'this' until
instantiation time, as the second testcase below illustrates. So this
patch makes check_accessibility_of_qualified_id punt in such situations
to avoid creating a bogus deferred access check.
gcc/cp/ChangeLog:
PR c++/98515
* semantics.c (check_accessibility_of_qualified_id): Punt if
we're checking access of a scoped non-static member inside a
class template.
gcc/testsuite/ChangeLog:
PR c++/98515
* g++.dg/template/access32.C: New test.
* g++.dg/template/access33.C: New test.
For NO_PROFILE_COUNTERS targets, R11 is a scratch register. We can use
R10 and R11 to call mcount in large model with PIC.
gcc/
PR target/98482
* config/i386/i386.c (x86_function_profiler): Use R10 and R11
to call mcount in large model with PIC for NO_PROFILE_COUNTERS
targets.
gcc/testsuite/
PR target/98482
* gcc.target/i386/pr98482-2.c: Updated.
When running FRE in the loop pipeline (as part of the conditionally
scheduled scalar cleanups) we have to reset the SCEV hashtable as
otherwise we can end up with stale entries and all sorts of problems.
Catched by my out-of-tree verifier for this problem.
2021-01-08 Richard Biener <rguenther@suse.de>
* tree-ssa-sccvn.c (pass_fre::execute): Reset the SCEV hash table.
This plugs two memleaks in the vectorizer.
2021-01-08 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (scalar_stmts_to_slp_tree_map_t): Fix.
(vect_build_slp_tree): On cache hit release the matched
scalar stmts vector.
* tree-vect-stmts.c (vectorizable_store): Properly free
vec_oprnds before possibly gathering them again.
Permute nodes are not transparent to the permute of their children.
Instead we have to materialize child permutes always and in future
may treat permute nodes as the source of arbitrary permutes as
we can permute the lane permutation vector at will (as the target
supports in the end).
2021-01-08 Richard Biener <rguenther@suse.de>
PR tree-optimization/98544
* tree-vect-slp.c (vect_optimize_slp): Always materialize
permutes at a permute node.
* gcc.dg/vect/bb-slp-pr98544.c: New testcase.
R10 is caller-saved. Although it can be used as a static chain register,
it is preserved when calling mcount for nested functions. Use R10 as a
scratch register to call mcount in large model.
gcc/
PR target/98482
* config/i386/i386.c (x86_function_profiler): Use R10 to call
mcount in large model. Sorry for large model with PIC.
gcc/testsuite/
PR target/98482
* gcc.target/i386/pr98482-1.c: New test.
* gcc.target/i386/pr98482-1.c: Likewise.
My patch to save/restore opts_set rather than essentially treating
global_options_set as a logical or whether some option has ever been
explicitly set somewhere apparently broke -mcmodel= vs. target attribute
(and as the patch shows some other options too).
The thing is, at least for options for which we ever test opts_set->x_*
or global_options_set.x_*, we need to save/restore them next to the
saving/restoring of the actual option values.
If an option has Save keyword or in case of TargetVariable, it is the
generic code that handles the saving and restoring of both the option
and corresponding opts_set flag automatically, for other variables
(TargetSave, or Target without Save) the backend needs to do that in the
target hook manually and in that case should save/restore both the option
values (the hooks mostly did that) and opts_set (they didn't).
As it seems much easier to let the automatic saving/restoring do the work
for us unless the saving/restoring of the option needs some specific magic,
the following patch is a result of grepping through the backend for
opts_set->x_ and global_options_set.x_ and for all such referenced
variables, grepping whether it is saved/restored including opts_set properly
in the generated options-save.c or not.
2021-01-08 Jakub Jelinek <jakub@redhat.com>
PR target/98585
* config/i386/i386.opt (ix86_cmodel, ix86_incoming_stack_boundary_arg,
ix86_pmode, ix86_preferred_stack_boundary_arg, ix86_regparm,
ix86_veclibabi_type): Remove x_ prefix, use TargetVariable instead of
TargetSave and initialize for variables with enum types.
(mfentry, mstack-protector-guard-reg=, mstack-protector-guard-offset=,
mstack-protector-guard-symbol=): Add Save.
* config/i386/i386-options.c (ix86_function_specific_save,
ix86_function_specific_restore): Don't save or restore x_ix86_cmodel,
x_ix86_incoming_stack_boundary_arg, x_ix86_pmode,
x_ix86_preferred_stack_boundary_arg, x_ix86_regparm,
x_ix86_veclibabi_type.
* gcc.target/i386/pr98585.c: New test.
This patch adds unpacked support for unconditional and
conditional CNOT. The type suffix has to be taken from
the element size rather than the container size.
gcc/
* config/aarch64/aarch64-sve.md (*cnot<mode>): Extend from
SVE_FULL_I to SVE_I.
(*cond_cnot<mode>_2, *cond_cnot<mode>_any): Likewise.
gcc/testsuite/
* gcc.target/aarch64/sve/cnot_2.c: New test.
* gcc.target/aarch64/sve/cond_cnot_4.c: Likewise.
* gcc.target/aarch64/sve/cond_cnot_4_run.c: Likewise.
* gcc.target/aarch64/sve/cond_cnot_5.c: Likewise.
* gcc.target/aarch64/sve/cond_cnot_5_run.c: Likewise.
* gcc.target/aarch64/sve/cond_cnot_6.c: Likewise.
* gcc.target/aarch64/sve/cond_cnot_6_run.c: Likewise.
This patch extends the conditional UXT patterns from SVE_FULL_I
to SVE_I. It doesn't matter in this case whether the type suffix
is taken from the element size or the container size.
gcc/
* config/aarch64/aarch64-sve.md (*cond_uxt<mode>_2): Extend from
SVE_FULL_I to SVE_I.
(*cond_uxt<mode>_any): Likewise.
gcc/testsuite/
* gcc.target/aarch64/sve/cond_uxt_5.c: New test.
* gcc.target/aarch64/sve/cond_uxt_5_run.c: Likewise.
* gcc.target/aarch64/sve/cond_uxt_6.c: Likewise.
* gcc.target/aarch64/sve/cond_uxt_6_run.c: Likewise.
* gcc.target/aarch64/sve/cond_uxt_7.c: Likewise.
* gcc.target/aarch64/sve/cond_uxt_7_run.c: Likewise.
* gcc.target/aarch64/sve/cond_uxt_8.c: Likewise.
* gcc.target/aarch64/sve/cond_uxt_8_run.c: Likewise.
This fixes a logical inconsistency with the SVE2 ACLE tests where the SVE2 tests
are checking for SVE support in the assembler instead of SVE2.
This makes all these tests fail when the user has an SVE enabled assembler but
not an SVE2 one.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp
(check_effective_target_aarch64_asm_sve2_ok): New.
* g++.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp: Use it.
* gcc.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp: Likewise.
This patch reimplements most of the vpadal intrinsics to use RTL
builtins in the normal way.
The ones that aren't converted are the int32x2_t -> int64x1_t ones as
the RTL pattern doesn't currently handle
these modes. We don't have a V1DI mode so it would need to return a
DImode value or a V2DI one with the first lane
being the result. It's not hard to do, but it would require a bit more
refactoring so we can do it separately later.
This patch hopefully improves the status quo.
The new Vwhalf mode attribute is created because the existing Vwtype
attribute maps V8QI wrongly (for this pattern) to "8h" as the
suffix rather than "4h" as needed.
gcc/
* config/aarch64/iterators.md (Vwhalf): New iterator.
* config/aarch64/aarch64-simd.md (aarch64_<sur>adalp<mode>_3):
Rename to...
(aarch64_<sur>adalp<mode>): ... This. Make more
builtin-friendly.
(<sur>sadv16qi): Adjust callsite of the above.
* config/aarch64/aarch64-simd-builtins.def (sadalp, uadalp): New
builtins.
* config/aarch64/arm_neon.h (vpadal_s8): Reimplement using
builtins.
(vpadal_s16): Likewise.
(vpadal_u8): Likewise.
(vpadal_u16): Likewise.
(vpadalq_s8): Likewise.
(vpadalq_s16): Likewise.
(vpadalq_s32): Likewise.
(vpadalq_u8): Likewise.
(vpadalq_u16): Likewise.
(vpadalq_u32): Likewise.
This patch reimplements the vaba* arm_neon.h intrinsics using RTL
builtins that expand to proper RTL patterns
rather than using inline asm.
The implementation is fairly straightforward by defining new builtins
and using them in the header.
gcc/
* config/aarch64/aarch64-simd-builtins.def (saba, uaba): Define
builtins.
* config/aarch64/arm_neon.h (vaba_s8): Implement using builtin.
(vaba_s16): Likewise.
(vaba_s32): Likewise.
(vaba_u8): Likewise.
(vaba_u16): Likewise.
(vaba_u32): Likewise.
(vabaq_s8): Likewise.
(vabaq_s16): Likewise.
(vabaq_s32): Likewise.
(vabaq_u8): Likewise.
(vabaq_u16): Likewise.
(vabaq_u32): Likewise.
Sometime ago we changed the RTL representation of the (SU)ABD
instructions in RTL to a (MINUS (MAX) (MIN)) rather than a (MINUS (ABS) (ABS))
as it is more correctly models the semantics.
We should do the same for the accumulation forms of these instructions:
UABA/SABA.
This patch does that and allows the new pattern to generate the unsigned
UABA form as well.
The new form also allows it to more easily be re-used to implement the
relevant arm_neon.h intrinsics in the future.
The testcase takes an -fno-tree-reassoc to work around a side-effect of
PR98581.
gcc/
* config/aarch64/aarch64-simd.md (aba<mode>_3): Rename to...
(aarch64_<su>aba<mode>): ... This. Handle uaba as well.
Change RTL pattern to match.
gcc/testsuite/
* gcc.target/aarch64/usaba_1.c: New test.
2021-01-05 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/93794
* trans-expr.c (gfc_conv_component_ref): Remove the condition
that deferred character length components only be allocatable.
gcc/testsuite/
PR fortran/93794
* gfortran.dg/deferred_character_35.f90 : New test.
2021-01-08 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/98458
* simplify.c (is_constant_array_expr): If an array constructor
expression has elements other than constants or structures, try
fixing the expression with gfc_reduce_init_expr. Also, if shape
is NULL, obtain the array size and set it.
gcc/testsuite/
PR fortran/98458
* gfortran.dg/implied_do_3.f90 : New test.
- This patch introduce new set of architecture extension test macros
which is accept on riscv-c-api-doc recently.
- https://github.com/riscv/riscv-c-api-doc/blob/master/riscv-c-api.md#architecture-extension-test-macro
- We will also mark deprecated for legacy architecture extension test macros
in GCC 11, but still support that for 1 or 2 release cycles.
gcc/ChangeLog:
* common/config/riscv/riscv-common.c (riscv_current_subset_list): New.
* config/riscv/riscv-c.c (riscv-subset.h): New.
(INCLUDE_STRING): Define.
(riscv_cpu_cpp_builtins): Add new style architecture extension
test macros.
* config/riscv/riscv-subset.h (riscv_subset_list::begin): New.
(riscv_subset_list::end): New.
(riscv_current_subset_list): New.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/predef-10.c: New.
* gcc.target/riscv/predef-11.c: New.
* gcc.target/riscv/predef-12.c: New.
* gcc.target/riscv/predef-13.c: New.
Pre-work of new style of architecture extension test macros, we need the
list used in `config/riscv/riscv-c.c`, so those struct/class declaration
must move to header file rather than local C file.
gcc/ChangeLog
* common/config/riscv/riscv-common.c (RISCV_DONT_CARE_VERSION):
Move to riscv-subset.h.
(struct riscv_subset_t): Ditto.
(class riscv_subset_list): Ditto.
* config/riscv/riscv-subset.h (RISCV_DONT_CARE_VERSION): Move
from riscv-common.c.
(struct riscv_subset_t): Ditto.
(class riscv_subset_list): Ditto.
* config/riscv/t-riscv ($(common_out_file)): Add file
dependency.
As the testcase shows, calling cp_build_bit_cast in tsubst_copy doesn't seem
to be a good idea, because tsubst_copy might not really make the operand
non-dependent, but as processing_template_decl can be 0,
type_dependent_expression_p will return false and then cp_build_bit_cast
assumes the type is non-NULL and non-dependent.
So, this patch just follows what is done e.g. for NOP_EXPR etc. and just
builds some tree in tsubst_copy, and only calls the semantics.c function
from tsubst_copy_and_build.
2021-01-07 Jakub Jelinek <jakub@redhat.com>
PR c++/98329
* pt.c (tsubst_copy) <case BIT_CAST_EXPR>: Don't call
cp_build_bit_cast here, instead just build_min a BIT_CAST_EXPR and set
its location.
(tsubst_copy_and_build): Handle BIT_CAST_EXPR.
* g++.dg/cpp2a/bit-cast10.C: New test.
This fixes a thinko in my r11-2085 patch: when I said "But only give the
!late_return_type errors when funcdecl_p, to accept e.g. auto (*fp)() = f;
in C++11" I should've done this, otherwise we give bogus errors mentioning
"function with trailing return type" when there is none.
gcc/cp/ChangeLog:
PR c++/98441
* decl.c (grokdeclarator): Move the !funcdecl_p check inside the
!late_return_type block.
gcc/testsuite/ChangeLog:
PR c++/98441
* g++.dg/cpp0x/auto55.C: New test.
Discussing the 98469 patch and class prvalues with Jakub led me to
double-check our handling of TARGET_EXPR in constexpr.c, and add a note
about why we don't strip them in parameter initialization. And another to
clarify that we're handling an INIT_EXPR in a place we do strip them.
gcc/cp/ChangeLog:
* constexpr.c (cxx_bind_parameters_in_call): Add comment.
(cxx_eval_store_expression): Add comment.
Another change I was working on revealed that for complex numbers we were
building a ck_identity with build_conv, leading to the wrong active member
in the union being set. Rather than add another enumeration of the
appropriate conversion codes, I factored that out.
gcc/cp/ChangeLog:
* call.c (has_next): Factor out from...
(next_conversion): ...here.
(strip_standard_conversion): And here.
(is_subseq): And here.
(build_conv): Check it.
(standard_conversion): Don't call build_conv
for ck_identity.
lto-streamer-out.c's get_symbol_initial_value can return error_mark_node
rather than DECL_INITIAL as an optimization to avoid extra sections for
simple scalar values.
Add a check to the analyzer to handle such cases gracefully.
gcc/analyzer/ChangeLog:
PR analyzer/98580
* region.cc (decl_region::get_svalue_for_initializer): Gracefully
handle when LTO writes out DECL_INITIAL as error_mark_node.
gcc/testsuite/ChangeLog:
PR analyzer/98580
* gcc.dg/analyzer/pr98580-a.c: New test.
* gcc.dg/analyzer/pr98580-b.c: New test.
The testcase was failing to compile on some targets due to its use of
the non-standard functions nextupl and nextdownl. This patch makes the
testcase instead use the C99 function nexttowardl in an equivalent way.
libstdc++-v3/ChangeLog:
PR libstdc++/98384
* testsuite/20_util/to_chars/long_double.cc: Use nexttowardl
instead of the non-standard nextupl and nextdownl.
2021-01-07 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/93701
* resolve.c (find_array_spec): Put static prototype for
resolve_assoc_var before this function and call for associate
variables.
gcc/testsuite/
PR fortran/93701
* gfortran.dg/associate_54.f90: New test.
* gfortran.dg/associate_55.f90: New test.
* gfortran.dg/associate_56.f90: New test.
Adds support for using user-defined attributes on function arguments and
single-parameter alias declarations. These attributes behave analogous
to existing UDAs.
gcc/d/ChangeLog:
* dmd/MERGE: Merge upstream dmd 9038e64c5.
* d-builtins.cc (build_frontend_type): Update call to
Parameter::create.
We do not tolerate "growing" a vector to a lower size.
2021-01-07 Richard Biener <rguenther@suse.de>
gcc/c/
* gimple-parser.c (c_parser_gimple_compound_statement): Only
reallocate loop array if it is too small.
The BLSI instruction sets SF and ZF based on the result and clears OF.
CF is set to something unrelated.
The following patch optimizes BLSI followed by comparison, so we don't need
to emit a TEST insn in between.
2021-01-07 Jakub Jelinek <jakub@redhat.com>
PR target/98567
* config/i386/i386.md (*bmi_blsi_<mode>_cmp, *bmi_blsi_<mode>_ccno):
New define_insn patterns.
* gcc.target/i386/pr98567-1.c: New test.
* gcc.target/i386/pr98567-2.c: New test.
This patch extends the conditional unary integer operations
from SVE_FULL_I to SVE_I. In each case the type suffix is
taken from the element size rather than the container size:
this matters for ABS and NEG, but doesn't matter for NOT.
gcc/
* config/aarch64/aarch64-sve.md (@cond_<SVE_INT_UNARY:optab><mode>)
(*cond_<SVE_INT_UNARY:optab><mode>_2): Extend from SVE_FULL_I to SVE_I.
(*cond_<SVE_INT_UNARY:optab><mode>_any): Likewise.
gcc/testsuite/
* gcc.target/aarch64/sve/cond_unary_5.c: New test.
* gcc.target/aarch64/sve/cond_unary_5_run.c: Likewise.
* gcc.target/aarch64/sve/cond_unary_6.c: Likewise.
* gcc.target/aarch64/sve/cond_unary_6_run.c: Likewise.
* gcc.target/aarch64/sve/cond_unary_7.c: Likewise.
* gcc.target/aarch64/sve/cond_unary_7_run.c: Likewise.
* gcc.target/aarch64/sve/cond_unary_8.c: Likewise.
* gcc.target/aarch64/sve/cond_unary_8_run.c: Likewise.
This patch follows on from the previous one for the PR and
makes sure that we can handle == as well as <. Previously
we assumed without checking that IFN_VCONDEQ was available
if IFN_VCOND or IFN_VCONDU wasn't.
The patch also fixes the definition of the IFN_VCOND* functions.
The optabs are convert optabs in which the first mode is the
data mode and the second mode is the comparison or mask mode.
gcc/
PR tree-optimization/98560
* internal-fn.def (IFN_VCONDU, IFN_VCONDEQ): Use type vec_cond.
* internal-fn.c (vec_cond_mask_direct): Get the data mode from
argument 1.
(vec_cond_direct): Likewise argument 2.
(vec_condu_direct, vec_condeq_direct): Delete.
(expand_vect_cond_optab_fn): Rename to...
(expand_vec_cond_optab_fn): ...this, replacing old macro.
(expand_vec_condu_optab_fn, expand_vec_condeq_optab_fn): Delete.
(expand_vect_cond_mask_optab_fn): Rename to...
(expand_vec_cond_mask_optab_fn): ...this, replacing old macro.
(direct_vec_cond_mask_optab_supported_p): Treat the optab as a
convert optab.
(direct_vec_cond_optab_supported_p): Likewise.
(direct_vec_condu_optab_supported_p): Delete.
(direct_vec_condeq_optab_supported_p): Delete.
* gimple-isel.cc: Include internal-fn.h.
(gimple_expand_vec_cond_expr): Check that IFN_VCONDEQ is supported
before using it.
gcc/testsuite/
PR tree-optimization/98560
* gcc.dg/vect/pr98560-2.c: New test.
PR98560 is about a case in which the vectoriser initially generates:
mask_1 = a < 0;
mask_2 = mask_1 & ...;
res = VEC_COND_EXPR <mask_2, b, c>;
The vectoriser thus expects res to be calculated using vcond_mask.
However, we later manage to fold mask_2 to mask_1, leaving:
mask_1 = a < 0;
res = VEC_COND_EXPR <mask_1, b, c>;
gimple-isel then required a combined vcond to exist.
On most targets, it's not too onerous to provide all possible
(compare x select) combinations. For each data mode, you just
need to provide unsigned comparisons, signed comparisons, and
floating-point comparisons, with the data mode and type of
comparison uniquely determining the mode of the compared values.
But for targets like SVE that support “unpacked” vectors,
it's not that simple: the level of unpacking adds another
degree of freedom.
Rather than insist that the combined versions exist, I think
we should be prepared to fall back to using separate comparisons
and vcond_masks. I think that makes more sense on targets like
AArch64 and AArch32 in which compares and selects are fundementally
separate operations anyway.
gcc/
PR tree-optimization/98560
* gimple-isel.cc (gimple_expand_vec_cond_expr): If we fail to use
IFN_VCOND{,U,EQ}, fall back on IFN_VCOND_MASK.
gcc/testsuite/
PR tree-optimization/98560
* gcc.dg/vect/pr98560-1.c: New test.
As the testcase shows, bswap can match even byte-swapping or indentity
from low part of some wider SSA_NAME.
For bswap replacement other than for vector CONSTRUCTOR the code has been
using NOP_EXPR casts if the types weren't compatible, but for vectors
we need to use VIEW_CONVERT_EXPR. The problem with the latter is that
we require that it has the same size, which isn't guaranteed, so this patch
in those cases first adds a narrowing NOP_EXPR cast and only afterwards
does a VIEW_CONVERT_EXPR.
2021-01-07 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/98568
* gimple-ssa-store-merging.c (bswap_view_convert): New function.
(bswap_replace): Use it.
* g++.dg/torture/pr98568.C: New test.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr92658-avx512bw.c: Add
-mprefer-vector-width=512 to avoid impact of different default
mtune which gcc is built with.
* gcc.target/i386/pr92658-avx512bw-2.c: Ditto.
gcc/analyzer/ChangeLog:
PR analyzer/97074
* store.cc (binding_cluster::can_merge_p): Add "out_store" param
and pass to calls to binding_cluster::make_unknown_relative_to.
(binding_cluster::make_unknown_relative_to): Add "out_store"
param. Use it to mark base regions that are pointed to by
pointers that become unknown as having escaped.
(store::can_merge_p): Pass out_store to
binding_cluster::can_merge_p.
* store.h (binding_cluster::can_merge_p): Add "out_store" param.
(binding_cluster::make_unknown_relative_to): Likewise.
* svalue.cc (region_svalue::implicitly_live_p): New vfunc.
* svalue.h (region_svalue::implicitly_live_p): New vfunc decl.
gcc/testsuite/ChangeLog:
PR analyzer/97074
* gcc.dg/analyzer/pr97074.c: New test.
This pulls in the toplevel portions of these binutils-gdb commits:
1ff6de031241c59d0ff bfd, ld: add CTF section linking
87279e3cef5b2c54f4a libctf: installable libctf as a shared library
c59e30ed1727135f8ef libctf: new testsuite
* Makefile.def: Sync with binutils-gdb:
(dependencies): all-ld depends on all-libctf.
(host_modules): libctf is no longer no_install.
No longer no_check. Checking depends on all-ld.
* Makefile.in: Regenerated.
LRA can crash when a hard register was split and the same hard register
was assigned on the previous assignment sub-pass. The following
patch fixes this problem.
gcc/ChangeLog:
PR rtl-optimization/97978
* lra-int.h (lra_hard_reg_split_p): New external.
* lra.c (lra_hard_reg_split_p): New global.
(lra): Set up lra_hard_reg_split_p after splitting a hard reg.
* lra-assigns.c (lra_assign): Don't check allocation correctness
after hard reg splitting.
gcc/testsuite/ChangeLog:
PR rtl-optimization/97978
* gcc.target/i386/pr97978.c: New.
Where possible (i.e. where that doesn't alter the intent of a test) we
use a suspend_always as the final suspend and a test that the coroutine
was 'done' to check that the state machine had terminated correctly.
Sometimes, filed PRs have 'suspend_never' as the final suspend expression
and that needs to be changed to match the testsuite style. This is one
I missed and means that the call to 'done()' on the handle is made to an
already-destructed coroutine. Surprisngly, thAt didn't actually trigger
a failure until glibc 2-32.
Fixed by changing the final suspend to be 'suspend_always'.
gcc/testsuite/ChangeLog:
PR c++/96504
* g++.dg/coroutines/torture/pr95519-05-gro.C: Use suspend_always
as the final suspend point so that we can check that the state
machine has reached the expected point.
The error recovery after an invalid reference to an undefined CLASS
during a TYPE declaration lead to an invalid access. Add a check.
gcc/fortran/ChangeLog:
* resolve.c (resolve_component): Add check for valid CLASS
reference before trying to access CLASS data.
C++ sized deallocation only came in C++14, so this test wasn't
working properly in C++11, which isn't tested by default. Fixed
thus by constraining the dg-errors to C++14 only.
gcc/testsuite/ChangeLog:
PR testsuite/98566
* g++.dg/warn/Wmismatched-dealloc.C: Use target c++14 in
dg-error.
2021-01-06 John David Anglin <danglin@gcc.gnu.org>
libcody/ChangeLog:
PR bootstrap/98506
* resolver.cc: Only use fstatat when _POSIX_C_SOURCE >= 200809L.
In g++.dg/opt/store-merging-2.C, the natural alignment of types T and
S is a single byte, so we shouldn't expect store merging on
strict-alignment platforms. Indeed, without something like the
adjust-alignment pass to bump up the alignment of the automatic
variable, as in GCC 10, the optimization does not occur.
This patch adjusts the test so that the required alignment is
expressly stated, and so we don't rely on its accidentally being there
to get the desired optimization.
for gcc/testsuite/ChangeLog
* g++.dg/opt/store-merging-2.C: Add the required alignment.
The glimits.h overriding used in gcc/config/t-vxworks was fragile: the
intermediate file would already be there in a rebuild, and so the
adjustments would not be made, so the generated limits.h would miss
them, causing limits-width-[12] tests to fail on that target.
While changing it, I also replaced the modern $(cmd) shell syntax with
the more portable `cmd` construct.
for gcc/ChangeLog
* Makefile.in (T_GLIMITS_H): New.
(stmp-int-hdrs): Depend on it, use it.
* config/t-vxworks (T_GLIMITS_H): Override it.
(vxw-glimits.h): New.
This adds __attribute__((signed_bool_precision(precision))) to be able
to construct nonstandard boolean types which for the included testcase
is needed to simulate Ada and LTO interaction (Ada uses a 8 bit
precision boolean_type_node). This will also be useful for vector
unit testcases where we need to produce vector types with
non-standard precision signed boolean type components.
2021-01-06 Richard Biener <rguenther@suse.de>
PR tree-optimization/95582
gcc/c-family/
* c-attribs.c (c_common_attribute_table): Add entry for
signed_bool_precision.
(handle_signed_bool_precision_attribute): New.
gcc/testsuite/
* gcc.dg/pr95582.c: New testcase.
This fixes a premature optimization in the range intersection code
which assumes earlier branches have to be taken, not taking into
account that for symbolic ranges we cannot always compare endpoints.
The fix is to instantiate the compare deemed redundant (which then
fails as undecidable for the testcase).
2021-01-06 Richard Biener <rguenther@suse.de>
PR tree-optimization/98513
* value-range.cc (intersect_ranges): Compare the upper bounds
for the expected relation.
* gcc.dg/tree-ssa/pr98513.c: New testcase.
contrib/ChangeLog:
* gcc-changelog/git_commit.py: Add decode_path function.
* gcc-changelog/git_email.py: Use it in order to solve
utf8 encoding filename issues.
* gcc-changelog/git_repository.py: Likewise.
* gcc-changelog/test_email.py: Test it.
gcc/analyzer/ChangeLog:
PR analyzer/97072
* region-model-reachability.cc (reachable_regions::init_cluster):
Convert symbolic region handling to a switch statement. Add cases
to handle SK_UNKNOWN and SK_CONJURED.
gcc/testsuite/ChangeLog:
PR analyzer/97072
* gcc.dg/analyzer/pr97072.c: New test.
This ICE was fixed by r11-2694-g808f4dfeb3a95f50 (aka the big state
rewrite for GCC 11).
gcc/testsuite/ChangeLog:
PR analyzer/98073
* gcc.dg/analyzer/pr98073.c: New test.
The bogus leak message went away after
fcae512115 (aka "Hybrid EVRP and
testcases") due to that patch improving a phi node in the gimple input
to the analyzer.
gcc/testsuite/ChangeLog:
PR analyzer/98223
* gcc.dg/analyzer/pr94851-1.c: Remove xfail.
The HSAIL web server has reappeared after weeks, so restore the standard
reference for now while we consider further deprecation.
This reverts commit 7e999bd84f.
gcc/
2021-01-06 Gerald Pfeifer <gerald@pfeifer.com>
Revert:
2020-12-28 Gerald Pfeifer <gerald@pfeifer.com>
* doc/standards.texi (HSAIL): Remove section.
Commit 2f473f4b06 ("IBM Z: Do not run long double tests on old
machines") introduced a predicate for tests that must run only on z14+.
However, due to a syntax error, the predicate always returns false.
gcc/testsuite/ChangeLog:
2020-12-10 Ilya Leoshkevich <iii@linux.ibm.com>
* gcc.target/s390/s390.exp: Replace %% with %.
We don't use them, since we always call the C library functions which do
the right thing anyhow. And they aren't defined on all GNU/Linux variants.
Fixes PR go/98510
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/281473
Some files were missing from the libgo copy of internal/cpu, because they
used to only declare CacheLinePadSize which libgo gets from goarch.sh.
Now they also declare doinit, so copy them over. Adjust cpu_other.go.
Fix the amd64p32 build by adding a build constraint to cpu_no_name.go.
Fixes PR go/98493
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/281472
Jonathan mentioned on IRC that ISO/IEC 14882:2020 has been published
yesterday (and indeed it appears on www.iso.org for sale).
I think we should reflect that in our documentation and in cxx-status.html,
patches attached.
I understand we want to keep C++20 support experimental even in GCC 11,
though not sure if we should still talk about "almost certainly change in
incompatible ways" rather than that it might change in incompatible ways.
2021-01-05 Jakub Jelinek <jakub@redhat.com>
* doc/invoke.texi (-std=c++20): Adjust for the publication of
ISO 14882:2020 standard.
* doc/standards.texi: Likewise.
Adds the following new `__traits' to the D language.
- isDeprecated: used to detect if a function is deprecated.
- isDisabled: used to detect if a function is marked with @disable.
- isFuture: used to detect if a function is marked with @__future.
- isModule: used to detect if a given symbol represents a module, this
enhancement also adds support using `is(sym == module)'.
- isPackage: used to detect if a given symbol represents a package,
this enhancement also adds support using `is(sym == package)'.
- child: takes two arguments. The first must be a symbol or expression
and the second must be a symbol, such as an alias to a member of the
first 'parent' argument. The result is the second 'member' argument
interpreted with its 'this' context set to 'parent'. This is the
inverse of `__traits(parent, member)'.
- isReturnOnStack: determines if a function's return value is placed on
the stack, or is returned via registers.
- isZeroInit: used to detect if a type's default initializer has no
non-zero bits.
- getTargetInfo: used to query features of the target being compiled
for, the back-end can expand this to register any key to handle the
given argument, however a reliable subset exists which includes
"cppRuntimeLibrary", "cppStd", "floatAbi", and "objectFormat".
- getLocation: returns a tuple whose entries correspond to the
filename, line number, and column number of where the argument was
declared.
- hasPostblit: used to detect if a type is a struct with a postblit.
- isCopyable: used to detect if a type allows copying its value.
- getVisibility: an alias for the getProtection trait.
Reviewed-on: https://github.com/dlang/dmd/pull/12093
gcc/d/ChangeLog:
* dmd/MERGE: Merge upstream dmd a5c86f5b9.
* d-builtins.cc (d_eval_constant_expression): Handle ADDR_EXPR trees
created by build_string_literal.
* d-frontend.cc (retStyle): Remove function.
* d-target.cc (d_language_target_info): New variable.
(d_target_info_table): Likewise.
(Target::_init): Initialize d_target_info_table.
(Target::isReturnOnStack): New function.
(d_add_target_info_handlers): Likewise.
(d_handle_target_cpp_std): Likewise.
(d_handle_target_cpp_runtime_library): Likewise.
(Target::getTargetInfo): Likewise.
* d-target.h (struct d_target_info_spec): New type.
(d_add_target_info_handlers): Declare.
Use unsigned short to compute the zero-extended pextrw result.
PR target/98495
* gcc.target/i386/sse2-mmx-pextrw.c (compute_correct_result): Use
unsigned short to compute pextrw result.
In the testcase nontype-auto17.C below, the calls to f and g are invalid
because neither deduction nor defaulting of the template parameter T
yields a valid specialization. Deducing T doesn't work because T is
used only in a non-deduced context, and defaulting T doesn't work
because its default argument makes the type of M invalid.
But with -std=c++17 or later, we incorrectly accept both calls.
Starting with C++17 (specifically P0127R2), during deduction we're
allowed to try to deduce T from the argument '42' that's been
tentatively deduced for M. The problem is that when unify walks into
the type of M (a TYPENAME_TYPE), it immediately gives up without
performing any new unifications (so the type of M is still unknown) --
and then we go on to unify M with '42' anyway. Later in
type_unification_real, we complete the template argument vector using
T's default template argument, and end up forming the bogus
specializations f<void, 42> and g<S, 42>.
This patch fixes this issue by checking whether the type of an NTTP is
still dependent after walking into its type during unification. If it
is, it means we couldn't deduce all the template parameters used in its
type, and so we shouldn't yet unify the NTTP.
(The new testcase ttp33.C demonstrates the need for the TEMPLATE_PARM_LEVEL
check; without it, we would ICE on this testcase from the call to tsubst.)
gcc/cp/ChangeLog:
* pt.c (unify) <case TEMPLATE_PARM_INDEX>: After walking into
the type of the NTTP, substitute into the type again. If the
type is still dependent, don't unify the NTTP.
gcc/testsuite/ChangeLog:
* g++.dg/template/partial5.C: Adjust directives to expect the
same errors across all dialects.
* g++.dg/cpp1z/nontype-auto17.C: New test.
* g++.dg/cpp1z/nontype-auto18.C: New test.
* g++.dg/template/ttp33.C: New test.
My earlier patch to simplify x - y < 0 etc. for signed subtraction
with undefined overflow into x < y in match.pd regressed some tests,
even when it was guarded to be post-IPA, the following patch thus
attempts to optimize that during expansion instead (which is the last
time we can do it, afterwards we lose the information whether it was
x - y < 0 or (int) ((unsigned) x - y) < 0 for which we couldn't
optimize it.
2021-01-05 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94802
* expr.h (maybe_optimize_sub_cmp_0): Declare.
* expr.c: Include tree-pretty-print.h and flags.h.
(maybe_optimize_sub_cmp_0): New function.
(do_store_flag): Use it.
* cfgexpand.c (expand_gimple_cond): Likewise.
* gcc.target/i386/pr94802.c: New test.
* gcc.dg/Wstrict-overflow-25.c: Remove xfail.
Tweak a couple of comments added in the RTL-SSA series in response
to reviewer feedback.
gcc/
* mux-utils.h (pointer_mux::m_ptr): Tweak description of contents.
* rtlanal.c (simple_regno_set): Tweak description to clarify the
RMW condition.
Richi complained on IRC that cc1 is linked against libcody.a.
From my understanding, it is just the cc1plus and cc1objplus binaries
that need it, so this patch links only those against it.
> this is already part of my Solaris libcody patch
The following updated patch are the incremental changes between what Rainer
has committed and what I've posted.
2021-01-05 Jakub Jelinek <jakub@redhat.com>
gcc/cp/
* Make-lang.in (cc1plus-checksum, cc1plus$(exeext): Add
$(CODYLIB) after $(BACKEND).
gcc/objcp/
* Make-lang.in (cc1objplus-checksum, cc1objplus$(exeext): Add
$(CODYLIB) after $(BACKEND).
When materializing on a VEC_PERM node we have to permute the
incoming vectors, not the outgoing one.
2021-01-05 Richard Biener <rguenther@suse.de>
PR tree-optimization/98516
* tree-vect-slp.c (vect_optimize_slp): Permute the incoming
lanes when materializing on a VEC_PERM node.
(vectorizable_slp_permutation): Dump the permute properly.
* gcc.dg/vect/bb-slp-pr98516-1.c: New testcase.
* gcc.dg/vect/bb-slp-pr98516-2.c: Likewise.
On the following testcase we ICE during constexpr evaluation (for warnings),
because the IL has ADDR_EXPR of BIT_CAST_EXPR and ADDR_EXPR case asserts
the result is not a CONSTRUCTOR.
The patch punts on lval BIT_CAST_EXPR folding.
> This change is OK, but part of the problem is that we're trying to do
> overload resolution for an S copy/move constructor, which we shouldn't be
> because bit_cast is a prvalue, so in C++17 and up we should use it to
> directly initialize the target without any implied constructor call.
This version therefore wraps it into a TARGET_EXPR then, it alone fixes
the bug, but I've kept the constexpr.c change too.
2021-01-05 Jakub Jelinek <jakub@redhat.com>
PR c++/98469
* constexpr.c (cxx_eval_constant_expression) <case BIT_CAST_EXPR>:
Punt if lval is true.
* semantics.c (cp_build_bit_cast): Call get_target_expr_sfinae on
the result if it has a class type.
* g++.dg/cpp2a/bit-cast8.C: New test.
* g++.dg/cpp2a/bit-cast9.C: New test.
In this test we ICE in type_throw_all_p because it got a deferred
noexcept which it shouldn't. Here's the story:
In noexcept61.C, we call bar, so we perform overload resolution. When
adding the (only) candidate, we need to deduce template arguments, so
call fn_type_unification as usually. That deduces U to
void (*) (int &, int &)
which is correct, but its noexcept-spec is deferred_noexcept. Then
we call add_function_candidate (bar), wherein we try to create an
implicit conversion sequence for every argument. Since baz<int> is
of unknown type, we instantiate_type it; it is a TEMPLATE_ID_EXPR
so that calls resolve_address_of_overloaded_function. But we crash
there, because target_type contains the deferred_noexcept.
So we need to maybe_instantiate_noexcept before we can compare types.
resolve_overloaded_unification seemed like the appropriate spot, now
fn_type_unification produces the function type with its noexcept-spec
instantiated. This shouldn't go against CWG 1330 because here we
really need to instantiate the noexcept-spec.
This also fixes class-deduction76.C, a dg-ice test I recently added,
therefore this fix also fixes c++/90799, yay.
gcc/cp/ChangeLog:
PR c++/82099
* pt.c (resolve_overloaded_unification): Call
maybe_instantiate_noexcept after instantiating the function
decl.
gcc/testsuite/ChangeLog:
PR c++/82099
* g++.dg/cpp1z/class-deduction76.C: Remove dg-ice.
* g++.dg/cpp0x/noexcept61.C: New test.
This moves it to catch individual SLP subgraphs
2021-01-05 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (vect_slp_region): Move debug counter
to cover individual subgraphs.
It wasn't supposed to be enabled and appearantly copying around the
checking messed up the condition.
2021-01-05 Richard Biener <rguenther@suse.de>
PR tree-optimization/98428
* tree-vect-slp.c (vect_build_slp_tree_1): Properly reject
vector lane extracts for loop vectorization.
Apparently reassoc ICEs on large functions (more than 32767 basic blocks
with something to reassociate in those).
The problem is that the pass uses long type to store the ranks, and
the bb ranks are (number of SSA_NAMEs with default defs + 2 + bb->index) << 16,
so with many basic blocks we overflow the ranks and we then have assertions
rank is not negative.
The following patch just uses int64_t instead of long in the pass,
yes, it means slightly higher memory consumption (one array indexed by
bb->index is twice as large, and one hash_map from trees to the ranks
will grow by 50%, but I think it is better than punting on large functions
the reassociation on 32-bit hosts and making it inconsistent e.g. when
cross-compiling. Given vec.h uses unsigned for vect element counts,
we don't really support more than 4G of SSA_NAMEs or more than 2G of basic
blocks in a function, so even with the << 16 we can't really overflow the
int64_t rank counters.
2021-01-05 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/98514
* tree-ssa-reassoc.c (bb_rank): Change type from long * to
int64_t *.
(operand_rank): Change type from hash_map<tree, long> to
hash_map<tree, int64_t>.
(phi_rank): Change return type from long to int64_t.
(loop_carried_phi): Change block_rank variable type from long to
int64_t.
(propagate_rank): Change return type, rank parameter type and
op_rank variable type from long to int64_t.
(find_operand_rank): Change return type from long to int64_t
and change slot variable type from long * to int64_t *.
(insert_operand_rank): Change rank parameter type from long to
int64_t.
(get_rank): Change return type and rank variable type from long to
int64_t. Use PRId64 instead of ld to print the rank.
(init_reassoc): Change rank variable type from long to int64_t
and adjust correspondingly bb_rank and operand_rank initialization.
As requested in the PR, the one's complement abs can be done more
efficiently without cmov or branching.
Had to change the ifcvt-onecmpl-abs-1.c testcase, we no longer optimize
it in ifcvt, on x86_64 with -m32 we generate in the end the exact same
code, but with -m64:
movl %edi, %eax
- notl %eax
- cmpl %edi, %eax
- cmovl %edi, %eax
+ sarl $31, %eax
+ xorl %edi, %eax
ret
2021-01-05 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/96928
* tree-ssa-phiopt.c (xor_replacement): New function.
(tree_ssa_phiopt_worker): Call it.
* gcc.dg/tree-ssa/pr96928.c: New test.
* gcc.target/i386/ifcvt-onecmpl-abs-1.c: Remove -fdump-rtl-ce1,
instead of scanning rtl dump for ifcvt message check assembly
for xor instruction.
The following patch improves the A / (1 << B) -> A >> B simplification,
as seen in the testcase, if there is unnecessary widening for the division,
we just optimize it into a shift on the widened type, but if the lshift
is widened too, there is no reason to do that, we can just shift it in the
original type and convert after. The tree_nonzero_bits & wi::mask check
already ensures it is fine even for signed values.
I've split the vr-values optimization into a separate patch as it causes
a small regression on two testcases, but this patch fixes what has been
reported in the PR alone.
2021-01-05 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/96930
* match.pd ((A / (1 << B)) -> (A >> B)): If A is extended
from narrower value which has the same type as 1 << B, perform
the right shift on the narrower value followed by extension.
* g++.dg/tree-ssa/pr96930.C: New test.
I've tried to add such helper, but handling over just analysis and letting
each pass handle it differently seems complicated given the limitations of
the bswap infrastructure.
So, this patch just hooks the optimization also into store-merging so that
the original testcase from the PR can be fixed.
2021-01-05 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/96239
* gimple-ssa-store-merging.c (maybe_optimize_vector_constructor): New
function.
(get_status_for_store_merging): Don't return BB_INVALID for blocks
with potential bswap optimizable CONSTRUCTORs.
(pass_store_merging::execute): Optimize vector CONSTRUCTORs with bswap
if possible.
* gcc.dg/tree-ssa/pr96239.c: New test.
Description of options should be . terminated, the:
FAIL: compiler driver --help=go option(s): "^ +-.*[^:.]$" absent from output: " -fgo-embedcfg=<file> List embedded files via go:embed"
test even reports that.
2021-01-05 Jakub Jelinek <jakub@redhat.com>
* lang.opt (fgo-embedcfg=): Add full stop at the end of description.
This fixes extraction of live bool vector results for the case of
integer mode vectors.
2021-01-05 Richard Biener <rguenther@suse.de>
PR tree-optimization/98381
* tree.c (vector_element_bits): Properly compute bool vector
element size.
* tree-vect-loop.c (vectorizable_live_operation): Properly
compute the last lane bit offset.
Prevent spurious FP exceptions with _mm_cvt{,t}ps_pi32 for TARGET_MMX_WITH_SSE
by clearing the top 64 bytes of the input XMM register.
2021-01-05 Uroš Bizjak <ubizjak@gmail.com>
gcc/
PR target/98522
* config/i386/sse.md (sse_cvtps2pi): Redefine as define_insn_and_split.
Clear the top 64 bytes of the input XMM register.
(sse_cvttps2pi): Ditto.
gcc/testsuite
PR target/98522
* gcc.target/i386/pr98522.c: New test.
The diagnostic for a misplaced module decl was essentially 'computer
says no', which isn't the most helpful. This adjusts it to indicate
what would be acceptable.
gcc/cp/
* parser.c (cp_parser_module_declaration): Alter diagnostic
text to say where is permissable.
gcc/testsuite/
* g++.dg/modules/mod-decl-1.C: Adjust.
* g++.dg/modules/p0713-2.C: Adjust.
* g++.dg/modules/p0713-3.C: Adjust.
_mm_extract_pi16 is intrinsic for pextrw, which should be zero-extended,
not sign-extended.
gcc/
PR target/98495
* config/i386/xmmintrin.h (_mm_extract_pi16): Cast to unsigned
short first.
gcc/testsuite/
PR target/98495
* gcc.target/i386/pr98495-1.c: New test.
* gcc.target/i386/pr98495-2.c: New test.
* gcc.target/i386/pr98495-3.c: New test.
* gcc.target/i386/pr98495-4.c: New test.
* gcc.target/i386/pr98495-5.c: New test.
The following patch adds define_insn_and_split to optimize
vpmovmskb %xmm0, %eax
- movzwl %ax, %eax
notl %eax
and combine splitter to optimize
pmovmskb %xmm0, %eax
- notl %eax
- movzwl %ax, %eax
+ xorl $65535, %eax
gcc/ChangeLog
PR target/98461
* config/i386/sse.md (*sse2_pmovskb_zexthisi): New
define_insn_and_split for zero_extend of subreg HI of pmovskb
result.
(*sse2_pmovskb_zexthisi): Add new combine splitters for
zero_extend of not of subreg HI of pmovskb result.
gcc/testsuite/ChangeLog
* gcc.target/i386/sse2-pr98461-2.c: New test.
This patch fixes a mode/rtx mismatch for ILP32 targets in:
mem = force_const_mem (ptr_mode, imm);
where imm can be Pmode rather than ptr_mode.
The patch uses convert_memory_address to convert the Pmode address
to ptr_mode before the call. However, immediate addresses can in
general contain unspecs, and convert_memory_address wasn't set up
to handle those.
The patch therefore adds some generic unspec handling to
convert_memory_address_addr_space_1. As the comment says, we can add
a target hook if this behaviour turns out to be wrong for some targets.
But I think what the patch does is a strict improvement over the status
quo: without it, we would try to force the unspec into a register,
but nevertheless wrap the result in a (const ...). That in turn
would be invalid rtl and seems bound to generate an ICE later.
I tested the explow.c part using -fstack-protector with local hacks
to force SYMBOL_FORCE_TO_MEM for UNSPEC_SALT_ADDR.
Fixes c-c++-common/torture/pr57945.c and various other tests.
gcc/
PR target/97269
* explow.c (convert_memory_address_addr_space_1): Handle UNSPECs
nested in CONSTs.
* config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Use
convert_memory_address to convert symbolic immediates to ptr_mode
before forcing them to memory.
aarch64's *add<mode>3_poly_1 has a pattern with the constraints:
"=...,r,&r"
"...,0,rk"
"...,Uai,Uat"
i.e. the penultimate alternative requires operands 0 and 1 to match,
but the final alternative does not allow them to match.
The register allocators dealt with this correctly, and so used
different input and output registers for instructions with Uat
operands. However, constrain_operands carried the penultimate
alternative's matching rule over to the final alternative,
so it would essentially ignore the earlyclobber. This in turn
allowed postreload to convert a correct Uat pairing into an
incorrect one.
The fix is simple: recompute the matching information for each
alternative.
gcc/
PR rtl-optimization/97144
* recog.c (constrain_operands): Initialize matching_operand
for each alternative, rather than only doing it once.
gcc/testsuite/
PR rtl-optimization/97144
* gcc.c-torture/compile/pr97144.c: New test.
* gcc.target/aarch64/sve/pr97144.c: Likewise.
In the PR, fwprop was changing a call instruction and tripped
an assert when trying to update a list of call clobbers.
There are two ways we could handle this: remove the call clobber
and then add it back, or assume that the clobber will stay in its
current place.
At the moment we don't have enough information to safely move
calls around, so the second approach seems simpler and more
efficient.
gcc/
PR rtl-optimization/98403
* rtl-ssa/changes.cc (function_info::finalize_new_accesses): Explain
why we don't remove call clobbers.
(function_info::apply_changes_to_insn): Don't attempt to add
call clobbers here.
gcc/testsuite/
PR rtl-optimization/98403
* g++.dg/opt/pr98403.C: New test.
On AArch64, the vectoriser tries various ways of vectorising with both
SVE and Advanced SIMD and picks the best one. All other things being
equal, it prefers earlier attempts over later attempts.
The way this works currently is that, once it has a successful
vectorisation attempt A, it analyses all other attempts as epilogue
loops of A:
/* When pick_lowest_cost_p is true, we should in principle iterate
over all the loop_vec_infos that LOOP_VINFO could replace and
try to vectorize LOOP_VINFO under the same conditions.
E.g. when trying to replace an epilogue loop, we should vectorize
LOOP_VINFO as an epilogue loop with the same VF limit. When trying
to replace the main loop, we should vectorize LOOP_VINFO as a main
loop too.
However, autovectorize_vector_modes is usually sorted as follows:
- Modes that naturally produce lower VFs usually follow modes that
naturally produce higher VFs.
- When modes naturally produce the same VF, maskable modes
usually follow unmaskable ones, so that the maskable mode
can be used to vectorize the epilogue of the unmaskable mode.
This order is preferred because it leads to the maximum
epilogue vectorization opportunities. Targets should only use
a different order if they want to make wide modes available while
disparaging them relative to earlier, smaller modes. The assumption
in that case is that the wider modes are more expensive in some
way that isn't reflected directly in the costs.
There should therefore be few interesting cases in which
LOOP_VINFO fails when treated as an epilogue loop, succeeds when
treated as a standalone loop, and ends up being genuinely cheaper
than FIRST_LOOP_VINFO. */
However, the vectoriser can normally elide alias checks for epilogue
loops, on the basis that the main loop should do them instead.
Converting an epilogue loop to a main loop can therefore cause the alias
checks to be skipped. (It probably also unfairly penalises the original
loop in the cost comparison, given that one loop will have alias checks
and the other won't.)
As the comment says, we should in principle analyse each vector mode
twice: once as a main loop and once as an epilogue. However, doing
that up-front would be quite expensive. This patch instead goes for a
compromise: if an epilogue loop for mode M2 seems better than a main
loop for mode M1, re-analyse with M2 as the main loop.
The patch fixes dg.torture.exp=pr69719.c when testing with
-msve-vector-bits=128.
gcc/
PR tree-optimization/98371
* tree-vect-loop.c (vect_reanalyze_as_main_loop): New function.
(vect_analyze_loop): If an epilogue loop appears to be cheaper
than the main loop, re-analyze it as a main loop before adopting
it as a main loop.
With the introduction of C++20 modules and libcody, cc1plus and
cc1objplus gained a dependency on the socket functions. Before those
were merged into libc in Solaris 11.4, one needed to link with -lsocket -lnsl
on Solaris, so that merge broke the Solaris 11.3 build.
While we already have 4 different checks for those libraries in the
tree, I decided to import autoconf-archive's AX_LIB_SOCKET_NSL macro
instead. At the same time, the patch only links libcody and the
networking libs where needed (cc1plus, cc1objplus).
Bootstrapped without regressions on i386-pc-solaris2.11 (Solaris 11.3
and 11.4), sparc-sun-solaris2.11, and x86_64-pc-linux-gnu.
2020-12-16 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
c++tools:
PR c++/98316
* configure.ac: Include ../config/ax_lib_socket_nsl.m4.
(NETLIBS): Determine using AX_LIB_SOCKET_NSL.
* configure: Regenerate.
* Makefile.in (NETLIBS): Define.
(g++-mapper-server$(exeext)): Add $(NETLIBS).
gcc/objcp:
PR c++/98316
* Make-lang.in (cc1objplus$(exeext)): Add $(CODYLIB), $(NETLIBS).
gcc/cp:
PR c++/98316
* Make-lang.in (cc1plus$(exeext)): Add $(CODYLIB), $(NETLIBS).
gcc:
PR c++/98316
* configure.ac (NETLIBS): Determine using AX_LIB_SOCKET_NSL.
* aclocal.m4, configure: Regenerate.
* Makefile.in (NETLIBS): Define.
(BACKEND): Remove $(CODYLIB).
config:
PR c++/98316
* ax_lib_socket_nsl.m4: Import from autoconf-archive.
We don't try to optimize for signed x, y (int) (x - 1U) * y + y
into x * y, we can't do that with signed x * y, because the former
is well defined for INT_MIN and -1, while the latter is not.
We could perhaps optimize it during isel or some very late optimization
where we'd turn magically flag_wrapv, but we don't do that yet.
This patch optimizes it in simplify-rtx.c, such that we can optimize
it during combine.
2021-01-05 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/98334
* simplify-rtx.c (simplify_context::simplify_binary_operation_1):
Optimize (X - 1) * Y + Y to X * Y or (X + 1) * Y - Y to X * Y.
* gcc.target/i386/pr98334.c: New test.
This is just a precautionary fix.
2021-01-05 Bernd Edlinger <bernd.edlinger@hotmail.de>
* tree-inline.c (expand_call_inline): Restore input_location.
Return result from recursive call.
The constexpr iteration dereferenced an array element past the end of
the array.
for gcc/testsuite/ChangeLog
* g++.dg/cpp1y/constexpr-66093.C: Fix bounds issue.
This option will be used by the go command to implement go:embed directives,
which are new with the upcoming Go 1.16 release.
* lang.opt (fgo-embedcfg): New option.
* go-c.h (struct go_create_gogo_args): Add embedcfg field.
* go-lang.c (go_embedcfg): New static variable.
(go_langhook_init): Set go_create_gogo_args embedcfg field.
(go_langhook_handle_option): Handle OPT_fgo_embedcfg_.
* gccgo.texi (Invoking gccgo): Document -fgo-embedcfg.
-fsanitize=undefined with calls to nonnull functions
creates struct __ubsan_nonnull_arg_data instances
with CONSTRUCTORs for RECORD_TYPEs with NULL index values.
The analyzer was mistakenly using INTEGER_CST for these
fields, leading to ICEs.
Fix the issue by iterating through the fields in the type
for such cases, imitating similar logic in varasm.c's
output_constructor.
gcc/analyzer/ChangeLog:
PR analyzer/98293
* store.cc (binding_map::apply_ctor_to_region): When "index" is
NULL, iterate through the fields for RECORD_TYPEs, rather than
creating an INTEGER_CST index.
gcc/testsuite/ChangeLog:
PR analyzer/98293
* gcc.dg/analyzer/pr98293.c: New test.
The IFN_MASK* functions take two leading arguments: a load or
store pointer and a “cookie”. The type of the cookie is the
type of the access for TBAA purposes (like for MEM_REFs)
while the value of the cookie is the alignment of the access.
This PR was caused by a disagreement about whether the alignment
is measured in bits or bytes.
It looks like this goes back to PR68786, which made the
vectoriser create its own cookie argument rather than reusing
the one created by ifcvt. The alignment value of the new cookie
was measured in bytes (as needed by set_ptr_info_alignment)
while the existing code expected it to be measured in bits.
The folds I added for IFN_MASK_LOAD and STORE then made
things worse.
gcc/
PR tree-optimization/95401
* config/aarch64/aarch64-sve-builtins.cc
(gimple_folder::load_store_cookie): Use bits rather than bytes
for the alignment argument to IFN_MASK_LOAD and IFN_MASK_STORE.
* gimple-fold.c (gimple_fold_mask_load_store_mem_ref): Likewise.
* tree-vect-stmts.c (vectorizable_store): Likewise.
(vectorizable_load): Likewise.
gcc/testsuite/
PR tree-optimization/95401
* g++.dg/vect/pr95401.cc: New test.
* g++.dg/vect/pr95401a.cc: Likewise.
This makes sure to set the vector type on an invariant mask argument
for a masked load and SLP.
2021-01-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/98308
* tree-vect-stmts.c (vectorizable_load): Set invariant mask
SLP vectype.
* gcc.dg/vect/pr98308.c: New testcase.
As the testcase shows, we punt unnecessarily on popcount loop idioms if
the type is smaller than int or larger than long long.
Smaller type than int can be handled by zero-extending the argument to
unsigned int, and types twice as long as long long by doing
__builtin_popcountll on both halves of the __int128.
2020-01-04 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/95771
* tree-ssa-loop-niter.c (number_of_iterations_popcount): Handle types
with precision smaller than int's precision and types with precision
twice as large as long long. Formatting fixes.
* gcc.target/i386/pr95771.c: New test.
This does VN replacement in loop nb_iterations consistent with
the rest of the IL by using availability at the definition site
of uses.
2021-01-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/98464
* tree-ssa-sccvn.c (vn_valueize_for_srt): Rename from ...
(vn_valueize_wrapper): ... this. Temporarily adjust vn_context_bb.
(process_bb): Adjust.
* g++.dg/opt/pr98464.C: New testcase.
The original documentation added to mention the clash between
-fsanitize=address and -fsanitize=hwaddress used confusing wording trying
to say that -fsanitize=hwaddress is only available on AArch64.
It read as if -fsanitize=address were only supported on AArch64.
This patch fixes that wording by being more explicit.
gcc/ChangeLog:
PR other/98437
* doc/invoke.texi (-fsanitize=address): Fix wording describing
clash with -fsanitize=hwaddress.
This avoids running into memory reference code in compute_avail by
properly classifying unfolded reference trees on constants.
2021-01-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/98282
* tree-ssa-sccvn.c (vn_get_stmt_kind): Classify tcc_reference on
invariants as VN_NARY.
* g++.dg/opt/pr98282.C: New testcase.
This patch fixes a codegen regression in the handling of things like:
__temp.val[0] \
= vcombine_##funcsuffix (__b.val[0], \
vcreate_##funcsuffix (__AARCH64_UINT64_C (0))); \
in the 64-bit vst[234] functions. The zero was forced into a
register at expand time, and we relied on combine to fuse the
zero and combine back together into a single combinez pattern.
The problem is that the zero could be hoisted before combine
gets a chance to do its thing.
gcc/
PR target/89057
* config/aarch64/aarch64-simd.md (aarch64_combine<mode>): Accept
aarch64_simd_reg_or_zero for operand 2. Use the combinez patterns
to handle zero operands.
gcc/testsuite/
PR target/89057
* gcc.target/aarch64/pr89057.c: New test.
The expansions of the svprf[bhwd] instructions weren't taking
advantage of the immediate addressing mode.
gcc/
* config/aarch64/aarch64.c (offset_6bit_signed_scaled_p): New function.
(offset_6bit_unsigned_scaled_p): Fix typo in comment.
(aarch64_sve_prefetch_operand_p): Accept MUL VLs in the range
[-32, 31].
gcc/testsuite/
* gcc.target/aarch64/sve/acle/asm/prfb.c: Test for a MUL VL range of
[-32, 31].
* gcc.target/aarch64/sve/acle/asm/prfh.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/prfw.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/prfd.c: Likewise.
This zeroes matches when failing SLP discovery because of the
work limit.
2021-01-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/98393
* tree-vect-slp.c (vect_build_slp_tree): Properly zero matches
when hitting the limit.
When the VF is one a SLP reduction is in-order and thus we can
vectorize even when the reduction op is not associative.
2021-01-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/98291
* tree-vect-loop.c (vectorizable_reduction): Bypass
associativity check for SLP reductions with VF 1.
* gcc.dg/vect/slp-reduc-11.c: New testcase.
* gcc.dg/vect/vect-reduc-in-order-4.c: Adjust.
x is never equal to ~x, so we can fold such comparisons to constants.
2021-01-04 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/96782
* match.pd (x == ~x -> false, x != ~x -> true): New simplifications.
* gcc.dg/tree-ssa/pr96782.c: New test.
When linking with -flto and -save-temps, various
temporary files are created in /tmp.
The same happens when invoking the driver with @file
parameter, and using -L or -I options.
gcc:
2021-01-04 Bernd Edlinger <bernd.edlinger@hotmail.de>
* collect-utils.c (collect_execute): Check dumppfx.
* collect2.c (maybe_run_lto_and_relink, do_link): Pass atsuffix
to collect_execute.
(do_link): Add new parameter atsuffix.
(main): Handle -dumpdir option. Skip one argument for
-o, -isystem and -B options.
* gcc.c (make_at_file): New helper function.
(close_at_file): Use it.
gcc/testsuite:
2021-01-04 Bernd Edlinger <bernd.edlinger@hotmail.de>
* gcc.misc-tests/outputs.exp: Adjust testcase.
We use this in the sim tree currently. Rather than require people to
have pkg-config installed, include it in the config/ dir.
config/ChangeLog:
* pkg.m4: New file from pkg-config-0.29.2.
Ideally, the linker will be queried for its version and that will be
used to determine capabilities that cannot be discovered from
reasonable configuration testing.
When building cross tools, this might not be possible, and we have
strategies for providing useful defaults. These are adjusted here to
refect current choices.
gcc/ChangeLog:
* config/darwin.h (MIN_LD64_NO_COAL_SECTS): Adjust.
Amend handling for LD64_VERSION fallback defaults.
The darwinN.h headers (with the sole exception of darwin7.h,
which contains a target macro definition) now only contain
values that set fall-backs for cross-compilations, these can
be provided from the config.gcc script which means we no longer
need the darwinN.h - so delete them.
gcc/ChangeLog:
* config.gcc: Compute default version information
from the configured target. Likewise defaults for
ld64.
* config/darwin10.h: Removed.
* config/darwin12.h: Removed.
* config/darwin9.h: Removed.
* config/rs6000/darwin8.h: Removed.
Darwin defines ASM_OUTPUT_ALIGNED_DECL_COMMON which is used in
preference to ASM_OUTPUT_ALIGNED_COMMON, which makes the latter
definition dead code. Remove this.
gcc/ChangeLog:
* config/darwin9.h (ASM_OUTPUT_ALIGNED_COMMON): Delete.
We now need a modern (C++11) toolchain to bootstrap GCC, so there's no
need to skip the stack protect for Darwin < 9.
gcc/ChangeLog:
* config/darwin9.h (STACK_CHECK_STATIC_BUILTIN): Move from here..
* config/darwin.h (STACK_CHECK_STATIC_BUILTIN): .. to here.
There is no need to make the LINK_GCC_C_SEQUENCE_SPEC conditional on
configuration parameters, it is adequately conditionalized on the
macosx-version-min.
gcc/ChangeLog:
* config/darwin10.h (LINK_GCC_C_SEQUENCE_SPEC): Move from
here...
* config/darwin.h (LINK_GCC_C_SEQUENCE_SPEC): ... to here.
The darwinN.h headers were (presumably) introduced to allow specs to be
adjusted when there was no mmacosx-version-min handling, or that was
considered unreliable.
We have version-specific specs for the values that have configuration
data, and the version is set in the driver (so may be considered
reliably present).
Some of the 'darwinN.h' content has become dead code, and the reminder
is either conditionalised on version information (or is setting values
used as fall-backs in cross-compilations).
With the changes needed for Darwin20 / macOS 11 the 'darwnN.h' headers
are now too unwieldy to be useful - so this series moves the relevant
specs definitons to the common 'darwin.h' header and then finally uses
the config.gcc script to supply the fall-back defaults for cross-
compilations.
We can then delete all but the main header, since the darwinN.h are
unused.
This change moves a spec from darwin10.h to the main darwin.h
target header.
gcc/ChangeLog:
* config/darwin10.h (LINK_GCC_C_SEQUENCE_SPEC): Move the spec
for the Darwin10 unwinder stub from here ...
* config/darwin.h (LINK_COMMAND_SPEC_A): ... to here.
The toolchain now requires a C++11 compiler to bootstrap and
none of the older Darwin toolchains which were based on stabs
debugging are suitable. We can simplify the debug setup now.
gcc/ChangeLog:
* config/darwin.h (DSYMUTIL_SPEC): Default to DWARF
(ASM_DEBUG_SPEC):Only define if the assembler supports
stabs.
(PREFERRED_DEBUGGING_TYPE): Default to DWARF.
(DARWIN_PREFER_DWARF): Define.
* config/darwin9.h (PREFERRED_DEBUGGING_TYPE): Remove.
(DARWIN_PREFER_DWARF): Likewise
(DSYMUTIL_SPEC): Likewise.
(COLLECT_RUN_DSYMUTIL): Likewise.
(ASM_DEBUG_SPEC): Likewise.
(ASM_DEBUG_OPTION_SPEC): Likewise.
An invalid declaration of a CLASS instance can lead to an internal state
with inconsistent attributes during parsing that needs to be handled with
sufficient care when processing subsequent statements. Avoid a lookup of
the vtab entry for such cases.
gcc/fortran/ChangeLog:
* class.c (gfc_find_vtab): Add check on attribute is_class.
The tests use -mfp16-format=alternative, and so should not be run
if that option isn't supported.
for gcc/testsuite/ChangeLog
* lib/target-supports.exp
(check_effective_target_arm_fp16_alternative_ok_nocache):
Return zero for *-*-vxworks7r* targets.
* gcc.target/arm/aapcs/vfp22.c: Require arm_fp16_alternative_ok.
* gcc.target/arm/aapcs/vfp23.c: Likewise.
* gcc.target/arm/aapcs/vfp24.c: Likewise.
* gcc.target/arm/aapcs/vfp25.c: Likewise.
This test fails during the execution on VxWorks 7 when using
C++-14 and C++-17.
for gcc/testsuite/ChangeLog
* g++.dg/init/new26.C: Fix overriding of the delete operator
for c++14 profile.
The only TLS model supported in VxWorks kernel mode is local-exec.
for gcc/testsuite/ChangeLog
* g++.dg/tls/pr79288.C: Skip on vxworks_kernel (TLS model
not supported).
If the target is configured such that -mlong-call is passed
by default, the function calls these tests are trying to detect
by scanning the assembly file are performed using long calls,
like so:
| foo:
| @ memset-inline-2.c:12: memset (a, -1, 14);
| mov r2, #14 @,
| mvn r1, #0 @,
| ldr r0, .L2 @,
| ldr r3, .L2+4 @ tmp112,
| bx r3 @ tmp112
Looking at .L2 (and in particular at .L2+4):
| .L2:
| .word a
| .word memset <<<---
This change adds -mno-long-calls to the list of compiler options
to make sure we generate short call code, allowing the assembly
matching to pass.
This is added unconditionally to the dg-options (as opposed to using
dg-additional-options) because this test is already specific to ARM
targets, and -mno-long-calls is available on all ARM targets.
for gcc/testsuite/ChangeLog
* gcc.target/arm/memset-inline-2.c: Add -mno-long-calls to
the test's dg-options.
* gcc.target/arm/pr78255-2.c: Likewise.
The conflicting definition of OK is present in VxWorks RTP headers too.
for gcc/testsuite/ChangeLog
* g++.old-deja/g++.mike/p658.C: Also undefine OK on VxWorks RTP.
In VxWorks 7, UINT32 is defined in both modes, kernel and rtp. Adjust
the work around accordingly.
for gcc/testsuite/ChangeLog
* g++.dg/opt/20050511-1.C: Work around UINT32 in vxworks rtp
headers too.
Linking in vxworks kernel-mode is partial linking, so missing symbols
are not detected.
for gcc/testsuite/ChangeLog
* g++.old-deja/g++.pt/const2.C: Skip on vxworks kernel.
VxWorks headers define ERROR as a macro, which conflicts with the use
in the test.
for gcc/testsuite/ChangeLog
* g++.dg/tree-ssa/copyprop.C: Undefine ERROR if defined.
The vxworks kernel-mode linking is partial linking, so it cannot
detect missing symbols.
for gcc/testsuite/ChangeLog
* g++.dg/other/anon5.C: Skip on vxworks kernel.
Adjust vxworks initpri expectations, given that vxworks7 has switched
to .init_array.
for gcc/testsuite/ChangeLog
* gcc.dg/vxworks/initpri1.c: Tigthen VxWorks version check.
* gcc.dg/vxworks/initpri2.c: Likewise.
This test currently fails on VxWorks 7 SR06x0 targets when in kernel
mode, because it expects a discrepancy between built-in and system
intmax_t for all VxWorks targets when in kernel mode. Fortunately,
this has now been fixed when targetting VxWorks 7 SR06x0, so this
commit adjusts the "dg-error" condition to exclude newer versions of
VxWorks 7.
for gcc/testsuite/ChangeLog
* gcc.dg/intmax_t-1.c: Do not expect an error on *-*-vxworks7r*
targets.
Match xfail on kernel instead of rtp mode.
for gcc/testsuite/changeLog
* gcc.dg/pthread-init-1.c: Fix the VxWorks xfail filters.
* gcc.dg/pthread-init-2.c: Ditto.
Explicitly disable some vxworks-missing features in the testsuite, that
the current feature tests detect as present.
for gcc/testsuite/ChangeLog
* lib/target-supports.exp (check_weak_available,
check_fork_available, check_effective_target_lto,
check_effective_target_mempcpy): Add vxworks filters.
The implicit -mlong-calls used in our vxworks configurations changes
the call sequences from those expected in the mve_libcall testcases.
This patch brings the test output in line with the expectations, with
an explicit -mno-long-calls.
for gcc/testsuite/ChangeLog
* gcc.target/arm/mve/intrinsics/mve_libcall1.c: Pass an
explicit -mno-long-calls.
* gcc.target/arm/mve/intrinsics/mve_libcall2.c: Likewise.
The implicit -mlong-calls from our vxworks configurations makes the
tail-call instructions differ from those expected by the
no_unique_address tests in gcc.target/arm.
This patch adds -mno-long-calls to the compilation commands, so that
we generate the expected sequences.
for gcc/testsuite/ChangeLog
* g++.target/arm/no_unique_address_1.C: Add -mno-long-calls.
* g++.target/arm/no_unique_address_2.C: Likewise.
The headmerge tests pass a constant to conditional calls, so that the
same constant is always passed to a function, though it's a different
function depending on which path is taken.
The test checks that the constant appears only once in the assembly
output, as a means to verify that the insns setting up the argument
are unified: they appear as separate insns up to jump2, where
crossjump identifies a common prefix to all conditional paths and
unifies them.
Alas, with -mlong-calls, that we enable in our arm-vxworks
configurations, the argument register is loaded after loading the
callee address into another register. Since each path calls a
different function, there's no common initial code sequence for
crossjump to unify, and the argument register set up remains separate,
so the test fails.
Though it would surely be desirable for the compiler to perform the
unification of the argument register setting up, this patch merely
avoids the effects of -mlong-calls, with an explicit -mno-long-calls.
for gcc/testsuite/ChangeLog
* gcc.target/arm/headmerge-1.c: Add -mno-long-calls.
* gcc.target/arm/headmerge-2.c: Likewise.
The implicit -mlong-calls used in our arm-vxworks configurations
changes the register allocation patterns in the arm/fp16-aapcs-2.c
test: r3 ends up used in the long-call sequence, and we end up using
ip as a temporary, which doesn't match the expected mov patterns.
This patch adds an explicit -mno-long-calls for the generated code to
match the expectation.
for gcc/testsuite/ChangeLog
* gcc.target/arm/fp16-aapcs-2.c: Use -mno-long-calls.
On some targets, there are no < 8191; and >= 8191; strings,
but < 8191) and >= 8191), so just remove the ; from the regexps.
2021-01-01 Jakub Jelinek <jakub@redhat.com>
PR testsuite/98489
PR tree-optimization/56719
* gcc.dg/tree-ssa/pr56719.c: Remove semicolon from
scan-tree-dump-times regexps.
In this testcase we end up with:
unsigned long long x = ...;
char y = (char) (x << 37);
The overwidening pattern realised that only the low 8 bits
of x << 37 are needed, but then tried to turn that into:
unsigned long long x = ...;
char y = (char) x << 37;
which gives an out-of-range shift. In this case y can simply
be replaced by zero, but as the comment in the patch says,
it's kind-of awkward to do that in the middle of vectorisation.
Most of the overwidening stuff is about keeping operations
as narrow as possible, which is important for vectorisation
but could be counter-productive for scalars (especially on
RISC targets). In contrast, optimising y to zero in the above
feels like an independent optimisation that would benefit scalar
code and that should happen before vectorisation.
gcc/
PR tree-optimization/98302
* tree-vect-patterns.c (vect_determine_precisions_from_users): Make
sure that the precision remains greater than the shift count.
gcc/testsuite/
PR tree-optimization/98302
* gcc.dg/vect/pr98302.c: New test.
This PR is about a case in which the vectoriser was feeding
incorrect alignment information to tree-data-ref.c, leading
to incorrect runtime alias checks. The alignment was taken
from the TREE_TYPE of the DR_REF, which in this case was a
COMPONENT_REF with a normally-aligned type. However, the
underlying MEM_REF was only byte-aligned.
This patch uses dr_alignment to calculate the (byte) alignment
instead, just like we do when creating vector MEM_REFs.
gcc/
PR tree-optimization/94994
* tree-vect-data-refs.c (vect_vfa_align): Use dr_alignment.
gcc/testsuite/
PR tree-optimization/94994
* gcc.dg/vect/pr94994.c: New test.
The static GET_MODE_MASKs for SVE vectors are based on the
static precisions, which in turn are based on 128-bit SVE.
The precisions are later updated based on -msve-vector-bits
(usually to become variable length), but the GET_MODE_MASK
stayed the same. This caused combine to fold:
(*_extract:DI (subreg:DI (reg:VNxMM R) 0) ...)
to zero because the extracted bits appeared to be insignificant.
gcc/
PR rtl-optimization/98214
* genmodes.c (emit_insn_modes_h): Emit a definition of CONST_MODE_MASK.
(emit_mode_mask): Treat mode_mask_array as non-constant if adj_nunits.
(emit_mode_adjustments): Update GET_MODE_MASK when updating
GET_MODE_NUNITS.
* machmode.h (mode_mask_array): Use CONST_MODE_MASK.
This adds a set of calls to name lookup that are needed by modules.
Generally installing imported bindings, or walking the current TU's
bindings. One note about template instantiations though. When we're
about to instantiate a template we have to know about all the
maybe-partial specializations that exist. These can be in any
imported module -- not necesarily the module defining the template.
Thus we key such foreign templates to the innermost namespace and
identifier of the containing entitity -- that's the only thing we have
a handle on. That's why we note and load pending specializations here.
gcc/cp/
* module.cc (lazy_specializations_p): Stub.
* name-lookup.h (append_imported_binding_slot)
(mergeable_namespacE_slots, lookup_class_binding)
(walk_module_binding, import_module_binding, set_module_binding)
(note_pending_specializations, load_pending_specializations)
(add_module_decl, add_imported_namespace): Declare.
(get_cxx_dialect_name): Declare.
(enum WMB_flags): New.
* name-lookup.c (append_imported_binding_slot)
(mergeable_namespacE_slots, lookup_class_binding)
(walk_module_binding, import_module_binding, set_module_binding)
(note_pending_specializations, load_pending_specializations)
(add_module_decl, add_imported_namespace): New.
(get_cxx_dialect_name): Make extern.
With modules, we need the ability to name 'foos' in different modules.
The idiom for that is a trailing '@modulename' suffix. This adds that
to the error printing routines. I also augment the tree dumping
machinery to show module-specific metadata.
gcc/cp/
* error.c (dump_module_suffix): New.
(dump_aggr_type, dump_simple_decl, dump_function_name): Call it.
* ptree.c (cxx_print_decl): Print module information.
* module.cc (module_name, get_importing_module): Stubs.
Name-lookup is the most changed piece of the front end for modules.
Here are some preparatort cleanups and API extensions.
gcc/cp/
* name-lookup.h (set_class_bindings): Return vector, take signed
'extra' parm.
* name-lookup.c (maybe_lazily_declare): Break out ...
(get_class_binding): .. of here, call it.
(find_member_slot): Adjust get_class_bindings call.
(set_class_bindings): Allow -ve extra. Return the vector.
(set_identifier_type_value_with_scope): Remove checking assert.
(lookup_using_decl): Set decl's context.
(do_pushtag): Adjust set_identifier_type_value_with_scope handling.
gcc/cp/
* module.cc (trees_{in,out}::decl_container): Stream template as
well as container. Do not strip (no nudity!)
gcc/testsuite/
* g++.dg/modules/noexcept-1{.h,_[ab].[HC]}: New.
gcc/cp/
* cp-lang.c (LANG_HOOKS_PREPROCESS_UNDEF): Do not override here.
* cp-tree.h (module_cpp_undef): Do not declare.
* module.cc (module_cpp_undef): Delete.
(handle_module_option): Store cpp_main_search code in flag_header_unit.
(module_preprocess_options): Set main_search option here.
Conditionally set preprocess_undef lang hook here.
gcc/cp/
* module.cc (trees_{in,out}::key_mergeable): Variable templates
can be partially specialized and constrained.
(finish_module_processing): Point at module decl, so ICEs blame
that.
gcc/testsuite/
* g++.dg/modules/var-tpl-concept-1{.h,_[ab].C}: New.
libcpp/
* lex.c (cpp_maybe_module_directive): Add asserts and comments
about initial and final state. Remove __builtin_expects.
(_cpp_lex_token): Remove __builtin_expect.
gcc/cp/
* module.cc (trees_{in,out}::core_vals): Stream using_decl decls.
(trees_out::lang_decl_vals): Do not stream DECL_ACCESS that is
actually an access alterer.
(trees_in::read_class_def): Reconstruct DECL_ACCESS.
gcc/testsuite/
* g++.dg/modules/access-1_[abc].C: New.
gcc/cp/
* cp-tree.h (maybe_check_all_macros): Declare.
* module.cc (slurping::release_macros): New.
(slurping::~slurping): Use it.
(module_state::maybe_completed_reading): Use it.
(maybe_check_all_macros): New, broken out of ...
(finish_module_processing): ... here. DO NOT CALL.
* parser.c (cp_lexer_new_main): Call it here.
gcc/cp/
* module.cc (trees_{in,out}::key_mergeable): Stream MK_enum
underlying type decl.
(check_mergeable_decl): The enum itself is on the ovl list.
* name-lookup.c (init_global_partition): Copy the enum for the
first member of an anonymous enum.
gcc/testsuite/
* g++.dg/modules/enum-8_[abcd].[CH]: New.
gcc/cp/
* name-lookup.h (add_imported_namespace): Drop anon name parm.
* name-lookup.c (anon_name): Delete.
(make_namespace): Nothing special for anonymous namespace.
(add_imported_namespace): Likewise.
* module.cc (module_state::{read,write}_namespaces): Anon
namepspace does not need naming.
libstdc++: Rebase include/pstl to current upstream
gcc/testsuite/
PR 97549 workaround
* g++.dg/modules/xtream-header{,-2}.h: Don't include <exception>.
gcc/cp/
* mapper-client.cc: Enable networking only when CODY_NETWORKING
* mapper-server.cc: Likewise.
libcody/
Rebase b26a54f | Enable networking only on known-good systems
gcc/cp/
* name-lookup.c (get_fixed_binding_slot): Promote internal GM
entities to global slot.
* module.cc (has_definition): Add internal linkage fns/vars in GM.
(depset::hash::make_dependency): Internal linkage in GM is ok.
gcc/cp/
* module.cc (trees_out::decl_value): Assert no pmfs.
(trees_in::decl_value): Ensure existing duplicate class has member
vector.
(tree_out::write_class_def): Ensure there's a method vector.
* name-lookup.c (set_class_bindings): Extra can be negative,
meaning always. Return the member vector.
(mergeable_class_entities): Simplify.
(lookup_field_ident): Simplify.
My change to namespace-scope spell corrections ignored the issue that
different targets might have different builtins, and therefore perturb
iteration order. This fixes it by using an intermediate array of
identifier, which we sort before considering.
gcc/cp/
* name-lookup.c (maybe_add_fuzzy_decl): New.
(maybe_add_fuzzy_binding): New.
(consider_binding_level): Use intermediate sortable vector for
namespace bindings.
gcc/testsuite/
* c-c++-common/spellcheck-reserved.c: Restore diagnostic.
gcc/cp/
* module.cc (trees_out::chained_decls): Also mark local fns for
by-value walking.
(trees_out::decl_node): Assert we don't meet a local var or fn.
(trees_out::get_merge_kind): Local fns are also unique.
* pt.c (push_template_decl_real): Local fns also lack a header.
(tsubst_function_decl): Cope with local fns.
(tsubst_decl): Adjust VAR_DECL tsubsting.
gcc/testsuite/
* g++.dg/modules/tpl-extern-{var,fn}-1_{a.H,b.C}: New.
gcc/cp/
* cp-tree.h (TINFO_VAR_DECLARED_CONSTINIT): Replace with ...
(DECL_DECLARED_CONSTINIT_P): ... here, decl_lang_flag 7.
* decl.c (start_decl): Set DECL_DECLARED_CONSTINIT_P as necessary.
(cp_finish_decl): Likewise.
* pt.c (push_template_decl_real): Don't add a header for
DECL_LOCAL_DECL_P VAR_DECLS.
(tsubst_decl): Check for VAR_DECLS lacking template info are
local. No need to handle TINFO_VAR_DECLARED_CONSTINIT specially.
(tsubst_expr): Likewise.
(instantiate_decl): Likewise.
gcc/cp/
* mapper-client.cc (module_client::open_module_client): Use
Client::PC_PATHNAME.
* mapper-resolver.cc (module_resolver::ModuleRepoRequest): Use
PathnameResponse.
(module_resolver::cmi_response): Likewise.
(module_resolver::IncludeTranslateRequest): Use BoolResponse and
PathnameResponse.
* module.cc (module_state::set_filename): Use Client::PC_PATHNAME.
(module_translate_include): Use Client::PC_BOOL and
Client::PC_PATHNAME.
gcc/cp/
* module.cc (module_state::read_prepare_maps): New.
(module_state::write_{ordinary,macro}_maps): Adjust.
(struct module_state_config): Record number of locations needed.
(module_state::read_location): Deal with lack of locations.
(module_state::{read,write}_config): Adjust.
(module_state::read_initial): Adjust.
gcc/cp/
* name-lookup.c (get_fixed_binding_slot): Don't stat-hack a
namespace.
* module.cc (trees_in::assert_definition): Header units are
module_purview, but ok.
gcc/testsuite/
* g++.dg/modules/ns-dup-1_[ab].C: New.
gcc/cp/
* module.cc (depset::hash_add_namespace): Don't mark bindings
special.
(struct add_binding_data): Record finding a namespace.
(depset::hash::add_binding_entity): Make namespaces idempotent.
(depset::hash::add_namespace_entities): Clear met_namespace.
(module_state::write): Zap partitions bitmap if empty.
* name-lookup.c (push_namespace): Ensure namespace is in
MODULE_SLOT_CURRENT.
* ptree.c (cxx_print_xnode): Show more detail on MODULE_VECTOR.
gcc/testsuite/
* g++.dg/modules/ns-imp-1_[abc].C: New.
* g++.dg/modules/ns-part-1_[abc].C: New.
This implements signed and unsigned integer-class types, whose width is
one bit larger than the widest supported signed and unsigned integral
type respectively. In our case this is either __int128 and unsigned
__int128, or long long and unsigned long long.
Internally, the two integer-class types are represented as a largest
supported unsigned integral type plus one extra bit. The signed
integer-class type is represented in two's complement form with the
extra bit acting as the sign bit.
libstdc++-v3/ChangeLog:
* include/Makefile.am (bits_headers): Add new header
<bits/max_size_type.h>.
* include/Makefile.in: Regenerate.
* include/bits/iterator_concepts.h
(ranges::__detail::__max_diff_type): Remove definition, replace
with forward declaration of class __max_diff_type.
(__detail::__max_size_type): Remove definition, replace with
forward declaration of class __max_size_type.
(__detail::__is_unsigned_int128, __is_signed_int128)
(__is_int128): New concepts.
(__detail::__is_integer_like): Accept __int128 and unsigned
__int128.
(__detail::__is_signed_integer_like): Accept __int128.
* include/bits/max_size_type.h: New header.
* include/bits/range_access.h: Include <bits/max_size_type.h>.
(__detail::__to_unsigned_like): Two new overloads.
* testsuite/std/ranges/iota/difference_type.cc: New test.
* testsuite/std/ranges/iota/max_size_type.cc: New test.
gcc/cp/
* name-lookup.c (push_namespace): Do not create slot on first
lookup.
gcc/testsuite/
* g++.dg/modules/string-view1.C: New test.
* g++.dg/modules/string-view2.C: Ditto.
gcc/cp/
* lex.c (copy_lang_type): Split allocation & assignment to be
conditional-breakpoint friendly.
* module.cc (trees_in::read_class_def): Update variants if we
alter TYPE_LANG_SPECIFIC.
* name-lookup.c (maybe_lazily_declare): Look at main variant's
decl.
gcc/testsuite/
* g++.dg/modules/tdef-inst-1{.h,_[ab].C}: New.
gcc/cp/
* decl.c (poplevel): A local-binding tree list holds the name in
TREE_PURPOSE.
* name-lookup.c (update_local_overload): Add id to TREE_PURPOSE.
(lookup_name_1): Deal with local-binding error_mark_node marker.
(op_unqualified_lookup): Return error_mark_node for 'nothing
found'. Do other short circuiting here.
(maybe_save_operator_binding): Reimplement to always cache a
result.
(push_operator_bindings): Deal with 'ignore' marker.
gcc/testsuite/
* g++.dg/modules/operator-1_[ab].C: New.
gcc/cp/
* module.cc (trees_{in,out}::lang_decl_bools): Do not stream
anticipated_p or hidden_friend_p.
* name-lookup.c (name_lookup::adl_class_fns): DECL_ANTICIPATED is
not informative.
Hidden lambda entities only occur in block and class scopes. There's
no need to check for them on every lookup. So moving that particular
piece of validation to lookup_name_1, which cares. Also reordered the
namespace and type checking, as that is also simpler.
gcc/cp/
* name-lookup.c (qualify_lookup): Drop lambda checking here.
Reorder namespace & type checking.
(lookup_name_1): Do hidden lambda checking here.
gcc/cp
* module.cc (note_def_cache_hasher): GTY needs this even when
unused.
(not_defs_table_t, note_defs): Likewise.
gcc/
* gcc.c (driver::maybe_print_and_exit): Warn about disable-checking.
This patch deals with LOOKUP_HIDDEN, which originally meant 'find
hidden friends', but it's being pressed into service for not ignoring
lambda-relevant internals. However these two functions are different.
(a) hidden friends can occur in block scope (very uncommon) and (b) it
had the semantics of stopping after the innermost enclosing
namepspace. That's really suspect for the lambda case, but not
relevant there because we never get to namespace scope (I think).
Anyway, I've split the flag into two and adjusted the lambda callers
to just search block scope. These two flags are added to the
LOOK_want enum class, which allows dropping another parameter from the
name lookup routines.
The remaining LOOKUP_$FOO flags in cp-tree.h are, I think, now all
related to features of overload resolution, conversion operators and
reference binding. Nothing to do with /name/ lookup.
gcc/cp/
* cp-tree.h (LOOKUP_HIDDEN): Delete.
(LOOKUP_PREFER_RVALUE): Adjust initializer.
* name-lookup.h (enum class LOOK_want): Add HIDDEN_FRIEND and
HIDDEN_LAMBDA flags.
(lookup_name_real): Drop flags parm.
(lookup_qualified_name): Drop find_hidden parm.
* name-lookup.c (class name_lookup): Drop hidden field, adjust
ctors.
(name_lookup::add_overload): Check want for hiddenness.
(name_lookup::process_binding): Likewise.
(name_lookup::search_unqualified): Likewise.
(identifier_type_value_1): Adjust lookup_name_real call.
(set_decl_namespace): Adjust name_lookup ctor.
(qualify_lookup): Drop flags parm, use want for hiddenness.
(lookup_qualified_name): Drop find_hidden parm.
(lookup_name_real_1): Drop flags parm, adjust qualify_lookup
calls.
(lookup_name_real): Drop flags parm.
(lookup_name_nonclass, lookup_name): Adjust lookup_name_real
calls.
(lookup_type_scope_1): Adjust qualify_lookup calls.
* call.c (build_operator_new_call): Adjust lookup_name_real call.
(add_operator_candidates): Likewise.
* coroutines.cc (morph_fn_to_coro): Adjust lookup_qualified_name
call.
* parser.c (cp_parser_lookup_name): Adjust lookup_name_real calls.
* pt.c (check_explicit_specialization): Adjust
lookup_qualified_name call.
(deduction_guides_for): Likewise.
(tsubst_friend_class): Adjust lookup_name_real call.
(lookup_init_capture_pack): Likewise.
(tsubst_expr): Likewise, don't look in namespaces.
* semantics.c (capture_decltype): Adjust lookup_name_real. Don't
look in namespaces.
libcc1/
* libcp1plugin.cc (plugin_build_dependent_exp): Adjust
lookup_name_real call.
gcc/cp/
* module.cc (module_state::read_cluster): Pay attention to hidden
decls even in same module.
* name-lookup.c (name_lookup:adl_namespace_fns): Drop unused parm.
libcpp/
* directives.c (do_include_common): Drop FIXME question.
* lex.c (cpp_maybe_module_directive): C++ keywords are not a thing
here.
(cpp_directive_only_process): Add assert.
gcc/cp/
* pt.c (tsubst_expr): Do not process using decls again.
gcc/testsuite/
* g++.dg/modules/using-6_a.C: Enable elided code.
* g++.dg/modules/using-8_[ab].C: New.
gcc/cp/
* module.cc (module_state::set_filename): New.
(module_state::do_import): Drop fname arg.
(module_state::read_imports): Set filename here.
(module_state::write_locations): Drop duplicate FIXME.
(module_state::read_macros): Drop out of date FIXME.
(direct_import): Adjust.
(module_translate): Set filename if we're told it.
(preprocess_module): Copy if filename already known.
(preprocessed_module, init_modules): Adjust.
gcc/cp/
* name-lookup.c (name_lookup::adl_namespace_fns): Last param is
unused, and bootstrap complains :( [Nathan left it like that so
he'd be reminded to remove it if it really turned out not needed]
c++: implicit operator== adjustments from P2002. [Jason Merrill]
gcc/cp/
* class.c (build_clone): Retain old version for the moment.
(DECL_NEETS_VTT_PARM_P): Resurrect, before we kill it again.
libcpp/
* directives.c (_cpp_do_file_change): Optimize rewinding one line
to line zero.
gcc/c-family/
* c-opts.c (c_finish_options): Set locations to zero.
* c-ppoutput.c (cb_define): Always advance line number.
libcpp/
* files.c (cpp_push_include): Pass highest_line for loc.
(cpp_push_default): Likewise.
* line-map.c (linemap_add): Set range and column bits to what we
used to figure start location.
gcc/cp/
* except.c (declare_library_fn_1): Don't look at current binding.
Just make the decl and push it.
* name-lookup.h (get_global_module_decls): Delete decl.
* name-lookup.c (get_global_module_decls): Delete defn.
libcpp/
* macro.c (cpp_get_token_1): Pay attention to arg parsing mode,
and the existence of padding/comment tokens.
gcc/testsuite/
* g++.dg/modules/cpp-6_[abc].[CH]: New.
gcc/cp/
* module.cc (trees_in::unused): New field.
(trees_in::tree_node): Add defaulted off is_use arg, set TREE_USED
on streamed in obect. Adjust callers to set it.
(trees_{in,out}::core_bools): Do not stream base.used_flag.
(trees_{in,out}::core_vals): Reorder BINFO fields. Increment
unused around vtbl pieces.
(trees_in::decl_value): Save and reset unused field.
(trees_in::read_var_def): Increment unsused for vtbl initializers.
gcc/testsuite/
* g++.dg/modules/used-1_[abc].C: New.
gcc/cp
* module.cc (module_state::write_cluster): Break out definition
streaming to separate loop.
gcc/testsuite/
* g++.dg/modules/member-def-[12]_c.C: Adjust scans, oh for CHECK-DAG.
gcc/cp/
* cp-tree.h (DECL_MODULE_PARTITION_P): Delete.
(lang_decl_base): Remove module_partition_p bitfield.
* decl.c (duplicate_decls): No need to check or reset it.
* lex.c (cxx_dup_lang_specific): No need to reset it.
* pt.c (build_template_decl): No need to check it.
(tsubst_template_decl): Likewise, or reset it.
* module.cc (trees_in::decl_value): No need to set it.
(trees_out::decl_node): No need to check it.
(depset::hash::make_dependency): Likewise, Adjust import marking
code.
(set_instantiating_module): No need to reset it.
gcc/cp/
* module.cc (trees_out::key_mergeable): Instantiations can have
requires. Not just the template.
gcc/testsuite/
* g++.dg/modules/concept-6{,_[ab]}.[hHC]: New.
gcc/cp/
* tree.c (cp_tree_equal): [CALL_EXPR] Directly check number of
arguments to a call. Assert they exist.
* typeck.c (structural_comptypes): [DECLTYPE_TYPE] Break apart if
conditional.
gcc/cp/
* modules.cc (trees_out::decl_node): Treat namespaces like other
decls.
(module_state::write): When checking, insert non-imported
namespace-decls into the entity map here
gcc/cp/
* cp-tree.h (module_streaming): Declare.
* module.cc (module_streaming): Define.
(cluster_cmp): Do not strip template off an alias instantiation.
(module_state::read_cluster): Increment module_streaming around
the loading.
* typeck.c (structural_comptypes): Do not resolve typename types
when module_streaming.
WARNING: This breaks lots of modules tests because an inconsistent
invariant is now consistently different. This is intended.
gcc/cp/
* cp-tree.h (SET_TYPE_TEMPLATE_INFO): Not for aliases.
* pt.c (lookup_template_class_1): Type alias's template info
should already be correct.
(tsubst_template_decl): Don't reset TI_TEMPLATE of an alias.
gcc/cp/
* cp-tree.h (SET_TYPE_TEMPLATE_INFO): Only expand VAL once.
* pt.c (perform_typedefs_access_check)
(append_type_to_template_for_access_check): Move G_T_N_A_C call
out of loop.
(get_types_needing_access_check): Simplify.
gcc/cp/
* module.cc (trees_in::decl_value): No need to install constraints
for duplicate.
(trees_in::is_matching_decl): Open code comparison, don't use
decls_match.
gcc/cp/
* name-lookup.c (lookup_field_ident): Check TYPE_LANG_SPECIFIC
before getting the member vector.
gcc/testsuite/
* g++.dg/modules/pmf-1{,_[ab]}.[hHC]: New.
gcc/cp/
* pt.c (template_args_equal): Slight refactor to clarify control
flow.
* tree.c (cp_tree_equal): Use comp_template_args for TREE_VEC.
* typeck.c (structural_comptypes): Comment about exception specs.
gcc/cp/
* module.cc (trees_in::decl_value): Set and reset constraints on
duplicate decl.
(trees_in::is_matching_decl): Use decls_match to compare decls.
gcc/testsuite/
* g++.dg/modules/except-2{,_[ab]}.[hHC]: New.
* g++.dg/modules/builtin-4_[ab].[HC]: New.
gcc/cp/
* cp-gimplify.c (cp_genericize_r): Set DECL_CONTEXT of
IMPORTED_DECL.
* module.cc (trees_{in.out}::core_vals): Stream IMPORTED_DECL's
initial.
(trees_out::decl_node): IMPORTED_DECLs are always by value.
gcc/cp/
* module.cc (trees_{in,out}::vec_chained_decls): New.
(trees_{in,out}::{read,write}_class_def): Use for fields and vtables.
gcc/testsuite/
* g++.dg/modules/vtt-2{,_[ab]}.[hHC]: New.
gcc/cp/
* module.cc (depset::hash::add_dependency): Only unscoped enum
usings from the enum itself are mutual dependencies.
(cluster_cmp): Cope with using decls for the same entity.
gcc/testsuite/
* g++.dg/modules/using-7.C: New.
gcc/cp/
* module.cc (c_parse_final_c_cleanups): Don't complain about
unsynthesized defaulted members in a header unit, and pick a
privileged clone to complain about when we do.
gcc/testsuite/
* g++.dg/modules/imp-member-3.H: New.
gcc/cp/
* decl.c (build_typename_type): Refactor.
* module.cc (trees_out::decl_value): Class-scope using-decls are
like fields.
(trees_out::key_mergeable): Using-decls are like anonymous fields.
(trees_{in,out}::{read,write}_class_def): Beware of
typename_type's TYPE_DECL in the member vec.
(trees_out::mark_class_def): Mark using-decls.
(depset::hash::make_dependency): Assert not a class-scope
using-decl.
* name-lookup.c (mergeable_class_member): Using-decls can be
indexed.
({get,lookup}_field_ident): Using-decls are like fields.
gcc/testsuite/
* g++.dg/modules/indirect-2_c.C: Adjust scans.
* g++.dg/modules/merge-13{,_[ab]}.[hHC]: New.
PR c++/93761
gcc/cp/
* module.cc (begin_header_unit): New, broken out of ...
(module_begin_main_file): ... here. Call it when not
preprocessed.
(init_module_processing): Call it when preprocessed.
gcc/testsuite/
* g++.dg/modules/preproc-1.C: New.
gcc/c-family/
* c-opts.c (c_finish_options): Builtins & command-line starts at
1.
(cb_file_change): Alter detection of main file start.
* c-ppoutput.c (print_line_1): Don't nadger line zero.
gcc/testsuite/
* g++.dg/modules/dir-only-3.C: Update line nos
* g++.dg/modules/dir-only-4.C: New.
libcpp/
lex.c (do_peek_prev): New.
(cpp_directive_only_process): Peek backwards from 'R' to check it
is a raw literal.
gcc/testsuite/
* c-c++-common/cpp/dir-only-7.c: Extend.
libcpp/
* lex.c (cpp_maybe_module_directive): Use 'module control-line'
here ...
gcc/cp/
* parser.c (cp_parser_diagnose_invalid_type_name): ... here ...
(cp_parser_import_declaration): ... and here.
gcc/cp/
* module.cc (preprocess_module): Don't load header unit if not
preprocessing. Always close and reopen the line map spans.
libcpp/
* include/cpplib.h (cpp_get_options, cpp_get_callbacks)
(cpp_get_deps): Mark as PURE.
gcc/cp/
* module.cc (MODULE_UNKNOWN_PARTITION): Delete.
(module_state::read_imports): Drop special partition detection.
(module_state::read_partitions): Drop special partition setting.
(module_translate_include): Use main_source_loc for mapper.
({init,fini}_module_processing): Likewise.
gcc/cp/
* lex.c (module_token_cdtor): Call preprocessed_module on
teardown here ...
* parser.c (cp_lexer_new_main): ... not here.
* module.cc (struct module_state): Drop from_loc. Use loc in most
places.
(module_state::{maybe_create_loc,attach,is_detached): Delete.
(module_state::is_rooted): New. Use it in place of is_detached.
(module_state::imported_from): New.
(module_state::do_import): Create module loc here ...
(direct_import): ... not here ...
(preprocess_module): ... or here.
(import_module): Reparent here.
(module_cpp_undef): Check flag_header_unit.
(module_begin_main_file): Do not declare_module here ...
(preprocessed_module): ... do it here.
gcc/cp/
* cp-tree.h (module_normal_import_p): Delete.
* module.cc (struct module_state): Rationalize flags.
(module_normal_import_p): Delete.
(module_state::direct_import): Don't save and restore line state
here.
(import_module): Just for language.
(preprocess_module): Save line state here.
* parser.c (cp_parser_import_declaration): Adjust.
(cp_parser_declaration_seq_opt): Drop superflous pragma parsing.
gcc/cp/
* parser.c (cp_parser_import_declaration): Diagnose include
translation in purview. Emit not about module-directive control lines.
gcc/testsuite/
* g++.dg/modules/inc-xlate-1_e.C: New.
gcc/cp/
* parser.c (cp_lexer_alloc): Remove pch finalizing.
(cp_parser_skip_to_closing_parenthesis_1): Deal with falling into
a CPP_PRAGMA.
(cp_parser_skip_to_end_of_{,block_or_}statement): Likewise.
(cp_parser_skip_to_pragma_eol): Don't consume CPP_EOF.
(cp_parser_new): Take a lexer, don't create it here.
(cp_parser_declaration): Just point at peeked tokens, don't copy.
(cp_parser_block_declaration): Likewise.
(cp_parser_initial_pragma): Don't get the first token.
(c_parse_file): Do it here, and finalize pch.
gcc/
* langhooks.h (preprocess_token): Take a const cpp_token pointer.
gcc/c-family/
* (c-ppoutput.c (scan_translation_unit): Preprocess lang hook
doesn't alter the token.
gcc/cp/
* modules.cc (struct module_state): Add
lazy_{preprocessor,language}_p fields.
(module_state::read): Delete, move into do_import.
(module_state::read_preprocessor): Be idempotent. Read direct
imports here. Set header bitmap bits.
(module_state::read_language): Likwise, set import/export bits.
(module_state::check_read): Adjust file closing condition.
(module_state::set_import): Do not deal with header bitmaps here.
(module_state::do_import): Do the initial read (only) here.
(module_state::direct_import): Do the preprocessor and language
reading here.
gcc/cp/
* module.cc (module_state::write_macros): Always write a set of
defs.
(module_state::read_macros): Adjust.
(module_state::read): Only read preprocessor for header unit.
(module_state::read_preprocessor): Assert am header.
gcc/cp/
* (slurping::slurping): Set current to ~0u.
(module_state::read_{initial,preprocessor,language}): New, broken
out of ...
(module_state::read): ... here. Call them.
gcc/cp/
* module.cc (trees_in::decl_value): Disambiguate entity_ary slots
from entuty_hash slots.
(trees_out::get_merge_kind): Check of
uninstantiated_template_friend even when there's a depset.
From-SVN: r279867
gcc/cp/
* module.cc (nodel_decl_has): Rename to ...
(duplicate_hash): ... here. Allow binfos.
(trees_{in,out}::binfo_mergeable): New.
(trees_{in,out}::tree_node): Use it for tt_binfo.
(trees_{in,out}::tree_value): Use to dedup binfos.
gcc/testsuite/
* g++.dg/modules/merge-8{.h,_a.H,_b.C}: New.
From-SVN: r279843
gcc/cp/
* name-lookup.c (mergeable_class_member): Get to class template
correctly.
gcc/testsuite/
* g++.dg/modules/merge-4{.h,_a.H,_b.C}: New.
From-SVN: r279578
gcc/cp/
* cp-tree.h (set_defining_module): New.
* decl.c (xref_tag_1): Check can redeclare, set instantiating
module.
(start_enum): Likewise.
* module.c: Update doc preamble.
(module_may_redeclare): Get to the template.
(set_defining_module): New, broken out of ...
(set_instantiating_module): ... here. DO NOT CALL IT.
* rtti.c (init_rtti_processing): "type_info" has exportedness.
* semantics.c (begin_class_definition): Check can redeclare. Set
defining & instantiating.
gcc/testsuite/
* g++.dg/modules/friend-5_[ab].C: New.
* g++.dg/modules/tdef-4_[bc].C: Add comments.
From-SVN: r279457
gcc/cp/
* module.cc (enum tree_tag): Delete tt_friend_template.
(trees_out::decl_node): Uninstantiated template friends are not
special.
(trees_in::tree_node): Delete tt_friend_template handling.
(trees_{in,out}::{read,write}_class_def): No need to stream
template friends specially.
(trees_out::mark_class_def): No need to mark class_def members.
(depset::hash::make_dependency): Do not confuse uninstantiated
template friends with scope members.
gcc/testsuite/
* g++.dg/modules/tpl-friend-2_a.C: Adjust scan.
From-SVN: r279054
gcc/cp/
* decl.c (duplicate_decls): Just zap the template's module info.
* pt.c (build_template_decl): No need to copy false module info.
(tsubst_tenplate_decl): Just zap the template's module info.
From-SVN: r279051
gcc/cp/
* module.cc (trees_out::decl_node): Get entity index & origin from
the depset.
(depset::hash::make_dependency): Stash the importing index &
origin in the depset.
(module_state::write_namespace): Get index & origin from the depset.
From-SVN: r279044
gcc/cp/
* module.cc (depset::hash::add_namespace_context): New.
(trees_out::decl_node): Use it for namespace.
(depset::hash::make_dependency): Likewise for discovered GMF.
(depset::hash::add_binding): Use it.
(depset::tarjan::connect): Make sure we don't wander into imports.
(depset::hash::connect): Don't add imports to the graph.
(module_state::write_namespace{,s}): Never meet global namespace.
(module_state::write_entries): Handle namespace entries
separately.
(module_state::write): Reorganize cluster traversals.
From-SVN: r279022
gcc/cp/
* cp-tree.h (lang_decl_base): Replace module_origin field with
module_import_p flag.
* modules.cc (MODULE_UNKNOWN{,_PARTITION}): Adjust.
(MODULE_LIMIT): Delete.
(module_state): Make mod & remap unsigneds.
(module_state::read): No need to check module overflow.
From-SVN: r279011
gcc/cp/
* decl.c (duplicate_decls): Pass the old decl to
module_may_redeclare.
* module.cc (module_may_redeclare): Expect the decl, not its
owner.
From-SVN: r278999
Namespaces are in the entity_ary
gcc/cp/
* module.cc (trees_out::decl_node): Namespace may be imported.
(depset::hash::make_dependency): Permit imported namespaces.
(module_state::{read,write}_namespace): New.
(module_state::{read,write}_namespaces): Adjust.
(module_state::{read,write]_bindings): Adjust.
(module_state::write): Don't count imported namespaces.
* name-lookup.c (add_imported_namespace): Mark namespace as
imported, if we made it.
gcc/testsuite/
* g++.dg/modules/indirect-[1234]_[bc].C: Adjust scans.
* g++.dg/modules/namespace-2_a.C: Likewise.
From-SVN: r278704
gcc/cp/
* module.cc (module_state::write_namespaces): Delete TABLE arg,
look at the dep[0]
(module_state::read_namespaces): Add num_spaces parm. Use it to
count iterations.
(module_State::write_bindings): Drop TABLE arg, look at dep[0].
(struct module_state_config): Add num_namespaces field, stream it.
(module_state::{read,write}): Don't count global namespace. adjust
namespace streaming calls.
From-SVN: r278694
gcc/cp/
* module.cc (trees_out::decl_node): We should not meet imported
internal namespaces.
(module_state::{read,write}): Stream entities before namespaces.
* name-lookup.c (add_imported_namespace): Module number must not
be negative.
From-SVN: r278625
gcc/cp/
* module.cc (trees_out::decl_value): Ensure we stay in the
section.
(trees_out::mark_class_def): Do not mark non-friends on
CLASSTYPE_DECL_LIST.
From-SVN: r278615
gcc/cp/
* (class depset): Drop entity_num field. Adjust all users to use
cluster.
(module_state::write_cluster): Merge counting and marking loops.
From-SVN: r278508
gcc/cp/
* module.cc (unnamed_ary, unnamed_map_t, unnamed_map): Delete.
(module_state::read_cluster): Do not insert into unnamed_ary or
map.
(module_state::read_unnamed): Do not extend unnamed_ary.
(module_state::read): Do not set unnamed_lwm.
(module_for_unnamed): Delete.
({init,finish}_module_processing): Do not create/destroy unnamed
ary & map.
gcc/testsuite/
* g++.dg/modules/indirect-[23]_c.C: Adjust scans.
* g++.dg/modules/inst-[124]_b.C: Likewise.
* g++.dg/modules/late-ret-2_c.C: Likewise.
From-SVN: r278503
gcc/cp/
* module.cc (module_state::{read,write}_unnamed): Specializations
store entity index.
(lazy_load_specializations): Use the entity index.
From-SVN: r278502
gcc/cp/
* module.cc (enum tree_tag): Kill tt_anon. No longer emitted.
(import_entity_index): Do not assert imported.
(trees_out::decl_node): Always get the import_entity_index for
tt_entity.
(trees_in::tree_node): Drop tt_anon.
(module_state::write_cluster): Add the EK_DECLs to the entity_map.
From-SVN: r278462
gcc/cp/
* module.cc (trees_out): Clean up dead code, update instrumentation.
(class_members): New global.
(depset::hash::add_writables): Rename to ...
(depset::hash::add_namespace_entities): ... this.
(depset::hash::add_class_entities): New, incomplete.
(module_state::write): Add class entities.
(set_instantiating_module): Record on the class_member list if
necessary.
From-SVN: r277574
gcc/cp/
* module.cc (trees_out::core_vals): Check if template parm has a
canonical type before processing it.
(trees_in::tpl_parm_value): Likewise.
From-SVN: r277457
gcc/cp/
* module.cc (enum tree_tag): Delete tt_typename_decl.
(trees_out::decl_node): Do not handle TYPENAME_TYPE here ...
(trees_out::type_node): ... handle them here instead.
(trees_in::tree_node): Delete tt_typename_decl, handle
TYPENAME_TYPE as a derived_type.
From-SVN: r277456
gcc/cp/
* module.cc (trees_{in,out}::tpl_parm_value): No need to stream
bound ttp's TI here.
(trees_out::get_merge_kind): Refactor anon type determination.
From-SVN: r277423
gcc/cp/
* module.cc (get_clone_target): Assert more.
(trees_in::back_ref): Check the tree's not insane.
(trees_in::tree_node): Check the clone target is ok.
(module_state::lazy_load): Inhibit GC.
From-SVN: r277356
gcc/cp/
* pt.c (reduce_template_parm_level): Attach TPI to the type or
decl.
(convert_generic_types_to_packs): Pass new type to
reduce_template_parm_level.
From-SVN: r277332
gcc/cp/
* module.cc (trees_{in,out}::key_mergeable): Do not stream tpl
header of fn parms here.
(tree_{in,out}::decl_value): Stream them here ...
(depset:hash::find_dependencies): ... and here.
From-SVN: r277263
gcc/cp/
* module.cc (dumper::impl::nested_name): Check template_parm_p
directly.
(trees_out::core_vals): Check DECL_TEMPLATE_PARM_P.
(trees_out::decl_value): Never get a DECL_TEMPLATE_PARM_P.
(trees_in::decl_value): Likewise.
(trees_out::decl_node): Send DECL_TEMPLATE_PARM_P to tree_value.
(trees_out::type_node): Simplify name detection.
(trees_out::tree_value): Allow DECL_TEMPLATE_PARM_P, but no other
tmpls/type/var/fns.
* tree.c (bind_template_template_parm): Set DECL_TEMPLATE_PARM_P.
From-SVN: r277162
gcc/cp/
* module.cc (trees_out::decl_value): Stream thunks too.
(trees_out::decl_node): Forward all potentially mergeable decls to
decl_value.
(trees_out::tree_value): Make sure we don't get any potentially
mergeable decls.
(trees_{in,out}::tree_value): Stream template parms
via tpl_parms.
(trees_{in,out}::tpl_parms): The vector can be 0 length.
(trees_out::mark_declaration): Don't mark the template parms.
From-SVN: r277116
gcc/cp/
* decl.c (cp_make_fname_decl): Set context to global namespace,
outside functions.
(builtin_function_1): Merge into ...
(cxx_builtin_function): ... here. Nadger the decl before maybe
copying it. Set the context.
(cxx_builtin_function_ext_scope): Push to top level, then call
cxx_builtin_function.
From-SVN: r277081
gcc/cp/
* decl.c (fixup_anonymous_aggr): Clear LAZY flags, no need to
strip out fns.
* module.cc (module_state::lazy_load): Distinguish out of order
from failure to set slot.
* name-lookup.c (get_binding_or_decl): Fixme :(
gcc/testsuite/
* g++.dg/modules/anon-2{,_[ab]}.[hHC]: New.
From-SVN: r276005
gcc/cp/
* module.cc (trees_in::chained_decls): No need to deal with clones here.
(trees_{in::read,out::write}_class_def): Don't chain
fields until we know we're the definition.
From-SVN: r275732
gcc/cp/
* name-lookup.c (get_namespace_binding): Fish out global binding,
if it's a vector.
gcc/testsuite/
* g++.dg/modules/binding-2.H: New.
From-SVN: r275711
gcc/cp/
* module.cc (trees_out::core_vals): Never write template_decl's
type ...
(trees_in::tree_value): .. resurrect it here instead.
From-SVN: r275687
gcc/cp/
* module.cc (trees_in::finish{,_type}): Delete. Move into
trees_in::tree_value, removing type remapping etc,
(trees_out::start): Check not streaming an unexpected type.
(trees_{in,out}::core_vals): Pointers are not streamed.
(trees_in::tree_value): Move remnants of finish{,_type} here &
simplify handling.
From-SVN: r275225
gcc/cp/
* module.cc (friend_from_decl_list): Reimplement.
(trees_out::tree_decl): When streaming a local template friend
reference, make sure we find one.
From-SVN: r275179
gcc/cp/
* module.cc (trees_in::finish): Don't clear PENDING_TEMPLATE here.
Set IDENTIFIER_VIRTUAL_P if t is a vfunc.
(trees_out::core_bools): Write PENDING_TEMPLATE as false.
gcc/testsuite/
* g++.dg/modules/virt-1_[ab].C: New.
From-SVN: r274909
gcc/cp/
* module.cc (depset::hash::add_dependency): We should never find a
redirect.
(depset::hash::add_redirect): Rename to ...
(depset::hash::add_partial_redirect): ... here. Mark the redirect
as unreachable.
From-SVN: r274832
gcc/c-family/
* c.opt (Winclude-translate*): New family of options.
gcc/cp/
* module.cc (inform_includes): New var.
(module_translate_include): Inform of translations.
(init_module_processing): Canonicalize inform list.
(handle_module_option): Process inform options.
From-SVN: r274797
gcc/cp/
* module.cc (module_state::module_state): Header must be
.-relative if not absolute.
(get_module): Validate module name more.
gcc/testsuite/
* g++.dg/modules/map-2.{C,map}: New.
From-SVN: r274752
gcc/cp/
* module.cc (trees_out::tree_decl): Cope with using decls in the
binding list.
gcc/testsuite/
* g++.dg/modules/enum-6_[ab].[HC]: New.
From-SVN: r273762
gcc/cp/
* parser.c (cp_parser_class_specifier_1): Fixup a template's type
with a late exception specifier.
gcc/testsuite/
* g++.dg/modules/except-1.C: New.
From-SVN: r273701
gcc/cp/
* module.cc (trees_out::tree_type): Detect bound template template
parm.
(trees_{in,out}::tree_value): Stream type on any TYPE_DECL that
its TYPE_STUB_DECL.
From-SVN: r272936
gcc/cp/
* module.cc (trees_in:::read_function_def): Push the template for
post processing.
(module_state::read_cluster): Deal with abstract post processing.
gcc/testsuite/
* g++.dg/modules/tpl-friend-7_[ab].C: New.
From-SVN: r272895
gcc/cp/
* class.c (build_clone): Neaten and assert.
* cp-tree.h (lang_decl_u5.cloned_function): Fix comment.
* name-lookup.c (get_lookup_ident): Don't fall off end of overload.
From-SVN: r272761
gcc/cp/
* module.cc (depset::hash::add_redirect): New.
(depset::hash::add_specialization): Use it.
(depset::hash::add_mergeable): Use it.
(depset::hash::add_dependency): Never add a redirect here.
From-SVN: r272707
gcc/cp/
* modules.cc (depset): Add DB_OOT_SPEC_BIT.
(depset::~depset): Free the spec entry if we own it.
(trees_{in,out}::note_definition): Check template result isn't
there.
(depset::hash::add_dependency): Correctly insert discovered
non-member template instantiations.
From-SVN: r272562
gcc/cp/
* modules.cc (note_defs): New checking hash table.
(trees_{in,out}::note_definition): New checkers
(trees_in::read_{function,class,var,enum}_def): Add maybe_template
arg, use it. Note definitions.
(member_owned_by_class): New, extracted from ...
(trees_out::mark_class_member): ... here. Call it.
(trees_out::write_class_def): Only write the owned definitions.
(trees_out::write_definition): Note definition.
(trees_in::read_definition): Pass maybe_template to readers.
(module_state::write): Reset note_defs hash.
(init_module_processing): Init it.
(finish_module_processing): Delete it.
From-SVN: r272557
gcc/cp/
* module.cc (specialization_cmp): Deal with more equivalencies.
(depset_cmp): New, cloned and adjusted from cluster_cmp.
(depset::hash::connect): Use it.
From-SVN: r272419
gcc/cp/
* module.cc (depset::hash::add_dependency): Inhibit internal
linkage setting on functions.
(cluster_cmp): We can meet matching using decls.
From-SVN: r272313
gcc/cp/
* module.cc (trees_{in,out}::core_vals): Stream original_type of type of
typedefs, not their type.
(trees_out::tree_type): Stream type_name of typedefs.
(trees_out::tree_value): Insert the type of a typedef.
(trees_in::tree_value): Reconstruct the type of a typedef.
gcc/testsuite/
* g++.dg/modules/tdef-4_[abc].C: New.
* g++.dg/modules/class-3_b.C: Adjust scan.
From-SVN: r272297
gcc/cp/
* module.cc (binding_cmp): There can be an implicit and
non-implicit type_decl.
* name-lookup.c (check_mergeable_decl): Check implicitness of
type_decl.
From-SVN: r272286
gcc/cp/
* module.cc (loc_spans::init): Add lmaps parm, separate main from
forced header locs.
(loc_spans::SPAN_FIRST): New, use it for first span.
(loc_spans::SPAN_MAIN): Just after the first span.
gcc/testsuite/
* g++.dg/modules/macro-5_[abc].[CH]: Adjust.
From-SVN: r272042
gcc/cp/
* module.cc (dumper::MAPPER): New flag. Use it on mapper things.
(dumper::push): Only do blank line when starting a new module
nest.
From-SVN: r271973
gcc/cp/
* module.cc (trees_out::core_vals): Template type parms are their
own canonical.
(trees_in::finish_type): Never subst a canonical type parm for the type.
gcc/testsuite/
* g++.dg/modules/ttp-2_[ab].C: New.
From-SVN: r271962
gcc/cp/
* module.cc (trees_{in,out}::{read,write}_class_def): Stream
friend lists and decl lists specially.
(trees_out::mark_class_def): Mark local friend decls.
(depset::hash::add_specializations): Don't add non-specializations
that are in the table.
* pt.c (push_template_decl_real): Mark non-pushed friend templates.
gcc/testsuite/
* g++.dg/modules/tpl-friend-1_[ab].C: New.
From-SVN: r271809
gcc/cp/
* decl.c (duplicate_decls): Remove duplicate assert.
* pt.c (build_template_decl): Set RESULT & TYPE of the template
here ...
(process_partial_specialization): ... not here ...
(add_inherited_template_parms): ... nor here ...
(push_template_decl_Real): ... nor here. Refactor.
gcc/
* doc/invoke.texi (C++ Modules): Document atomicity.
gcc/fortran/
* cpp.c (gfc_cpp_add_dep, gfc_cpp_add_target, gfc_cpp_init):
Rename mrules to mkdeps.
From-SVN: r271745
Template template parms, and a bunch of other stuff
gcc/cp/
* cp-tree.h (DECL_TEMPLATE_INFO): Augment docs.
* module.cc (depset::clear_flag_bit): New.
(depset::is_unreached): Replace is_implicit_specialization.
(depset::is_marked): Replace is_first_dep_repurposed.
(dumper::impl::nested_name): Template args may be NULL.
(trees_{in,out}::core_vals): Template decl result & args streamed
with decl.
(trees_out::tree_decl): TTPs by value.
(trees_{in,out}::tree_value): Reorder body streaming, stream more
template bits.
(trees_out::tree_mergeable): Redo specialization tagging.
(trees_out::mark_class_def): Only mark decls on decl list.
(trees_out::mark_declaration): Simplify.
(depset::hash::add_dependency): Deal with reaching unreached.
(specialization_add): Grab all instantiations from this TU.
(depset::hash::add_specialiazations): Determing is_unreached.
(depset::hash::find_dependencies): Iterate until no more unreached
reached.
(module_state::write_unnamed): Adjust.
* pt.c (tsubst_function_decl): Set DECL_MODULE_OWNER.
gcc/testsuite/
* g++.dg/modules/ttp-1_[ab].C: New.
* g++.dg/modules/indirect-[234]_[bc].C: Adjust scans.
* g++.dg/modules/inst-[24]_ab.C: Likewise.
From-SVN: r271578
gcc/cp/
* module.cc (dumper::impl::nested_name): Cope with TTPs.
(depset:hash::connect): Return the vec.
(depset::tarjan): Create and return the vec.
(module_state::write_{bindings,unnamed}): SCCS are in a vec.
(module_state::write): Likewise.
From-SVN: r271551
gcc/cp/
* module.cc (trees_out::mark_node): Binfos may be marked.
(trees_{in,out}::start): Binfos may be streamed.
(trees_{in,out}::core_vals): Likewise.
(trees_{in,out}::tree_node): Reachable binfos may always be
inserted.
(trees_{in,out}::{read,write}_binfos): Delete.
(trees_out::mark_class_def): Mark the binfo heirarchy.
(trees_{in,out}::{read,write}_class_def): Stream binfos here.
From-SVN: r271518
gcc/cp/
* pt.c (get_specializations): Adjust type template checking.
(get_specializations_for_module): Get the type specializations
too.
gcc/testsuite/
* g++.dg/modules/tpl-spec-4_[ab].C: New.
From-SVN: r270781
gcc/cp/
* module.cc (tree_tag): Delete tt_inst.
(trees_out::tree_decl): All instantiations are depended. Never
tt_inst.
(trees_in::tree_node): Delete tt_inst handling.
(trees_in::tree_mergeable): Deal with type specializations.
* pt.c (lookup_template_class_1): Set instatiation owner to
current TU.
(match_mergable): Accept type.
gcc/testsuite/
* g++.dg/modules/indirect-[234]_[bc].C: Adjust scans.
* g++.dg/modules/inst-[34]_[bc].C: New.
From-SVN: r270543
gcc/cp/
* module.cc (loc_spans): Main span is all but reserved.
(slurping): Drop pre_early_ok.
(loc_spans::init): Don't push command line and forced header
spans.
(loc_spans::macro): Fix comparison thinko.
(module_state::read_location): No such thing as early.
(module_state::prepare_locations): Drop command line and forced
header handling.
(maybe_add_macro): Ignore lazy macros.
(canonicalize_header_name): Return the buffer.
gcc/testsuite/
* g++.dg/modules/macro-5_c.C: Adjust regexp.
From-SVN: r270281
gcc/cp/
* module.cc (slurping): Move {ordinary,macro}_locs to ..
(module_state): ... here.
(module_state::read_location): Adjust.
(module_state::read_locations): Adjust, recorded locations are for
current TU.
From-SVN: r270242
gcc/c-family/
* c-opts.c (c_common_post_options): Invert sense of
cpp_read_main_file arg.
* parser.c (cp_parser_translation_unit): Reject include
translation in module purview.
libcpp/
* files.c (_cpp_stack_file): Take include type parm. Drop line
parm. Do line-table adjusting here ...
(_cpp_stack_include): ... not here.
* include/cpplib.h (cpp_read_main_file): Invert final parm.
* init.c (cpp_read_main_file): Adjust.
* internal.h (include_type): Add more enumerations, and document.
(_cpp_stack_file): Adjust prototype.
gcc/testsuite/
* g++.dg/modules/alias-2_[ab].[CH]: New.
* g++.dg/modules/exp-xlate-1_[ab].[CH]: New.
* g++.dg/modules/legacy-3_c.H: Robustify testing.
From-SVN: r269955
libcpp/
* files.c (struct _cpp_file): Make bools bitfields, add
header_unit flag.
(_cpp_find_file): Don't write for (; !fake;) to mean if (!fake)
for (;;).
(_cpp_stack_file): Refactor.
From-SVN: r269844
gcc/cp/
* module.cc (trees_out::tree_node): Any enum can be tt_enum_int.
(trees_out::mark_enum_def): Only mark integer_cst inits.
gcc/testsuite/
* g++.dg/modues/enum-2_[ab].C: Add.
From-SVN: r269822
gcc/cp/
* class.c (maybe_add_class_template_decl): No need to add
CONST_DECLs.
* pt.c (instantiate_class_template_1): CONST_DECLs are not on the
template decl list no more.
* module.cc (node_template_info): Cope with enum members of
templates.
(trees_in::insert): Allow null when failing.
(trees_{in,out}::core_vals): Uninstantiated enums have no
underlying type.
(trees_out::tree_decl): Enums can be members too.
(trees_out::write_class_def): Only announce when streaming.
(trees_out::mark_class_def): Don't mark const_decl fields.
(trees_out::mark_enum_def): Always mark the enum here.
gcc/testsuite.
* g++.dg/modules/enum-2_[ab].C: New.
From-SVN: r269821
gcc/cp/
* module.cc (depset::add_dependency): Drop kind arg, return void.
(depset::hash::add_binding): Set current during addition.
(module_state::add_writables): Tweak.
From-SVN: r269397
gcc/cp/
* module.cc (declare_module): Deal with de-GMFing here, get name
of implicit header module.
* parser.c (enum module_preamble): New enum.
(cp_parser_translation_unit): Use it.
(cp_parser_{module,import}_declaration): Use it.
gcc/testsuite/
* g++.dg/modules/p0713-3.C: Adjust diags.
From-SVN: r269369
gcc/cp/
* module.cc (module_state::read_imports): Don't check partition
import/export here ...
(module_state::direct_import): ... or here ...
(module_state::write): ... but do it here.
gcc/testsuite/
* g++.dg/modules/part-2_[cd].C: Adjust.
* g++.dg/modules/part-2_e.C: New.
From-SVN: r269215
gcc/cp/
* module.cc (get_module_slot): New broken out of ...
(get_module): ... here. Call it.
(module_mapper::translate_include): Cope with file mapper.
gcc/testsuite/
* g++.dg/modules/map-2.{C,map}: New. gcc/cp/
* module.cc (get_module_slot): New broken out of ...
(get_module): ... here. Call it.
(module_mapper::translate_include): Cope with file mapper.
gcc/testsuite/
* g++.dg/modules/map-2.{C,map}: New.
From-SVN: r268535
ADL part 2
gcc/cp/
* cp-tree.h (LOOKUP_FOUND_P): Allow enumeral type.
* module.cc (module_visible_instantiation_path): Protect against
not modules.
* name-lookup.c (name_lookup::adl_enum): New.
(name_lookup::adl_type): Use it.
(name_lookup::adl_namespace): INST_PATH may be null.
(name_lookup::search_adl): Add partitions of types.
gcc/testsuite/
* g++.dg/modules/adl-[23]_[abc].C: New.
From-SVN: r268401
gcc/cp/
* module.cc (trees_{in,out}::core_vals): Clean up enumeral type
handling.
(trees_out:tree_node, module_state::mark_enum_def): Use
SCOPED_ENUM_P.
From-SVN: r268206
gcc/cp/
* cp-objcp-common.c (cp_common_init_ts): Set TYPE_PACK_EXPANSION
and TYPE_ARGUMENT_PACK correctly.
* module.c (trees_in::finish_type): Special case pack types and
bound parms.
gcc/testsuite/
* g++.dg/modules/var-tpl-1_[ab].C: New.
From-SVN: r268171
gcc/cp/
* module.cc (finish_module_parse): Better location info on errors.
* name-lookup.c (name_lookup::search_namespace_only): Skip
partition slot too.
(check_module_override): Likewise.
(check_mergeable_decl): New broken out of ...
(match_mergeable_decl): ... here. Call it.
(reuse_namespace): Iterate over global slot.
* ptree.c (cxx_print_xnode): More cluster info.
From-SVN: r267807
gcc/cp/
* module.cc (module_state::direct_import): Add LAZY parm, use it
rather than legacy_p. Never expect a has_bmi file.
(declare_module): Always add to pending imports.
* parser.c (cp_parser_translation_unit): Expect a deferred import
for a legacy module.
From-SVN: r267729
Breaks leg-merge-7_c.C,
gcc/cp/
* module.c (tree_tag): Add tt_template, tt_implicit_template.
(node_template_info): Not a member fn.
(trees_out::{mark_node,mark_gme}): Mark the template.
(trees_{in,out}::tree_decl): Do implicit_template.
From-SVN: r267205
Function template deduping
gcc/cp/
* module.cc (trees_{in,out}::tpl_parms): New.
(trees_out::mark_gme): Mark template result too.
(trees_{in,out}::tree_gme): Do templates.
(module_state::write_cluster): Any decl can be a GME.
* name-lookup.c (match_global_decl): Do template matching.
gcc/testsuite/
* g++.dg/modules/leg-merge-7_[abc].[HC]: New.
From-SVN: r267110
gcc/cp/
* class.c (layout_class_type): Base type has same module as owner.
* module.c (module_state::{read,write,mark}_class): Take DECL.
(module_state::{read,write,mark}_definition): Adjust.
(topmost_decl, get_module_owner): Don't look at TYPE_CONTEXT.
From-SVN: r266790
gcc/cp/
* class.c (build_base_field_1): Refactor.
(layout_class_type): Give as_base type a name.
* decl.c (initialize_predefined_identifiers): Make as_base name
unpronouncable.
* module.cc (tree_tag): Delete tt_as_base.
(trees_out::mark_node): Don't expect fake base here.
(trees_out::tree_decl): Write fake base as pseudo-named.
(trees_out::tree_type): Don't handle fake base specially here.
(trees_in::tree_node): Read fake base as pseudo-named. Delete
tt_as_base handling.
(module_State::mark_class_def): Adjust.
From-SVN: r266789
gcc/cp/
* parser.c (cp_parser_translation_unit): Adjust GMF deferred
imports.
(cp_parser_tokenize): Only pay attention to module/export at start
of decl.
From-SVN: r265299
gcc/cp/
* module.cc (declare_module): Copy always.
* parser.c (cp_parser_translation_unit): Allo GMF under atom.
(cp_parser_module_name): Allow legacy name under ts
(cp_parser_initial_pragma): Fix from trunk.
gcc/testsuite/
* g++.dg/modules: Many changes.
From-SVN: r265230
gcc/cp/
* module.cc (module_purview): New.
(module_purview_p, module_interface_p): Use it.
(module_maybe_interface_p): New.
(module_state::direct_import): Use it.
(process_deferred_imports): Likewise.
(declare_module): Set it.
* parser.c (cp_parser_translation_unit): Inform modules of GMF.
From-SVN: r265229
gcc/cp/
* cp-tree.h (process_deferred_imports): Drop location arg.
* modules.cc ({import,declare}_module): Disable eager imports.
(process_deferred_imports): Drop location arg, take it from the
deferred imports.
* parser.c (cp_parser_translation_unit): Process deferred imports.
(cp_parser_tokenize): Correct close-brace handling. Process
deferred imports.
(c_parse_file): Adjust prcess_deferred_imports call.
gcc/testsuite/
* g++.dg/modules/atom-preamble-1.C: More workarounds
From-SVN: r265201
gcc/cp/
* module.cc (loc_spans::close): Close the current last map.
(module_state::prepare_locations): Adjust.
(module_state::preamble_load): Adjust span closing.
(finish_module_parse): Likewise.
From-SVN: r265023
Expunge the spewer
gcc/cp/
* module.c (struct slurping): No need to tag.
(struct spewing): Delete.
(declare_module, module_state::atom_preamble)
(finish_module_parse): Don't deal with it.
From-SVN: r264311
New locations working
gcc/cp/
* module.c (module_state::prepare_locations): New, broken out of ...
(module_state::write_locations): Adjust.
(module_state::read_locations): Fix.
From-SVN: r264309
gcc/cp/
* cp-tree.h (import_module, declare_module, atom_module_preamble)
(finish_module_parse, maybe_atom_legacy_module): Drop line_map
arg.
* decl2.c (c_parse_final_cleanups): Adjust.
* parser.c (cp_parser_module_declaration)
(cp_parser_import_declaration, c_parse_file): Adjust.
* module.c (loc_spans): Drop lmaps member & adjust.
(module_state): Drop line_maps from some but not all members.
From-SVN: r264272
Fix more GC
gcc/cp/
* module.c (module_state): Tag for_user.
(module_state_hash): Defive from ggc_ptr_hash.
(init_module_processing): GGC alloc hash table. get mapper when
not lazy, add ggc_collect.
(finish_module_parse): Don't zap hash table here ...
(finish_module_processing): ... do it here instead.
gcc/testsuite/
* g++.dg/modules/gc-2_a.C: New.
* g++.dg/modules/gc-2.map: New.
From-SVN: r263995
gcc/cp/
* cp-tree.h (declare_module, import_module): Separate name from location.
* module.c (module_state::attach): Drop maybe_vec_name arg.
(module_state::get_module): Flatten here.
(declare_module, import_module): Separate name and from loc.
(maybe_atom_legacy_module): Adjust.
* parser.c (cp_parser_module_name): Return tree only.
(cp_parser_module_declaration, cp_parser_import_declaration): Adjust.
gcc/testsuite/
* g++.dg/modules/legacy-4.H: Adjust error.
From-SVN: r263748
gcc/cp/
* module.c (module_state::is_mapping): Rename to ...
(module_state::is_detached): ... here. Use from_loc.
(module_state::attach): New broken out of ...
(module_state::find_module): ... here. Delete, fold into ...
(module_state::read_imports, import_module, declare_module):
... these callers.
(module_state::read): Adjust module index setting.
From-SVN: r263705
Diverted include line numbering
libcpp/
* directives.c (do_include_common): Fixup diversion line
numbering.
(_cpp_pop_buffer): Free to_free even if not a file.
gcc/c-family/
* c-ppoutput.c (print_line_1): More C++y.
gcc/cp/
* module.c (module_mapper::divert_include): Two \n's.
gcc/testsuite/
* g++.dg/modules/legacy-4: New.
* g++.dg/modules/legacy-3_b.H
From-SVN: r262256
gcc/cp/
* module.c (module_state::{read,write}_imports): New.
(module_state::{read,write}_config): Don't do imports here ...
(module_state::{read,write}): ... do them here.
From-SVN: r261600
gcc/cp/
* module.c (struct slurping): New, broken out of ...
(struct module_state): ... here. Move loading data there and
adjust all users.
From-SVN: r261590
* module.c
(elf_out::grow): Always define, always align.
(elf_out::write): Streaming buffers must be using our
allocator. No need to align here.
From-SVN: r261473
gcc/cp/
* module.c (data::printf): Moved from ...
(bytes_out::printf): ... here.
(elf): Replace sections pointer with section_vec local class.
Adjust all uses.
(elf_out): Remove strtab member class.
(elf_out::end): Adjust.
From-SVN: r261330
gcc/cp/
* module.c (relativize_import): New.
(module_state::write_readme): Call it.
(module_state::read_imports): Make import relative to importer,
query mapper if needed.
From-SVN: r261170
gcc/cp/
* module.c (depset::hash): Add a sneakoscope.
(module_state::find_dependencies): Turn it on.
(trees_out::tree_decl): Check it.
gcc/testsuite/
* g++.dg/modules/local-1_[ab}.C: New.
From-SVN: r260607
Revert lazy definition patches. Once one gets into template land
it's all very brittle. We end up with an unmaintained difference
betweek 'we need the definition right now', and 'we need the
definition at some point'. I end up not being able to maintin the
SCC dependency graph, and my head melted, looking a the rat hole
that was appearing. Until proved otherwise, there are other
things to get on with.
gcc/cp/
* cp-tree.h (DECL_MODULE_LAZY_DEFN, HAS_DECL_MODULE_LAZY_DEFN_P)
(MAYBE_DECL_MODULE_LAZY_DEFN): Delete.
(struct lang_decl_min): Remove lazy_module_defn field.
(lazy_load_defn): Delete decl.
* constexpr.c (cxx_eval_call_expression): Remove lazy loading.
* decl2.c (decl_defined_p, mark_used): Likewise.
* module.c (depset::hash::add_definition): Remove DEFERRED parm.
(trees_in::{pre,}seed): Delete.
(trees_out::seeding{,_p}): Delete.
(trees_out::unmark_trees): Seeding not needed.
(trees_out::{begin,end}): Delete seeding variants.
(trees_out::seed): Remove.
(trees_out::insert): Return val, remove seeding.
(trees_{in,out}::lang_decl_vals): Remove seeding.
(has_definition): Return bool.
(ct_seed_decl, ct_self_used): Delete.
(module_state::{read,write}_cluster): Remove lazy handling.
(module_state::{find_dependencies,write}): Adjust.
(module_state::check_read): Adjust.
(module_state::lazy_load, lazy_load_defn): Delete.
* ptree.c (cxx_print_decl): Remove lazy defn index.
From-SVN: r260545
gcc/cp/
* cp-tree.h (HAS_DECL_MODULE_LAZY_DEFN_P): New.
(MAYBE_DECL_MODULE_LAZY_DEFN): Use it.
* module.c (trees_in::{pre,}seed): Adjust.
(trees_out::{begin,end}): New overloads.
(trees_out::unmark_trees): Adjust.
(trees_out::{preseed,insert}): Correct force marking.
(trees_in::tree_node): Robustify.
(module_state::write_cluster): Fix name, better messages,
(module_state::read_cluster): Use ct_no_decl.
(module_state::write): Use HAS_DECL_MODULE_LAZY_DEFN_P.
(module_state::check_read): Use decl if it makes sense.
From-SVN: r260484
gcc/cp/
* module.c (module_server): Make pointer to non-const.
(server_fini, handle_module_option): Add const_cast.
(server_init): Write into module_server. strdup if it came from
environment.
From-SVN: r260201
gcc/
* diagnostic-core.h (fullname): Declare.
* diagnostic.c (fullname): Define.
* toplev.c (general_init): Set it.
gcc/cp/
* cxx-module-oracle.sh: More messages.
* module.c (oracle_init): When defaulting, expect to be next to
cc1plus.
(oracle_stream): Always try and init the oracle.
gcc/testsuite/
* g++.dg/modules/main_a.C: Adjust for oracle use.
From-SVN: r260173
gcc/cp/
* module.c (depset::tarjan::connect): Use section==0 for done, not
top bit of cluster.
gcc/testsuite/
* g++.dg/modules/unnamed-1_b.C: Fix scan.
From-SVN: r260014
gcc/cp/
* module.c (trees_out::maybe_tag_decl_type): New, broken out of ...
(trees_out::tree_decl): ... here. Call it.
(trees_out::tree_ref): Use it.
gcc/testsuite/
* g++.dg/modules/unnamed-1_[ab].C: New.
From-SVN: r259956
gcc/cp/
* cp-tree.h (DECL_TEMPLATE_INFO): Correct comment.
* module.c: Update description, general format cleanups.
(elf_out::SECTION_ALIGN): New.
(elf_out::pad): Use it.
* pt.c (build_template_decl): Make static.
From-SVN: r259673
gcc/cp/
* module.c (maybe_get_template): Delete.
(trees_out::tree_decl): Move dependency building into named-decl
handling. Don't walk into namespaces.
From-SVN: r259653
gcc/cp/
* module.c (trees_out::core_vals): Check module of type context.
(trees_out::tree_decl): Assert we can find the named decl.
(module_state:read_config): Move defrosting to ...
(module_state::read): ... here.
From-SVN: r259615
gcc/cp/
* module.c (module_state::importing): New.
(module_state::init): Lazy_open is not just for laziness.
(module_state::read_config): Call maybe_defrost.
(module_state::maybe_defrost): New, broken out of ...
(module_state::load_section): ... here. Call it.
(module_state::freeze_an_elf): Look in importing stack too.
gcc/testsuite/
* g++.dg/modules/nest-1_[abc].C: New.
From-SVN: r259560
gcc/cp/
* module.c (elf_in::keep_sections): New.
(elf_in::read): Add type arg.
(elf_in::find): Remove type arg.
(elf_in::begin): Coalesce error messages.
(module_state::loading): New field.
(module_state::{read,write}_config): Serialize section range ...
(module_state::{read,write}_namespace): ... not here.
(module_state::read_decls): Do not read the actual decls.
(module_state::read): ... do them here.
From-SVN: r259176
gcc/cp/
* module.c (bytes_in::{use,i,u,wi,str}): Use set_overrun.
(module_state::tng_write_cluster): Sort cluster here ...
(module_state::tng_write_bindings): ... not here.
(module_state::tng_read_bindings): Set refs_tng.
(trees_{in,out}::define_class): Don't deal with refs_tng here.
(trees_out::tree_binfo): Protect from dep_walk_p. Force insert
new tag.
From-SVN: r259045
Fix enum types, more globals
gcc/cp/
* module.c (module_state::maybe_add_global): New.
(module_state::init): Use it.
(trees_{in,out}::core_vals): Special case TYPE of unscoped enum.
From-SVN: r259028
gcc/cp/
* module.c (trees_{in,out}::tree_binfo): Serialize entire path.
(trees_{in,out}::tree_node): Adjust.
-- This line, and those below, will be ignored--
M ChangeLog.modules
M gcc/cp/module.c
From-SVN: r259005
gcc/cp/
module.c (class depset): Make a class, add accessors. Implicitly
hold name as first decl.
(depset::table::{maybe_insert,find}): New.
(depset::table::{append,find_exports,find_dependencies}): Adjust.
(depset::table::{write_bindings,write_namespaces}): Adjust.
(module_state::write_cluster): Adjust.
From-SVN: r258855
gcc/cp/
* module.c: More commenting.
(trees_{in,out}::define_class): Don't do AsBase here.
(trees_in::finish_type, trees_out::tree_node}: Do it here.
(get_module_owner): Cleanup.
From-SVN: r257733
gcc/cp/
* module.c (elf): Make more constants private.
(elf_in::find): Swap args, default to PROGBITS.
(elf_out::add): Replace type and flags with string_p arg.
(bytes_in::begin, bytes_out::end): Adjust.
(data::set_crc): Store zero for no-crc.
From-SVN: r257689
Keep module-file map in module hash
gcc/cp/
* module.c (module_state): Add empty_p, get_module members.
Rename set_name, delete set_location.
(module_file_map, module_file): Delete.
(module_state::do_import): Use get_module, set filename here.
(declare_module): No need to set filename here.
(add_module_file): Use module_state::get_module.
From-SVN: r257569
c-family/
* c-common.h (module_output, module_files_map, module_files): Move
to cp-tree.h.
* c-common.c (module_output, module_files): Move to module.c
* c-opts.c (add_module_file): Move to cp-objcp-common.c.
(c_common_handle_option): Move modules options to
cp-objcp-common.c.
cp/
* cp-objcp-common.c (add_module_file): Moved from c-opts.c.
(cp_handle_option): New, from c_common_handle_option.
* cp-objcp-common.h (LANG_HOOKS_HANDLE_OPTION): Point at
cp_handle_option.
* cp-tree.h (module_output, module_files_map, module_files): Moved
from c-common.h.
* cp-module.c (module_output, module_files): Moved from c-common.c.
From-SVN: r257560
Imports use special tags
gcc/cp/
* module.c (module_context): Cope with C++ anon types.
(trees_{in,out}::tree_node_special): Deal with imports here ...
(trees_{in,out}::tree_node): ... not here.
gcc/testsuite/
* g++.dg/modules/class-3_[bd].C: Adjust.
From-SVN: r257425
Kill namespace module slot hackery
gcc/cp/
* name-lookup.c (module_binding_slot): Make CREATE a bool. Remove
namespace hackery.
(find_namespace_partition): Delete.
(merge_global_decl): Do not expect a namespace. Remove such
handling.
(push_module_binding): Likewise.
From-SVN: r257423
gcc/cp/
* decl2 (c_parse_final_cleanups): Only finish_module if we started
it.
* module.c (module_hash_state): Move earlier.
(module_state): Add hash and modules static members.
(module_state::set_import): Replace do_import.
(module_state::lazy_init): Create the hash table and default
module.
(trees_in::tree_node_raw): Don't remap a public namespace.
(module_purview_p, module_interface_p): Adjust.
(module_state::do_import): Move more stuff to lazy_init.
(import_module): Deal with setting current module flags.
(finish_module): Adjust.
From-SVN: r257396
gcc/cp/
* module.c (cpm_reader::wi): Promote before shifting.
(cpms_out::define_class): Maybe define all members.
(cpms_out::maybe_tag_definition): Only implicit typedefs are
defined.
(cpms_{in.out}::core_vals): Don't stream value cache.
gcc/testsuite/
* g++.dg/modules/nested-1_[abc].C: New
From-SVN: r255014
gcc/cp/
* cp-tree.h (TS_CP_BINDING, TS_CP_WRAPPER, LAST_TS_CP_ENUM): Delete.
(union lang_tree_node): Adjust GTY desc.
(cp_tree_node_structure): Take a tree_code.
* decl.c (cp_tree_node_structure): Take a tree_code.
* module.c (cpms_{out,in}::core_vals): Order clauses from
tree-core.h, use cp_tree_node_structure for C++.
From-SVN: r253689
gcc/c-family/
* c.opt (-fmodule-output): New option.
(-fmodule-file): New option.
* c-common.h (module_output): Declare.
(module_files): Declare.
* c-common.c (module_output): New variable.
(module_files): New variable.
* c-opts.c (add_module_file): New function.
* c-opts.c: Handle -fmodule-output and -fmodule-file.
gcc/cp/
* module.c (finish_module): If specified, use module_output as output
file name.
* module.c (do_module_import): Check module_files for a module name
to file mapping before falling back to default.
gcc/
* doc/invoke.texi (C++ Dialect Options): Document -fmodule-output
and -fmodule-file.
From-SVN: r250389
Start moving class name lookup around
gcc/cp/
* class.c (count_fields, add_fields_to_record_type)
add_enum_fields_to_record_type, sorted_fields_type_new,
create_classtype_sorted_fields,
insert_late_enum_def_into_classtype_sorted_fields): Move to ...
* name-lookup.c: ... here.
(lookup_class_member): New. Broken out of ...
* search.c (lookup_field_1): ... here. Call it.
* name-lookup.h (lookup_class_member)
create_classtype_sorted_fields,
insert_late_enum_def_into_classtype_sorted_fields): Declare.
((--This line, and those below, will be ignored--
M gcc/cp/name-lookup.h
M gcc/cp/class.c
M gcc/cp/search.c
M gcc/cp/cp-tree.h
M gcc/cp/name-lookup.c
M ChangeLog.modules
From-SVN: r249661
More PoD struct
gcc/cp/
* module.c (cpms_{out,in}::ident_imported_decl): New.
(cpms_{out,in}::lang_type_vals): New.
(cpms_{out,in}::core_vals): Blocks and statements.
(cpms_{out,in}::tree_node): More general import referencing.
* name-lookup.c (name_lookup::add_module_fns): New.
(name_lookup::adl_namespace_only): Use it.
gcc/testsuite/
* g++.dg/modules/class-3_[abc].C: New.
From-SVN: r248928
gcc/cp/
* config-lang.in: Update.
* name-lookup.c (add_using_directive): Rename to ...
(add_using_namespace): ... here. Add FROM arg.
(do_toplevel_using_directive): Fold into add_using_namespace.
(finish_namespace_using_directive, finish_local_using_directive)
push_namespace): Adjust.
(--This line, and those below, will be ignored--
M ChangeLog.modules
M gcc/cp/name-lookup.c
M gcc/cp/config-lang.in
From-SVN: r248338
gcc/cp/
* cp-tree.h (lookup_add, lookup_maybe_add): Swap args.
* name-lookup.c (name_lookup::merge_binding)
adl_lookup::add_functions): Adjust.
* pt.c (check_explicit_specialization, do_class_deduction):
Adjust.
* tree.c (lookup_add, lookup_maybe_add): Adjust.
(--This line, and those below, will be ignored--
M gcc/cp/tree.c
M gcc/cp/cp-tree.h
M gcc/cp/pt.c
M gcc/cp/name-lookup.c
M ChangeLog.modules
From-SVN: r248196
Overload bindings
gcc/cp/
* module.c (cpms_in::tag_binding, cpms_out::tag_binding): Binding
can be overload.
(cpms_out::write_bindings): Write whole slot.
(cpms_in::finish): Look for import match.
(cpms_in::read_lang_decl_bools, cpms_out::write_lang_decl_bools)
cpms_in::read_lang_decl_vals, cpms_out::write_lang_decl_vals):
Implement.
* name-lookup.c (push_module_binding): Binding can be overload.
(find_module_instance): New.
* name-lookup.h (find_module_instance): Declare.
gcc/testsuite/
* g++.dg/modules/mod-impl-1_[abcd]: Add overloads.
(--This line, and those below, will be ignored--
M ChangeLog.modules
M gcc/cp/name-lookup.c
M gcc/cp/name-lookup.h
M gcc/cp/module.c
M gcc/testsuite/g++.dg/modules/mod-impl-1_a.C
M gcc/testsuite/g++.dg/modules/mod-impl-1_b.C
M gcc/testsuite/g++.dg/modules/mod-impl-1_c.C
M gcc/testsuite/g++.dg/modules/mod-impl-1_d.C
From-SVN: r247611
Kill rt_import popping madness
gcc/cp/
* module.c (cpm_serial::rt_export): New tag.
(cpms_in::tag_import): Do the importing right now.
(cpms_in::read_item): ... not here.
(read_module): Pass in dump file name, open it if needed.
From-SVN: r247348
Namespace stat hack via special overload
gcc/cp
* name-lookup.c (STAT_HACK_P, STAT_TYPE, STAT_DECL)
MAYBE_STAT_DECL, MAYBE_STAT_TYPE): New.
(stat_hack): New.
(find_namespace_value, name_lookup::search_namespace_only)
update_binding, do_pushdecl, set_identifier_type_value_with_scope,
set_global_value, lookup_type_scope_1, do_pushtag): Adjust.
((--This line, and those below, will be ignored--
M ChangeLog.modules
M gcc/cp/name-lookup.c
From-SVN: r247185
gcc/cp/
* name-lookup.c (add_local_decl): Delete.
(name_lookup::add): Fold into ...
(name_lookup::search_namespace_only): ... here.
(update_binding): Allow pushing the same artificial type.
(do_pushdecl): Explicitly set namespace type.
(push_local_binding): Use add_decl_to_level.
(set_identifier_type_Value_with_scope): Use update_binding for
namespaces.
From-SVN: r247112
gcc/cp/
* name-lookup.c (do_nonmember_using_decl): Pass in pointers to
value and type nodes.
(do_local_using_directive): Use current_binding_level directly.
(do_toplevel_using_directive): Adjust.
(lookup_type_current_level): Delete.
From-SVN: r247106
Unifying binding push part 5
gcc/cp/
* name-lookup.c (find_local_binding): New. Broken out of ...
(find_local_value): ... this. Use it.
(update_local_overload): New. Broken out of ...
(replace_local_overload_binding): ... this. Use it.
(do_pushdecl): Keep local binding around.
(local_push_binding): Consume supplement_binding, add_local_decl.
From-SVN: r246912
Separating namespace bindings part 15
gcc/cp/
* cp-tree.h (PUSH_GLOBAL, PUSH_LOCAL, PUSH_USING): Delete.
* lex.c (unqualified_name_lookup_error): Adjust push_local_binding
call.
* name-lookup.h (push_local_binding): Flags is now bool.
* name-lookup.c (matching_using_decl_p): Rename to ...
(matching_fn_p): ... here. Update callers.
(set_local_extern_decl_linkage): New. Broken out of ...
(do_pushdecl): ... here. Call it.
(push_local_binding, augment_local_overload_binding)
do_local_push_overload, do_local_using_decl): Flags is now bool.
gcc/testsuite/
* g++.dg/lookup/extern-redecl1.C: New.
* g++.dg/parse/ctor9.c: tweak.
(--This line, and those below, will be ignored--
M ChangeLog.modules
M gcc/testsuite/g++.dg/parse/ctor9.C
A gcc/testsuite/g++.dg/lookup/extern-redecl1.C
M gcc/cp/cp-tree.h
M gcc/cp/name-lookup.c
M gcc/cp/lex.c
M gcc/cp/name-lookup.h
From-SVN: r246852
gcc/cp/
* cp-tree.h (lang_decl_ns): Replace tree list ns_using, ns_users
with tree vector usings & inlinees.
(DCL_NAMES_SPACE_USING, DECL_NAMESPACE_INLINEES): Update.
(TREE_INDIRECT_USING): Delete.
* decl.c (cxx_init_decl_processing): Tweak.
* name-lookup.h (cp_binding_level): using_directives is a vec.
* name-lookup.c (name_lookup::do_queue_usings, queue_usings)
search_namespace, search_usings, queue_namespace,
search_unqualified, assoc_namespace_only): inlinees and usings are
vectors. Remove old TREE_LIST code.
(namespace_ancestor_1, namespace_ancestor, add_using_namespace_1)
add_using_namespace): Delete.
(qualified_namespace_lookup): Tweak.
(add_using_directive): New.
(do_toplevel_using_directive, do_local_using_directive): Adjust.
(push_namespace): Adjust.
* tree.c (decl_anon_ns_mem_p): Reimplement.
(cp_free_lang_data): Update.
((--This line, and those below, will be ignored--
M ChangeLog.modules
M gcc/cp/decl.c
M gcc/cp/tree.c
M gcc/cp/cp-tree.h
M gcc/cp/name-lookup.c
M gcc/cp/name-lookup.h
From-SVN: r246647
gcc/cp/
* name-lookup.c (name_lookup::scopes): Make static. Adjust uses.
(name_lookup::search_namespace_only): Broken out of ...
(name_lookup::search_namespace): ... here. Call it.
* tree.c (ovl_cache): New.
(ovl_make, ovl_lookup_keep): Use it.
From-SVN: r246615
Cleanup using-directives
gcc/cp/
* cp-tree.h (LOOKUP_MARKED_P): Rename from NAME_MARKED_P, update
all.
* name-lookup.h (finish_toplevel_using_directive)
finish_local_using_directive): Declare.
(do_using_directive, parse_using_directive): Delete.
* name-lookup.c (do_using_directive): Delete.
(do_toplevel_using_directive): Reimplement.
(do_local_using_directive, do_local_using_directive_1): New.
(parse_using_directive): Delete.
(push_using_directive, push_using_directive_1): Delete.
(finish_toplevel_using_directive, finish_local_using_directive):
New.
* pt.c (tsubst_expr): Adjust.
* parser.c (cp_parser_using_directive): Adjust.
(--This line, and those below, will be ignored--
M gcc/testsuite/g++.dg/lookup/strong-using-6.C
M gcc/cp/tree.c
M gcc/cp/cp-tree.h
M gcc/cp/pt.c
M gcc/cp/name-lookup.c
M gcc/cp/name-lookup.h
M gcc/cp/parser.c
M ChangeLog.modules
From-SVN: r246535
gcc/cp/
* cp-tree.h (OVL_P, OVL_PLURAL_P): New.
* name-lookup.c (name_lookup::add, diagnose_name_conflict)
pushdecl_maybe_friend_1, push_overloaded_decl_1,
do_nonmember_using_decl, push_class_level_binding_1,
set_decl_namespace, lookup_arg_dependent_1): Use OVL_P.
(--This line, and those below, will be ignored--
M ChangeLog.modules
M gcc/cp/cp-tree.h
M gcc/cp/name-lookup.c
From-SVN: r246421
Keep hidden overloads at start of list
gcc/cp
* cp-tree.h (OVL_HIDDEN_P): New.
(ovl_iterator::hidden_p, unhide): New.
(DECL_HIDDEN_P): New.
(hidden_name_p, remove_hidden_names): Delete.
(ovl_skip_hidden): Declare.
* decl.c (builtin_function_1): Set DECL_ANTICIPATED before
pushing.
(xref_tag_1): Replace hidden_name_p with DECL_HIDDEN_P.
* name-lookup.c (anticipated_builtin_p)
skip_anticipated_buitins): New.
(supplement_binding_1): Use anticipated_builtin_p.
(replace_local_overload_binding): New. Broken out of
augment_local_overload_binding.
(fixup_unhidden_decl): New.
(pushdecl_maybe_friend_1): Deal with unhiding decl. Set
DECL_ANTICIPATED before really pushing.
(augment_local_overload_binding): Call
replace_local_overload_binding.
(push_overloaded_decl_1): Deal with unhiding decl.
(do_nonmember_using_decl): Use anticipated_builtin_p.
(ambiguous_decls): Use ovl_skip_hidden.
(lookup_name_real_1): Use DECL_HIDDEN_P, ovl_skip_hidden.
(arg_assoc_namespace): Use DECL_HIDDEN_P.
(lookup_arg_dependent_1): Use ovl_skip_hidden.
* pt.c (instantiate_class_template): Use DECL_HIDDEN_P.
* tree.c (ovl_move_unhidden): New.
(ovl_add): Deal with hiddenness.
(ovl_add_transient): Adjust.
(hidden_name_p, remove_hidden_names): Delete.
(ovl_skip_hidden): New.
(ovl_iterator::ovl_unhide): New.
gcc/testsuite/
* g++.dg/lookup/friend19.C: New.
* g++.dg/lookup/using56.C: New.
(--This line, and those below, will be ignored--
A gcc/testsuite/g++.dg/lookup/friend19.C
A gcc/testsuite/g++.dg/lookup/using56.C
M gcc/cp/pt.c
M gcc/cp/cp-tree.h
M gcc/cp/decl.c
M gcc/cp/name-lookup.c
M gcc/cp/tree.c
From-SVN: r246332
New OVERLOAD representation part 6
gcc/cp/
* cp-tree.h: Move ovl handling fns to original location.
* tree.c (ovl_add): Use ovl_make.
(ovl_add_transient): Use ovl_add.
From-SVN: r246198
New OVERLOAD representation part 1
gcc/cp/
* cp-tree.h (OVL_LENGTH, OVL_USINGS, OVL_FIRST, OVL_NAME)
OVL_SINGLE, OVL_ELT, OVL_HAS_USING, OVL_HAS_HIDDEN): New.
(ovl_iterator): Implement new-style iterator.
* tree.c (tree_ovl_elt_check_failed): New.
(ovl_maybe_keep): Fixup bracing.
(ovl_scope): Add new smarts.
(--This line, and those below, will be ignored--
M ChangeLog.modules
M gcc/cp/tree.c
M gcc/cp/cp-tree.h
From-SVN: r246172
Vectorize OVERLOAD part 6
gcc/cp/
* cp-tree.h (OVL_TRANSIENT): Rename from OVL_ARG_DEPENDENT.
* tree.c (ovl_maybe_keep): Adjust.
* name-lookup.c (set_namespace_binding_1): Use OVL_SINGLE.
(lookup_name_real_1): Use OVL_FIRST.
(add_function): Set OVL_TRANSIENT.
(lookup_arg_dependent_1): Use OVL_FIRST.
(cp_emit_debug_info_for_using): Use iterator.
From-SVN: r246159
Add OVERLOAD iterator part 6
gcc/cp/
* name-lookup.c (pushdecl_maybe_friend_1): Use OVL_FIRST.
(push_overloaded_decl_1): Use iterators.
(consider_binding_level): Use OVL_FIRST.
From-SVN: r246157
Kill get_first_fn part 3
gcc/cp/
* cp-tree.h (get_first_fn): Delete.
* pt.c (iterative_hash_template_arg, tsubst_copy_and_build): Use
get_ovl.
(tsubst_baselink): Use OVL_NAME.
* typeck.c (invalid_nonstatic_memfn_p, build_x_unary_op)
cp_build_addr_expr_1): Use get_ovl.
(finish_class_member_access_expr): Use OVL_NAME.
* tree.c (dependent_name): Recode.
(get_first_fn): Delete.
(--This line, and those below, will be ignored--
M ChangeLog.modules
M gcc/cp/tree.c
M gcc/cp/typeck.c
M gcc/cp/pt.c
M gcc/cp/cp-tree.h
From-SVN: r246104
Kill get_first_fn part 2
gcc/cp/
* cp-tree.h (get_ovl): Add want_first parm, make pure.
* constexpr.c (potential_constant_expression_1): Adjust get_ovl.
* lambda.c (lambda_function): Likewise.
* error.c (dump_decl): Use identifier_p.
* mangle.c (write_expression): Use OVL_NAME.
* name-lookup.c (pushdecl_class_level): Likewise.
* parser.c (cp_parser_postfix_expression)
cp_parser_expression_statement, cp_parser_direct_declarator,
cp_parser_constructor_declarator_p): Use get_ovl.
* search.c (lookup_member0: Likewise.
* typeck2.c (cxx_incomplete_type_diagnostic): Likewise.
* semantics.c (perform_koenig_lookup): Use OVL_NAME.
(finish_call_expr, finish_id_expression): Use get_ovl.
* tree.c (get_ovl): Add want_first arg, adjust.
(--This line, and those below, will be ignored--
M ChangeLog.modules
M gcc/cp/lambda.c
M gcc/cp/semantics.c
M gcc/cp/name-lookup.c
M gcc/cp/tree.c
M gcc/cp/mangle.c
M gcc/cp/search.c
M gcc/cp/typeck2.c
M gcc/cp/constexpr.c
M gcc/cp/error.c
M gcc/cp/parser.c
M gcc/cp/cp-tree.h
From-SVN: r246102
Kill build_overload
gcc/cp/
* cp-tree.h (ovl_add): Declare.
(build_overload): Delete.
* tree.c (ovl_add): New.
(build_overload): Delete.
* class.c (add_method): Use ovl_add.
* constraint.cc (finish_shorthand_constraint): Likewise.
* name-lookup.c (do_nonmember_using_decl, merge_functions)
remove_hidden_names, add_function): Likewise.
* pt.c (make_constrained_auto): Likewise.
(--This line, and those below, will be ignored--
M ChangeLog.modules
M gcc/cp/cp-tree.h
M gcc/cp/name-lookup.c
M gcc/cp/class.c
M gcc/cp/tree.c
M gcc/cp/pt.c
M gcc/cp/constraint.cc
From-SVN: r246094
Rename OVL_USED
* gcc/cp/
* cp-tree.h (OVL_USED): Rename to ...
(OVL_VIA_USING): ... here.
* class.c (add_method): Update.
* tree.c (ovl_scope): Update.
* search.c (lookup_field_r): Update.
* name-lookup.c (push_overloaded_decl_1)
do_nonmember_using_decl): Update.
(--This line, and those below, will be ignored--
M ChangeLog.modules
M gcc/cp/class.c
M gcc/cp/tree.c
M gcc/cp/cp-tree.h
M gcc/cp/search.c
M gcc/cp/name-lookup.c
From-SVN: r246093
Add OVERLOAD iterator part 6
gcc/cp/
* search.c (lookup_field_fuzzy_info::fuzzy_lookup_fn)
lookup_conversion_operator, lookup_fnfields_idx_nolazy,
lookup_conversions_r): Use new accessors.
(--This line, and those below, will be ignored--
M ChangeLog.modules
M gcc/cp/search.c
From-SVN: r246057
Add OVERLOAD iterator part 5
gcc/cp/
* decl2.c (mark_used): Use new accessors.
* dump.c (cp_dump_tree): Likewise.
* error.c (dump_decl, dump_expr, location_of): Likewise.
* init.c (build_offset_ref): Likewise.
* parser.c (cp_parser_nested_name_specifier_opt)
cp_parser_lookup_name): Likewise.
* pt.c (check_explicit_specialization, check_template_shadow)
tsubst_baselink): Likewise.
* ptree.c (cxx_print_xnode): Likewise.
((--This line, and those below, will be ignored--
M gcc/cp/parser.c
M gcc/cp/ptree.c
M gcc/cp/error.c
M gcc/cp/dump.c
M gcc/cp/decl2.c
M gcc/cp/init.c
M gcc/cp/pt.c
M ChangeLog.modules
From-SVN: r246056
Add OVERLOAD iterator part 4
gcc/cp/
* cp-tree.h (OVL_FIRST, OVL_NAME): New accessors.
* call.c (build_user_type_conversion_1)
print_error_for_call_failure, add_candidates): Use them.
* class.c (method_name_cmp, resort_method_name_cmp)
resort_type_method_vec, finish_struct_methods, warn_hidden,
resolve_address_of_overloaded_function,
note_name_declared_in_class): Likwise.
* cxx-pretty-print.c (pp_cxx_unqualified_id, pp_cxx_qualified_id)
cxx_pretty_printer::id_expression,
cxx_pretty_printer::expression): Likewise.
* decl.c (poplevel): Likewise.
* mangle.c (write_member_name): Likewise.
* method.c (strip_inheriting_ctors): Likewise.
* typeck.c (cp_build_addr_expr_1): Likewise.
(((--This line, and those below, will be ignored--
M ChangeLog.modules
M gcc/cp/mangle.c
M gcc/cp/cp-tree.h
M gcc/cp/typeck.c
M gcc/cp/class.c
M gcc/cp/decl.c
M gcc/cp/method.c
M gcc/cp/cxx-pretty-print.c
M gcc/cp/parser.c
M gcc/cp/call.c
From-SVN: r246050
Add OVERLOAD iterator, part 1
Add MAYBE_BASELINK_FUNCTIONS
gcc/cp/
* cp-tree.h (struct ovl_iterator): New.
(MAYBE_BASELINK_FUNCTIONS): New.
* call.c (build_op_call_1, add_candidates)
build_op_delete_call): Use new iterator.
* lambda.c (maybe_generic_this_capture): Likewise.
* method.c (inherited_ctor_binfo, binfo_inherited_from): Likewise.
* parser.c (lookup_literal_operator)
cp_parser_template_name): Likewise.
* search.c (shared_member_p, look_for_overrides_here)
lookup_conversions_r): Likewise.
* semantics.c (finish_call_expr)
classtype_has_nothrow_assign_or_copy_p): Likewise.
* typeck.c (check_template_keyword): Likewise.
* tree.c (is_overloaded_fn, get_fns): Use MAYBE_BASELINK_FUNCTIONS.
((((--This line, and those below, will be ignored--
M gcc/cp/parser.c
M gcc/cp/cp-tree.h
M gcc/cp/lambda.c
M gcc/cp/semantics.c
M gcc/cp/name-lookup.c
M gcc/cp/tree.c
M gcc/cp/search.c
M gcc/cp/typeck.c
M gcc/cp/call.c
M gcc/cp/method.c
M ChangeLog.modules
From-SVN: r246031
gcc/cp/
* gcc/cp/cp-tree.h (NAMESPACE_CHECK): Delete.
(MODULE_NAMESPACE_P, GLOBAL_MODULE_NAMESPACE)
NAMESPACE_INLINE_P): Adjust.
(--This line, and those below, will be ignored--
M ChangeLog.modules
M gcc/cp/cp-tree.h
From-SVN: r246030
Canonicalize canonical type hashing
gcc/
* tree.h (type_hash_default): Declare.
* tree.c (type_hash_list, attribute_hash_list): Move into
type_hash_default.
(build_type_attribute_qual_variant): Break out hash code calc into
type_hash_default.
(type_hash_default): New. Generic type hash computation.
(build_range_type_1, build_array_type_1, build_function_type)
build_method_type_directly, build_offset_type, build_complex_type,
make_vector_type): Call it.
gcc/c-family/
* c-common.c (complete_array_type): Use type_hash_default.
(--This line, and those below, will be ignored--
M ChangeLog.modules
M gcc/c-family/c-common.c
M gcc/tree.c
M gcc/tree.h
From-SVN: r245550
/* Modeling API uses and misuses via state machines.
Copyright (C) 2019-2020 Free Software Foundation, Inc.
Copyright (C) 2019-2021 Free Software Foundation, Inc.
Contributed by David Malcolm <dmalcolm@redhat.com>.
This file is part of GCC.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.