middle-end: Fix logical shift truncation (PR rtl-optimization/91838) (gcc-9 backport)

This fixes a fall-out from a patch I had submitted two years ago which started
allowing simplify-rtx to fold logical right shifts by offsets a followed by b
into >> (a + b).

However this can generate inefficient code when the resulting shift count ends
up being the same as the size of the shift mode.  This will create some
undefined behavior on most platforms.

This patch changes to code to truncate to 0 if the shift amount goes out of
range.  Before my older patch this used to happen in combine when it saw the
two shifts.  However since we combine them here combine never gets a chance to
truncate them.

The issue mostly affects GCC 8 and 9 since on 10 the back-end knows how to deal
with this shift constant but it's better to do the right thing in simplify-rtx.

Note that this doesn't take care of the Arithmetic shift where you could replace
the constant with MODE_BITS (mode) - 1, but that's not a regression so punting it.

gcc/ChangeLog:

	Backport from mainline
	2020-01-31  Tamar Christina  <tamar.christina@arm.com>

	PR rtl-optimization/91838
	* simplify-rtx.c (simplify_binary_operation_1): Update LSHIFTRT case
	to truncate if allowed or reject combination.

gcc/testsuite/ChangeLog:

	Backport from mainline
	2020-01-31  Tamar Christina  <tamar.christina@arm.com>
		    Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/91838
	* g++.dg/opt/pr91838.C: New test.
This commit is contained in:
Tamar Christina
2020-02-11 10:50:12 +00:00
parent 9248369630
commit f6e9ae4da8
4 changed files with 44 additions and 3 deletions

View File

@@ -1,3 +1,12 @@
2020-02-11 Tamar Christina <tamar.christina@arm.com>
Backport from mainline
2020-01-31 Tamar Christina <tamar.christina@arm.com>
PR rtl-optimization/91838
* simplify-rtx.c (simplify_binary_operation_1): Update LSHIFTRT case
to truncate if allowed or reject combination.
2020-02-07 H.J. Lu <hongjiu.lu@intel.com>
Backport from mainline

View File

@@ -3519,9 +3519,21 @@ simplify_binary_operation_1 (enum rtx_code code, machine_mode mode,
{
rtx tmp = gen_int_shift_amount
(inner_mode, INTVAL (XEXP (SUBREG_REG (op0), 1)) + INTVAL (op1));
tmp = simplify_gen_binary (code, inner_mode,
XEXP (SUBREG_REG (op0), 0),
tmp);
/* Combine would usually zero out the value when combining two
local shifts and the range becomes larger or equal to the mode.
However since we fold away one of the shifts here combine won't
see it so we should immediately zero the result if it's out of
range. */
if (code == LSHIFTRT
&& INTVAL (tmp) >= GET_MODE_BITSIZE (inner_mode))
tmp = const0_rtx;
else
tmp = simplify_gen_binary (code,
inner_mode,
XEXP (SUBREG_REG (op0), 0),
tmp);
return lowpart_subreg (int_mode, tmp, inner_mode);
}

View File

@@ -1,3 +1,12 @@
2020-02-11 Tamar Christina <tamar.christina@arm.com>
Backport from mainline
2020-01-31 Tamar Christina <tamar.christina@arm.com>
Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/91838
* g++.dg/opt/pr91838.C: New test.
2020-02-10 H.J. Lu <hongjiu.lu@intel.com>
Backport from mainline

View File

@@ -0,0 +1,11 @@
/* { dg-do compile { target c++11 } } */
/* { dg-additional-options "-O2 -Wno-psabi -w" } */
/* { dg-additional-options "-masm=att" { target i?86-*-* x86_64-*-* } } */
using T = unsigned char; // or ushort
using V [[gnu::vector_size(8)]] = T;
V f(V x) {
return x >> 8 * sizeof(T);
}
/* { dg-final { scan-assembler {pxor\s+%xmm0,\s+%xmm0} { target { { i?86-*-* x86_64-*-* } && lp64 } } } } */