a68: scanner fixes for bits denotations

This commit: 1. Fixes a bug in the scanner so it now checks whether the digits in a bits denotation are ok for its radix. 2. Does not allow to have typographical display features between the digits of bits denotations in SUPPER stropping, when the radix is 16. This is is avoid confusing situations like the one described in the comment below. 3. Adds a few tests. 4. Fixes an existing test that was assuming that bits denotations with radix 10 are allowed. The report allows radixes 2, 4, 8 and 16. Signed-off-by: Jose E. Marchesi <jemarch@gnu.org> gcc/algol68/ChangeLog * a68-parser-scanner.cc (get_next_token): Bits denotation parsing fixes. * ga68.texi (SUPPER stropping): Document special rule for bits denotations with radix 16. gcc/testsuite/ChangeLog * algol68/compile/error-radix-1.a68: New test. * algol68/compile/radix-hex-upper-1.a68: Likewise. * algol68/compile/radix-hex-supper-1.a68: Likewise. * algol68/compile/error-radix-4.a68: Likewise. * algol68/compile/error-radix-3.a68: Likewise. * algol68/compile/error-radix-2.a68: Likewise. * algol68/execute/environment-enquiries-6.a68: Do not use radix 10 in bits denotations.
2026-02-22 03:47:02 -05:00 · 2025-12-30 00:11:31 +01:00
parent cc515677b9
commit 6839de7f03
9 changed files with 64 additions and 7 deletions
--- a/gcc/algol68/a68-parser-scanner.cc
+++ b/gcc/algol68/a68-parser-scanner.cc
@@ -1663,13 +1663,37 @@ get_next_token (bool in_format,
 	}
      else if (is_radix_char (ref_l, ref_s, &c))
 	{
+	  /* Parse the radix, which is expressed in base 10.  */
 	  (sym++)[0] = c;
+	  char *end;
+	  int64_t radix = strtol (A68_PARSER (scan_buf), &end, 10);
+	  gcc_assert (end != A68_PARSER (scan_buf) && *end == 'r');
+
+	  /* Get the rest of the bits literal.  Typographical display features
+	     are allowed in the reference language between the digit symbols
+	     composing the denotation.  However, in SUPPER stropping this could
+	     lead to confusing situations like:
+
+	       while bitmask /= 16r0 do ~ od
+
+	     Where the scanner would recognize a bits denotation 16r0d and then
+	     the parser would complain about a missing 'do'.  This is not a
+	     problem in UPPER stropping since D is not a valid hexadecimal
+	     digit.
+
+	     To avoid confusing errors, in SUPPER stropping we do not allow
+	     typographical display features in bits denotations when the radix
+	     is 16.  */
 	  c = next_char (ref_l, ref_s, true);
-	  /* This is valid for both UPPER and SUPPER stropping.  */
-	  while (ISDIGIT (c) || strchr ("abcdef", c) != NO_TEXT)
+	  while (((radix == 2 && (c == '0' || c == '1'))
+		  || (radix == 4 && (c >= '0' && c <= '3'))
+		  || (radix == 8 && (c >= '0' && c <= '7'))
+		  || (radix == 16 && (ISDIGIT (c) || strchr ("abcdef", c) != NO_TEXT))))
 	    {
 	      (sym++)[0] = c;
-	      c = next_char (ref_l, ref_s, true);
+	      c = next_char (ref_l, ref_s,
+			     OPTION_STROPPING (&A68_JOB) != SUPPER_STROPPING
+			     || radix != 16);
 	    }
 	  *att = BITS_DENOTATION;
 	}
--- a/gcc/algol68/ga68.texi
+++ b/gcc/algol68/ga68.texi
@@ -1312,7 +1312,10 @@ that, when they appear between symbols, are of no significance and do
 not alter the meaning of the program.  However, when a space or a tab
 appear in string or character denotations, they represent the
@code{space symbol} and the @code{tab symbol}
-respectively@footnote{The @code{tab symbol} is a GNU extension}.
+respectively@footnote{The @code{tab symbol} is a GNU extension}.  The
+different stropping regimes, however, may impose specific restrictions
+on where typographical display features may or may not
+appear. @xref{Stropping regimes}.

@node Worthy characters
@section Worthy characters
@@ -1698,6 +1701,23 @@ The underscore characters are not really part of the tag, but part of
 the stropping.  For example, both @code{goto found_zero} and
@code{goto foundzero} jump to the same label.

+In general, typographical display features are allowed between any
+symbol in the written program.  In SUPPER stropping, however, it is
+not allowed to place spaces or tab characters between the constituent
+digits of bits denotations when the radix is 16.  This is to avoid
+confusing situations like the following invalid program:
+
+@example
+@B{while} bitmask /= 16r0 @B{do} ~ @B{od}
+@end example
+
+@noindent
+Where the bits denotation would be interpreted as @code{16r0d} rather
+than @code{16r0}, leading to a syntax error.  Note however that
+typographical display features are still allowed between the radix
+part and the digits, so @code{16r aabb} is valid also in SUPPER
+stropping.
+
 The @code{recsel output records} procedure, encoded in SUPPER
 stropping, looks like below.

--- a/gcc/testsuite/algol68/compile/error-radix-1.a68
+++ b/gcc/testsuite/algol68/compile/error-radix-1.a68
@@ -0,0 +1 @@
+(bits a = 2r03; skip) { dg-error "" }
--- a/gcc/testsuite/algol68/compile/error-radix-2.a68
+++ b/gcc/testsuite/algol68/compile/error-radix-2.a68
@@ -0,0 +1 @@
+(bits b = 8r09; skip)  { dg-error "" }
--- a/gcc/testsuite/algol68/compile/error-radix-3.a68
+++ b/gcc/testsuite/algol68/compile/error-radix-3.a68
@@ -0,0 +1 @@
+(bits c = 8rab, skip) { dg-error "" }
--- a/gcc/testsuite/algol68/compile/error-radix-4.a68
+++ b/gcc/testsuite/algol68/compile/error-radix-4.a68
@@ -0,0 +1 @@
+(bits d = 16rfg; skip) { dg-error "" }
--- a/gcc/testsuite/algol68/compile/radix-hex-supper-1.a68
+++ b/gcc/testsuite/algol68/compile/radix-hex-supper-1.a68
@@ -0,0 +1,3 @@
+{ Make sure that in supper stropping the bits
+  denotation is parsed as 16rab and not 16rabd  }
+while bits foo; foo /= 16r ab do ~ od
--- a/gcc/testsuite/algol68/compile/radix-hex-upper-1.a68
+++ b/gcc/testsuite/algol68/compile/radix-hex-upper-1.a68
@@ -0,0 +1,6 @@
+# { dg-options "-fstropping=upper" } #
+
+# Typographical display features are allowed between
+  digit symbols in hexadecimal bits denotations in UPPER
+  stropping.  #
+WHILE BITS foo; foo /= 16r a b DO ~ OD
--- a/gcc/testsuite/algol68/execute/environment-enquiries-6.a68
+++ b/gcc/testsuite/algol68/execute/environment-enquiries-6.a68
@@ -1,7 +1,7 @@
 # { dg-options "-fstropping=upper" }  #
 # Environment enquiries for SIZETY BITs #
-BEGIN ASSERT (max bits /= 10r0);
+BEGIN ASSERT (max bits /= 2r0);
      # XXX use LENG max bits below #
-      ASSERT (long max bits >= LONG 10r0);
-      ASSERT (long long max bits >= LONG LONG 10r0)
+      ASSERT (long max bits >= LONG 16r0);
+      ASSERT (long long max bits >= LONG LONG 4r0)
 END