poke-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] libpoke: Enable octal and hex \-sequence in string litera


From: Jose E. Marchesi
Subject: Re: [PATCH v2] libpoke: Enable octal and hex \-sequence in string literals
Date: Tue, 03 Nov 2020 12:58:27 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)

Hi Mohammad.

> 2020-11-03  Mohammad-Reza Nabipoor  <m.nabipoor@yahoo.com>
>
>       * libpoke/pkl-trans.c (pkl_trans1_ps_string): Add support for octal
>       and hexadecimal escape sequences.
>       * doc/poke.texi (String Literals): Document new escape sequences.
>       * testsuite/poke.pkl/strings-esc-1.pk: New test.
>       * testsuite/poke.pkl/strings-esc-2.pk: Likewise.
>       * testsuite/poke.pkl/strings-esc-diag-1.pk: Likewise.
>       * testsuite/poke.pkl/strings-esc-diag-2.pk: Likewise.
>       * testsuite/poke.pkl/strings-esc-diag-3.pk: Likewise.
>       * testsuite/poke.pkl/strings-esc-diag-4.pk: Likewise.
>       * testsuite/poke.pkl/string-diag-1.pk: `\x` is now a valid sequence;
>       use `\z` as an invalid escape sequence.
>       * testsuite/Makefile.am (EXTRA_DIST): Add new tests.
> ---
>
> Hi, Jose.
>
> Thanks for your answers and comments.
>
> Changes from version 1:
>   - Added documentation in `poke.texi`.
>   - Updateed tests to add more normal characters.
>   - Changed interpretation of octal escape sequences.
>     Old behavior: `"\400" == " 0"` (space character + 0 character).
>         Because 0o400 does not fit inside a byte it treats the last digit
>         as a separate character (not part of the octal number).
>       This behavior was incompatible with C compilers (gcc emits a warning
>         and clang emits an error). I droped this "feature" in favor of an
>         error.
>     New behavior: `"\400"` (error: octal escape sequence out of range).
>
>
> Now two more questions :)
>   - Is documentation understandable?

I think it is.  But see a few notes below :)

>   - Do you like the new behavior regarding octal escape sequences?

Makes sense to me, to error if the specified quantity overflows a byte.

>
> Regards,
> Mohammad-Reza
>
>
>  ChangeLog                                |  15 +++
>  doc/poke.texi                            |  11 ++
>  libpoke/pkl-trans.c                      | 126 ++++++++++++++++++-----
>  testsuite/Makefile.am                    |   6 ++
>  testsuite/poke.pkl/string-diag-1.pk      |   2 +-
>  testsuite/poke.pkl/strings-esc-1.pk      |  25 +++++
>  testsuite/poke.pkl/strings-esc-2.pk      |  25 +++++
>  testsuite/poke.pkl/strings-esc-diag-1.pk |   3 +
>  testsuite/poke.pkl/strings-esc-diag-2.pk |   3 +
>  testsuite/poke.pkl/strings-esc-diag-3.pk |   5 +
>  testsuite/poke.pkl/strings-esc-diag-4.pk |   5 +
>  11 files changed, 200 insertions(+), 26 deletions(-)
>  create mode 100644 testsuite/poke.pkl/strings-esc-1.pk
>  create mode 100644 testsuite/poke.pkl/strings-esc-2.pk
>  create mode 100644 testsuite/poke.pkl/strings-esc-diag-1.pk
>  create mode 100644 testsuite/poke.pkl/strings-esc-diag-2.pk
>  create mode 100644 testsuite/poke.pkl/strings-esc-diag-3.pk
>  create mode 100644 testsuite/poke.pkl/strings-esc-diag-4.pk
>
> diff --git a/ChangeLog b/ChangeLog
> index 9098ddd2..9cf32a73 100644
> --- a/ChangeLog
> +++ b/ChangeLog
> @@ -1,3 +1,18 @@
> +2020-11-03  Mohammad-Reza Nabipoor  <m.nabipoor@yahoo.com>
> +
> +     * libpoke/pkl-trans.c (pkl_trans1_ps_string): Add support for octal
> +     and hexadecimal escape sequences.
> +     * doc/poke.texi (String Literals): Document new escape sequences.
> +     * testsuite/poke.pkl/strings-esc-1.pk: New test.
> +     * testsuite/poke.pkl/strings-esc-2.pk: Likewise.
> +     * testsuite/poke.pkl/strings-esc-diag-1.pk: Likewise.
> +     * testsuite/poke.pkl/strings-esc-diag-2.pk: Likewise.
> +     * testsuite/poke.pkl/strings-esc-diag-3.pk: Likewise.
> +     * testsuite/poke.pkl/strings-esc-diag-4.pk: Likewise.
> +     * testsuite/poke.pkl/string-diag-1.pk: `\x` is now a valid sequence;
> +     use `\z` as an invalid escape sequence.
> +     * testsuite/Makefile.am (EXTRA_DIST): Add new tests.
> +
>  2020-11-01  Egeyar Bagcioglu  <egeyar@gmail.com>
>  
>       * libpoke/ios-dev-file.h: Include errno.h.
> diff --git a/doc/poke.texi b/doc/poke.texi
> index fb3aad87..16d81f5c 100644
> --- a/doc/poke.texi
> +++ b/doc/poke.texi
> @@ -6133,8 +6133,19 @@ Denotes an horizontal tab.
>  Denotes a backlash @code{\} character.
>  @item \"
>  Denotes a double-quote @code{"} character.
> +@item \o
> +@item \oo
> +@item \ooo
> +Denotes an octal number (@code{o} represents an octal digit).
> +The number should be less than 256 (0o400).

I would use @var{num} instead of these `o', `oo' and `ooo'.  Then you
can say that NUM is a number expressed with octal digits from 0 to 255.

> +@item \xX
> +@item \xXX
> +Denotes a hexadecimal number (@code{X} represents a hexadecimal digit).

Ditto for @var{X}.  Better to use \x@var{num} IMO.

>  @end table
>  
> +Due to the fact that strings in Poke are @code{NULL}-terminated, inserting
> +a @code{NULL} character using escape sequence leads to compilation error.

I would make it explicit that in this context a "NULL character" is the
character with code 0.

> +
>  @node String Types
>  @subsection String Types
>  
> diff --git a/libpoke/pkl-trans.c b/libpoke/pkl-trans.c
> index 18366ca0..6f414e8a 100644
> --- a/libpoke/pkl-trans.c
> +++ b/libpoke/pkl-trans.c
> @@ -20,6 +20,7 @@
>  
>  #include <gettext.h>
>  #define _(str) gettext (str)
> +#include <ctype.h>
>  #include <stdio.h>
>  #include <string.h>
>  #include <xalloc.h>
> @@ -304,6 +305,10 @@ PKL_PHASE_BEGIN_HANDLER (pkl_trans1_ps_string)
>    size_t string_length, i;
>    bool found_backslash = false;
>  
> +#define ISODIGIT(c) ((unsigned)(c) - '0' < 8) // is octal digit
> +#define XDIGIT(x) \
> +  ((unsigned)(x) - '0' < 10 ? (x) - '0' : ((x) | 0x20) - 'a' + 10)
> +

Please do not use C++ comments.

>    /* Please keep this code in sync with the string printer in
>       pvm-val.c:pvm_print_val.  */
>  
> @@ -311,27 +316,45 @@ PKL_PHASE_BEGIN_HANDLER (pkl_trans1_ps_string)
>       \-expansion, and report errors in the contents of the string.  */
>    for (p = string_pointer, string_length = 0; *p != '\0'; ++p)
>      {
> -      if (p[0] == '\\')
> +      string_length++;
> +      if (*p != '\\')
> +        continue;
> +
> +      found_backslash = true;
> +      ++p;
> +
> +      if (ISODIGIT (p[0]))
> +        {
> +          if (ISODIGIT (p[1]))
> +            p += ISODIGIT (p[2]) ? 2 : 1; // reason: 0377 == 0xff

Likewise.

> +          continue;
> +        }
> +
> +      switch (*p)
>          {
> -          switch (p[1])
> +        case '\\':
> +        case 'n':
> +        case 't':
> +        case '"':
> +          break;
> +        case 'x':
> +          ++p;
> +          if (!isxdigit (p[0]))
>              {
> -            case '\\':
> -            case 'n':
> -            case 't':
> -            case '"':
> -              string_length++;
> -              break;
> -            default:
>                PKL_ERROR (PKL_AST_LOC (string),
> -                         "invalid \\%c sequence in string", p[1]);
> +                         _ ("\\x used with no following hex digits"));
>                PKL_TRANS_PAYLOAD->errors++;
>                PKL_PASS_ERROR;
>              }
> -          p++;
> -          found_backslash = true;
> +          if (isxdigit (p[1]))
> +            ++p;
> +          break;
> +        default:
> +          PKL_ERROR (PKL_AST_LOC (string),
> +                     _ ("invalid \\%c sequence in string"), *p);
> +          PKL_TRANS_PAYLOAD->errors++;
> +          PKL_PASS_ERROR;
>          }
> -      else
> -        string_length++;
>      }
>  
>    if (!found_backslash)
> @@ -342,24 +365,77 @@ PKL_PHASE_BEGIN_HANDLER (pkl_trans1_ps_string)
>  
>    for (p = string_pointer, i = 0; *p != '\0'; ++p, ++i)
>      {
> -      if (p[0] == '\\')
> +      if (*p != '\\')
> +        {
> +          new_string_pointer[i] = *p;
> +          continue;
> +        }
> +      ++p;
> +
> +      // octal escape sequence


Likewise.

> +      if (ISODIGIT (p[0]))
>          {
> -          switch (p[1])
> +          unsigned int num = p[0] - '0';
> +
> +          if (ISODIGIT (p[1]))
>              {
> -            case '\\': new_string_pointer[i] = '\\'; break;
> -            case 'n':  new_string_pointer[i] = '\n'; break;
> -            case 't':  new_string_pointer[i] = '\t'; break;
> -            case '"':  new_string_pointer[i] = '"'; break;
> -            default:
> -              assert (0);
> +              ++p;
> +              num = (num << 3) | (p[0] - '0');
> +              if (ISODIGIT (p[1]))
> +                {
> +                  ++p;
> +                  num = (num << 3) | (p[0] - '0');
> +                }
>              }
> -          p++;
> +          if (num == '\0')
> +            {
> +              PKL_ERROR (PKL_AST_LOC (string),
> +                         _ ("string literal cannot contain NULL character"));
> +              PKL_TRANS_PAYLOAD->errors++;
> +              PKL_PASS_ERROR;
> +            }
> +          else if (num > 255)
> +            {
> +              PKL_ERROR (PKL_AST_LOC (string),
> +                         _ ("octal escape sequence out of range"));
> +              PKL_TRANS_PAYLOAD->errors++;
> +              PKL_PASS_ERROR;
> +            }
> +          new_string_pointer[i] = num;
> +          continue;
> +        }
> +
> +      switch (*p)
> +        {
> +        case '\\': new_string_pointer[i] = '\\'; break;
> +        case 'n': new_string_pointer[i] = '\n'; break;
> +        case 't': new_string_pointer[i] = '\t'; break;
> +        case '"': new_string_pointer[i] = '"'; break;
> +        case 'x':
> +          ++p;
> +          new_string_pointer[i] = XDIGIT (p[0]);
> +          if (isxdigit (p[1]))
> +            {
> +              new_string_pointer[i] = (XDIGIT (p[0]) << 4) | XDIGIT (p[1]);
> +              ++p;
> +            }
> +          if (new_string_pointer[i] == '\0')
> +            {
> +              PKL_ERROR (PKL_AST_LOC (string),
> +                         _ ("string literal cannot contain NULL character"));
> +              PKL_TRANS_PAYLOAD->errors++;
> +              PKL_PASS_ERROR;
> +            }
> +          break;
> +        default:
> +          assert (0);
>          }
> -      else
> -        new_string_pointer[i] = p[0];
>      }
>    new_string_pointer[i] = '\0';
>  
> +#undef ISODIGIT
> +#undef XDIGIT
> +
>    free (string_pointer);
>    PKL_AST_STRING_POINTER (string) = new_string_pointer;
>  }
> diff --git a/testsuite/Makefile.am b/testsuite/Makefile.am
> index 5ba9900a..7b0a7c7c 100644
> --- a/testsuite/Makefile.am
> +++ b/testsuite/Makefile.am
> @@ -1370,6 +1370,12 @@ EXTRA_DIST = \
>    poke.pkl/strings-2.pk \
>    poke.pkl/strings-3.pk \
>    poke.pkl/strings-4.pk \
> +  poke.pkl/strings-esc-1.pk \
> +  poke.pkl/strings-esc-2.pk \
> +  poke.pkl/strings-esc-diag-1.pk \
> +  poke.pkl/strings-esc-diag-2.pk \
> +  poke.pkl/strings-esc-diag-3.pk \
> +  poke.pkl/strings-esc-diag-4.pk \
>    poke.pkl/strings-index-1.pk \
>    poke.pkl/strings-index-2.pk \
>    poke.pkl/strings-index-diag-1.pk \
> diff --git a/testsuite/poke.pkl/string-diag-1.pk 
> b/testsuite/poke.pkl/string-diag-1.pk
> index 877bf6ad..1daf5c4e 100644
> --- a/testsuite/poke.pkl/string-diag-1.pk
> +++ b/testsuite/poke.pkl/string-diag-1.pk
> @@ -2,4 +2,4 @@
>  
>  /* Invalid \-sequence in string.  */
>  
> -defvar s = "foo\xbar"; /* { dg-error "" } */
> +defvar s = "foo\zbar"; /* { dg-error "" } */
> diff --git a/testsuite/poke.pkl/strings-esc-1.pk 
> b/testsuite/poke.pkl/strings-esc-1.pk
> new file mode 100644
> index 00000000..35dd6a11
> --- /dev/null
> +++ b/testsuite/poke.pkl/strings-esc-1.pk
> @@ -0,0 +1,25 @@
> +/* { dg-do run } */
> +
> +/* Octal escape sequence */
> +
> +defvar s = "\1\12\1234;\12==\n";
> +
> +/* { dg-command { .set obase 8 } } */
> +
> +/* { dg-command { s'length } } */
> +/* { dg-output "0o11UL" } */
> +
> +/* { dg-command {s[0]} } */
> +/* { dg-output "\n0o1UB" } */
> +
> +/* { dg-command {s[1]} } */
> +/* { dg-output "\n0o12UB" } */
> +
> +/* { dg-command {s[2]} } */
> +/* { dg-output "\n0o123UB" } */
> +
> +/* { dg-command {s[3]} } */
> +/* { dg-output "\n0o64UB" } */
> +
> +/* { dg-command {s[4:]} } */
> +/* { dg-output "\n\";\\\\n==\\\\n\"" } */
> diff --git a/testsuite/poke.pkl/strings-esc-2.pk 
> b/testsuite/poke.pkl/strings-esc-2.pk
> new file mode 100644
> index 00000000..35696ea0
> --- /dev/null
> +++ b/testsuite/poke.pkl/strings-esc-2.pk
> @@ -0,0 +1,25 @@
> +/* { dg-do run } */
> +
> +/* Hexadecimal escape sequence */
> +
> +defvar s = "He\x1llo \x12\x123, World!";
> +
> +/* { dg-command {.set obase 16} } */
> +
> +/* { dg-command {s'length} } */
> +/* { dg-output "0x12UL" } */
> +
> +/* { dg-command {s[2]} } */
> +/* { dg-output "\n0x1UB" } */
> +
> +/* { dg-command {s[7]} } */
> +/* { dg-output "\n0x12UB" } */
> +
> +/* { dg-command {s[8]} } */
> +/* { dg-output "\n0x12UB" } */
> +
> +/* { dg-command {s[9]} } */
> +/* { dg-output "\n0x33UB" } */
> +
> +/* { dg-command {s[0:1] + s[3:5] + s[10:]} } */
> +/* { dg-output "\n\"Hello, World!\"" } */
> diff --git a/testsuite/poke.pkl/strings-esc-diag-1.pk 
> b/testsuite/poke.pkl/strings-esc-diag-1.pk
> new file mode 100644
> index 00000000..c51cbccd
> --- /dev/null
> +++ b/testsuite/poke.pkl/strings-esc-diag-1.pk
> @@ -0,0 +1,3 @@
> +/* { dg-do compile } */
> +
> +defvar s = "a\0b"; /* { dg-error "" } */
> diff --git a/testsuite/poke.pkl/strings-esc-diag-2.pk 
> b/testsuite/poke.pkl/strings-esc-diag-2.pk
> new file mode 100644
> index 00000000..f4934e36
> --- /dev/null
> +++ b/testsuite/poke.pkl/strings-esc-diag-2.pk
> @@ -0,0 +1,3 @@
> +/* { dg-do compile } */
> +
> +defvar s = "x\x0z"; /* { dg-error "" } */
> diff --git a/testsuite/poke.pkl/strings-esc-diag-3.pk 
> b/testsuite/poke.pkl/strings-esc-diag-3.pk
> new file mode 100644
> index 00000000..96f5a513
> --- /dev/null
> +++ b/testsuite/poke.pkl/strings-esc-diag-3.pk
> @@ -0,0 +1,5 @@
> +/* { dg-do compile } */
> +
> +/* Invalid \-sequence */
> +
> +defvar s = "a\8b"; /* { dg-error "" } */
> diff --git a/testsuite/poke.pkl/strings-esc-diag-4.pk 
> b/testsuite/poke.pkl/strings-esc-diag-4.pk
> new file mode 100644
> index 00000000..241f301e
> --- /dev/null
> +++ b/testsuite/poke.pkl/strings-esc-diag-4.pk
> @@ -0,0 +1,5 @@
> +/* { dg-do compile } */
> +
> +/* Invalid octal \-sequence */
> +
> +defvar s = "\400"; /* { dg-error "" } */



reply via email to

[Prev in Thread] Current Thread [Next in Thread]