[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug-gettext] [PATCH v2] xgettext: Support message syntax checks
From: |
Daiki Ueno |
Subject: |
[bug-gettext] [PATCH v2] xgettext: Support message syntax checks |
Date: |
Wed, 04 Feb 2015 18:30:24 +0900 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) |
With this change, xgettext could report common syntactic problems
in strings to be extracted. Current built-in checks are
ellipsis-unicode, space-ellipsis, and quote-unicode. Those checks
can be enabled with --check option of xgettext and disabled with
special "xgettext:" comment in source files.
Feature suggested by Philip Withnall in:
https://savannah.gnu.org/bugs/?44098
* gettext-tools/src/message.h (enum syntax_check_type): New enum.
(NSYNTAXCHECKS): New constant.
(enum is_syntax_check): New enum.
(struct message_ty): New field 'do_syntax_check'.
(syntax_check_name): New variable declaration.
* gettext-tools/src/message.c (syntax_check_name): New variable.
* gettext-tools/src/msgl-cat.c (catenate_msgdomain_list): Propagate
mp->do_syntax_check.
* gettext-tools/src/msgmerge.c (message_merge): Propagate
ref->do_syntax_check.
* gettext-tools/src/msgl-check.h (syntax_check_message_list): New
declaration.
* gettext-tools/src/msgl-check.c (syntax_check_ellipsis_unicode): New
function.
(syntax_check_space_ellipsis): New function.
(syntax_check_quote_unicode): New function.
(syntax_check_message): New function.
(syntax_check_message_list): New function.
* gettext-tools/src/read-catalog-abstract.h (po_parse_comment_special):
Adjust function declaration.
* gettext-tools/src/read-catalog-abstract.c (po_parse_comment_special):
Add new argument SCP for syntax checking; all callers changed.
* gettext-tools/src/read-catalog.h (DEFAULT_CATALOG_READER_TY): New
field 'do_syntax_check'.
* gettext-tools/src/read-catalog.c (default_constructor): Initialize
this->do_syntax_check.
(default_copy_comment_state): Propagate this->do_syntax_check.
* gettext-tools/src/xgettext.c (long_options): Add --check option.
(main): Handle --check option.
(usage): Document --check option.
(remember_a_message): Propagate do_syntax_check value.
* gettext-tools/tests/xgettext-13: New file.
* gettext-tools/tests/Makefile.am (TESTS): Add new test.
* gettext-tools/doc/xgettext.texi: Document --check option.
---
gettext-tools/doc/ChangeLog | 4 +
gettext-tools/doc/xgettext.texi | 36 ++++++++
gettext-tools/src/ChangeLog | 39 ++++++++
gettext-tools/src/message.c | 12 +++
gettext-tools/src/message.h | 26 ++++++
gettext-tools/src/msgl-cat.c | 13 +++
gettext-tools/src/msgl-check.c | 144 ++++++++++++++++++++++++++++++
gettext-tools/src/msgl-check.h | 4 +-
gettext-tools/src/msgmerge.c | 3 +
gettext-tools/src/read-catalog-abstract.c | 35 +++++++-
gettext-tools/src/read-catalog-abstract.h | 3 +-
gettext-tools/src/read-catalog.c | 8 +-
gettext-tools/src/read-catalog.h | 1 +
gettext-tools/src/xgettext.c | 67 +++++++++++++-
gettext-tools/tests/ChangeLog | 5 ++
gettext-tools/tests/Makefile.am | 1 +
gettext-tools/tests/xgettext-13 | 99 ++++++++++++++++++++
17 files changed, 492 insertions(+), 8 deletions(-)
create mode 100755 gettext-tools/tests/xgettext-13
diff --git a/gettext-tools/doc/ChangeLog b/gettext-tools/doc/ChangeLog
index edac431..645c580 100644
--- a/gettext-tools/doc/ChangeLog
+++ b/gettext-tools/doc/ChangeLog
@@ -1,3 +1,7 @@
+2015-02-04 Daiki Ueno <address@hidden>
+
+ * xgettext.texi: Document --check option.
+
2015-02-03 Daiki Ueno <address@hidden>
* msgexec.texi, msgfilter.texi: Fix markup error caused by commit
diff --git a/gettext-tools/doc/xgettext.texi b/gettext-tools/doc/xgettext.texi
index 451e25f..1fb4bc1 100644
--- a/gettext-tools/doc/xgettext.texi
+++ b/gettext-tools/doc/xgettext.texi
@@ -144,6 +144,42 @@ gettext (
The second comment line will not be extracted, because there is one
blank line between the comment line and the keyword.
address@hidden address@hidden
address@hidden address@hidden
address@hidden address@hidden, @code{xgettext} option}
address@hidden address@hidden, @code{xgettext} option}
address@hidden supported syntax checks, @code{xgettext}
+Perform a syntax check on msgid and msgid_plural. The supported checks
+are:
+
address@hidden @samp
address@hidden ellipsis-unicode
+Prefer Unicode ellipsis character over ASCII @code{...}
+
address@hidden space-ellipsis
+Prohibit whitespace before an ellipsis character
+
address@hidden quote-unicode
+Prefer Unicode quotation marks over ASCII @code{"'`}
+
address@hidden table
+
+The option has an effect on the all input files. To enable or disable
+checks, you can mark it with @code{xgettext:} comment in the source
+file. For example, if you specify @code{-Wspace-ellipsis} option, but
+want to suppress the check on a particular string, add a special comment:
+
address@hidden
+/* xgettext: no-space-ellipsis-check */
+gettext ("We really really need to output ...");
address@hidden example
+
+The special @code{xgettext:} comment can be followed by flags separated
+with a comma. The possible flags are of the form
address@hidden@var{name}-check}, where @var{name} is the name of one
+of the valid syntax checks. If a flag is prefixed by @code{no-}, the
+meaning is negated.
+
@end table
@subsection Language specific options
diff --git a/gettext-tools/src/ChangeLog b/gettext-tools/src/ChangeLog
index 633ec9e..7a542b9 100644
--- a/gettext-tools/src/ChangeLog
+++ b/gettext-tools/src/ChangeLog
@@ -1,3 +1,42 @@
+2015-02-04 Daiki Ueno <address@hidden>
+
+ xgettext: Support message syntax checks
+ With this change, xgettext could report common syntactic problems
+ in strings to be extracted. Current built-in checks are
+ ellipsis-unicode, space-ellipsis, and quote-unicode. Those checks
+ can be enabled with --check option of xgettext and disabled with
+ special "xgettext:" comment in source files.
+ Feature suggested by Philip Withnall in:
+ https://savannah.gnu.org/bugs/?44098
+ * message.h (enum syntax_check_type): New enum.
+ (NSYNTAXCHECKS): New constant.
+ (enum is_syntax_check): New enum.
+ (struct message_ty): New field 'do_syntax_check'.
+ (syntax_check_name): New variable declaration.
+ * message.c (syntax_check_name): New variable.
+ * msgl-cat.c (catenate_msgdomain_list): Propagate
+ mp->do_syntax_check.
+ * msgmerge.c (message_merge): Propagate ref->do_syntax_check.
+ * msgl-check.h (syntax_check_message_list): New declaration.
+ * msgl-check.c (syntax_check_ellipsis_unicode): New function.
+ (syntax_check_space_ellipsis): New function.
+ (syntax_check_quote_unicode): New function.
+ (syntax_check_message): New function.
+ (syntax_check_message_list): New function.
+ * read-catalog-abstract.h (po_parse_comment_special): Adjust
+ function declaration.
+ * read-catalog-abstract.c (po_parse_comment_special): Add new
+ argument SCP for syntax checking; all callers changed.
+ * read-catalog.h (DEFAULT_CATALOG_READER_TY): New field
+ 'do_syntax_check'.
+ * read-catalog.c (default_constructor): Initialize
+ this->do_syntax_check.
+ (default_copy_comment_state): Propagate this->do_syntax_check.
+ * xgettext.c (long_options): Add --check option.
+ (main): Handle --check option.
+ (usage): Document --check option.
+ (remember_a_message): Propagate do_syntax_check value.
+
2015-02-03 Daiki Ueno <address@hidden>
msgfilter: Factor out quoted string handling
diff --git a/gettext-tools/src/message.c b/gettext-tools/src/message.c
index 586675f..2596887 100644
--- a/gettext-tools/src/message.c
+++ b/gettext-tools/src/message.c
@@ -104,6 +104,14 @@ possible_format_p (enum is_format is_format)
}
+const char *const syntax_check_name[NSYNTAXCHECKS] =
+{
+ /* sc_ellipsis_unicode */ "ellipsis-unicode",
+ /* sc_space_ellipsis */ "space-ellipsis",
+ /* sc_quote_unicode */ "quote-unicode"
+};
+
+
message_ty *
message_alloc (const char *msgctxt,
const char *msgid, const char *msgid_plural,
@@ -130,6 +138,8 @@ message_alloc (const char *msgctxt,
mp->range.min = -1;
mp->range.max = -1;
mp->do_wrap = undecided;
+ for (i = 0; i < NSYNTAXCHECKS; i++)
+ mp->do_syntax_check[i] = undecided;
mp->prev_msgctxt = NULL;
mp->prev_msgid = NULL;
mp->prev_msgid_plural = NULL;
@@ -235,6 +245,8 @@ message_copy (message_ty *mp)
result->is_format[i] = mp->is_format[i];
result->range = mp->range;
result->do_wrap = mp->do_wrap;
+ for (i = 0; i < NSYNTAXCHECKS; i++)
+ result->do_syntax_check[i] = mp->do_syntax_check[i];
for (j = 0; j < mp->filepos_count; ++j)
{
lex_pos_ty *pp = &mp->filepos[j];
diff --git a/gettext-tools/src/message.h b/gettext-tools/src/message.h
index bf2215a..8b9bc3f 100644
--- a/gettext-tools/src/message.h
+++ b/gettext-tools/src/message.h
@@ -114,6 +114,29 @@ enum is_wrap
#endif
+/* Kinds of syntax checks which apply to strings. */
+enum syntax_check_type
+{
+ sc_ellipsis_unicode,
+ sc_space_ellipsis,
+ sc_quote_unicode
+};
+#define NSYNTAXCHECKS 3
+extern DLL_VARIABLE const char *const syntax_check_name[NSYNTAXCHECKS];
+
+/* Is current msgid subject to a syntax check? */
+#if 0
+enum is_syntax_check
+{
+ undecided,
+ yes,
+ no
+};
+#else /* HACK - C's enum concept is so stupid */
+#define is_syntax_check is_format
+#endif
+
+
struct altstr
{
const char *msgstr;
@@ -175,6 +198,9 @@ struct message_ty
/* Do we want the string to be wrapped in the emitted PO file? */
enum is_wrap do_wrap;
+ /* Do we want to apply extra syntax checks on the string? */
+ enum is_syntax_check do_syntax_check[NSYNTAXCHECKS];
+
/* The prev_msgctxt, prev_msgid and prev_msgid_plural strings appearing
before the message, if present. Generated by msgmerge. */
const char *prev_msgctxt;
diff --git a/gettext-tools/src/msgl-cat.c b/gettext-tools/src/msgl-cat.c
index 0bd58d4..8502a64 100644
--- a/gettext-tools/src/msgl-cat.c
+++ b/gettext-tools/src/msgl-cat.c
@@ -308,6 +308,8 @@ domain \"%s\" in input file '%s' doesn't contain a header
entry with a charset s
tmp->range.min = - INT_MAX;
tmp->range.max = - INT_MAX;
tmp->do_wrap = yes; /* may be set to no later */
+ for (i = 0; i < NSYNTAXCHECKS; i++)
+ tmp->do_syntax_check[i] = undecided; /* may be set to
yes/no later */
tmp->obsolete = true; /* may be set to false later */
tmp->alternative_count = 0;
tmp->alternative = NULL;
@@ -535,6 +537,8 @@ UTF-8 encoded from the beginning, i.e. already in your
source code files.\n"),
tmp->is_format[i] = mp->is_format[i];
tmp->range = mp->range;
tmp->do_wrap = mp->do_wrap;
+ for (i = 0; i < NSYNTAXCHECKS; i++)
+ tmp->do_syntax_check[i] = mp->do_syntax_check[i];
tmp->prev_msgctxt = mp->prev_msgctxt;
tmp->prev_msgid = mp->prev_msgid;
tmp->prev_msgid_plural = mp->prev_msgid_plural;
@@ -583,6 +587,9 @@ UTF-8 encoded from the beginning, i.e. already in your
source code files.\n"),
}
if (tmp->do_wrap == undecided)
tmp->do_wrap = mp->do_wrap;
+ for (i = 0; i < NSYNTAXCHECKS; i++)
+ if (tmp->do_syntax_check[i] == undecided)
+ tmp->do_syntax_check[i] = mp->do_syntax_check[i];
tmp->obsolete = false;
}
else
@@ -635,6 +642,12 @@ UTF-8 encoded from the beginning, i.e. already in your
source code files.\n"),
}
if (mp->do_wrap == no)
tmp->do_wrap = no;
+ for (i = 0; i < NSYNTAXCHECKS; i++)
+ if (mp->do_syntax_check[i] == yes)
+ tmp->do_syntax_check[i] = yes;
+ else if (mp->do_syntax_check[i] == no
+ && tmp->do_syntax_check[i] == undecided)
+ tmp->do_syntax_check[i] = no;
/* Don't fill tmp->prev_msgid in this case. */
if (!mp->obsolete)
tmp->obsolete = false;
diff --git a/gettext-tools/src/msgl-check.c b/gettext-tools/src/msgl-check.c
index d6f4a3d..30f178d 100644
--- a/gettext-tools/src/msgl-check.c
+++ b/gettext-tools/src/msgl-check.c
@@ -40,6 +40,7 @@
#include "plural-table.h"
#include "c-strstr.h"
#include "message.h"
+#include "quote.h"
#include "gettext.h"
#define _(str) gettext (str)
@@ -912,3 +913,146 @@ check_message_list (message_list_ty *mlp,
return seen_errors;
}
+
+
+static int
+syntax_check_ellipsis_unicode (const message_ty *mp, const char *msgid)
+{
+ const char *cp;
+ int seen_errors = 0;
+
+ for (cp = msgid; *cp != '\0'; cp++)
+ {
+ cp = strchrnul (cp, '\n');
+ if (cp > msgid + 3 && memcmp (cp - 3, "...", 3) == 0)
+ {
+ po_xerror (PO_SEVERITY_ERROR, mp, NULL, 0, 0, false,
+ _("ASCII ellipsis ('...') instead of Unicode"));
+ seen_errors++;
+ }
+ }
+
+ return seen_errors;
+}
+
+
+static int
+syntax_check_space_ellipsis (const message_ty *mp, const char *msgid)
+{
+ /* Coincidentally the lengths of bytes are same for UTF-8 and ASCII
+ ellipsis. */
+ const char *ellipsis
+ = mp->do_syntax_check[sc_ellipsis_unicode] == yes ? "\xE2\x80\xA6" : "...";
+ const char *cp;
+ int seen_errors = 0;
+
+ for (cp = msgid; *cp != '\0'; cp++)
+ {
+ cp = strchrnul (cp, '\n');
+ if (cp > msgid + 4 && memcmp (cp - 3, ellipsis, 3) == 0
+ && c_isspace (*(cp - 4)))
+ {
+ po_xerror (PO_SEVERITY_ERROR, mp, NULL, 0, 0, false,
+ _("space before ellipsis found in user visible strings"));
+ seen_errors++;
+ }
+ }
+
+ return seen_errors;
+}
+
+
+struct callback_arg
+{
+ const message_ty *mp;
+ int seen_errors;
+};
+
+static void
+syntax_check_quote_unicode_callback (char quote, const char *quoted,
+ size_t quoted_length, void *data)
+{
+ struct callback_arg *arg = data;
+
+ switch (quote)
+ {
+ case '"':
+ po_xerror (PO_SEVERITY_ERROR, arg->mp, NULL, 0, 0, false,
+ _("ASCII double quote used instead of Unicode"));
+ arg->seen_errors++;
+ break;
+
+ case '\'':
+ po_xerror (PO_SEVERITY_ERROR, arg->mp, NULL, 0, 0, false,
+ _("ASCII single quote used instead of Unicode"));
+ arg->seen_errors++;
+ break;
+
+ default:
+ break;
+ }
+}
+
+static int
+syntax_check_quote_unicode (const message_ty *mp, const char *msgid)
+{
+ struct callback_arg arg;
+
+ arg.mp = mp;
+ arg.seen_errors = 0;
+
+ scan_quoted (msgid, strlen (msgid),
+ syntax_check_quote_unicode_callback, &arg);
+
+ return arg.seen_errors;
+}
+
+
+typedef int (* syntax_check_function) (const message_ty *mp, const char
*msgid);
+static const syntax_check_function sc_funcs[NSYNTAXCHECKS] =
+{
+ syntax_check_ellipsis_unicode,
+ syntax_check_space_ellipsis,
+ syntax_check_quote_unicode
+};
+
+/* Perform all syntax checks on a non-obsolete message.
+ Return the number of errors that were seen. */
+static int
+syntax_check_message (const message_ty *mp)
+{
+ int seen_errors = 0;
+ int i;
+
+ for (i = 0; i < NSYNTAXCHECKS; i++)
+ {
+ if (mp->do_syntax_check[i] == yes)
+ {
+ seen_errors += sc_funcs[i] (mp, mp->msgid);
+ if (mp->msgid_plural)
+ seen_errors += sc_funcs[i] (mp, mp->msgid_plural);
+ }
+ }
+
+ return seen_errors;
+}
+
+
+/* Perform all syntax checks on a message list.
+ Return the number of errors that were seen. */
+int
+syntax_check_message_list (message_list_ty *mlp)
+{
+ int seen_errors = 0;
+ size_t j;
+
+ for (j = 0; j < mlp->nitems; j++)
+ {
+ message_ty *mp = mlp->item[j];
+
+ if (!is_header (mp))
+ seen_errors += syntax_check_message (mp);
+ }
+
+ return seen_errors;
+}
diff --git a/gettext-tools/src/msgl-check.h b/gettext-tools/src/msgl-check.h
index f03300c..f9d9abd 100644
--- a/gettext-tools/src/msgl-check.h
+++ b/gettext-tools/src/msgl-check.h
@@ -28,7 +28,6 @@
extern "C" {
#endif
-
/* Check the values returned by plural_eval.
Signals the errors through po_xerror.
Return the number of errors that were seen.
@@ -60,6 +59,9 @@ extern int check_message_list (message_list_ty *mlp,
int check_compatibility,
int check_accelerators, char accelerator_char);
+/* Perform all syntax checks on a message list.
+ Return the number of errors that were seen. */
+extern int syntax_check_message_list (message_list_ty *mlp);
#ifdef __cplusplus
}
diff --git a/gettext-tools/src/msgmerge.c b/gettext-tools/src/msgmerge.c
index 0415b2a..71d8962 100644
--- a/gettext-tools/src/msgmerge.c
+++ b/gettext-tools/src/msgmerge.c
@@ -1330,6 +1330,9 @@ message_merge (message_ty *def, message_ty *ref, bool
force_fuzzy,
result->do_wrap = ref->do_wrap;
+ for (i = 0; i < NSYNTAXCHECKS; i++)
+ result->do_syntax_check[i] = ref->do_syntax_check[i];
+
/* Insert previous msgid, commented out with "#|".
Do so only when --previous is specified, for backward compatibility.
Since the "previous msgid" represents the original msgid that led to
diff --git a/gettext-tools/src/read-catalog-abstract.c
b/gettext-tools/src/read-catalog-abstract.c
index d4e98ee..0817cd7 100644
--- a/gettext-tools/src/read-catalog-abstract.c
+++ b/gettext-tools/src/read-catalog-abstract.c
@@ -262,7 +262,8 @@ po_callback_comment_special (const char *s)
void
po_parse_comment_special (const char *s,
bool *fuzzyp, enum is_format formatp[NFORMATS],
- struct argument_range *rangep, enum is_wrap *wrapp)
+ struct argument_range *rangep, enum is_wrap *wrapp,
+ enum is_syntax_check scp[NSYNTAXCHECKS])
{
size_t i;
@@ -272,6 +273,8 @@ po_parse_comment_special (const char *s,
rangep->min = -1;
rangep->max = -1;
*wrapp = undecided;
+ for (i = 0; i < NSYNTAXCHECKS; i++)
+ scp[i] = undecided;
while (*s != '\0')
{
@@ -405,6 +408,36 @@ po_parse_comment_special (const char *s,
continue;
}
+ /* Accept syntax check description. */
+ if (len >= 6 && memcmp (t + len - 6, "-check", 6) == 0)
+ {
+ const char *p;
+ size_t n;
+ enum is_syntax_check value;
+
+ p = t;
+ n = len - 6;
+
+ if (n >= 3 && memcmp (p, "no-", 3) == 0)
+ {
+ p += 3;
+ n -= 3;
+ value = no;
+ }
+ else
+ value = yes;
+
+ for (i = 0; i < NSYNTAXCHECKS; i++)
+ if (strlen (syntax_check_name[i]) == n
+ && memcmp (syntax_check_name[i], p, n) == 0)
+ {
+ scp[i] = value;
+ break;
+ }
+ if (i < NSYNTAXCHECKS)
+ continue;
+ }
+
/* Unknown special comment marker. It may have been generated
from a future xgettext version. Ignore it. */
}
diff --git a/gettext-tools/src/read-catalog-abstract.h
b/gettext-tools/src/read-catalog-abstract.h
index c3fc84f..367584b 100644
--- a/gettext-tools/src/read-catalog-abstract.h
+++ b/gettext-tools/src/read-catalog-abstract.h
@@ -184,7 +184,8 @@ extern void po_callback_comment_dispatcher (const char *s);
extern void po_parse_comment_special (const char *s, bool *fuzzyp,
enum is_format formatp[NFORMATS],
struct argument_range *rangep,
- enum is_wrap *wrapp);
+ enum is_wrap *wrapp,
+ enum is_syntax_check scp[NSYNTAXCHECKS]);
#ifdef __cplusplus
diff --git a/gettext-tools/src/read-catalog.c b/gettext-tools/src/read-catalog.c
index 4642249..8c77df1 100644
--- a/gettext-tools/src/read-catalog.c
+++ b/gettext-tools/src/read-catalog.c
@@ -105,6 +105,8 @@ default_constructor (abstract_catalog_reader_ty *that)
this->range.min = -1;
this->range.max = -1;
this->do_wrap = undecided;
+ for (i = 0; i < NSYNTAXCHECKS; i++)
+ this->do_syntax_check[i] = undecided;
}
@@ -172,6 +174,8 @@ default_copy_comment_state (default_catalog_reader_ty
*this, message_ty *mp)
mp->is_format[i] = this->is_format[i];
mp->range = this->range;
mp->do_wrap = this->do_wrap;
+ for (i = 0; i < NSYNTAXCHECKS; i++)
+ mp->do_syntax_check[i] = this->do_syntax_check[i];
}
@@ -205,6 +209,8 @@ default_reset_comment_state (default_catalog_reader_ty
*this)
this->range.min = -1;
this->range.max = -1;
this->do_wrap = undecided;
+ for (i = 0; i < NSYNTAXCHECKS; i++)
+ this->do_syntax_check[i] = undecided;
}
@@ -299,7 +305,7 @@ default_comment_special (abstract_catalog_reader_ty *that,
const char *s)
default_catalog_reader_ty *this = (default_catalog_reader_ty *) that;
po_parse_comment_special (s, &this->is_fuzzy, this->is_format, &this->range,
- &this->do_wrap);
+ &this->do_wrap, this->do_syntax_check);
}
diff --git a/gettext-tools/src/read-catalog.h b/gettext-tools/src/read-catalog.h
index f567d78..74e0fd7 100644
--- a/gettext-tools/src/read-catalog.h
+++ b/gettext-tools/src/read-catalog.h
@@ -113,6 +113,7 @@ struct default_catalog_reader_class_ty
enum is_format is_format[NFORMATS]; \
struct argument_range range; \
enum is_wrap do_wrap; \
+ enum is_syntax_check do_syntax_check[NSYNTAXCHECKS]; \
typedef struct default_catalog_reader_ty default_catalog_reader_ty;
struct default_catalog_reader_ty
diff --git a/gettext-tools/src/xgettext.c b/gettext-tools/src/xgettext.c
index f9156eb..12b3f54 100644
--- a/gettext-tools/src/xgettext.c
+++ b/gettext-tools/src/xgettext.c
@@ -58,6 +58,8 @@
#include "po-charset.h"
#include "msgl-iconv.h"
#include "msgl-ascii.h"
+#include "msgl-check.h"
+#include "po-xerror.h"
#include "po-time.h"
#include "write-catalog.h"
#include "write-po.h"
@@ -179,6 +181,9 @@ static bool recognize_format_kde;
/* If true, recognize Boost format strings. */
static bool recognize_format_boost;
+/* Syntax checks enabled by default. */
+static enum is_syntax_check default_syntax_check[NSYNTAXCHECKS];
+
/* Canonicalized encoding name for all input files. */
const char *xgettext_global_source_encoding;
@@ -204,6 +209,7 @@ static const struct option long_options[] =
{ "add-location", optional_argument, NULL, 'n' },
{ "boost", no_argument, NULL, CHAR_MAX + 11 },
{ "c++", no_argument, NULL, 'C' },
+ { "check", required_argument, NULL, 'W' },
{ "color", optional_argument, NULL, CHAR_MAX + 14 },
{ "copyright-holder", required_argument, NULL, CHAR_MAX + 1 },
{ "debug", no_argument, &do_debug, 1 },
@@ -346,7 +352,7 @@ main (int argc, char *argv[])
init_flag_table_vala ();
while ((optchar = getopt_long (argc, argv,
- "ac::Cd:D:eEf:Fhijk::l:L:m::M::no:p:sTVw:x:",
+
"ac::Cd:D:eEf:Fhijk::l:L:m::M::no:p:sTVw:W:x:",
long_options, NULL)) != EOF)
switch (optchar)
{
@@ -525,6 +531,17 @@ main (int argc, char *argv[])
}
break;
+ case 'W':
+ if (strcmp (optarg, "ellipsis-unicode") == 0)
+ default_syntax_check[sc_ellipsis_unicode] = yes;
+ else if (strcmp (optarg, "space-ellipsis") == 0)
+ default_syntax_check[sc_space_ellipsis] = yes;
+ else if (strcmp (optarg, "quote-unicode") == 0)
+ default_syntax_check[sc_quote_unicode] = yes;
+ else
+ error (EXIT_FAILURE, 0, _("syntax check '%s' unknown"), optarg);
+ break;
+
case 'x':
read_exclusion_file (optarg);
break;
@@ -836,6 +853,24 @@ warning: file '%s' extension '%s' is unknown; will try
C"), filename, extension)
else if (sort_by_msgid)
msgdomain_list_sort_by_msgid (mdlp);
+ /* Check syntax of messages. */
+ {
+ int nerrors = 0;
+
+ for (i = 0; i < mdlp->nitems; i++)
+ {
+ message_list_ty *mlp = mdlp->item[i]->messages;
+ nerrors = syntax_check_message_list (mlp);
+ }
+
+ /* Exit with status 1 on any error. */
+ if (nerrors > 0)
+ error (EXIT_FAILURE, 0,
+ ngettext ("found %d fatal error", "found %d fatal errors",
+ nerrors),
+ nerrors);
+ }
+
/* Write the PO file. */
msgdomain_list_print (mdlp, file_name, output_syntax, force_po, do_debug);
@@ -921,6 +956,10 @@ Operation mode:\n"));
preceding keyword lines in output file\n\
-c, --add-comments place all comment blocks preceding keyword
lines\n\
in output file\n"));
+ printf (_("\
+ -W, --check=NAME perform syntax check on messages\n\
+ (ellipsis-unicode, space-ellipsis,\n\
+ quote-unicode)\n"));
printf ("\n");
printf (_("\
Language specific options:\n"));
@@ -1644,8 +1683,8 @@ xgettext_record_flag (const char *optionstring)
flag += 5;
}
- /* Unlike po_parse_comment_special(), we don't accept "fuzzy" or "wrap"
- here - it has no sense. */
+ /* Unlike po_parse_comment_special(), we don't accept "fuzzy",
+ "wrap", or "check" here - it has no sense. */
if (strlen (flag) >= 7
&& memcmp (flag + strlen (flag) - 7, "-format", 7) == 0)
{
@@ -2238,6 +2277,7 @@ remember_a_message (message_list_ty *mlp, char *msgctxt,
char *msgid,
enum is_format is_format[NFORMATS];
struct argument_range range;
enum is_wrap do_wrap;
+ enum is_syntax_check do_syntax_check[NSYNTAXCHECKS];
message_ty *mp;
char *msgstr;
size_t i;
@@ -2264,6 +2304,8 @@ remember_a_message (message_list_ty *mlp, char *msgctxt,
char *msgid,
range.min = -1;
range.max = -1;
do_wrap = undecided;
+ for (i = 0; i < NSYNTAXCHECKS; i++)
+ do_syntax_check[i] = undecided;
if (msgctxt != NULL)
CONVERT_STRING (msgctxt, lc_string);
@@ -2297,6 +2339,8 @@ meta information, not the empty string.\n")));
for (i = 0; i < NFORMATS; i++)
is_format[i] = mp->is_format[i];
do_wrap = mp->do_wrap;
+ for (i = 0; i < NSYNTAXCHECKS; i++)
+ do_syntax_check[i] = mp->do_syntax_check[i];
}
else
{
@@ -2376,12 +2420,13 @@ meta information, not the empty string.\n")));
enum is_format tmp_format[NFORMATS];
struct argument_range tmp_range;
enum is_wrap tmp_wrap;
+ enum is_syntax_check tmp_syntax_check[NSYNTAXCHECKS];
bool interesting;
t += strlen ("xgettext:");
po_parse_comment_special (t, &tmp_fuzzy, tmp_format, &tmp_range,
- &tmp_wrap);
+ &tmp_wrap, tmp_syntax_check);
interesting = false;
for (i = 0; i < NFORMATS; i++)
@@ -2400,6 +2445,12 @@ meta information, not the empty string.\n")));
do_wrap = tmp_wrap;
interesting = true;
}
+ for (i = 0; i < NSYNTAXCHECKS; i++)
+ if (tmp_syntax_check[i] != undecided)
+ {
+ do_syntax_check[i] = tmp_syntax_check[i];
+ interesting = true;
+ }
/* If the "xgettext:" marker was followed by an interesting
keyword, and we updated our is_format/do_wrap variables,
@@ -2525,6 +2576,14 @@ meta information, not the empty string.\n")));
mp->do_wrap = do_wrap == no ? no : yes; /* By default we wrap. */
+ for (i = 0; i < NSYNTAXCHECKS; i++)
+ {
+ if (do_syntax_check[i] == undecided)
+ do_syntax_check[i] = default_syntax_check[i] == yes ? yes : no;
+
+ mp->do_syntax_check[i] = do_syntax_check[i];
+ }
+
/* Warn about the use of non-reorderable format strings when the programming
language also provides reorderable format strings. */
warn_format_string (is_format, mp->msgid, pos, "msgid");
diff --git a/gettext-tools/tests/ChangeLog b/gettext-tools/tests/ChangeLog
index eec1586..9223edd 100644
--- a/gettext-tools/tests/ChangeLog
+++ b/gettext-tools/tests/ChangeLog
@@ -1,3 +1,8 @@
+2015-02-04 Daiki Ueno <address@hidden>
+
+ * xgettext-13: New file.
+ * Makefile.am (TESTS): Add new test.
+
2015-01-29 Daiki Ueno <address@hidden>
* msgexec-6: New file.
diff --git a/gettext-tools/tests/Makefile.am b/gettext-tools/tests/Makefile.am
index ee34655..32bc192 100644
--- a/gettext-tools/tests/Makefile.am
+++ b/gettext-tools/tests/Makefile.am
@@ -72,6 +72,7 @@ TESTS = gettext-1 gettext-2 gettext-3 gettext-4 gettext-5
gettext-6 gettext-7 \
recode-sr-latin-1 recode-sr-latin-2 \
xgettext-2 xgettext-3 xgettext-4 xgettext-5 xgettext-6 \
xgettext-7 xgettext-8 xgettext-9 xgettext-10 xgettext-11 xgettext-12 \
+ xgettext-13 \
xgettext-awk-1 xgettext-awk-2 \
xgettext-c-2 xgettext-c-3 xgettext-c-4 xgettext-c-5 \
xgettext-c-6 xgettext-c-7 xgettext-c-8 xgettext-c-9 xgettext-c-10 \
diff --git a/gettext-tools/tests/xgettext-13 b/gettext-tools/tests/xgettext-13
new file mode 100755
index 0000000..32107f2
--- /dev/null
+++ b/gettext-tools/tests/xgettext-13
@@ -0,0 +1,99 @@
+#!/bin/sh
+. "${srcdir=.}/init.sh"; path_prepend_ . ../src
+
+# Test for --check option.
+
+# --check=ellipsis-unicode
+cat <<\EOF > xg-ellipsis-u.c
+gettext ("this is a sentence...");
+
+ngettext ("this is a sentence", "these are sentences...", 2);
+
+/* xgettext: no-ellipsis-unicode-check */
+gettext ("this is another sentence...");
+
+gettext ("this is a multiline sentence\n"
+ "and the second line...\n"
+ "ends with an ellipsis\n");
+EOF
+
+: ${XGETTEXT=xgettext}
+LANGUAGE= LC_ALL=C ${XGETTEXT} --omit-header --add-comments
--check=ellipsis-unicode -d xg-ellipsis-u.tmp xg-ellipsis-u.c
2>xg-ellipsis-u.err
+
+test `grep -c 'ASCII ellipsis' xg-ellipsis-u.err` = 3 || exit 1
+
+# --check=space-ellipsis
+cat <<\EOF > xg-space-e.c
+gettext ("this is a sentence ...");
+
+/* xgettext: no-space-ellipsis-check, no-ellipsis-unicode-check */
+gettext ("this is another sentence ...");
+
+gettext ("this is a multiline sentence\n"
+ "and the second line ...\n"
+ "ends with an ellipsis\n");
+EOF
+
+LANGUAGE= LC_ALL=C ${XGETTEXT} --omit-header --add-comments
--check=space-ellipsis -d xg-space-e.tmp xg-space-e.c 2>xg-space-e.err
+
+test `grep -c 'space before ellipsis' xg-space-e.err` = 2 || exit 1
+
+# Combination of --check=space-ellipsis and --check=ellipsis-unicode.
+LANGUAGE= LC_ALL=C ${XGETTEXT} --omit-header --add-comments
--check=ellipsis-unicode --check=space-ellipsis -d xg-space-eu.tmp xg-space-e.c
2>xg-space-eu.err
+
+test `grep -c 'ASCII ellipsis' xg-space-eu.err` = 2 || exit 1
+
+# --check=quote-unicode
+cat <<\EOF > xg-quote-u.c
+gettext ("\"double quoted\"");
+
+/* xgettext: no-quote-unicode-check */
+gettext ("\"double quoted but ignored\"");
+
+gettext ("double quoted but empty \"\"");
+
+gettext ("\"\" double quoted but empty");
+
+gettext ("\"foo\" \"bar\" \"baz\"");
+
+gettext ("'single quoted'");
+
+/* xgettext: no-quote-unicode-check */
+gettext ("'single quoted but ignored'");
+
+gettext ("'foo' 'bar' 'baz'");
+
+gettext ("prefix'single quoted without surrounding spaces'suffix");
+
+gettext ("prefix 'single quoted with surrounding spaces' suffix");
+
+gettext ("single quoted with apostrophe, empty '' ");
+
+gettext ("'single quoted at the beginning of string' ");
+
+gettext (" 'single quoted at the end of string'");
+
+gettext ("line 1\n"
+"'single quoted at the beginning of line' \n"
+"line 3");
+
+gettext ("line 1\n"
+" 'single quoted at the end of line'\n"
+"line 3");
+
+gettext ("`single quoted with grave'");
+
+/* xgettext: no-quote-unicode-check */
+gettext ("`single quoted with grave but ignored'");
+
+gettext ("single quoted with grave, empty `'");
+
+gettext ("`' single quoted with grave, empty");
+
+gettext ("`double grave`");
+EOF
+
+LANGUAGE= LC_ALL=C ${XGETTEXT} --omit-header --add-comments
--check=quote-unicode -d xg-quote-u.tmp xg-quote-u.c 2>xg-quote-u.err
+
+test `grep -c 'ASCII double quote' xg-quote-u.err` = 4 || exit 1
+test `grep -c 'ASCII single quote' xg-quote-u.err` = 12 || exit 1
--
2.1.0
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [bug-gettext] [PATCH v2] xgettext: Support message syntax checks,
Daiki Ueno <=