[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Thoughts on GLib regexes
From: |
Jean Abou Samra |
Subject: |
Thoughts on GLib regexes |
Date: |
Wed, 30 Nov 2022 01:06:37 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.5.0 |
Hi,
As shown by https://gitlab.com/lilypond/lilypond/-/issues/6463,
Guile regular expressions are a trap when it comes to Unicode.
Under a non-Unicode locale, characters that can't be expressed
in the locale encoding get converted to "?", both in the pattern
and the search string, before invoking the underlying POSIX regex
functions.
I would like feedback on this approach:
https://gitlab.com/jeanas/lilypond/-/commits/regex-glib/
LilyPond requires GLib (for Pango), and GLib has a regex
API wrapping that of PCRE, which is fully Unicode-aware.
This branch wraps the GLib regex API into a Scheme API that
LilyPond should then use.
On the plus side, it allows not to worry about Unicode anymore,
eliminating the nasty trap that bought us a critical regression.
On the minus side, it is ~250 lines of code, and I don't
immediately see regexes in the current code base that
would be problematic with Unicode.
Thoughts?
Regards,
Jean
OpenPGP_signature
Description: OpenPGP digital signature
- Thoughts on GLib regexes,
Jean Abou Samra <=