sed-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Multi-byte character as delimiter


From: Assaf Gordon
Subject: Re: Multi-byte character as delimiter
Date: Sat, 7 Mar 2020 16:18:37 -0700
User-agent: Mutt/1.11.4 (2019-03-13)

Hello,

On Tue, Mar 03, 2020 at 01:49:16PM +0100, Haakon Storm Heen wrote:
> ### What I'm trying
> cat example.txt|gsed "sāœ‹$(printf '\t')āœ‹|āœ‹"
> 
> ### Error
> gsed: -e expression #1, char 2: delimiter character is not a single-byte
> character
> 
> ### Workaround? Feature request?
> 
> - Any way around this?
> - Should I add multibyte delimiter characters as a feature request?

Currently, there is no way around it.
Enforcing single-byte delimiter is gnu sed's behaviour since at least
version 4.0a from 2003.

You can of course ask for it as a feature, but personally I do not think
the benefits outweigh the costs of such addition.

> The rationale behind this is:
> 
> - emoji/unicode are (IMHO) better visual indicators (than plain `ascii`)

That is only true if your terminal properly supports unicode characters.
it would be very easy to assume all terminals behave as nice as MacOS's
terminal, but I suspect many do not.

They are also somewhat harder to type than regular characters.

> - many files I process are scripts that might contain the usual delimiter
> characters `/` `_` `|` ...

Do you mean that you are processing shell scripts using "gsed" ?
The scripts containing these characters are only a problem if you need
to replace several of these in one SED command, isn't it ?

For example, if you wanted to replace slashes AND pipe in the same sed
command, it would still be easy (and visually clear) to use ";" as
deliimter, no ? e.g.:

        gsed 's;/;FOO;'

> - replacing a šŸ¤š by mistake with something else is not as detrimental as
> replacing `/` or `|` if the file happens to be a shell script.

If you are using sed as a pipe (eg. "cat FILE.sh | gsed ...") then you
have the original file at hand if something detrimental happened.
If you are replacing inplace, use the backup option to keep a previous
version.

These two methods can help recover from any mistakes.


All in all, I'm not in favor of adding this as an option.
However, if you have other convincing use-cases please do send them.
And, since working code is worth a thousand emails, if you (or others)
want to try to implement this (including unit tests) - this will be a
strong case in favor.

regards,
 - assaf



reply via email to

[Prev in Thread] Current Thread [Next in Thread]