bug#34524: wc: word count incorrect when words separated only by no-brea

bug-coreutils

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#34524: wc: word count incorrect when words separated only by no-brea

From:	Pádraig Brady
Subject:	bug#34524: wc: word count incorrect when words separated only by no-break space
Date:	Sat, 9 Mar 2019 19:31:43 -0800
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0

On 09/03/19 05:52, Bruno Haible wrote:
> Hi Pádraig,
> 
>>>> In regard to options for enabling various behaviors for wc(1),
>>>> I'm thinking we might keep the strict POSIX isspace() behavior
>>>> with LC_CTYPE=C and/or POSIXLY_CORRECT=1, and use iswnbspace()
>>>> by default
> 
> Since you plan to add a --words=... option in the future (as suggested
> by Paul or me), it would make sense to add this option now, instead
> of testing POSIXLY_CORRECT. If you introduce POSIXLY_CORRECT dependent
> behaviour now (and need to keep it for backward-compatibility), you'll
> have a hard to understand interface: What will the following do?
> 
>   env POSIXLY_CORRECT=1 wc --words=unicode
>   wc --words=unicode

Well until we actually support more contextual
unicode word separation operation, the --words
option parameter would be a bit redundant.
Generally no-one would need to use POSIXLY_CORRECT
directly with wc, rather setting it globally
on a system or script to minimize changes.

In the above example --words=unicode would be
an explicit option to operate in extension to POSIX,
and so POSIXLY_CORRECT would be ignored there.

cheers,
Pádraig

[Prev in Thread]

Current Thread

[Next in Thread]

bug#34524: wc: word count incorrect when words separated only by no-break space, Bruno Haible, 2019/03/09
- bug#34524: wc: word count incorrect when words separated only by no-break space, Pádraig Brady <=

Prev by Date: bug#34524: wc: word count incorrect when words separated only by no-break space
Next by Date: bug#34810: dirname manpages documentation (feature request - documentation)
Previous by thread: bug#34524: wc: word count incorrect when words separated only by no-break space
Next by thread: bug#34810: dirname manpages documentation (feature request - documentation)
Index(es):
- Date
- Thread