[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: performance bug of `wc -m` on simulated macOS

From: Bruno Haible
Subject: Re: performance bug of `wc -m` on simulated macOS
Date: Sun, 20 May 2018 22:57:03 +0200
User-agent: KMail/5.1.3 (Linux/4.4.0-124-generic; KDE/5.18.0; x86_64; ; )

Pádraig Brady wrote:
> $ yes áááááááááááááááááááá | head -n100000 > mbc.txt
> $ yes 12345678901234567890 | head -n100000 > num.txt
> ===== Before ====
> $ time src/wc -Lm < mbc.txt
> 2100000      20
> real    0m0.186s
> $ time src/wc -m < mbc.txt
> 2100000
> real    0m0.186s

> Now I see we may be replacing wcwidth() on OSX as there are issues
> with OSX handling of combining characters in UTF-8.
> So maybe the slow down is with the gnulib wcwidth!?
> To test that I did:
>   $ gl_cv_func_wcwidth_works=no ./configure --quiet
>   $ time src/wc -Lm < mbc.txt
>   2100000      20
>   real        0m0.225s

When I do a profiling of this (on a glibc system, with
gl_cv_func_wcwidth_works=no) using valgrind + kcachegrind,
I obtain the attached output.

My interpretation:

  * rpl_wcwidth is 2.5 times slower than the native glibc wcwidth.

  * uc_width in itself is OK; it's the locale_charset call which
    is eating churn. Which is silly, since the locale does not change
    while 'wc' is running.

    To improve this, it would be good if gnulib implemented a
    wcwidth_l function that takes a locale_t object as argument.
    The step from locale_t to the lookup table used by uc_width
    would be faster than the sequence of nl_langinfo_l and locale_charset.
    I'm not sure, though, that this can be realized:
      - the locale_t objects are not extensible.
      - #ifs are needed to accommodate platforms that don't have 'locale_t'
        at all.


Attachment: callgrind.out.8022.png
Description: application/kcachegrind

reply via email to

[Prev in Thread] Current Thread [Next in Thread]