[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#20029: 'yes' surprisingly slow
From: |
Pádraig Brady |
Subject: |
bug#20029: 'yes' surprisingly slow |
Date: |
Sat, 07 Mar 2015 12:10:39 +0000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 |
On 07/03/15 11:49, Ole Tange wrote:
> These two commands give the same output:
>
> $ yes `echo {1..1000}` | head -c 2300M | md5sum
> a0241f2247e9a37db60e7def3e4f7038 -
>
> $ yes "`echo {1..1000}`" | head -c 2300M | md5sum
> a0241f2247e9a37db60e7def3e4f7038 -
>
> But the time to run is quite different:
>
> $ time yes "`echo {1..1000}`" | head -c 2300M >/dev/null
>
> real 0m0.897s
> user 0m0.384s
> sys 0m1.343s
>
> $ time yes `echo {1..1000}` | head -c 2300M >/dev/null
>
> real 0m11.352s
> user 0m10.571s
> sys 0m2.590s
>
> WTF?!
>
> I imagine 'yes' spends a lot of time collecting the 1000 args. But why
> does it do that more than once?
The stdio interactions dominate here.
The slow case has 1000 times more fputs_unlocked() calls.
Yes we could build the line up once and output that.
If doing that we could also build up a BUFSIZ of complete lines
to output at a time, in which case you'd probably avoid stdio altogether.
BTW I noticed tee uses stdio calls which is redundant overhead currently.
It wouldn't if we added a --buffered call to tee so that it might
honor stdbuf(1), though I'm not sure it's worth that flexibility in tee.
I'll look at improving these.
thanks,
Pádraig.