coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: numfmt (=print 'human' sizes) updates


From: Pádraig Brady
Subject: Re: numfmt (=print 'human' sizes) updates
Date: Wed, 26 Dec 2012 21:25:59 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120615 Thunderbird/13.0.1

On 12/26/2012 03:28 PM, Assaf Gordon wrote:
Hello,

Pádraig Brady wrote, On 12/21/2012 12:42 PM:

I'm starting to think the original idea of having a --format option
would be more a more concise, well known and extendible interface
rather than having --padding, --grouping, --base, ...
It wouldn't have to be passed directly to printf, and could
be parsed and preprocessed something like we do in seq(1).


Regarding 'format' option, there are some intricacies that are worth discussing:

1. Depending on the requested conversion, the output can be a string (e.g. 
"1.4Ki") or a long double (e.g. 1400000).

I would expect the numeric format applies to the number portion only,
so there should be little confusion there.

2. Internally, the program uses long doubles - so the real format is "%Lf" - regardless 
of what the user will give (e.g. "%f").

Yes. There is a long_double_format() function in seq.c that converts from
specified format to L.. internally.

3. printf accepts all sorts of options, some of which aren't relevant to 
numfmt, or only relevant when printing non-humanized values.
e.g.:
     $ LC_ALL=en_US.utf8 seq -f "%0'14.5f" 1000 1001
     0001,000.00000
     0001,001.00000

4. The assumption was that humanized numbers are always maximum 4 characters in SI/IEC (e.g. "1024" or 
"4.5M") or 5 characters with "iec-i" (e.g. "999Ti").
With the new 'format', if given "%'2.9f" - should the output be still 4 characters (e.g. "4.5T"), or respect 
the ".9" format (e.g. "4.500000000T") ? and does the suffix character counts in the "2.9" format ?

I would expect that specifying a precision after the '.' with 
--to={si,iec,iec-i}
would override any default precision, and that the width for the field would 
adjusted accordingly.
The suffix character would be significant to the field/padding width rather 
than the precision
(similarly to the first point that format only directly applies to the numeric 
portion).
So:

$ printf "%s\n" 94500 95000 | numfmt --to=iec
 94.5K
 95.0K
$ printf "%s\n" 94500 95000 | numfmt --to=iec-i
 94.5Ki
 95.0Ki
$ printf "%s\n" 94500 95000 | numfmt --to=iec-i --format='[%f]' # Keep default 
of %7.1Lf for --to=iec-i
[ 94.5Ki]
[ 95.0Ki]
$ printf "%s\n" 94500 95000 | numfmt --to=iec-i --format='[%g]' # %g -> %7Lg 
for --to=iec-i (adjusted for padding)
[ 94.5Ki]
[ 95  Ki]
$ printf "%s\n" 94500 95000 | numfmt --to=iec-i --format='[%10.f]' # Any 
padding or precision overrides defaults
[      94Ki]
[      95Ki]
$ printf "%s\n" 94500 95000 | numfmt --to=iec-i --format='%.f ' # Common would 
be to strip decimals and add a space
 94 Ki
 95 Ki

My preference is to keep things simple, and accept just a limited subset of the 
"format" syntax:
1. grouping (the ' character)
2. padding (the number after '%' and before the 'f'
3. alignment (optional '-' after '%')
4. Any prefix/suffix before/after the '%' option.
5. Accept just "%f", but internally treat it as '%s' or '%Lf', depending on the 
output.

The above are consistent with my examples I think.

All other options will be silently ignored, or trigger errors.

Example:
     $ numfmt --format "xx%20fxx" --to=si 5000
     [[ internally, treats as "--padding 20" ]]
     xx                5.0Kxx

     $ numfmt --format "xx%'-10fxx" 5000
     [[ internally, treats as "--padding -10 --grouping" ]]
     xx5,000     xx

     $ numfmt --format "xx%0#'+010llfxx" 5000
     [[ reject as 'too complicated' / unsupported printf options ]]

Too complicated/unsupported might just completely override our
defaults and padding etc. and be passed to printf?
I.E. follow the "it's better to ask forgiveness than permission" idiom,
i.e. use as much logic below you as possible.

$ echo 1111 | numfmt --format="xx%0#'+010llfxx\n"
xx+1,111.000000xx

cheers,
Pádraig.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]