[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Adding humanize_number to coreutiles?
From: |
Pádraig Brady |
Subject: |
Re: Adding humanize_number to coreutiles? |
Date: |
Tue, 14 Feb 2012 01:06:18 +0000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:6.0) Gecko/20110816 Thunderbird/6.0 |
On 02/13/2012 03:45 AM, Peng Yu wrote:
> 2012/2/7 Pádraig Brady <address@hidden>:
>> On 02/07/2012 03:36 AM, Peng Yu wrote:
>>> Hi,
>>>
>>> Several commands in coreutils have the -h option. I'm wondering
>>> whether anybody in the develop team also thinks that it is worthwhile
>>> to export it as a standalone command. If so, I'd recommend add such
>>> convenient command in coreutiles. As I don't find it anywhere else as
>>> a stand alone command.
>>>
>>> http://siarzhuk.dyndns.org/haiku/doxygen/coreutils_2lib_2human_8c_source.html#l00154
>>
>> I've needed such functionality many times.
>> I'm thinking a printf format would be best to expose this:
>> http://lists.gnu.org/archive/html/coreutils/2011-08/msg00029.html
>>
>> %H seems like it might cause compat problems in future.
>> %{human} is more descriptive and extensible, so I'm leaning towards that.
>> Any other suggestions appreciated.
>>
>> I'll work on it this week.
>
> I'm not sure %{human} is enough for configuring all the possible ways
> of printing a humanized number. See my code below, there are binary
> and decimal humanized numbers. Also, you'd better allow a space (or
> not) between the numbers and the letters (such as 'T', 'G').
>
> Also embedding it in printf will make it hard to be found, I'd
> recommend to create a new command like humanizenumber, just as I did.
>
> /tmp$ cat `which humanizenumber.sh `
> #!/usr/bin/env bash
>
> script_name=`basename "$0" .sh`
>
> TEMP=`getopt -o hbd:sn --long
> help,binary,number_of_decimal_places:,space,newline -n
> "${script_name}.sh" -- "$@"`
>
> if [ $? != 0 ] ; then printf "Terminating...\n" >&2 ; exit 1 ; fi
>
> eval set -- "$TEMP"
>
> abspath_script=`readlink -f -e "$0"`
> script_absdir=`dirname "$abspath_script"`
>
> number_of_decimal_places=0
> while true ; do
> case "$1" in
> -h|--help)
> cat "$script_absdir"/${script_name}_help.txt
> exit
> ;;
> -b|--binary)
> binary=x
> shift
> ;;
> -d|--number_of_decimal_places)
> number_of_decimal_places="$2"
> shift 2
> ;;
> -s|--space)
> space=' '
> shift
> ;;
> -n|--newline)
> newline='\n'
> shift
> ;;
> --)
> shift
> break
> ;;
> *)
> printf "Internal error!\n">&2
> exit 1
> ;;
> esac
> done
>
> if [ $# -ne 0 ]
> then
> n="$1"
> if [ -n "$binary" ]
> then
> awk -v sum=$n \
> -v space="$space" \
> -v newline="$newline" \
> -v number_of_decimal_places=$number_of_decimal_places '
> BEGIN{
> hum[1024**5]="P"
> hum[1024**4]="T"
> hum[1024**3]="G"
> hum[1024**2]="M"
> hum[1024]="K"
> for (x=1024**5; x>=1024; x/=1024) {
> if (sum>=x) {
> printf "%." number_of_decimal_places "f" space "%s" newline,
> sum/x, hum[x]
> break
> }
> }
> }'
> else
> awk -v sum=$n \
> -v space="$space" \
> -v newline="$newline" \
> -v number_of_decimal_places=$number_of_decimal_places '
> BEGIN{
> hum[1000**5]="P"
> hum[1000**4]="T"
> hum[1000**3]="G"
> hum[1000**2]="M"
> hum[1000]="k"
> for (x=1000**5; x>=1000; x/=1000) {
> if (sum>=x) {
> printf "%." number_of_decimal_places "f" space "%s" newline, sum/x,
> hum[x]
> break
> }
> }
> }'
> fi
> fi
> /tmp$ humanizenumber.sh -h
> Description:
> Humanize number(s)
>
> Usage:
> humanizenumber.sh [Options] [NUMBER]
>
> NUMBER If not specifed, then do nothing.
>
> Options:
> -h|--help Help message.
> -b|--binary Default: decimal.
> -s|--space Default: nospace.
>
> Examples:
> humanizenumber.sh 456456456
> humanizenumber.sh -d 2 456456456
> humanizenumber.sh -s 456456456
> humanizenumber.sh -b 456456456
> humanizenumber.sh -n 456456456
>
> Author:
> Peng Yu <address@hidden>
Looking more at this, you might be right.
Now printf already has related formatting functionality:
$ env LANG=fa_IR.utf8 printf "%I'd\n" 1234
۱٬۲۳۴
I was thinking it would be appropriate to add "human" into the mix like
$ env LANG=fa_IR.utf8 printf "%Hd\n" 1234
1K
$ env LANG=fa_IR.utf8 printf "%HId\n" 1234
۱K
But as you say there are options for humanizing.
So would there be enough cohesive functionality one could add to such a util?
I suppose so, since one could add field processing and
multiplier support for example.
Also what to call it? humanize_number is too long I think.
Perhaps we could use a more general name. Drats I was
thinking of `numconv`, but that's taken:
http://www.unixref.com/manPages/numconv.html
Maybe `convnum`, anyway...
A tentative design could be:
convnum [OPTIONS] [NUM]...
Numbers are processed from stdin or the command options.
--from={auto,SI,IEC}
If not specified, suffixes are ignored
auto => 1K -> 1000, 1Ki -> 1024
SI => 1K* -> 1000
IEC => 1K* -> 1024
--from-unit=<NUMBER>
Specify the unit size.
--from-unit=1 is implied if not specified
--to={SI,IEC,<NUMBER>}
Auto scale the numbers to SI (powers of 1000),
or IEC (powers of 1024), so at most 3 digits are output.
Note output will be standard, without a B suffix.
I.E. 123K or 123Ki for SI and IEC respectively.
If <NUMBER> is specified use this as the scale.
--to-unit=<NUMBER>
Specify the output unit size.
--to-unit=1 is implied if not specified
--round={ceiling, floor, nearest}
--round=ceiling is implied if not specified
--number-format=FORMAT
--number-format=%d is implied if not specified
You can use this to specify a space after the number
You can also use this to perform grouping (with %'d)
You can also use this to select alternative number forms (with %Id) etc.
--suffix=SUFFIX
Example --suffix=B
--field=NUM
replace the number in the portion of the line delimited by whitespace.
If the new number is narrower, then pad to the same field width with spaces.
<NUMBER>s specified above can be numeric
with an optional suffix, like K, Ki.
(Note K should probably be SI here, unlike other coreutils).
cheers,
Pádraig