@c Language: Brazilian Portuguese, Encoding: iso-8859-1
@c /descriptive.texi/1.8/Mon Jul 24 10:59:45 2006//
@menu
* address@hidden@~ao ao pacote descriptive::
* address@hidden@~oes para address@hidden@~ao da dados::
* address@hidden@~oes para estatistica descritiva::
* address@hidden@~oes para specific multivariate descriptive statistics::
* address@hidden@~oes para statistical graphs::
@end menu

@node address@hidden@~ao ao pacote descriptive, address@hidden@~oes para address@hidden@~ao da dados, descriptive, descriptive
@section address@hidden@~ao ao pacote descriptive

Package @code{descriptive} contains a set of functions for making descriptive statistical computations and graphing. Together with the source code there are three data sets in your Maxima tree: @code{pidigits.data}, @code{wind.data} and @code{biomed.data}. They can be also downloaded from the web site @code{www.biomates.net}.

Any statistics manual can be used as a reference to the functions in package @code{descriptive}.

For comments, bugs or suggestions, please contact me at @var{'mario AT edu DOT xunta DOT es'}.

Here is a simple example on how the descriptive functions in @code{descriptive} do they work, depending on the nature of their arguments, lists or matrices,

@c ===beg===
@c load (descriptive)$
@c /* univariate sample */   mean ([a, b, c]);
@c matrix ([a, b], [c, d], [e, f]);
@c /* multivariate sample */ mean (%);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) /* univariate sample */   mean ([a, b, c]);
                            c + b + a
(%o2)                       ---------
                                3
(%i3) matrix ([a, b], [c, d], [e, f]);
                            [ a  b ]
                            [      ]
(%o3)                       [ c  d ]
                            [      ]
                            [ e  f ]
(%i4) /* multivariate sample */ mean (%);
                      e + c + a  f + d + b
(%o4)                [---------, ---------]
                          3          3
@end example

Note that in multivariate samples the mean is calculated for each column.

In case of several samples with possible different sizes, the Maxima function @code{map} can be used to get the desired results for each sample,

@c ===beg===
@c load (descriptive)$
@c map (mean, [[a, b, c], [d, e]]);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) map (mean, [[a, b, c], [d, e]]);
                        c + b + a  e + d
(%o2)                  [---------, -----]
                            3        2
@end example

In this case, two samples of sizes 3 and 2 were stored into a list.

Univariate samples must be stored in lists like

@c ===beg===
@c s1 : [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5];
@c ===end===
@example
(%i1) s1 : [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5];
(%o1)           [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]
@end example

and multivariate samples in matrices as in

@c ===beg===
@c s2 : matrix ([13.17, 9.29], [14.71, 16.88], [18.50, 16.88],
@c              [10.58, 6.63], [13.33, 13.25], [13.21,  8.12]);
@c ===end===
@example
(%i1) s2 : matrix ([13.17, 9.29], [14.71, 16.88], [18.50, 16.88],
             [10.58, 6.63], [13.33, 13.25], [13.21,  8.12]);
                        [ 13.17  9.29  ]
                        [              ]
                        [ 14.71  16.88 ]
                        [              ]
                        [ 18.5   16.88 ]
(%o1)                   [              ]
                        [ 10.58  6.63  ]
                        [              ]
                        [ 13.33  13.25 ]
                        [              ]
                        [ 13.21  8.12  ]
@end example

In this case, the number of columns equals the random variable dimension and the number of rows is the sample size.

Data can be introduced by hand, but big samples are usually stored in plain text files. For example, file @code{pidigits.data} contains the first 100 digits of number @code{%pi}:
@example
      3
      1
      4
      1
      5
      9
      2
      6
      5
      3 ...
@end example

In order to load these digits in Maxima,

@c ===beg===
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c length (s1);
@c ===end===
@example
(%i1) load (numericalio)$
(%i2) s1 : read_list (file_search ("pidigits.data"))$
(%i3) length (s1);
(%o3)                          100
@end example

On the other hand, file @code{wind.data} contains daily average wind speeds at 5 meteorological stations in the Republic of Ireland (This is part of a data set taken at 12 meteorological stations. The original file is freely downloadable from the StatLib Data Repository and its analysis is discused in Haslett, J., Raftery, A. E. (1989) @var{Space-time Modelling with Long-memory Dependence: Assessing Ireland's Wind Power Resource, with Discussion}. Applied Statistics 38, 1-50). This loads the data:

@c ===beg===
@c load (numericalio)$
@c s2 : read_matrix (file_search ("wind.data"))$
@c length (s2);
@c s2 [%]; /* last record */
@c ===end===
@example
(%i1) load (numericalio)$
(%i2) s2 : read_matrix (file_search ("wind.data"))$
(%i3) length (s2);
(%o3)                          100
(%i4) s2 [%]; /* last record */
(%o4)            [3.58, 6.0, 4.58, 7.62, 11.25]
@end example

Some samples contain non numeric data. As an example, file @code{biomed.data} (which is part of another bigger one downloaded from the StatLib Data Repository) contains four blood measures taken from two groups of patients, @code{A} and @code{B}, of different ages,

@c ===beg===
@c load (numericalio)$
@c s3 : read_matrix (file_search ("biomed.data"))$
@c length (s3);
@c s3 [1]; /* first record */
@c ===end===
@example
(%i1) load (numericalio)$
(%i2) s3 : read_matrix (file_search ("biomed.data"))$
(%i3) length (s3);
(%o3)                          100
(%i4) s3 [1]; /* first record */
(%o4)            [A, 30, 167.0, 89.0, 25.6, 364]
@end example

The first individual belongs to group @code{A}, is 30 years old and his/her blood measures were 167.0, 89.0, 25.6 and 364.

One must take care when working with categorical data. In the next example, symbol @code{a} is asigned a value in some previous moment and then a sample with categorical value @code{a} is taken,

@c ===beg===
@c a : 1$
@c matrix ([a, 3], [b, 5]);
@c ===end===
@example
(%i1) a : 1$
(%i2) matrix ([a, 3], [b, 5]);
                            [ 1  3 ]
(%o2)                       [      ]
                            [ b  5 ]
@end example


@node address@hidden@~oes para address@hidden@~ao da dados, address@hidden@~oes para estatistica descritiva, address@hidden@~ao ao pacote descriptive, descriptive
@section address@hidden@~oes para address@hidden@~ao da dados

@deffn {Function} continuous_freq (@var{list})
@deffnx {Function} continuous_freq (@var{list}, @var{m})
The argument of @code{continuous_freq} must be a list of numbers, which will be then grouped in intervals and counted how many of them belong to each group. Optionally, function @code{continuous_freq} admits a second argument indicating the number of classes, 10 is default,

@c ===beg===
@c load (numericalio)$
@c load (descriptive)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c continuous_freq (s1, 5);
@c ===end===
@example
(%i1) load (numericalio)$
(%i2) load (descriptive)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) continuous_freq (s1, 5);
(%o4) [[0, 1.8, 3.6, 5.4, 7.2, 9.0], [16, 24, 18, 17, 25]]
@end example

The first list contains the interval limits and the second the corresponding counts: there are 16 digits inside the interval @code{[0, 1.8]}, that is 0's and 1's, 24 digits in @code{(1.8, 3.6]}, that is 2's and 3's, and so on.
@end deffn


@deffn {Function} discrete_freq (@var{list})
Counts absolute frequencies in discrete samples, both numeric and categorical. Its unique argument is a list,

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"));
@c discrete_freq (s1);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"));
(%o3) [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8, 9, 7, 9, 3, 2, 3, 8, 
4, 6, 2, 6, 4, 3, 3, 8, 3, 2, 7, 9, 5, 0, 2, 8, 8, 4, 1, 9, 7, 
1, 6, 9, 3, 9, 9, 3, 7, 5, 1, 0, 5, 8, 2, 0, 9, 7, 4, 9, 4, 4, 
5, 9, 2, 3, 0, 7, 8, 1, 6, 4, 0, 6, 2, 8, 6, 2, 0, 8, 9, 9, 8, 
6, 2, 8, 0, 3, 4, 8, 2, 5, 3, 4, 2, 1, 1, 7, 0, 6, 7]
(%i4) discrete_freq (s1);
(%o4) [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], 
                             [8, 8, 12, 12, 10, 8, 9, 8, 12, 13]]
@end example

The first list gives the sample values and the second their absolute frequencies. Commands @code{? col} and @code{? transpose} should help you to understand the last input.
@end deffn


@deffn {Function} subsample (@var{data_matrix}, @var{logical_expression})
@deffnx {Function} subsample (@var{data_matrix}, @var{logical_expression}, @var{col_num}, @var{col_num}, ...)
This is a sort of variation of the Maxima @code{submatrix} function. The first argument is the name of the data matrix, the second is a quoted logical expression and optional additional arguments are the numbers of the columns to be taken. Its behaviour is better understood with examples,

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s2 : read_matrix (file_search ("wind.data"))$
@c subsample (s2, '(%c[1] > 18));
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s2 : read_matrix (file_search ("wind.data"))$
(%i4) subsample (s2, '(%c[1] > 18));
              [ 19.38  15.37  15.12  23.09  25.25 ]
              [                                   ]
              [ 18.29  18.66  19.08  26.08  27.63 ]
(%o4)         [                                   ]
              [ 20.25  21.46  19.95  27.71  23.38 ]
              [                                   ]
              [ 18.79  18.96  14.46  26.38  21.84 ]
@end example

These are multivariate records in which the wind speeds in the first meteorological station were greater than 18. See that in the quoted logical expression the @var{i}-th component is refered to as @code{%c[i]}. Symbol @code{%c[i]} is used inside function @code{subsample}, therefore when used as a categorical variable, Maxima gets confused. In the following example, we request only the first, second and fifth components of those records with wind speeds greater or equal than 16 in station number 1 and lesser than 25 knots in station number 4,

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s2 : read_matrix (file_search ("wind.data"))$
@c subsample (s2, '(%c[1] >= 16 and %c[4] < 25), 1, 2, 5);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s2 : read_matrix (file_search ("wind.data"))$
(%i4) subsample (s2, '(%c[1] >= 16 and %c[4] < 25), 1, 2, 5);
                     [ 19.38  15.37  25.25 ]
                     [                     ]
                     [ 17.33  14.67  19.58 ]
(%o4)                [                     ]
                     [ 16.92  13.21  21.21 ]
                     [                     ]
                     [ 17.25  18.46  23.87 ]
@end example

Here is an example with the categorical variables of @code{biomed.data}. We want the records corresponding to those patients in group @code{B} who are older than 38 years,

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s3 : read_matrix (file_search ("biomed.data"))$
@c subsample (s3, '(%c[1] = B and %c[2] > 38));
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s3 : read_matrix (file_search ("biomed.data"))$
(%i4) subsample (s3, '(%c[1] = B and %c[2] > 38));
                [ B  39  28.0  102.3  17.1  146 ]
                [                               ]
                [ B  39  21.0  92.4   10.3  197 ]
                [                               ]
                [ B  39  23.0  111.5  10.0  133 ]
                [                               ]
                [ B  39  26.0  92.6   12.3  196 ]
(%o4)           [                               ]
                [ B  39  25.0  98.7   10.0  174 ]
                [                               ]
                [ B  39  21.0  93.2   5.9   181 ]
                [                               ]
                [ B  39  18.0  95.0   11.3  66  ]
                [                               ]
                [ B  39  39.0  88.5   7.6   168 ]
@end example


Probably, the statistical analysis will involve only the blood measures,

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s3 : read_matrix (file_search ("biomed.data"))$
@c subsample (s3, '(%c[1] = B and %c[2] > 38), 3, 4, 5, 6);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s3 : read_matrix (file_search ("biomed.data"))$
(%i4) subsample (s3, '(%c[1] = B and %c[2] > 38), 3, 4, 5, 6);
                   [ 28.0  102.3  17.1  146 ]
                   [                        ]
                   [ 21.0  92.4   10.3  197 ]
                   [                        ]
                   [ 23.0  111.5  10.0  133 ]
                   [                        ]
                   [ 26.0  92.6   12.3  196 ]
(%o4)              [                        ]
                   [ 25.0  98.7   10.0  174 ]
                   [                        ]
                   [ 21.0  93.2   5.9   181 ]
                   [                        ]
                   [ 18.0  95.0   11.3  66  ]
                   [                        ]
                   [ 39.0  88.5   7.6   168 ]
@end example


This is the multivariate mean of @code{s3},

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s3 : read_matrix (file_search ("biomed.data"))$
@c mean (s3);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s3 : read_matrix (file_search ("biomed.data"))$
(%i4) mean (s3);
       65 B + 35 A  317          6 NA + 8145.0
(%o4) [-----------, ---, 87.178, -------------, 18.123, 
           100      10                100
                                                    3 NA + 19587
                                                    ------------]
                                                        100
@end example
Here, the first component is meaningless, since @code{A} and @code{B} are categorical, the second component is the mean age of individuals in rational form, and the fourth and last values exhibit some strange behaviour. This is because symbol @code{NA} is used here to indicate @var{non available} data, and the two means are of course nonsense. A possible solution would be to take out from the matrix those rows with @code{NA} symbols, although this deserves some loss of information,

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s3 : read_matrix (file_search ("biomed.data"))$
@c mean (subsample (s3, '(%c[4] # NA and %c[6] # NA), 3, 4, 5, 6));
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s3 : read_matrix (file_search ("biomed.data"))$
(%i4) mean (subsample (s3, '(%c[4] # NA and %c[6] # NA), 3, 4, 5, 6));
(%o4) [79.4923076923077, 86.2032967032967, 16.93186813186813, 
                                                            2514
                                                            ----]
                                                             13
@end example
@end deffn


@node address@hidden@~oes para estatistica descritiva, address@hidden@~oes para specific multivariate descriptive statistics, address@hidden@~oes para address@hidden@~ao da dados, descriptive
@section address@hidden@~oes para estatistica descritiva


@deffn {Function} mean (@var{list})
@deffnx {Function} mean (@var{matrix})
This is the sample mean, defined as
@ifhtml
@example
                       n
                     ====
             _   1   \
             x = -    >    x
                 n   /      i
                     ====
                     i = 1
@end example
@end ifhtml
@ifinfo
@example
                       n
                     ====
             _   1   \
             x = -    >    x
                 n   /      i
                     ====
                     i = 1
@end example
@end ifinfo
@tex
$${\bar{x}={1\over{n}}{\sum_{i=1}^{n}{x_{i}}}}$$
@end tex

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c mean (s1);
@c %, numer;
@c s2 : read_matrix (file_search ("wind.data"))$
@c mean (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) mean (s1);
                               471
(%o4)                          ---
                               100
(%i5) %, numer;
(%o5)                         4.71
(%i6) s2 : read_matrix (file_search ("wind.data"))$
(%i7) mean (s2);
(%o7)     [9.9485, 10.1607, 10.8685, 15.7166, 14.8441]
@end example
@end deffn


@deffn {Function} var (@var{list})
@deffnx {Function} var (@var{matrix})
This is the sample variance, defined as
@ifhtml
@example
                     n
                   ====
           2   1   \          _ 2
          s  = -    >    (x - x)
               n   /       i
                   ====
                   i = 1
@end example
@end ifhtml
@ifinfo
@example
                     n
                   ====
           2   1   \          _ 2
          s  = -    >    (x - x)
               n   /       i
                   ====
                   i = 1
@end example
@end ifinfo
@tex
$${{1}\over{n}}{\sum_{i=1}^{n}{(x_{i}-\bar{x})^2}}$$
@end tex

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c var (s1), numer;
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) var (s1), numer;
(%o4)                   8.425899999999999
@end example

See also function @code{var1}.
@end deffn


@deffn {Function} var1 (@var{list})
@deffnx {Function} var1 (@var{matrix})
This is the sample variance, defined as
@ifhtml
@example
                     n
                   ====
               1   \          _ 2
              ---   >    (x - x)
              n-1  /       i
                   ====
                   i = 1
@end example
@end ifhtml
@ifinfo
@example
                     n
                   ====
               1   \          _ 2
              ---   >    (x - x)
              n-1  /       i
                   ====
                   i = 1
@end example
@end ifinfo
@tex
$${{1\over{n-1}}{\sum_{i=1}^{n}{(x_{i}-\bar{x})^2}}}$$
@end tex

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c var1 (s1), numer;
@c s2 : read_matrix (file_search ("wind.data"))$
@c var1 (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) var1 (s1), numer;
(%o4)                    8.5110101010101
(%i5) s2 : read_matrix (file_search ("wind.data"))$
(%i6) var1 (s2);
(%o6) [17.39586540404041, 15.13912778787879, 15.63204924242424, 
                            32.50152569696971, 24.66977392929294]
@end example

See also function @code{var}.
@end deffn


@deffn {Function} std (@var{list})
@deffnx {Function} std (@var{matrix})
This is the the square root of function @code{var}, the variance with denominator @math{n}.

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c std (s1), numer;
@c s2 : read_matrix (file_search ("wind.data"))$
@c std (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) std (s1), numer;
(%o4)                   2.902740084816414
(%i5) s2 : read_matrix (file_search ("wind.data"))$
(%i6) std (s2);
(%o6) [4.149928523480858, 3.871399812729241, 3.933920277534866, 
                            5.672434260526957, 4.941970881136392]
@end example

See also functions @code{var} and @code{std1}.
@end deffn


@deffn {Function} std1 (@var{list})
@deffnx {Function} std1 (@var{matrix})
This is the the square root of function @code{var1}, the variance with denominator @math{n-1}.

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c std1 (s1), numer;
@c s2 : read_matrix (file_search ("wind.data"))$
@c std1 (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) std1 (s1), numer;
(%o4)                   2.917363553109228
(%i5) s2 : read_matrix (file_search ("wind.data"))$
(%i6) std1 (s2);
(%o6) [4.17083509672109, 3.89090320978032, 3.953738641137555, 
                            5.701010936401517, 4.966867617451963]
@end example

See also functions @code{var1} and @code{std}.
@end deffn


@deffn {Function} noncentral_moment (@var{list}, @var{k})
@deffnx {Function} noncentral_moment (@var{matrix}, @var{k})
The non central moment of order @math{k}, defined as
@ifhtml
@example
                       n
                     ====
                 1   \      k
                 -    >    x
                 n   /      i
                     ====
                     i = 1
@end example
@end ifhtml
@ifinfo
@example
                       n
                     ====
                 1   \      k
                 -    >    x
                 n   /      i
                     ====
                     i = 1
@end example
@end ifinfo
@tex
$${{1\over{n}}{\sum_{i=1}^{n}{x_{i}^k}}}$$
@end tex

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c noncentral_moment (s1, 1), numer; /* the mean */
@c s2 : read_matrix (file_search ("wind.data"))$
@c noncentral_moment (s2, 5);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) noncentral_moment (s1, 1), numer; /* the mean */
(%o4)                         4.71
(%i6) s2 : read_matrix (file_search ("wind.data"))$
(%i7) noncentral_moment (s2, 5);
(%o7) [319793.8724761506, 320532.1923892463, 391249.5621381556, 
                            2502278.205988911, 1691881.797742255]
@end example

See also function @code{central_moment}.
@end deffn


@deffn {Function} central_moment (@var{list}, @var{k})
@deffnx {Function} central_moment (@var{matrix}, @var{k})
The central moment of order @math{k}, defined as
@ifhtml
@example
                    n
                  ====
              1   \          _ k
              -    >    (x - x)
              n   /       i
                  ====
                  i = 1
@end example
@end ifhtml
@ifinfo
@example
                    n
                  ====
              1   \          _ k
              -    >    (x - x)
              n   /       i
                  ====
                  i = 1
@end example
@end ifinfo
@tex
$${{1\over{n}}{\sum_{i=1}^{n}{(x_{i}-\bar{x})^k}}}$$
@end tex

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c central_moment (s1, 2), numer; /* the variance */
@c s2 : read_matrix (file_search ("wind.data"))$
@c central_moment (s2, 3);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) central_moment (s1, 2), numer; /* the variance */
(%o4)                   8.425899999999999
(%i6) s2 : read_matrix (file_search ("wind.data"))$
(%i7) central_moment (s2, 3);
(%o7) [11.29584771375004, 16.97988248298583, 5.626661952750102, 
                             37.5986572057918, 25.85981904394192]
@end example

See also functions @code{central_moment} and @code{mean}.
@end deffn


@deffn {Function} cv (@var{list})
@deffnx {Function} cv (@var{matrix})
The variation coefficient is the quotient between the sample standard deviation (@code{std}) and the @code{mean},

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c cv (s1), numer;
@c s2 : read_matrix (file_search ("wind.data"))$
@c cv (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) cv (s1), numer;
(%o4)                   .6193977819764815
(%i5) s2 : read_matrix (file_search ("wind.data"))$
(%i6) cv (s2);
(%o6) [.4192426091090204, .3829365309260502, 0.363779605385983, 
                            .3627381836021478, .3346021393989506]
@end example

See also functions @code{std} and @code{mean}.
@end deffn


@deffn {Function} mini (@var{list})
@deffnx {Function} mini (@var{matrix})
This is the minimum value of the sample @var{list},

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c mini (s1);
@c s2 : read_matrix (file_search ("wind.data"))$
@c mini (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) mini (s1);
(%o4)                           0
(%i5) s2 : read_matrix (file_search ("wind.data"))$
(%i6) mini (s2);
(%o6)             [0.58, 0.5, 2.67, 5.25, 5.17]
@end example

See also function @code{maxi}.
@end deffn


@deffn {Function} maxi (@var{list})
@deffnx {Function} maxi (@var{matrix})
This is the maximum value of the sample @var{list},

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c maxi (s1);
@c s2 : read_matrix (file_search ("wind.data"))$
@c maxi (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) maxi (s1);
(%o4)                           9
(%i5) s2 : read_matrix (file_search ("wind.data"))$
(%i6) maxi (s2);
(%o6)          [20.25, 21.46, 20.04, 29.63, 27.63]
@end example

See also function @code{mini}.
@end deffn


@deffn {Function} range (@var{list})
@deffnx {Function} range (@var{matrix})
The range is the difference between the extreme values.

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c range (s1);
@c s2 : read_matrix (file_search ("wind.data"))$
@c range (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) range (s1);
(%o4)                           9
(%i5) s2 : read_matrix (file_search ("wind.data"))$
(%i6) range (s2);
(%o6)          [19.67, 20.96, 17.37, 24.38, 22.46]
@end example
@end deffn


@deffn {Function} quantile (@var{list}, @var{p})
@deffnx {Function} quantile (@var{matrix}, @var{p})
This is the @address@hidden, with @var{p} a number in @math{[0, 1]}, of the sample @var{list}.
Although there are several address@hidden@~oes para the sample quantile (Hyndman, R. J., Fan, Y. (1996) @var{Sample quantiles in statistical packages}. American Statistician, 50, 361-365), the one based on linear interpolation is implemented in package @code{descriptive}.

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c /* 1st and 3rd quartiles */ [quantile (s1, 1/4), quantile (s1, 3/4)], numer;
@c s2 : read_matrix (file_search ("wind.data"))$
@c quantile (s2, 1/4);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) /* 1st and 3rd quartiles */ [quantile (s1, 1/4), quantile (s1, 3/4)], numer;
(%o4)                      [2.0, 7.25]
(%i5) s2 : read_matrix (file_search ("wind.data"))$
(%i6) quantile (s2, 1/4);
(%o6)    [7.2575, 7.477500000000001, 7.82, 11.28, 11.48]
@end example
@end deffn


@deffn {Function} median (@var{list})
@deffnx {Function} median (@var{matrix})
Once the sample is ordered, if the sample size is odd the median is the central value, otherwise it is the mean of the two central values.

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c median (s1);
@c s2 : read_matrix (file_search ("wind.data"))$
@c median (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) median (s1);
                                9
(%o4)                           -
                                2
(%i5) s2 : read_matrix (file_search ("wind.data"))$
(%i6) median (s2);
(%o6)         [10.06, 9.855, 10.73, 15.48, 14.105]
@end example

The median is the 1/address@hidden

See also function @code{quantile}.
@end deffn


@deffn {Function} qrange (@var{list})
@deffnx {Function} qrange (@var{matrix})
The interquartilic range is the difference between the third and first quartiles, @code{quantile(list,3/4) - quantile(list,1/4)},

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c qrange (s1);
@c s2 : read_matrix (file_search ("wind.data"))$
@c qrange (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) qrange (s1);
                               21
(%o4)                          --
                               4
(%i5) s2 : read_matrix (file_search ("wind.data"))$
(%i6) qrange (s2);
(%o6) [5.385, 5.572499999999998, 6.0225, 8.729999999999999, 
                                               6.650000000000002]
@end example

See also function @code{quantile}.
@end deffn


@deffn {Function} mean_deviation (@var{list})
@deffnx {Function} mean_deviation (@var{matrix})
The mean deviation, defined as
@ifhtml
@example
                     n
                   ====
               1   \          _
               -    >    |x - x|
               n   /       i
                   ====
                   i = 1
@end example
@end ifhtml
@ifinfo
@example
                     n
                   ====
               1   \          _
               -    >    |x - x|
               n   /       i
                   ====
                   i = 1
@end example
@end ifinfo
@tex
$${{1\over{n}}{\sum_{i=1}^{n}{|x_{i}-\bar{x}|}}}$$
@end tex

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c mean_deviation (s1);
@c s2 : read_matrix (file_search ("wind.data"))$
@c mean_deviation (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) mean_deviation (s1);
                               51
(%o4)                          --
                               20
(%i5) s2 : read_matrix (file_search ("wind.data"))$
(%i6) mean_deviation (s2);
(%o6) [3.287959999999999, 3.075342, 3.23907, 4.715664000000001, 
                                               4.028546000000002]
@end example

See also function @code{mean}.
@end deffn


@deffn {Function} median_deviation (@var{list})
@deffnx {Function} median_deviation (@var{matrix})
The median deviation, defined as
@ifhtml
@example
                 n
               ====
           1   \
           -    >    |x - med|
           n   /       i
               ====
               i = 1
@end example
@end ifhtml
@ifinfo
@example
                 n
               ====
           1   \
           -    >    |x - med|
           n   /       i
               ====
               i = 1
@end example
@end ifinfo
@tex
$${{1\over{n}}{\sum_{i=1}^{n}{|x_{i}-med|}}}$$
@end tex
where @code{med} is the median of @var{list}.

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c median_deviation (s1);
@c s2 : read_matrix (file_search ("wind.data"))$
@c median_deviation (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) median_deviation (s1);
                                5
(%o4)                           -
                                2
(%i5) s2 : read_matrix (file_search ("wind.data"))$
(%i6) median_deviation (s2);
(%o6)           [2.75, 2.755, 3.08, 4.315, 3.31]
@end example

See also function @code{mean}.
@end deffn


@deffn {Function} harmonic_mean (@var{list})
@deffnx {Function} harmonic_mean (@var{matrix})
The harmonic mean, defined as
@ifhtml
@example
                  n
               --------
                n
               ====
               \     1
                >    --
               /     x
               ====   i
               i = 1
@end example
@end ifhtml
@ifinfo
@example
                  n
               --------
                n
               ====
               \     1
                >    --
               /     x
               ====   i
               i = 1
@end example
@end ifinfo
@tex
$${{n}\over{\sum_{i=1}^{n}{{{1}\over{x_{i}}}}}}$$
@end tex

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c y : [5, 7, 2, 5, 9, 5, 6, 4, 9, 2, 4, 2, 5]$
@c harmonic_mean (y), numer;
@c s2 : read_matrix (file_search ("wind.data"))$
@c harmonic_mean (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) y : [5, 7, 2, 5, 9, 5, 6, 4, 9, 2, 4, 2, 5]$
(%i4) harmonic_mean (y), numer;
(%o4)                   3.901858027632205
(%i5) s2 : read_matrix (file_search ("wind.data"))$
(%i6) harmonic_mean (s2);
(%o6) [6.948015590052786, 7.391967752360356, 9.055658197151745, 
                            13.44199028193692, 13.01439145898509]
@end example

See also functions @code{mean} and @code{geometric_mean}.
@end deffn


@deffn {Function} geometric_mean (@var{list})
@deffnx {Function} geometric_mean (@var{matrix})
The geometric mean, defined as
@ifhtml
@example
                 /  n      \ 1/n
                 | /===\   |
                 |  ! !    |
                 |  ! !  x |
                 |  ! !   i|
                 | i = 1   |
                 \         /
@end example
@end ifhtml
@ifinfo
@example
                 /  n      \ 1/n
                 | /===\   |
                 |  ! !    |
                 |  ! !  x |
                 |  ! !   i|
                 | i = 1   |
                 \         /
@end example
@end ifinfo
@tex
$$\left(\prod_{i=1}^{n}{x_{i}}\right)^{{{1}\over{n}}}$$
@end tex

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c y : [5, 7, 2, 5, 9, 5, 6, 4, 9, 2, 4, 2, 5]$
@c geometric_mean (y), numer;
@c s2 : read_matrix (file_search ("wind.data"))$
@c geometric_mean (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) y : [5, 7, 2, 5, 9, 5, 6, 4, 9, 2, 4, 2, 5]$
(%i4) geometric_mean (y), numer;
(%o4)                   4.454845412337012
(%i5) s2 : read_matrix (file_search ("wind.data"))$
(%i6) geometric_mean (s2);
(%o6) [8.82476274347979, 9.22652604739361, 10.0442675714889, 
                            14.61274126349021, 13.96184163444275]
@end example

See also functions @code{mean} and @code{harmonic_mean}.
@end deffn


@deffn {Function} kurtosis (@var{list})
@deffnx {Function} kurtosis (@var{matrix})
The kurtosis coefficient, defined as
@ifhtml
@example
                    n
                  ====
            1     \          _ 4
           ----    >    (x - x)  - 3
              4   /       i
           n s    ====
                  i = 1
@end example
@end ifhtml
@ifinfo
@example
                    n
                  ====
            1     \          _ 4
           ----    >    (x - x)  - 3
              4   /       i
           n s    ====
                  i = 1
@end example
@end ifinfo
@tex
$${{1\over{n s^4}}{\sum_{i=1}^{n}{(x_{i}-\bar{x})^4}}-3}$$
@end tex

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c kurtosis (s1), numer;
@c s2 : read_matrix (file_search ("wind.data"))$
@c kurtosis (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) kurtosis (s1), numer;
(%o4)                  - 1.273247946514421
(%i5) s2 : read_matrix (file_search ("wind.data"))$
(%i6) kurtosis (s2);
(%o6) [- .2715445622195385, 0.119998784429451, 
   - .4275233490482866, - .6405361979019522, - .4952382132352935]
@end example

See also functions @code{mean}, @code{var} and @code{skewness}.
@end deffn


@deffn {Function} skewness (@var{list})
@deffnx {Function} skewness (@var{matrix})
The skewness coefficient, defined as
@ifhtml
@example
                    n
                  ====
            1     \          _ 3
           ----    >    (x - x)
              3   /       i
           n s    ====
                  i = 1
@end example
@end ifhtml
@ifinfo
@example
                    n
                  ====
            1     \          _ 3
           ----    >    (x - x)
              3   /       i
           n s    ====
                  i = 1
@end example
@end ifinfo
@tex
$${{1\over{n s^3}}{\sum_{i=1}^{n}{(x_{i}-\bar{x})^3}}}$$
@end tex

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c skewness (s1), numer;
@c s2 : read_matrix (file_search ("wind.data"))$
@c skewness (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) skewness (s1), numer;
(%o4)                  .009196180476450306
(%i5) s2 : read_matrix (file_search ("wind.data"))$
(%i6) skewness (s2);
(%o6) [.1580509020000979, .2926379232061854, .09242174416107717, 
                            .2059984348148687, .2142520248890832]
@end example

See also functions @code{mean}, @code{var} and @code{kurtosis}.
@end deffn


@deffn {Function} pearson_skewness (@var{list})
@deffnx {Function} pearson_skewness (@var{matrix})
Pearson's skewness coefficient, defined as 
@ifhtml
@example
                _
             3 (x - med)
             -----------
                  s
@end example
@end ifhtml
@ifinfo
@example
                _
             3 (x - med)
             -----------
                  s
@end example
@end ifinfo
@tex
$${{3\,\left(\bar{x}-med\right)}\over{s}}$$
@end tex
where @var{med} is the median of @var{list}.

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c pearson_skewness (s1), numer;
@c s2 : read_matrix (file_search ("wind.data"))$
@c pearson_skewness (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) pearson_skewness (s1), numer;
(%o4)                   .2159484029093895
(%i5) s2 : read_matrix (file_search ("wind.data"))$
(%i6) pearson_skewness (s2);
(%o6) [- .08019976629211892, .2357036272952649, 
         .1050904062491204, .1245042340592368, .4464181795804519]
@end example

See also functions @code{mean}, @code{var} and @code{median}.
@end deffn


@deffn {Function} quartile_skewness (@var{list})
@deffnx {Function} quartile_skewness (@var{matrix})
The quartile skewness coefficient, defined as 
@ifhtml
@example
               c    - 2 c    + c
                3/4      1/2    1/4
               --------------------
                   c    - c
                    3/4    1/4
@end example
@end ifhtml
@ifinfo
@example
               c    - 2 c    + c
                3/4      1/2    1/4
               --------------------
                   c    - c
                    3/4    1/4
@end example
@end ifinfo
@tex
$${{c_{{{3}\over{4}}}-2\,c_{{{1}\over{2}}}+c_{{{1}\over{4}}}}\over{c
 _{{{3}\over{4}}}-c_{{{1}\over{4}}}}}$$
@end tex
where @math{c_p} is the @var{p}-quantile of sample @var{list}.

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c quartile_skewness (s1), numer;
@c s2 : read_matrix (file_search ("wind.data"))$
@c quartile_skewness (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) quartile_skewness (s1), numer;
(%o4)                  .04761904761904762
(%i5) s2 : read_matrix (file_search ("wind.data"))$
(%i6) quartile_skewness (s2);
(%o6) [- 0.0408542246982353, .1467025572005382, 
       0.0336239103362392, .03780068728522298, 0.210526315789474]
@end example

See also function @code{quantile}.
@end deffn


@node address@hidden@~oes para specific multivariate descriptive statistics, address@hidden@~oes para statistical graphs, address@hidden@~oes para estatistica descritiva, descriptive
@section address@hidden@~oes para specific multivariate descriptive statistics


@deffn {Function} cov (@var{matrix})
The covariance matrix of the multivariate sample, defined as
@ifhtml
@example
              n
             ====
          1  \           _        _
      S = -   >    (X  - X) (X  - X)'
          n  /       j        j
             ====
             j = 1
@end example
@end ifhtml
@ifinfo
@example
              n
             ====
          1  \           _        _
      S = -   >    (X  - X) (X  - X)'
          n  /       j        j
             ====
             j = 1
@end example
@end ifinfo
@tex
$${S={1\over{n}}{\sum_{j=1}^{n}{\left(X_{j}-\bar{X}\right)\,\left(X_{j}-\bar{X}\right)'}}}$$
@end tex
where @math{X_j} is the @math{j}-th row of the sample matrix.

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s2 : read_matrix (file_search ("wind.data"))$
@c fpprintprec : 7$  /* change precision for pretty output */
@c cov (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s2 : read_matrix (file_search ("wind.data"))$
(%i4) fpprintprec : 7$  /* change precision for pretty output */
(%i5) cov (s2);
      [ 17.22191  13.61811  14.37217  19.39624  15.42162 ]
      [                                                  ]
      [ 13.61811  14.98774  13.30448  15.15834  14.9711  ]
      [                                                  ]
(%o5) [ 14.37217  13.30448  15.47573  17.32544  16.18171 ]
      [                                                  ]
      [ 19.39624  15.15834  17.32544  32.17651  20.44685 ]
      [                                                  ]
      [ 15.42162  14.9711   16.18171  20.44685  24.42308 ]
@end example

See also function @code{cov1}.
@end deffn


@deffn {Function} cov1 (@var{matrix})
The covariance matrix of the multivariate sample, defined as
@ifhtml
@example
              n
             ====
         1   \           _        _
   S  = ---   >    (X  - X) (X  - X)'
    1   n-1  /       j        j
             ====
             j = 1
@end example
@end ifhtml
@ifinfo
@example
              n
             ====
         1   \           _        _
   S  = ---   >    (X  - X) (X  - X)'
    1   n-1  /       j        j
             ====
             j = 1
@end example
@end ifinfo
@tex
$${{1\over{n-1}}{\sum_{j=1}^{n}{\left(X_{j}-\bar{X}\right)\,\left(X_{j}-\bar{X}\right)'}}}$$
@end tex
where @math{X_j} is the @math{j}-th row of the sample matrix.

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s2 : read_matrix (file_search ("wind.data"))$
@c fpprintprec : 7$ /* change precision for pretty output */
@c cov1 (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s2 : read_matrix (file_search ("wind.data"))$
(%i4) fpprintprec : 7$ /* change precision for pretty output */
(%i5) cov1 (s2);
      [ 17.39587  13.75567  14.51734  19.59216  15.5774  ]
      [                                                  ]
      [ 13.75567  15.13913  13.43887  15.31145  15.12232 ]
      [                                                  ]
(%o5) [ 14.51734  13.43887  15.63205  17.50044  16.34516 ]
      [                                                  ]
      [ 19.59216  15.31145  17.50044  32.50153  20.65338 ]
      [                                                  ]
      [ 15.5774   15.12232  16.34516  20.65338  24.66977 ]
@end example

See also function @code{cov}.
@end deffn


@deffn {Function} global_variances (@var{matrix})
@deffnx {Function} global_variances (@var{matrix}, @var{logical_value})
Function @code{global_variances} returns a list of global variance measures:

@itemize @bullet
@item
@var{total variance}: @code{trace(S_1)},
@item
@var{mean variance}: @code{trace(S_1)/p},
@item
@var{generalized variance}: @code{determinant(S_1)},
@item
@var{generalized standard deviation}: @code{sqrt(determinant(S_1))},
@item
@var{efective variance} @code{determinant(S_1)^(1/p)}, (defined in: address@hidden, D. (2002) @var{An@'alisis de datos multivariantes}; McGraw-Hill, Madrid.)
@item
@var{efective standard deviation}: @code{determinant(S_1)^(1/(2*p))}.
@end itemize
where @var{p} is the dimension of the multivariate random variable and @math{S_1} the covariance matrix returned by @code{cov1}.

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s2 : read_matrix (file_search ("wind.data"))$
@c global_variances (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s2 : read_matrix (file_search ("wind.data"))$
(%i4) global_variances (s2);
(%o4) [105.338342060606, 21.06766841212119, 12874.34690469686, 
         113.4651792608502, 6.636590811800794, 2.576158149609762]
@end example

Function @code{global_variances} has an optional logical argument: @code{global_variances(x,true)} tells Maxima that @code{x} is the data matrix, making the same as @code{global_variances(x)}. On the other hand, @code{global_variances(x,false)} means that @code{x} is not the data matrix, but the covariance matrix, avoiding its recalculation,

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s2 : read_matrix (file_search ("wind.data"))$
@c s : cov1 (s2)$
@c global_variances (s, false);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s2 : read_matrix (file_search ("wind.data"))$
(%i4) s : cov1 (s2)$
(%i5) global_variances (s, false);
(%o5) [105.338342060606, 21.06766841212119, 12874.34690469686, 
         113.4651792608502, 6.636590811800794, 2.576158149609762]
@end example

See also @code{cov} and @code{cov1}.
@end deffn


@deffn {Function} cor (@var{matrix})
@deffnx {Function} cor (@var{matrix}, @var{logical_value})
The correlation matrix of the multivariate sample.

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c fpprintprec:7$
@c s2 : read_matrix (file_search ("wind.data"))$
@c cor (s2);
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) fpprintprec:7$
(%i4) s2 : read_matrix (file_search ("wind.data"))$
(%i5) cor (s2);
      [   1.0     .8476339  .8803515  .8239624  .7519506 ]
      [                                                  ]
      [ .8476339    1.0     .8735834  .6902622  0.782502 ]
      [                                                  ]
(%o5) [ .8803515  .8735834    1.0     .7764065  .8323358 ]
      [                                                  ]
      [ .8239624  .6902622  .7764065    1.0     .7293848 ]
      [                                                  ]
      [ .7519506  0.782502  .8323358  .7293848    1.0    ]
@end example

Function @code{cor} has an optional logical argument: @code{cor(x,true)} tells Maxima that @code{x} is the data matrix, making the same as @code{cor(x)}. On the other hand, @code{cor(x,false)} means that @code{x} is not the data matrix, but the covariance matrix, avoiding its recalculation,

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c fpprintprec:7$
@c s2 : read_matrix (file_search ("wind.data"))$
@c s : cov1 (s2)$
@c cor (s, false); /* this is faster */
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) fpprintprec:7$
(%i4) s2 : read_matrix (file_search ("wind.data"))$
(%i5) s : cov1 (s2)$
(%i6) cor (s, false); /* this is faster */
      [   1.0     .8476339  .8803515  .8239624  .7519506 ]
      [                                                  ]
      [ .8476339    1.0     .8735834  .6902622  0.782502 ]
      [                                                  ]
(%o6) [ .8803515  .8735834    1.0     .7764065  .8323358 ]
      [                                                  ]
      [ .8239624  .6902622  .7764065    1.0     .7293848 ]
      [                                                  ]
      [ .7519506  0.782502  .8323358  .7293848    1.0    ]
@end example

See also @code{cov} and @code{cov1}.
@end deffn


@deffn {Function} list_correlations (@var{matrix})
@deffnx {Function} list_correlations (@var{matrix}, @var{logical_value})
Function @code{list_correlations} returns a list of correlation measures:

@itemize @bullet

@item
@var{precision matrix}: the inverse of the covariance matrix @math{S_1},
@ifhtml
@example
       -1     ij
      S   = (s  )             
       1         i,j = 1,2,...,p
@end example
@end ifhtml
@ifinfo
@example
       -1     ij
      S   = (s  )             
       1         i,j = 1,2,...,p
@end example
@end ifinfo
@tex
$${S_{1}^{-1}}={\left(s^{ij}\right)_{i,j=1,2,\ldots, p}}$$
@end tex

@item
@var{multiple correlation vector}:  @math{(R_1^2, R_2^2, ..., R_p^2)}, with 
@ifhtml
@example
       2          1
      R  = 1 - -------
       i        ii
               s   s
                    ii
@end example
@end ifhtml
@ifinfo
@example
       2          1
      R  = 1 - -------
       i        ii
               s   s
                    ii
@end example
@end ifinfo
@tex
$${R_{i}^{2}}={1-{{1}\over{s^{ii}s_{ii}}}}$$
@end tex
being an indicator of the goodness of fit of the linear multivariate regression model on @math{X_i} when the rest of variables are used as regressors.

@item
@var{partial correlation matrix}: with element @math{(i, j)} being
@ifhtml
@example
                         ij
                        s
      r        = - ------------
       ij.rest     / ii  jj\ 1/2
                   |s   s  |
                   \       /
@end example
@end ifhtml
@ifinfo
@example
                         ij
                        s
      r        = - ------------
       ij.rest     / ii  jj\ 1/2
                   |s   s  |
                   \       /
@end example
@end ifinfo
@tex
$${r_{ij.rest}}={-{{s^{ij}}\over \sqrt{s^{ii}s^{jj}}}}$$
@end tex

@end itemize

Example:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s2 : read_matrix (file_search ("wind.data"))$
@c z : list_correlations (s2)$
@c fpprintprec : 5$ /* for pretty output */
@c z[1];  /* precision matrix */
@c z[2];  /* multiple correlation vector */
@c z[3];  /* partial correlation matrix */
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s2 : read_matrix (file_search ("wind.data"))$
(%i4) z : list_correlations (s2)$
(%i5) fpprintprec : 5$ /* for pretty output */
(%i6) z[1];  /* precision matrix */
      [  .38486   - .13856   - .15626   - .10239    .031179  ]
      [                                                      ]
      [ - .13856   .34107    - .15233    .038447   - .052842 ]
      [                                                      ]
(%o6) [ - .15626  - .15233    .47296    - .024816  - .10054  ]
      [                                                      ]
      [ - .10239   .038447   - .024816   .10937    - .034033 ]
      [                                                      ]
      [ .031179   - .052842  - .10054   - .034033   .14834   ]
(%i7) z[2];  /* multiple correlation vector */
(%o7)       [.85063, .80634, .86474, .71867, .72675]
(%i8) z[3];  /* partial correlation matrix */
       [  - 1.0     .38244   .36627   .49908   - .13049 ]
       [                                                ]
       [  .38244    - 1.0    .37927  - .19907   .23492  ]
       [                                                ]
(%o8)  [  .36627    .37927   - 1.0    .10911    .37956  ]
       [                                                ]
       [  .49908   - .19907  .10911   - 1.0     .26719  ]
       [                                                ]
       [ - .13049   .23492   .37956   .26719    - 1.0   ]
@end example

Function @code{list_correlations} also has an optional logical argument: @code{list_correlations(x,true)} tells Maxima that @code{x} is the data matrix, making the same as @code{list_correlations(x)}. On the other hand, @code{list_correlations(x,false)} means that @code{x} is not the data matrix, but the covariance matrix, avoiding its recalculation.

See also @code{cov} and @code{cov1}.
@end deffn


@node address@hidden@~oes para statistical graphs,  , address@hidden@~oes para specific multivariate descriptive statistics, descriptive
@section address@hidden@~oes para statistical graphs


@deffn {Function} dataplot (@var{list})
@deffnx {Function} dataplot (@var{list}, @var{option_1}, @var{option_2}, ...)
@deffnx {Function} dataplot (@var{matrix})
@deffnx {Function} dataplot (@var{matrix}, @var{option_1}, @var{option_2}, ...)
Funtion @code{dataplot} permits direct visualization of sample data, both univariate (@var{list}) and multivariate (@var{matrix}). Giving values to the following @var{options} some aspects of the plot can be controlled:

@itemize @bullet

@item
@code{'outputdev}, default @code{"x"}, indicates the output device; correct values are @code{"x"}, @code{"eps"} and @code{"png"}, for the screen, postscript and png format files, respectively.

@item
@code{'maintitle}, default @code{""}, is the main title between double quotes.

@item
@code{'axisnames}, default @code{["x","y","z"]}, is a list with the names of axis @code{x}, @code{y} and @code{z}.

@item
@code{'joined}, default @code{false}, a logical value to select points in 2D to be joined or isolated.

@item
@code{'picturescales}, default @code{[1.0, 1.0]}, scaling factors for the size of the plot.

@item
@code{'threedim}, default @code{true}, tells Maxima whether to plot a three column matrix with a 3D diagram or a multivariate scatterplot. See examples bellow.

@item
@code{'axisrot}, default @code{[60, 30]}, changes the point of view when @code{'threedim} is set to @code{true} and  data are stored in a three column matrix. The first number is the rotation angle of the @var{x}-axis, and the second number is the rotation angle of the @var{z}-axis, both measured in degrees.

@item
@code{'nclasses}, default @code{10}, is the number of classes for the histograms in the diagonal of multivariate scatterplots.

@item
@code{'pointstyle}, default @code{1}, is an integer to indicate how to display sample points.

@end itemize

For example, with the following input a simple plot of the first twenty digits of @code{%pi} is requested and the output stored in an eps file.

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c dataplot (makelist (s1[k], k, 1, 20), 'pointstyle = 3)$
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) dataplot (makelist (s1[k], k, 1, 20), 'pointstyle = 3)$
@end example

Note that one dimensional data are plotted as a time series. In the next case, same more data with different settings,

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c dataplot (makelist (s1[k], k, 1, 50), 'maintitle = "First pi digits",
@c  'axisnames = ["digit order", "digit value"], 'pointstyle = 2,
@c  'joined = true)$
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) dataplot (makelist (s1[k], k, 1, 50), 'maintitle = "First pi digits",
 'axisnames = ["digit order", "digit value"], 'pointstyle = 2,
 'joined = true)$
@end example

Function @code{dataplot} can be used to plot points in the plane. The next example is a scatterplot of the pairs of wind speeds corresponding to the first and fifth meteorological stations,

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s2 : read_matrix (file_search ("wind.data"))$
@c dataplot (submatrix (s2, 2, 3, 4), 'pointstyle = 2,
@c  'maintitle = "Pairs of wind speeds measured in knots",
@c  'axisnames = ["Wind speed in A", "Wind speed in E"])$
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s2 : read_matrix (file_search ("wind.data"))$
(%i4) dataplot (submatrix (s2, 2, 3, 4), 'pointstyle = 2,
 'maintitle = "Pairs of wind speeds measured in knots",
 'axisnames = ["Wind speed in A", "Wind speed in E"])$
@end example

If points are stored in a two column matrix, @code{dataplot} can plot them directly, but if they are formatted as a list of pairs, their must be transformed to a matrix as in the following example.

@c ===beg===
@c load (descriptive)$
@c x : [[-1, 2], [5, 7], [5, -3], [-6, -9], [-4, 6]]$
@c dataplot (apply ('matrix, x), 'maintitle = "Points",
@c  'joined = true, 'axisnames = ["", ""], 'picturescales = [0.5, 1.0])$
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) x : [[-1, 2], [5, 7], [5, -3], [-6, -9], [-4, 6]]$
(%i3) dataplot (apply ('matrix, x), 'maintitle = "Points",
 'joined = true, 'axisnames = ["", ""], 'picturescales = [0.5, 1.0])$
@end example

Points in three dimensional space can be seen as a projection on the plane. In this example, plots of wind speeds corresponding to  three meteorological stations are requested, first in a 3D plot and then in a multivariate scatterplot.

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s2 : read_matrix (file_search ("wind.data"))$
@c /* 3D plot */ dataplot (submatrix (s2, 4, 5), 'pointstyle = 2,
@c  'maintitle = "Pairs of wind speeds measured in knots",
@c  'axisnames = ["Station A", "Station B", "Station C"])$
@c /* Multivariate scatterplot */ dataplot (submatrix (s2, 4, 5),
@c  'nclasses = 6, 'threedim = false)$
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s2 : read_matrix (file_search ("wind.data"))$
(%i4) /* 3D plot */ dataplot (submatrix (s2, 4, 5), 'pointstyle = 2,
 'maintitle = "Pairs of wind speeds measured in knots",
 'axisnames = ["Station A", "Station B", "Station C"])$
(%i5) /* Multivariate scatterplot */ dataplot (submatrix (s2, 4, 5),
 'nclasses = 6, 'threedim = false)$
@end example
Note that in the last example, the number of classes in the histograms of the diagonal is set to 6, and that option @code{'threedim} is set to @code{false}.

For more than three dimensions only multivariate scatterplots are possible, as in

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s2 : read_matrix (file_search ("wind.data"))$
@c dataplot (s2)$
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s2 : read_matrix (file_search ("wind.data"))$
(%i4) dataplot (s2)$
@end example
@end deffn


@deffn {Function} histogram (@var{list})
@deffnx {Function} histogram (@var{list}, @var{option_1}, @var{option_2}, ...)
@deffnx {Function} histogram (@var{one_column_matrix})
@deffnx {Function} histogram (@var{one_column_matrix}, @var{option_1}, @var{option_2}, ...)
This function plots an histogram. Sample data must be stored in a list of numbers or a one column matrix. Giving values to the following @var{options} some aspects of the plot can be controlled:

@itemize @bullet

@item
@code{'outputdev}, default @code{"x"}, indicates the output device; correct values are @code{"x"}, @code{"eps"} and @code{"png"}, for the screen, postscript and png format files, respectively.

@item
@code{'maintitle}, default @code{""}, is the main title between double quotes.

@item
@code{'axisnames}, default @code{["x", "Fr."]}, is a list with the names of axis @code{x} and @code{y}.

@item
@code{'picturescales}, default @code{[1.0, 1.0]}, scaling factors for the size of the plot.

@item
@code{'nclasses}, default @code{10}, is the number of classes or bars.

@item
@code{'relbarwidth}, default @code{0.9}, a decimal number between 0 and 1 to control bars width.

@item
@code{'barcolor}, default @code{1}, an integer to indicate bars color.

@item
@code{'colorintensity}, default @code{1}, a decimal number between 0 and 1 to fix color intensity.

@end itemize

In the next two examples, histograms are requested for the first 100 digits of number @code{%pi} and for the wind speeds in the third meteorological station.

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s1 : read_list (file_search ("pidigits.data"))$
@c histogram (s1, 'maintitle = "pi digits", 'axisnames = ["", "Absolute frequency"],
@c  'relbarwidth = 0.2, 'barcolor = 3, 'colorintensity = 0.6)$
@c s2 : read_matrix (file_search ("wind.data"))$
@c histogram (col (s2, 3), 'colorintensity = 0.3)$
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s1 : read_list (file_search ("pidigits.data"))$
(%i4) histogram (s1, 'maintitle = "pi digits", 'axisnames = ["", "Absolute frequency"],
 'relbarwidth = 0.2, 'barcolor = 3, 'colorintensity = 0.6)$
(%i5) s2 : read_matrix (file_search ("wind.data"))$
(%i6) histogram (col (s2, 3), 'colorintensity = 0.3)$
@end example
Note that in the first case, @code{s1} is a list and in the second example, @code{col(s2,3)} is a matrix.

See also function @code{barsplot}.
@end deffn


@deffn {Function} barsplot (@var{list})
@deffnx {Function} barsplot (@var{list}, @var{option_1}, @var{option_2}, ...)
@deffnx {Function} barsplot (@var{one_column_matrix})
@deffnx {Function} barsplot (@var{one_column_matrix}, @var{option_1}, @var{option_2}, ...)
Similar to @code{histogram} but for discrete, numeric or categorical, statistical variables. These are the options,

@itemize @bullet

@item
@code{'outputdev}, default @code{"x"}, indicates the output device; correct values are @code{"x"}, @code{"eps"} and @code{"png"}, for the screen, postscript and png format files, respectively.

@item
@code{'maintitle}, default @code{""}, is the main title between double quotes.

@item
@code{'axisnames}, default @code{["x", "Fr."]}, is a list with the names of axis @code{x} and @code{y}.

@item
@code{'picturescales}, default @code{[1.0, 1.0]}, scaling factors for the size of the plot.

@item
@code{'relbarwidth}, default @code{0.9}, a decimal number between 0 and 1 to control bars width.

@item
@code{'barcolor}, default @code{1}, an integer to indicate bars color.

@item
@code{'colorintensity}, default @code{1}, a decimal number between 0 and 1 to fix color intensity.

@end itemize

This example plots the barchart for groups @code{A} and @code{B} of patients in sample @code{s3},

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s3 : read_matrix (file_search ("biomed.data"))$
@c barsplot (col (s3, 1), 'maintitle = "Groups of patients",
@c  'axisnames = ["Group", "# of individuals"], 'colorintensity = 0.2)$
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s3 : read_matrix (file_search ("biomed.data"))$
(%i4) barsplot (col (s3, 1), 'maintitle = "Groups of patients",
 'axisnames = ["Group", "# of individuals"], 'colorintensity = 0.2)$
@end example
The first column in sample @code{s3} stores the categorical values @code{A} and @code{B}, also known sometimes as factors. On the other hand, the positive integer numbers in the second column are ages, in years, which is a discrete variable, so we can plot the absolute frequencies for these values,

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s3 : read_matrix (file_search ("biomed.data"))$
@c barsplot (col (s3, 2), 'maintitle = "Ages",
@c  'axisnames = ["Years", "# of individuals"], 'colorintensity = 0.2,
@c  'relbarwidth = 0.6)$
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s3 : read_matrix (file_search ("biomed.data"))$
(%i4) barsplot (col (s3, 2), 'maintitle = "Ages",
 'axisnames = ["Years", "# of individuals"], 'colorintensity = 0.2,
 'relbarwidth = 0.6)$
@end example


See also function @code{histogram}.
@end deffn


@deffn {Function} boxplot (@var{data})
@deffnx {Function} boxplot (@var{data}, @var{option_1}, @var{option_2}, ...)
This function plots box diagrams. Argument @var{data} can be a list, which is not of great interest, since these diagrams are mainly used for comparing different samples, or a matrix, so it is possible to compare two or more components of a multivariate statistical variable. But it is also allowed @var{data} to be a list of samples with possible different sample sizes, in fact this is the only function in package @code{descriptive} that admits this type of data structure. See example bellow.  These are the options,

@itemize @bullet

@item
@code{'outputdev}, default @code{"x"}, indicates the output device; correct values are @code{"x"}, @code{"eps"} and @code{"png"}, for the screen, postscript and png format files, respectively.

@item
@code{'maintitle}, default @code{""}, is the main title between double quotes.

@item
@code{'axisnames}, default @code{["sample", "y"]}, is a list with the names of axis @code{x} and @code{y}.

@item
@code{'picturescales}, default @code{[1.0, 1.0]}, scaling factors for the size of the plot.

@end itemize

Examples:

@c ===beg===
@c load (descriptive)$
@c load (numericalio)$
@c s2 : read_matrix (file_search ("wind.data"))$
@c boxplot (s2, 'maintitle = "Windspeed in knots",
@c  'axisnames = ["Seasons", ""])$
@c A :
@c  [[6, 4, 6, 2, 4, 8, 6, 4, 6, 4, 3, 2],
@c   [8, 10, 7, 9, 12, 8, 10],
@c   [16, 13, 17, 12, 11, 18, 13, 18, 14, 12]]$
@c boxplot (A)$
@c ===end===
@example
(%i1) load (descriptive)$
(%i2) load (numericalio)$
(%i3) s2 : read_matrix (file_search ("wind.data"))$
(%i4) boxplot (s2, 'maintitle = "Windspeed in knots",
 'axisnames = ["Seasons", ""])$
(%i5) A :
 [[6, 4, 6, 2, 4, 8, 6, 4, 6, 4, 3, 2],
  [8, 10, 7, 9, 12, 8, 10],
  [16, 13, 17, 12, 11, 18, 13, 18, 14, 12]]$
(%i6) boxplot (A)$
@end example
@end deffn