octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #36372] Improved ranks.m included:


From: anonymous
Subject: [Octave-bug-tracker] [bug #36372] Improved ranks.m included:
Date: Thu, 10 May 2012 04:39:22 +0000
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.16) Gecko/20120421 Firefox/3.5.16

Follow-up Comment #7, bug #36372 (project octave):

Summary: New ranks.m included:

(1) Support for dense added for real this time.
(2) New support for reverse-ordinal sorting.
----

Greetings:

(A) The previous ranks.m I sent never supported one of the new advertized
methods: "dense ranking." I have now implemented it, and included as (A)
below.

(B) Incidentally, I found a much simpler way to implement (the original,
fractional) ranks with Kurt's original logic of using sort twice. The
difference is that it avoids any loops this time  But, note that this simpler
logic comes at this expense: (a) this is still about twice more expensive than
(A) above, because it uses sort twice; (b) It can't be used to implement
competition and modified ranking. So, if we want to stick with Kurt's original
logic, and forgo adding any new methods, we could simply switch to this
method.). I am including this as "sampleranksmy.m" below.

Either way (A) or (B) is still orders of magnitude faster than the current
version. 

-----

(A): ranks.m with support for dense and reverse ordinal methods.

## Copyright (C) 1995-2012 Kurt Hornik; Copyright (C) 2012 Dave Goel
##
## This file is part of Octave.
##
## Octave is free software; you can redistribute it and/or modify it
## under the terms of the GNU General Public License as published by the
## Free Software Foundation; either version 3 of the License, or (at
## your option) any later version.
##
## Octave is distributed in the hope that it will be useful, but WITHOUT
## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
## FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
## for more details.
##
## You should have received a copy of the GNU General Public License
## along with Octave; see the file COPYING.  If not, see
## <http://www.gnu.org/licenses/>.

## -*- texinfo -*-
##
## @deftypefn {Function File} {} ranks (@var{x}, @var{dim})
##
## Return the ranks of @var{x} along the first non-singleton dimension
## adjust for ties. If the optional argument @var{dim} is given, operate
## along this dimension. @var{x} can optionally be a structure. In that
## case, its field named x is interpreted as the matrix, and its field
## named rtype is interpreted as the rank-type. The field rtype then
## donates the type of ranking: 0 or "fractional" for fractional, which
## is the default, 1 or "competition" for competiton ranking, 2 or
## "modified" for modified competition ranking, 3 or "ordinal" for
## ordinal ranking, 4 or "revordinal" for reverse ordinal,  and 5 or
## "dense" for dense ranking.
##
address@hidden deftypefn


## This function returns a result identical to GNU Octave's built-in
## ranks.m but should be much faster (For example, 1.4s vs. "no result
## in 33mins" when the input was a=round(10*rand(100123,100));
## Additionally, we also handle several ranking schemes. This function
## has been submitted to GNU Octave's mailing list and may become part
## of GNU Octave. [[LOCAL NOTES 20120503 : To compare this file to the
## latest octave version, see the file tmpranksoctave.m in this
## directory.]]


## Authors: KH <address@hidden>, Dave Goel <address@hidden>
## Description: Compute ranks


function y = ranks (x, dim)

  if (nargin < 1);
    print_usage ();
  endif

  if isstruct(x);
    rtype=0;
    if isfield(x,"rtype");
      rtype=x.rtype;
      if isempty(rtype);
        rtype=0;
      endif
    endif
    x=x.x;
  else
    rtype=0;
  endif


  nd = ndims (x);
  sz = size (x);
  if (nargin!=2);
    ## Find the first non-singleton dimension.
    dim  = 1;
    while (dim < nd + 1 && sz(dim) == 1)
      dim = dim + 1;
    endwhile
    if (dim > nd)
      dim = 1;
    endif
  else
    if (! (isscalar (dim) && dim == round (dim))
        && dim > 0
        && dim < (nd + 1))
      error ("ranks: dim must be an integer and valid dimension");
    endif
  endif

  if (sz(dim) == 1)
    y = ones(sz);
  else
    ## The algorithm works only on dim = 1, so permute if necesary.
    if (dim != 1)
      perm = [1 : nd];
      perm(1) = dim;
      perm(dim) = 1;
      x = permute (x, perm);
    endif
    sz=size(x);
    switch rtype;
      case {4,"revordinal"};
        [sx ids]=sort(x,'descend');
        ids=flipdim(ids,1);
      otherwise
        [sx ids]=sort(x); ## sx is sorted x.
    endswitch
    lin=cumsum(ones(size(x)),1);  ## A linearly increasing array.

    switch rtype;
      case {0,"fractional"};
        lin=(_competition(lin,sx,sz)+_modified(lin,sx,sz))/2;
      case {1,"competition"};
        lin=_competition(lin,sx,sz);
      case {2,"modified"};
        lin=_modified(lin,sx,sz);
      case {3,"ordinal"};
        ## no processing needed here.
      case {4,"revordinal"};
        ## no processing needed here.
      case {5,"dense"};
        lin=_dense(lin,sx,sz);
      otherwise 
        rtype
        error("Illegal value of rtype specified.");
    endswitch
    y=NaN(size(lin));
    ## If input was a vector, we could have simply done this: 
    ## y(ids)=lin;
    y(_ids(ids,sz))=lin;
    if (dim != 1)
      y = permute (y, perm);
    endif
  endif
endfunction

function idf=_ids(ids,sz);
  oo=ones(sz);
  allids=[{ids-1}];
  nn=numel(sz);
  for ii=2:nn;
    allids=[allids,{cumsum(oo,ii)-1}];
  endfor
  idf=allids{end};
  for jj=(nn-1):-1:1;
    idf=idf*sz(jj)+allids{jj};
  endfor
  idf+=1;
endfunction



function linnew=_dense (lin,sx,sz)
  infvec = -Inf * ones ([1, sz(2 : end)]);
  fnewp= logical(diff([infvec;sx]));
  linnew=cumsum(fnewp,1);
  linnew
endfunction

function linnew=_competition (lin,sx,sz)
  ## Stop increasing lin when sx does not increase. Same as before
  ## otherwise.
  infvec = -Inf * ones ([1, sz(2 : end)]);
  fnewp= find(diff([infvec;sx]));
  linnew=zeros(size(lin));
  linnew(fnewp)=lin(fnewp);
  linnew=cummax(linnew,1);
endfunction


function linnew=_modified (lin,sx,sz)
  ## Traverse lin backwards. Stop decreasing it when sx doesn't
  ## decrease.
  infvec = Inf * ones ([1, sz(2 : end)]);
  fnewp= find(diff([sx;infvec]));
  linnew=Inf(size(lin));
  linnew(fnewp)=lin(fnewp);
  linnew=flipdim(cummin(flipdim(linnew,1)),1);
endfunction 


function linnew=_revordinal (lin,sx,sz)
  infvec = -Inf * ones ([1, sz(2 : end)]);
  fnewp= find(diff([infvec;sx]));
  linnew=zeros(size(lin));
  linnew(fnewp)=lin(fnewp);
  linnew=cummax(linnew,1);
endfunction


%!assert (ranks (1:2:10), [1:5]'')
%!assert (ranks (10:-2:1), [5:-1:1]'')
%!assert (ranks ([2, 1, 2, 4]), [2.5, 1, 2.5, 4])
%!assert (ranks (ones (1, 5)), 3*ones (1, 5)'')
%!assert (ranks (1e6*ones (1, 5)), 3*ones (1, 5)'')
%!assert (ranks (rand (1, 5), 1), ones (1, 5))

%% Test input validation
%!error ranks ()
%!error ranks (1, 2, 3)
%!error ranks ({1, 2})
%!error ranks (['A'; 'B'])
%!error ranks (1, 1.5)
%!error ranks (1, 0)
%!error ranks (1, 3)

----

(B) sampleranksmy.m - Kurt's original logic, but avoids any for loops:
## Copyright (C) 1995-2012 Kurt Hornik Copyright (C) 2012 Dave Goel
##
## This file is part of Octave.
##
## Octave is free software; you can redistribute it and/or modify it
## under the terms of the GNU General Public License as published by the
## Free Software Foundation; either version 3 of the License, or (at
## your option) any later version.
##
## Octave is distributed in the hope that it will be useful, but WITHOUT
## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
## FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
## for more details.
##
## You should have received a copy of the GNU General Public License
## along with Octave; see the file COPYING.  If not, see
## <http://www.gnu.org/licenses/>.


## -*- texinfo -*-
## @deftypefn {Function File} {} ranks2my (@var{x}, @var{dim})
## Return the ranks of @var{x} along the first non-singleton dimension
## adjust for ties. If the optional argument @var{dim} is given, operate
## along this dimension. @var{x} can optionally be a structure. In that
## case, its field named x is interpreted as the matrix, and its field
## named rtype is interpreted as the rank-type. The field rtype then
## donates the type of ranking: 0 or "fractional" for fractional, which
## is the default, 1 or "competition" for competiton ranking, 2 or
## "modified" for modified competition ranking, 3 or "ordinal" for
## ordinal ranking, 4 for "reverse ordinal ranking"  and 5 or "dense"
## for dense ranking.
## @end deftypefn


## Authors: KH <address@hidden>, Dave Goel <address@hidden>
## Description: Compute ranks

## This function returns a result identical to GNU Octave's built-in
## ranks.m but should be much faster (For example, 1.4s vs. "no result
## in 33mins" when the input was a=round(10*rand(100123,100));
## Additionally, we also handle several ranking schemes. This function
## has been submitted to GNU Octave's mailing list and may become part
## of GNU Octave. [[LOCAL NOTES 20120503 : To compare this file to the
## latest octave version, see the file tmpranksoctave.m in this
## directory.]]




function y = sampleranksmy(x, dim)

  if (nargin != 1 && nargin != 2)
    print_usage ();
  endif

  if isstruct(x);
    rtype=0;
    if isfield(x,"rtype");
      rtype=x.rtype;
      if isempty(rtype);
        rtype=0;
      endif
    endif
    x=x.x;
  else
    rtype=0;
  endif


  nd = ndims (x);
  sz = size (x);
  if (nargin != 2)
    ## Find the first non-singleton dimension.
    dim  = 1;
    while (dim < nd + 1 && sz(dim) == 1)
      dim = dim + 1;
    endwhile
    if (dim > nd)
      dim = 1;
    endif
  else
    if (! (isscalar (dim) && dim == round (dim))
        && dim > 0
        && dim < (nd + 1))
      error ("ranks: dim must be an integer and valid dimension");
    endif
  endif

  if (sz(dim) == 1)
    y = ones(sz);
  else
    ## The algorithm works only on dim = 1, so permute if necesary.
    if (dim != 1)
      perm = [1 : nd];
      perm(1) = dim;
      perm(dim) = 1;
      x = permute (x, perm);
    endif
    [xsorted, xidsa] = sort (x);
    [ignore,  xidsb] = sort (x,'descend');
    xidsb=flipdim(xidsb,1);
    [ig, y1]=sort(xidsa);
    [ig, y2]=sort(xidsb);
    ## Now, y1 is ordinal sorting. y2 is reverse ordinal sorting. We
    ## simply need the mean of the two.
    y=(y1+y2)/2;
    if (dim != 1)
      y = permute (y, perm);
    endif
  endif
endfunction



 

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?36372>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]