octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #42671] [PATCH] corr() does not have p-values


From: Philipp Kutin
Subject: [Octave-bug-tracker] [bug #42671] [PATCH] corr() does not have p-values output, returns 1.0 with one observation.
Date: Thu, 03 Jul 2014 12:36:46 +0000
User-agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:30.0) Gecko/20100101 Firefox/30.0

URL:
  <http://savannah.gnu.org/bugs/?42671>

                 Summary: [PATCH] corr() does not have p-values output,
returns 1.0 with one observation.
                 Project: GNU Octave
            Submitted by: pkutin
            Submitted on: Thu 03 Jul 2014 12:36:46 PM GMT
                Category: Octave Function
                Severity: 3 - Normal
                Priority: 5 - Normal
              Item Group: Matlab Compatibility
                  Status: None
             Assigned to: None
         Originator Name: 
        Originator Email: 
             Open/Closed: Open
         Discussion Lock: Any
                 Release: dev
        Operating System: Any

    _______________________________________________________

Details:

The current corr.m is behind MATLAB's in various ways. First, there's no
p-values output with the second 'PVAL' outarg. Because of this, there's also
no option as to which kind of alternative hypothesis to consider ('both',
'left' or 'right').

The attached patch adds the PVAL output for the both-sided case. As stated in
the MATLAB docs, a transformation from r to values that are t-distributed
(assuming the input variables are uncorrelated bivariate Gaussian) is used
there.

Additionally, when correlating data sets with one observation, return NaN
instead of 1 -- the Pearson correlation coefficient is not defined in this
case since the variance of either variable isn't.

*Patch message:*

corr.m: obtain p-values from r-to-t transformation; return NaN for 1
observation.

* when correlating data sets with one observation, return NaN instead of 1.
* use a transformation into a t-distributed variable (assuming the input
  variables are uncorrelated bivariate Gaussian) to obtain both-sided
p-values


*Future directions:*
For corr() to accept key/value pairs like 'KIND' it would be nice to have a
factored system to extract these from a varargin passed to a function.
Searching for the K/V pattern in the Octave code, it seems like these are done
by hand each time now.
The remaining measures of assiciation -- spearman() and kendall() -- are
there, so dispatching to those could then be done from corr(), too. Estimating
p-values for them is a different story.

Tests on MATLAB R2013a:

>> corr(1,2)
ans =
   NaN
>> [c,p]=corr([1 2]',[2 3]')
c =
    1.0000
p =
   NaN
>> [c,p]=corr([1 2 3]',[2 3 4]')
c =
    1.0000
p =
   9.4864e-09

>> [c,p]=corr([1 2 3 4]',[2 3 4 7]', 'tail','right')
c =
    0.9562
p =
    0.0219
>> [c,p]=corr([1 2 3 4]',[2 3 4 7]', 'tail','left')
c =
    0.9562
p =
    0.9781
>> [c,p]=corr([1 2 3 4]',[2 3 4 7]', 'tail','both')
c =
    0.9562
p =
    0.0438




    _______________________________________________________

File Attachments:


-------------------------------------------------------
Date: Thu 03 Jul 2014 12:36:46 PM GMT  Name: corr-pval-1.patch  Size: 2kB  
By: pkutin

<http://savannah.gnu.org/bugs/download.php?file_id=31670>

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?42671>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]