pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: category.c


From: Jason Stover
Subject: Re: category.c
Date: Tue, 21 Mar 2006 16:05:48 -0500
User-agent: Mutt/1.5.10i

On Tue, Mar 21, 2006 at 08:35:52AM +0800, John Darrington wrote:
> On Mon, Mar 20, 2006 at 10:03:27AM -0500, Jason Stover wrote:
> 
>      
>      > 3. cat_value_update seems to do nothing for numeric variables.  Why is
>      >    this?  A numeric variable can be used as a categorical variable
>      >    just as easily as an alpha one.
>      
>      Good point. Encoding numeric data as categorical is usually a mistake
>      from a statistical standpoint, but there are circumstances when
>      treating a numeric variable as categorical makes perfect sense, so
>      maybe cat_value_update() shouldn't care what type of variable it is
>      looking at. This is where the question 'should we protect the user?'
>      comes up. Someone with a numeric variable that has, say, 10^5 distinct
>      values and inadvertently treats that variable as categorical could
>      wind up running a procedure with 0 or negative degrees of freedom;
>      slowing the machine down to a crawl; or, worst of all, finding bugs
>      we'd rather not know about. But users should probably have the ability
>      to treat numeric data as categorical if they want to.
> 
> I'm not a statistician, so I can't make any comment about whether
> numeric variables, "ought" to be used as catagorical ones.  But I've
> seen *many* examples where this is done.  Most demonstrations of
> T-TEST do something like 0 = Male, 1 = Female.  I've even seen reports
> telling me that a person's average sex is 0.54  Maybe we could have a
> very mild warning if a catagorical variable is numeric.

Yeah. And I don't think the warning is necessary. (I was thinking users
should enter a '0' or '1' but make the type categorical, but that doesn't 
happen, and often shouldn't happen, as in the case where 'average sex is
.54' just means '54% female.')

-Jason





reply via email to

[Prev in Thread] Current Thread [Next in Thread]