bug-gnu-pspp
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: PSPP-BUG: Logistic Regression bugs


From: Renan Levine
Subject: Re: PSPP-BUG: Logistic Regression bugs
Date: Tue, 13 Nov 2012 20:28:31 -0500
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1

Dear Mr. Darrington,

I understand that logistic regression is in development, and am happy to help in any way I can. When I teach social statistics, we end with a very basic introduction to logistic regression, so a logistic regression routine in PSPP will enable me to cover almost the entire course (and then some!) with PSPP.

The problem with the error message only concerns dichotomous dependent variables, not predictor variables. Missing values on the predictor variables do not pose any problems. Cases with missing values on any independent variables are dropped just like when completing OLS regressions.

I think unequivocally that what the routine needs to do is to ignore all missing values and just focus on the non-missing categories. For example, STATA's manual says: logit fits a maximum-likelihood logit model. depvar=0 indicates a negative outcome; depvar!=0 & depvar!=. (typically depvar=1) indicate a positive outcome.

The way I understand that SPSS statement (if its not a typo) is that the SPSS routine will generate a predicted value for any observations with a missing value on the dependent variable, assuming that none of the independent variables contain any missing values for that observation. This is one way that some use maximum likelihood techniques to impute missing values.

Yours,
Renan



On 13-Nov-12 3:29 PM, John Darrington wrote:
You are right.

As you may have noticed, the logistic regression feature is still in 
development.

As of the version you mentioned, the /CATEGORICAL subcommand was not 
implemented,
but it has been since, try the most recent HEAD.

I think there may still be a problem with missing values in the categorical
variables, and I am working on that.  Missing values in the non-categorical
predictor variables should be ok though.

I'm unsure exactly how it should behave in response to missing values on the
dependent variable.  The spss documentation says:
  "For a case with a missing value on the dependent variable, predicted values
   are calculated if it has non-missing values on all independent variables."
This statement doesn't make sense to me.  Why would a predicted value be
calculated? from what?  - I'm still thinking about that ...  Have you any
idea?

So thanks for your feedback.  I appreciate you taking the time to test and
report these things.

If you can do some similar tests with a very recent version (
3cd65292e3cc6bd6532214dcc8c8ddc65bdc2972 or later, I would appreciate it).

Particularly, more tests with missing values and with weighted values would
be great.

Regards,

John

On Tue, Nov 13, 2012 at 01:08:19PM -0500, Renan Levine wrote:
      Dear PSPP users and programmers,
Thank you for your hard work in developing new capabilities for
      PSPP. I'm using the most recent version of PSPP, psppire.exe
      0.7.9-gaef7f5
I recently encountered some problems while running logistic regressions: 1) There appears to be a bug in the logistic regression routine
      that causes it to recognize missing values in the dependent
      variable as a value category. So, even when a variable is coded
      0, 1 and [system] missing (common in public opinion data), PSPP
      gives an error message: "Dependent variable's values are not
      dichotomous."
I've run logit analyses on three different .por and one .sav
      datasets, tried to see if user-missing is treated differently
      than system-missing, and if declaring missing values works any
      differently than a recode statement. The only way I manage to run
      a logistic regression is if I recode the dependent variable to be
      two integers with no missing values.
2) Less critically, I'm not sure the syntax /CATEGORICAL=var is
      working correctly. When I include that line, letting the computer
      know that an independent variable is dichotomous, I get an error
      message: .3-13: error: Syntax error at 'categorical'. HOWEVER,
      just including the variable on the initial line with the other
      independent variables seems to work (I can't be certain because I
      did not cross-reference my results with another statistics
      program).
Yours,
      Renan


--
Renan Levine
Department of Political Science
University of Toronto - Scarborough
address@hidden
http://individual.utoronto.ca/renan
(416) 208-2651




reply via email to

[Prev in Thread] Current Thread [Next in Thread]