pspp-users
[Top][All Lists]

## Re: dummy coding of categorical variables results in zero coefficients a

 From: Alan Mead Subject: Re: dummy coding of categorical variables results in zero coefficients and standard errors Date: Wed, 20 Dec 2023 04:16:44 -0600 User-agent: Mozilla Thunderbird

Tim,

NaN looks like a numerical error. I'm curious, how may levels does the variable have and how many dummy variables are you using?

If the original variable has K levels, you should have K-1 dummy variables. For example, if your variable were location (1=rural, 2=suburban, 3=urban) then you would pick one level to be the reference and create two dummy variables, perhaps:

recode location (1=1) (else=0) into dum1.

recode location (2=1) (else=0) into dum2.

Then the coefficients of dum1 and dum2 tell you how living in a rural (dum1) or suburban (dum2) area compares to living in an urban area.

The model won't be defined if you use K variables for K levels.

I notice that both of the zeros are for xxx_1 variables, so that suggested possibly not coding the categorical variable correctly. But I don't know if that's what you are seeing. You could also get zeros if there were no instances of that dummy code, but you shouldn't see NaN values. It could also be another problem, or a bug. In fact, I think it's probably a bug to see NaN's...

-Alan

On 12/20/23 3:46 AM, tim.goodspeed@btinternet.com wrote:

A basic stat’s question and a specific PSPP query, please.  Any help gratefully received.  I can’t see this in the archives anywhere (searching for ‘categorical’ and ‘dummy’).

For a linear regression, some variables are categorical and so included using dummy coding (Coding Systems for Categorical Variables in Regression Analysis (ucla.edu)).

basic stat’s question: This results in a zero coefficient and zero standard error for some variables, as shown in the example below.  Is this correct?  There is little or no linear relationship to be found?

specific PSPP query: if there is little relationship/the coefficient is very small, is there a way to tell PSPP to show the very small value instead of zero?

 Table: Model Summary (adjRA1SR1) R R Square Adjusted R Square Std. Error of the Estimate 0.55723 0.310505 0.302797 0.8359 Table: ANOVA (adjRA1SR1) Sum of Squares df Mean Square F Sig. Regression 619.25791 22 28.148087 40.284698 0 Residual 1375.0987 1968 0.698729 Total 1994.3566 1990 Table: Coefficients (adjRA1SR1) Unstandardized Coefficients Standardized Coefficients t Sig. 95% Confidence Interval for B B Std. Error Beta Lower Bound Upper Bound (Constant) 8.163407 0.310014 0 26.332394 0 7.555417 8.771397 lnSTINC -0.036745 0.011677 -0.088107 -3.146888 0.002 -0.059645 -0.013845 RA1PKHSIZ -0.011834 0.016218 -0.020561 -0.729708 0.466 -0.043639 0.019971 RA1PRAGE -0.039326 0.011175 -0.550388 -3.519082 0 -0.061242 -0.01741 sqPRAGE 0.000464 0.000109 0.666977 4.258349 0 0.00025 0.000678 RA1PRSEX 0.13709 0.03935 0.068446 3.483888 0.001 0.059918 0.214261 RA1PB19_1 0 0 0 NaN NaN 0 0 RA1PB19_2 -0.485628 0.170694 -0.054029 -2.845015 0.004 -0.820389 -0.150867 RA1PB19_3 -0.324574 0.058981 -0.109094 -5.503011 0 -0.440246 -0.208902 RA1PB19_4 -0.333625 0.089807 -0.074169 -3.714896 0 -0.509752 -0.157497 RA1PB1 -0.002888 0.008407 -0.007002 -0.343559 0.731 -0.019376 0.0136 RA1SG17A_1 0 0 0 NaN NaN 0 0 RA1SG17A_2 -0.061221 0.053837 -0.021822 -1.137147 0.256 -0.166804 0.044363 RA1PA1 -0.15082 0.022182 -0.160102 -6.7991 0 -0.194324 -0.107317 RA1PA2 -0.248882 0.024367 -0.243609 -10.214077 0 -0.29667 -0.201095 RA1SC1 -0.328042 0.073134 -0.08782 -4.485512 0 -0.471469 -0.184614 RA1PF3bin 0.003064 0.041159 0.001422 0.074435 0.941 -0.077655 0.083783 RA1PF7A_2 0.009538 0.086914 0.002111 0.109735 0.913 -0.160917 0.179992 RA1PF7A_3 0.14177 0.166844 0.016081 0.849712 0.396 -0.18544 0.468979 RA1PF7A_4 -0.104009 0.155971 -0.01266 -0.666848 0.505 -0.409894 0.201877 RA1PF7A_5 0.173309 0.59246 0.005486 0.292525 0.77 -0.988606 1.335224 RA1PF7A_6 0.064264 0.080864 0.01504 0.794712 0.427 -0.094325 0.222853 RA1PG2 -0.350528 0.030049 -0.233421 -11.66509 0 -0.40946 -0.291597

```--

President, Talent Algorithms Inc.

science + technology = better workers

https://talalg.com

Linus' Law: Given enough eyeballs, all bugs are shallow.

```