|
From: | Alan Mead |
Subject: | Re: Pspp-users Digest, Vol 128, Issue 3 |
Date: | Thu, 19 Jan 2017 17:20:48 -0600 |
User-agent: | Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0 |
I agree about adding analyses to PSPPIRE but I have to disagree
about merging files (adding variables). I find it far
easier to use syntax: http://www.spss-tutorials.com/spss-match-files-command/ Just a few lines for most matches! The problem is that there are several choices (e.g., a table merge vs. a "normal" merge) and it makes the dialog too complex (i.e., I don't always get results I was expecting and I cannot tell what went wrong). I won't blame you if you learned the right way to do the merges you need to do in the dialog and now find that easy, but for a new dialog/GUI user it's not simple. The complexity of the underlying operations defy a trivially simple dialog interface. How many novice SPSS/PSPP users know what a "table match" is? And from the perspective of adding the functionality to PSPPIRE I think it would be a lot of work to support all the variations in merging. I added a tiny feature to PSPP and I didn't find the C impossible (although I needed help from John Darrington) but I have yet to figure out how to add things to PSPPIRE. So, anyway, merging two files to add variables is simple: match files file = 'c:\whatever\file1.sav' /file = 'c:\whatever\file2.sav' /by MyKeyVariable /map . file1.sav and file2.sav need to be sorted by MyKeyVariable (which, obviously, has to exist and be compatible in both files) so the whole syntax I use is just a little longer: get file = 'c:\whatever\file1.sav'. execute. sort cases by MyKeyVariable. compute dum1=1. execute. save /outfile = 'c:\whatever\file1.sav'. get file = 'c:\whatever\file.sav'. sort cases by MyKeyVariable. compute dum2=1. execute. save /outfile = 'c:\whatever\file2.sav'. match files file = 'c:\whatever\file1.sav' /file = 'c:\whatever\file2.sav' /by MyKeyVariable /map . recode dum1 dum2 (SYSMIS=0). execute. compute dum=dum1+dum2. freq / dum1 dum2 dum. temporary. select if( dum < 2). print / dum dum1 dum2 MyKeyVariable lname fname . execute. The business about DUM, DUM1 and DUM2 will help you identify mismatches. DUM1 will be 1 for all cases from file1, dum2 will be 1 for all cases from file 2. After recoding missing to zero and adding DUM1+DUM2, if DUM <> 2 then the case is the result of a mismatch. The PRINT statement will print these variables and the variables lname and fname to try to match up cases for all cases that are mismatched. I learned this method around 1993... I know there has since been added an in-built feature that's like dum (and MAP visually show in the output where variables came from). I'd be curious if there's a better way of doing this today. Of course, if you work with clean data that never have mismatches then you don't need to worry about this. I have never been so fortunate. -Alan On 1/19/2017 4:52 PM, Dr. Oliver Walter
wrote:
-- Alan D. Mead, Ph.D. President, Talent Algorithms Inc. science + technology = better workers http://www.alanmead.org I've... seen things you people wouldn't believe... functions on fire in a copy of Orion. I watched C-Sharp glitter in the dark near a programmable gate. All those moments will be lost in time, like Ruby... on... Rails... Time for Pi. --"The Register" user Alister, applying the famous "Blade Runner" speech to software development |
[Prev in Thread] | Current Thread | [Next in Thread] |