pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: naming data sets


From: Ben Pfaff
Subject: Re: naming data sets
Date: Fri, 16 Dec 2005 08:39:02 -0800
User-agent: Gnus/5.110004 (No Gnus v0.4) Emacs/21.4 (gnu/linux)

Jason Stover <address@hidden> writes:

> It's time for me to write a routine that saves the residuals of
> a regression model to the data. I have avoided doing this until
> now because I want the user to be able to have a choice of saving
> residuals to different data sets.
>
> Right now there is only one data set we can refer to, and that
> is a serious limitation. How difficult would it be to make PSPP
> able to recognize different data sets? If there is only one
> dictionary? 

I've thought about this for a while and I think I have a
proposal.

First, I think that my suggestion that all the data sets have the
same dictionary was flawed.  The problem is that the dictionary
for whatever data set you're working with can change without the
other dictionaries changing in a similar manner.  For example,
COMPUTE can add a variable to the current data set's dictionary,
but if we want to add that to the other data sets' dictionaries,
what should be the values?  We'd have to, essentially, perform
all transformations on all the existing data sets, and I don't
think that's something that really makes sense (if it does to
you, please explain).

Thus, I propose that we support multiple data sets, each of which
has an independent dictionary.  We introduce a new type of file
handle for these data sets.  Tentatively I'll call these
"temporary" file handles and "temporary" data sets (but better
terminology is welcome).  Access to temporary data sets is
through temporary file handles, using the usual commands for
accessing system files (GET, SAVE, XSAVE).

Here's what I'd add to the PSPP language:

        * Some extra syntax on FILE HANDLE for declaring
          temporary file handles (MODE=TEMPORARY perhaps).

        * A new command for destroying temporary file handles
          (e.g. CLOSE FILE HANDLE or DELETE FILE HANDLE), so that
          the memory or disk space used to store them can be
          freed up.

        * GET, SAVE, XSAVE would be extended to read and write to
          temporary file handles.  I'd introduce some kind of
          syntactic sugar so that it wasn't strictly necessary to
          declare temporary file handles in advance,
          e.g. something like XSAVE OUTFILE=TEMPORARY
          <HANDLENAME> would work properly.

Internally, a temporary file handle would be represented by a
dictionary plus a casefile, I think.  When PSPP terminates, the
data in temporary file handles would automatically disappear,
just like data in the active file.

This would be pretty easy to implement.  Would it allow you to do
what you want?  Does it seem like a good idea?
-- 
On Perl: "It's as if H.P. Lovecraft, returned from the dead and speaking by
seance to Larry Wall, designed a language both elegant and terrifying for his
Elder Things to write programs in, and forgot that the Shoggoths didn't turn
out quite so well in the long run." --Matt Olson




reply via email to

[Prev in Thread] Current Thread [Next in Thread]