pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Directory restructuring


From: Ben Pfaff
Subject: Re: Directory restructuring
Date: Sat, 04 Feb 2006 13:47:13 -0800
User-agent: Gnus/5.110004 (No Gnus v0.4) Emacs/21.4 (gnu/linux)

First a little commentary on oddities I noticed:

        vfmP.h -> delete it

                It's empty now, so it can go away.
                I don't know why it still exists.

        language/file-handle.[hq] -> break out fh_parse()

                The fh_parse() function is used all over the
                language/ directory but it's not part of the FILE
                HANDLE command.

        data/format.h -> break out parse_format_specifier*()

                parse_format_specifier() and
                parse_format_specifier_name() are defined in
                language/lexer/format-prs.c, so their prototypes
                should be in language/lexer/format-prs.h.

        data/var.h -> break out parsing functions

                Similarly, the parsing functions should have
                prototypes in language/lexer.

        language/stats/sort-prs.? -> break out parsing functions

                sort_parse_criteria() and related functions
                should be moved into language/lexer, I think.
                Then sort-prs.? can be renamed sort.?.

        libpspp/magic.? -> merge into data/val.?

                Everything in this file has to do with data
                values, and so it should be in the same place.
                Besides, the name "magic" is horrible.

        libpspp/misc.? -> ?

                Not sure what to do about this, but it's not
                good.

        math/sort.? -> move to data/sort.?

                I'd argue that sorting is a fundamental operation
                on data.

        output/htmlP.h -> delete

                Only two things get declared here.  struct
                html_driver_ext is only used in html.c so it
                should be moved there.  The html_class
                declaration isn't needed anywhere, except in
                output.c which already has its own extern
                declaration, so it can be deleted.

        data/data-out.h -> need this header

                For some reason data_out() is prototyped in
                format.h, but it should have its own header.

Now, to propose some renamings.  At top level:

        vfm.? -> procedure.?

                This code executes procedures.  "vfm" is too
                grandiose and obfuscatory.

In the `data' directory:

        sfm-read.? -> sys-file-reader.?
        sfm-write.? -> sys-file-writer.?
        pfm-read.? -> por-file-reader.?
        pfm-write.? -> por-file-writer.?

                These make it obvious what they actually do.

        sfmP.h -> sfm-private.h

                I picked up the P-suffix convention from some
                other project as meaning "private", but it's
                probably better to be explicit these days when
                even MS operating systems support long file
                names.

        file-handle-def.? -> file-handle.?

                The directory name makes the -def suffix
                unnecessary now.

        lex-def.? -> tokens.?

                Ditto.  But this is really about the definition
                of a token now; no actual lexical analysis is
                happening, so I think that "tokens" is a better
                name.

        vars-atr.c -> var.c

                Ditto.

        data-in.? -> ???

                This code converts data from external
                representation (raw bytes) to internal
                representation (struct value).  "data-in" isn't
                the most straightforward name.  Maybe
                "value-parser"?

        data-out.? -> ???

                This code converts the opposite direction.  Maybe
                "value-formatter"?

In language/data-io:

        dfm-read.? -> data-reader.?
        dfm-write.? -> data-writer.?

                These names better explain what the files do.

In language/lexer:

        vars-prs.c -> vars.c (or variables.c?)

                The directory name makes the -prs suffix
                unnecessary.  We need a corresponding header too.

In language/stats:

        descript.c -> descriptives.c

                No reason to abbreviate.

        sort-prs.? -> sort.?

                The parsing functions should get moved to
                language/lexer, and then there's no reason to
                keep "-prs" in the name.

In language/xforms:

        sel-if.c -> select-if.c

                No reason to abbreviate.

In libpspp:

        algorithm.? -> array.?

                These are in fact algorithms for use on arrays.
                I'm planning to introduce a similar set for use
                on linked lists, so it'd be nice to distinguish
                clearly.

In output:

        som.? -> manager.?

                It manages output.

        tab.? -> tables.?

                No reason to abbreviate.

John Darrington <address@hidden> writes:

> 1. I left src/data as one directory instead of src/data and
> src/data-io no particular reason except that it seemed an
> unecessary distinction.

Okay.  I liked the idea of splitting out i/o because that allows
the "core" data source files to be easily spotted.  Eventually we
may add support for more data formats too--for example you've
proposed supporting OpenOffice's spreadsheet format, I believe.
But I'm not prepared to argue about it; not a big deal.

> 2. The files which I've left in src are those which are
> currently in the "too hard" basket.  This means:
>
>       error.c : Should really be in libpspp, but it has too many 
> dependencies.  
>                 IMHO this interface needs to be rethought anyway.

I understand that it's not the right interface for the GUI to
use.  Can you explain other objections?  Do you have suggestions?

>       main.[ch]: I don't know why we need a main.h ?

main.c exports some interfaces?

>       glob.[ch]: This seems to be the rubbish bin for things that don't fit 
>                  elsewhere.

At one point I was convinced that all global data should be
declared in a single source file.  Later I decided that was a
stupid idea and I've been moving variables out of here since
then.  It's probably time to finish the job.

>       getl.c:    This is just a pain, which keeps biting me whenever I kick 
> it.

Maybe you can help me to understand.

>       vfm* :     You suggested this ought to go into data.  It certainly 
> can't go 
>                  there in its current form.  It just depends on too many 
> things. 
>                    The code in there frightens me ....

I'm not surprised that it's a bit scary.  The code in here
touches practically everything else.  I'm not sure what to do
about it.  Some of it could probably be pushed into the
dictionary, other parts could be split out: perhaps the
stream-related bits could be in their own files and perhaps
transformation handling and SPLIT FILE related code is separable
as well.

It might be best to basically postpone dealing with vfm.

> 3.  The dependency diagram shows a couple of nasty circular
>     dependencies involving src/math, src/output and src/output/chart.  I
>     need to look at these.  It seems clear to me that the charting API
>     needs work.

A few greps don't turn up any direct dependency from src/output
to src/math for me.  Can you elaborate on that?

I've never looked at the charting API with any amount of
scrutiny.

> 4.  linked-list.* was unused so I deleted them.

OK.

> 5.  You suggested title.c and temporary.c go in
>     src/language/dictionary  .  This didn't make sense to me, so I've
>     put them in src/language/utilities and src/language/control
>     respectively.

OK.

> 6.  src/output depends on src/language which seems wrong to me.  In
>     fact src/language/ contains only command.* and it's a very common
>     dependency, so perhaps this needs to be split or something.

I think src/output only wants command.h because it wants
cur_proc, the name of the current command, to include it in the
output.  We could reverse the dependency by making command.c call
a function in src/output.  src/language definitely needs to call
into src/output in any case, so how's that sound to you?

[...]

> 10. src/output is depending on src/data which seems wrong.  I think
>     this is because the ascii driver needs a filename or something.

src/output directly uses the following from src/data according to
a little bit of grepping:

        filename.?
        mkfile.?
        settings.?

                These are general-purpose code, not data-specific,
                so I'd move them to libpspp.

        format.?

                tab_value() takes a fmt_spec and formats the
                table cell accordingly.  tab_float() does
                something related.  

                We could use a helper routine in src/language
                instead, if you'd prefer.

        var.h

                This dependency is unneeded.  You can just delete
                the #include for it from tab.c without problems.

> 11. Although it has a similar name, dfm-* is a quite different thing
>     to sfm-* and pfm-*, and dfm-* has a lot of other dependencies.  So
>     I've put it in language/data-io which seems to work ok.

OK.  Also all of those are bad names so I've suggested new ones
above.

> I think the most urgent issues are to sort out the remaining files in
> src (especially vfm*) and to sort out the output/chart business.  I'm
> sure we'll have to go several more iterations, but it's a start.

I'm willing to help, let me know what you'd like help with.
-- 
"Mon peu de succès près des femmes est toujours venu de les trop aimer."
--Jean-Jacques Rousseau




reply via email to

[Prev in Thread] Current Thread [Next in Thread]