bug-gnu-pspp
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

PSPP-BUG: [bug #40864] When a variable has no label, DISPLAY DICTIONARY


From: anonymous
Subject: PSPP-BUG: [bug #40864] When a variable has no label, DISPLAY DICTIONARY puts the "Format:" line in it's place
Date: Wed, 11 Dec 2013 13:32:28 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:25.0) Gecko/20100101 Firefox/25.0

Follow-up Comment #8, bug #40864 (project pspp):

@comment5, adding "Label:" to the varlabel:

This sounds like a good idea. Easy and unambiguous can only help.
OTOH, this 
sed 's/^([a-zA-Z0-9]+t)(Format: [AF][0-9.]+)(t[0-9]+)/13nt2 /'
is all it takes to "fix" the empty varlabel for me, converting
<Var>   "Format: foo"  <Position>
to
<Var>           <Position>
        "Format: foo"

Ambiguity has a hard time with this (unless fabricated).


@DDI-XML:

Yes, we will use the DDI-XML as our archive format, and intend to distribute
it with the .csv as well.
That said, there are several DDI dialects, as in DDI-2(.x), DDI2-Workbook
(which we use for inhouse-transfers), DDI-3 and DDI-3 Lifecycle.

While I dare rule out DDI Lifecycle, as it is overly complex for this task, I
intend to develop and propose DDI2- and DDI-3 compliant formats for this.
This will not come to pass this year, but I will happily provide you with the
definition we get to, once I get green lights by either a member of the
technical committee or the committee itself, let's see where this goes.


Digression:

But XML being XML, I developed a format for humans to read really easily and
do ad-hoc parsing/filtering on.

Some properties:
- Variable name shall be prominently visible
- never use more than one whitespace char bewteen things
- No quoting, ever. Some studies use quotes, brackets and whatnot even in
value labels.
- All content is located after the first tab
- A space is only used for offsetting variable attributes (and in content,
obviously)
- One attribute per line, so filtering can always operate on a line basis
(think grep)
- Attributes are prefixed with "V_", again to allow easy filters/parsers
- A blank line as record separator, again, both for readability and allowing
more methods to separate records.

In effect, it introduces redundant unambiguity all around and looks like
this:

Var:    caseno
 V_Label:       original respondent number
 V_Index:       6
 V_Format:      F4.0
 V_Measure:     Scale

Var:    v1
 V_Label:       how important in your life: work (Q1A)
 V_Index:       7
 V_Format:      F1.0
 V_Measure:     Scale
 V_Missing:     -5 THRU -1
        -5      other missing
        -4      question not asked
        -3      not applicable
        -2      no answer
        -1      don't know
        1       very important
        2       quite important
        3       not important
        4       not at all important

Yes, so what, another newly invented format. But I consulted a lot of
different outputs, and none of them was _that_ friendly.
To my eyes :-) 
End of commercial.

/Digression.

@comment6, recutils:

Thanks for pointing out recutils, this looks rather sexy, and has escaped me
so far. Will go play :-)



    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?40864>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]