pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: good sample data sets for use in documentation


From: Ben Pfaff
Subject: Re: good sample data sets for use in documentation
Date: Mon, 27 Oct 2008 09:55:26 -0700
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.2 (gnu/linux)

Jason Stover <address@hidden> writes:

> On Sun, Oct 26, 2008 at 10:49:52AM -0700, Ben Pfaff wrote:
>> I would like to start including examples in the PSPP
>> documentation that work with realistic, interesting data sets
>> that we also include with PSPP.  To do this, I need some freely
>> distributable (ideally, public domain) data sets.  I have found
>> some of these on the web, but none seems really perfect, and I
>> wonder whether any of you have data sets to suggest?
>
> Do you mean data sets posted by organizations that collected data as
> part of a designed experiment or observational study, or just anything
> we cobbled together?
>
> I have some of the latter.

It's probably good to have a mix of both.  Yesterday, I was
looking around for the former.  Based on my web searches, other
things that are nice, but not entirely necessary, are:

        - Not too specific to any particular country or region,
          so that they will be more likely to be interesting to
          users throughout the world.

        - Formatted to be easily imported.  Notably, Excel
          spreadsheets are not particularly easy at the moment,
          and there are lots of websites with HTML tables that
          don't provide any other format.

        - I find it at least mildly interesting, and I understand
          what it's about.  (Obviously this is highly
          subjective.)

I guess that's all that comes to mind.

I found some that meet my criteria at the US Census Bureau: world
birth rates, infant mortality rates, population, etc., by year,
country, and region.  It's broken up into several tables, which
meets my immediate need of having some data sets to combine with
MATCH FILES, ADD FILES, and UPDATE.  

> R contains some publicly distributable data, too.

I didn't know that.  I should have a look.
-- 
I love deadlines.
I love the whooshing noise they make as they go by.
--Douglas Adams




reply via email to

[Prev in Thread] Current Thread [Next in Thread]