anarchdb-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Anarchdb-devel] Database population


From: Francois Gombault
Subject: Re: [Anarchdb-devel] Database population
Date: Mon, 12 May 2003 22:44:12 +0200
User-agent: KMail/1.5

[redirecting the mail to anarchdb-devel]

Andrew P Vanpernis wrote:
> I placed two text files into the the cards directory of ARDB's CVS
> repository.
> One is the the complete (minus Anarchs) card list from the White Wolf
> webpage. Unfortunately this file lacks any sort of rarity information.
> The second is a complete (minus Anarchs) list of rarities that can be
> found at http://www.thelasombra.com/cardlists.htm.

Fine. This is the best material we can find, so we'll see what we can do from 
this.

> The problem seems to be combining the information in these files and
> organizing them into a CSV file. I was just trying to think of some
> ideas for how to do this, and what the output should look like. Do we
> want multiple listings for a card within a given set. For example, with
> the card Academic Hunting Ground do we want?
>
> Academic Hunting Ground, Jyhad, Uncommon, ...
> Academic Hunting Ground, VtES, Uncommon, ...
> Academic Hunting Ground, Camarilla, Preconstructed Tremere, ...
> Academic Hunting Ground, Camarilla, Uncommon, ...
>
> Or this?
>
> Academic Hunting Ground, Jyhad, Uncommon, ...
> Academic Hunting Ground, VtES, Uncommon, ...
> Academic Hunting Ground, Camarilla, Preconstructed Tremere | Uncommon,
> ...

I think I prefer the second one, as we can handle multiple rarities pretty 
well with SQL queries. I see no need for duplicating the entries.

> Another question is should all of the fields be spelled out (like I did
> above), or should we come up with a set of abbreviations, similar to
> those used in the text files?

I'd vote against abbrev., because 
1) right now, they're already a pain to learn for newbies
2) we might some day have so many of them that they'll become unmanageable 
(from an SQL substring search point of view too).

Now, for the parsing machine:

We have a strong constraint. It is that building the populating the database 
now assign indexes to card names etc. These indexes will be used to build 
decks, export inventory, etc.

Rebuilding the database later on, after the release of a new set, for example, 
should keep the _same_ indexes for older cards, and assign new ones to new 
cards, thus ensuring compatibility.

So, here's the algorithm I imagined, let me know what you think about it:

1. Have a "sets.history" file, indicating the order of publication of the 
sets. It will look like:
set: Jyhad J
set: VTES VTES
promo: Marianna Gilbert
promo: Dan Murdock
set: Dark Sovereign DS
...
Or something equivalent. 

2. For each line in "sets.history", parse the card list and the rarity list, 
and build two CSV files, like Jyhad_crypt.csv and Jyhad_library.csv. In these 
files, cards are ordered by name, A->Z.
Unless it's a promo entry, in this case we don't generate files, we just store 
it's data.
Feed the CSV files in the database, or insert the promo card data.

3. Loop for the next set/promo.

This way, as long as entries in "sets.history" stay in the same order, and as 
long as cards don't get drastically renamed, we should end up with a 
compatible database.

Comments? Suggestions?

-- 
Francois
I WILL NOT SNAP BRAS
        Bart Simpson on chalkboard in episode 8F22





reply via email to

[Prev in Thread] Current Thread [Next in Thread]