phpgroupware-developers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Phpgroupware-developers] SQL Style Guide available


From: Kai Hofmann
Subject: [Phpgroupware-developers] SQL Style Guide available
Date: Wed, 7 May 2003 10:52:43 +0200

The SQL Style Guide is available at
http://www.probusiness.de/projekte/phpgroupware/

I also include it here for discussion - would be nice if the project would
accept it and include it into the wiki.

Greetings

    Kai




SQL Style Guide
Kai Hofmann
07.05.2003


Table of Content

 1. Style guide background
 2. Naming tables and columns
 3. Using datatypes
 4. Defining primary keys
 5. Relations and foreign keys
 6. Defining indexes
 7. 3rd normal form (3NF)
 8. Database access from the application
 9. How to write select statements with multiple tables
10. Connections to other sources like files, ldap and imap
11. References


1. Style guide background

This style guide describes how you should name and design the database for
you 
application module.
The base idea behind all this is to improve the usability, performance,
maintainance 
and understandability of you database part and application.
This guide tries to be database independent. So using it should make your
scheme more
portable to different dbms.


2. Naming tables and columns

Tables and columns should always be named in

- lower case
- english language

and a name should not use more than 64 characters.

Each table name should describe the tables content in one or two words - 
use the plural form.

Examples:

events
accounts
bookmarks
servers
timetrack_jobs
log_events

Each column of a table must start with a shortened version of the table name
with 
a maximum of 6 characters. For example use

msg   for messages
cal   for calendars
adr   for addresses
conf  for configurations
cat   for categories
log   for log entries
proj  for projects
cont  for contacts
inv   for invoices/inventories

Please be careful to use each label only in one table.
So if you have an invoices table as well as an inventory table use
inv for the invoices and invt for the inventory table.

After the label you should use "_" and then the singular form of a subject
that 
best describes the column:

cal_date   : table calendar colum with a date
cal_name   : table calendar column with calendar event name
cal_description : table calendar column with event description

for long fieldnames like description you can use short ones like "descr".

Try using the same vocabulary in all tables.
This nameing scheme will also shorten your select statemtens when using
multiple 
tables, because the column names are more
unique and so you only need to use the table name for the primary/foreign
key 
where condition! 


3. Using datatypes

Datatypes for the columns should be used wisely and with care.
Try to use the datatype that fits best and avoid the usage of blob
datatypes, 
because they are NOT portable
between different dbms. A varchar has a maximum length of 255 characters.
When using 
DATETIME (prefered over DATE and TIME because DATETIME is supported by more
dbms) you 
should read/write to this datatype in ISO8601 format [4]
"yyyy-mm-ddThh:mm:ss".

languages
---------
lang_code  CHAR(2)
lang_name  VARCHAR(32)

Foreign keys for relations must always have the same datatype as the primary
of the 
referenced table.


4. Defining primary keys

For primary keys always use an integer datatype (for performance reasons).
Only use another type when this is a real good alternative like for a
languages table 
when using the iso 2 character code as primary key (like in chapter 3).
Primary keys should always be named as "id" if possible:

cal_id
event_id
msg_id

except for good reasons like in the language example from chapter 3.


5. Relations and foreign keys

Relations between tables should be used whereever possible.
A database is optimized for handling complex data structures - so let the
database 
do its job instead of rewriting the whole thing within your application.

Foreign keys should be "copied" from the parent table with the same name and
the same datatype.
Using different datatypes for primary and foreign key will not work on all
database systems 
and also it will cost a lot of performance.
So when referencing the languages table from chapter 3 your messages table
should look as follows:

messages
--------
msg_id      INT
msg_subject VARCHAR(80)
msg_body    VARCHAR(255)
lang_code   CHAR(2)

this allows an easier identification of the referenced tables.

A special trick that is very useful sometime are self referencing tables
(for example nested categories).
In such a case you must rename the foreign key because it can not have the
same name as the primary key.

categories
----------
cat_id     INT
cat_name   VARCHAR(32)
cat_parent INT         (reference to cat_id)


6. Defining indexes

Indexes will speed up the processing of select statements when they are set
correctly.
In database systems without real "foreign keys/relations" (like MySQL 3.x)
you should create 
an index for each key that references another tables.
Otherwise MySQL will do a cost intensive full table scan each time.
You also should create indexes for often used select statements.


7. 3rd normal form (3NF)

For a good database design you should always use the 3rd normal form. This
will avoid double data 
within your database and has
many more advantages. For more about database design and the 3rd normal form
please read [2], [3].

Interesting online resources can be found under [5].


8. Database access from the application

Within your application you should NOT try to do the databases job - for
example 
don't associate the data from two tables with each other within your
application - 
the database is always faster and better in doing this job.

Try to move your database access code into extra classes or as a minimum
move it 
to an extra method.
This will allow better maintanance later in the lifecycle of your software.

Write a SQL Statement into one line and don't split it into several lines. 
This will allow searching via grep.
Use dynamic parts (variables) only within the WHERE clause.

Write the SQL-Keywords in uppercase - all other things in lowercase.


9. How to write select statements with multiple tables

There are several ways how to join tables via a select statement. 
Use the following form:

SELECT * FROM events,languages,calendar WHERE event.lang_code =
languages.lang_code AND event.cal_id = calendar.cal_id
SELECT * FROM events LEFT JOIN languages LEFT JOIN calendar WHERE
event.lang_code = languages.lang_code AND event.cal_id = calendar.cal_id

That's because this code is more portable between database systems and 
the dbms will convert other forms into this one.
Also this form is shorter, more readable, performs better (because the dbms 
must not transform it) and is less error prone.

Please do NOT use the following version, because it is not so readable (only
when using brackets) 
and more error prone, because multiple joins are only readable when
splitting them into several lines. 
Expanding an SQL-statement is not as simple as the above form.

SELECT * FROM events JOIN languages ON (event.lang_code =
languages.lang_code) JOIN calendar ON event.cal_id = calendar.cal_id


10. Connections to other sources like files, ldap and imap

Instead of using blobs you should save large data as files and reference
these files 
by filename and/or relative path from a varchar column.
Referencing emails via IMAP etc. would work via the unique message id.
For LDAP you can use the DN (distinguished name) as long as it is not larger
than 
255 characters (varchar limit).
Otherwise it might be better to work with UIDs or add your own unique
identifier 
to each object for referencing it from the database.


11. References

[1] A Guide to the SQL Standard. 
    A user's guide to the standard database language SQL.
    C. J. Date, Hugh Darwen

[2] An Introduction to Database Systems
    C. J. Date

[3] Fundamentals of Database Systems.
    Ramez Elmasri, Shamkant B. Navathe

[4] http://www.w3.org/TR/NOTE-datetime
 
http://www.mcs.vuw.ac.nz/technical/software/SGML/doc/iso8601/ISO8601.html
    http://www.cl.cam.ac.uk/~mgk25/iso-time.html
    
[5] http://www.devshed.com/Server_Side/MySQL/Normal/Normal1/page4.html
    http://support.microsoft.com/?scid=/support/kb/articles/q209/5/34.asp
    http://home.earthlink.net/~billkent/Doc/simple5.htm
    http://www.databasejournal.com/sqletc/article.php/26861_1428511_4
    http://www.sqlmag.com/Articles/Index.cfm?ArticleID=4887&pg=1
    http://databases.about.com/library/weekly/aa080501a.htm
    http://databases.about.com/library/weekly/aa081901a.htm
    http://databases.about.com/library/weekly/aa090201a.htm
    http://databases.about.com/library/weekly/aa091601a.htm

-- 
*****    Open Source und Linux im professionellen Einsatz    *****
**  komplexe Mailserver, Groupware, Office: sprechen Sie uns an **
Dipl.-Inform. Kai Hofmann                    Team Softwarelösungen
pro|business AG, EXPO Plaza 1 (Deutscher Pavillon), 30539 Hannover
E-Mail: address@hidden,   Tel.: 0511/60066-332, Fax: -355
WWW: http://www.probusiness.de/
 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]