maposmatic-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Maposmatic-dev] Features from the Code Sprint?


From: Maxime Petazzoni
Subject: Re: [Maposmatic-dev] Features from the Code Sprint?
Date: Wed, 15 Sep 2010 21:35:03 +0200
User-agent: Mutt/1.5.20 (2009-06-14)

Hi,

* Jeroen van Rijn <address@hidden> [2010-09-15 18:19:38]:

> 1. pgsql stores the database in e.g. /var/pgsql/ocitysmap
> 2. at 00:00, finish any current job if running, then shut down postgres
> 3. mv /var/pqsql/ocitysmap /var/pgsql/ocityold
> 4. mv /var/pgsql/ocitynew /var/pgsql/ocitysmap
> 5. restart pgsql and start processing jobs again
> 6. meanwhile at low io prio: rsync address@hidden:/somewhere
> /var/pgsql/ocityold
> 7. mv /var/pgsql/ocityold /var/pgsql/ocitynew
> 
> The other machine 'somehost' can keep the database up to date, just making
> sure it's in a known good state by the time the public server comes for its
> rsync.
> 
> That or you could have this other server be the initiator of the rsync
> update, putting a semaphore file on the maposmatic server when it's done;
> this public server would then upon seeing this file finish any running job,
> move those directories around, and restart postgres.

Any solution based on using two databases has several problems:

  - the planet OSM database already takes *a lot* of space, and is
    growing very fast. It will soon be hard to affort having two of them
    at any point in time;
  - swapping databases means a very low update rate, every couple of
    days for example, instead of our current daily updates and our wish
    to move to hourly updates with Osmosis.

> Another thought is to ask the authors of the update tool if it can delay the
> actual updating to near the end of the process and write all the updates at
> once in a single transaction. The database would be doing mostly reads until
> then and keep more rows around in cache that any rendering job at the time
> might make use of as well.

This is already what osm2pgsql is doing IIRC, but commiting the
transaction is very I/O intensive and not at all immediate.

> Also... have you thought to ask the guys at geofabrik how they keep things
> up to date? Is it solely throwing hardware at the problem, or did they do
> something clever with configuring both postgres and the update jobs too.

The guys at geofabrik threw *a lot* of hardware at the problem.
Fredrik's numbers were using a quad-Xeon machine with > 40GB of RAM and
SSD hard drives (with that setup it could do a full planet import in
less than 5 hours...).


So, it mainly is a hardware problem: we need a server with fast disks
and a great deal of RAM to deal with the planet import and the
hourly/daily updates, as well as enough disk space to accomodate the
database and its constant growth.

- Maxime
-- 
Maxime Petazzoni <http://www.bulix.org>
 ``One by one, the penguins took away my sanity.''
Linux kernel and software developer at MontaVista Software

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]