users-prolog
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: large data sets


From: Gregory Bourassa
Subject: Re: large data sets
Date: Mon, 16 Jun 2008 14:04:37 -0400

Martin,

I was hoping to see someone more familiar with GnuProlog internals step in to
respond.

A couple of more general observations.

1) Memory is pretty cheap nowadays.  If this is mission-critical, consider 
adding
memory.

2) The swap storm issue depends more on how many applications you are running,
and the architecture of your OS.   I would not really expect a "storm".  Likely
some measurable effect from swapping if your facts/rules exceed physical memory
AND your application jumps somewhat randomly from one area of the facts to
another.   However, Unix-style operating systems should be able to handle the
virtual memory operations in appropriately-size pieces -- plus their file
handling is really fast. Even the more recent incarnations of Windows should be
OK for a couple of hours/days before the memory usage sticks too high (Pain from
old war-wounds is showing here a bit; Sorry; maybe they have solved all that in
the latest Windows...I still observed memory issues in Win2003 server for all
languages, not particularly Prolog).  

3) Remember that Prolog's default way of matching facts in the database is
linear.   If your traversal of the large data set is mostly linear anyway,  this
is a non-issue.    If it needs lots of random-access, then you might want to
consider using more than one flavour of "fact" -- multiple different predicate
names.   This way, the searches would not be over a single giant linear list.  
Consider meta-programming predicates to generate these predicate names.

I have done fact-indexing systems in Prolog before.  They are not part of the
base language, but can be done.   Of course, indices can be a two-edged tool in
any case -- you have to consider the cost of insertion and index maintenance.  
If
you are asserting and retracting a lot, then they sometimes do not pay for
themselves.  If you are traversing and analysing more than asserting, then they
can be a help.

My own bias, as a multi-lingual developer, would be to give the Prolog a try. 
You can generate a giant data set pretty quickly to smoke-test your ideas.  You
will probably save enough on ease of development to justify buying some memory 
if
and when it becomes necessary.

I hope this helps.

Gregory Bourassa   

On Sat 14/06/08  1:52 AM , "Martin McDermott" address@hidden sent:
> Hello
> 
> 
> 
> Kind of a odd question. Im interested in using Prolog to work on large
> 
> data sets(a couple gigs). I'm wondering if this is a good idea or not
> 
> and if theres anything in particular I should look into. My major
> 
> concern is that while my data set will never get too large, I wont
> 
> have enough RAM to load the entire file. I'm not sure how this will
> 
> affect fact checking(possible swap storm?).
> 
> 
> 
> Thanks
> 
> 
> 
> 
> 
> _______________________________________________
> 
> Users-prolog mailing list
> 
> Users-pr
> address@hidden
> http://lists.gnu.org/mailman/listinfo/users-prolog
> 
> 
> 






reply via email to

[Prev in Thread] Current Thread [Next in Thread]