[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Therion] Dataset structure including survex files
From: |
Wookey |
Subject: |
[Therion] Dataset structure including survex files |
Date: |
Tue, 27 Jun 2006 02:59:51 +0100 |
User-agent: |
Mutt/1.5.11+cvs20060403 |
I have been trying to rationalise the Mulu dataset I first created in 2003,
to be able to process individual caves as well as the whole dataset.
I have also been trying to make the 2003 data compatible with the 2005 data
which Andrew Atkinson created.
Andrew has greatly improved the layout of the dataset over my initial efforts,
and has a hierarchical setup that works. However his dataset has used some
slightly different conventions to mine. By using his ideas I too have a
hierarchical dataset that works (in part- see below), but marging data from
the two schemes doesn't work properly.
I have been trying to understand exactly what is going on, and with a lot of
testing I have got some very peculiar results which make it clear to me that
I do not understand what Therion is doing, and why some things are
happening, nor what the correct solution to the problem is.
I'm afraid this is a very long post but I want to explain what I have done
so far and why, in what I hope is a clear way; not least because we have had
parts of this conversation before on the list and there has been some
confusion.
The dataset is available from svn://wookware.org/mulusurvey
A tarball with the scans removed to keep it small is here:
http://wookware.org/files/benarat.tgz (4MB)
I am using therion 0.3.10
Before I get into too much detail let me explain what we want to achieve:
* centreline data stored in survex form. Processed to top-level .3d file
* drawings stored in therion format
* ability to process the whole dataset to produce maps of everything
* ability to process individual caves to produce maps of them
* ideally mixing scraps with therion-style and survey-style station naming
syntaxes
Andrew's 2005 Layout looks like this:
-------
(example taken from api/Whiterock/)
There is one thconfig file for each cave:
top-level thconfig has:
source Whiterock_only.th
Whiterock_only has:
survey whiterock
import Whiterock.3d -surveys use -filter whiterock.the_ashes_series
input WHiterock.th
endsurvey
Whiterock.svx has:
*begin Whiterock
*include Api_Birthday.svx
.....
equates
*end Whiterock
Whiterock.th has:
input Api_Birthday.th
input Api_Chamber.th
....
<joins between caves>
...
Api_Birthday.th has:
input Api_Birthday.th2
map Api_BirthdayScPl1
map Api_BirthdayScPl2
joins if needed
stations notation is:
station -name api_birthday.16
thconfig in Api_Birthday dir has:
source Api_Birthday_only.th
exports ...
layout benarat
local changes to layout
endlayout
Api_Birthday_only.th has:
survey Api_Birthday
import Api_Birthday.3d -surveys use
input Api_Birthday,th
endsurvey
Api_Birthday.svx has:
*begin Api_Birthday
<data>
*end Api_Birthday
---------
This scheme works nicely, but has the disadvantage that you have to generate a
3d file for each
processable subdirectory - you cannot use the top-level 3d file that contains
all the info.
The layout I have used is slightly different - the main differences being the
station syntax
and the import syntax.
Wook scheme:
--------
top-level thconfig has:
source benarat_only.th
<layouts>
benerat_only.th has:
survey benarat
import benarat.3d -surveys use
input benarat.th
endsurvey benarat
benarat.th has:
<maps>
input terikan/terikan.th
input davis/davids.th
input menagerie/menagerie.th
...
<joins between caves>
davids.th:
input davids.th2
map
<davids scraps>
endmap
thconfig in davids dir:
survey davids
import ../benarat.3d -surveys use
input davids.th
endsurvey
stations (in davids.th2) are:
station -name davids.a15
thconfig in terikan subdir has:
survey terikan
import ../benarat.3d
input terikan.th
endsurvye
stations are:
stattion -name address@hidden
--------
This scheme also works (or at least it was but I now seem to have broken it
and have become hoplessly confused about how things work). It has the
advantage of allowing you to use one consistent top-level .3d data file, and
it seems to be possible to mix the different station notations that the 2003
(wook) and 2005 (andy) datasets use.
The thing I do not understand is how the survey hierarchy in 'import'ed .3d
files relates to the survey hierarchy in the therion files, and the
significance of the address@hidden syntax versus the survey.station syntax.
I started by reading the thbook entry:
import
------
Description: Reads survey data in diferent formats (currently processed
centreline in *.3d, *.plt, *.xyz formats). Survey stations may be referenced in
scraps
etc. When importing Survex' 3D file, stations are inserted in survey hierarchy,
if there
exists identical hierarchy to that in 3D file.
Syntax: import <file-name> [OPTIONS]
Context: survey / all (only with .3d files where survey structure is specified)
Options:
* filter <prefix> -> if specied, only stations with given prefix and shots
between them will be imported. Prefix will be removed from station names.
* surveys (create)/use/ignore -> species how to import survey structure
(works only with .3d files).
create -> split stations into subsurveys, if subsurveys do not exist, create
them
use -> split stations into existing subsurveys
ignore -> do not split stations into sub-surveys
This didn't really enlighten me enough to understand why some of my data
works and some doesn't.
What does 'split stations into sub-surveys' actually mean (surveys in the
.th files or in the .3d file, or both?). Examples are needed here to make
this clear. Does the top level of the .3d file have to match the current
survey in the therion hierarchy or the next level down? How can you tell
whether the stations have 'fitted' into existing subsurveys or not?
And what happens if you don't specify any of create/use/ignore. Do you get a
fourth behaviour, or one of the above? This needs specifying.
In an attempt to work it out for myself I tried some things and found this:
if we have davids.a15 style names then we need -surveys use on import to work
if we have address@hidden style names then it works with or without -surveys
use
so we can mix these two styles with -surveys use. (confirmed)
Why is this? I don't understand what's going on.
I also noticed (see davids.th2) that you can do "-station-names davids." in
the scrap line, then station -name a15 to get davids.a15 style names. Is
there equivalent syntax for address@hidden style?
The big problem here is that I don't want to have to rename hundreds of
stations in existing datasets from therion-stryle <station>@<survey> syntax
to survex-style survey.station syntax unless I really have to. And not
mentioning the survey-name in the .th2 scrap definitions is not always an
option, because scraps often cross survey joins.
Is seems to be possible to mix syntaxes, but it is tricky to get right, and
when it doesn't work it seems to be difficult to work out what you should do
to fix it.
I have caves with various layouts. Some work and some don't:
Bluemoon, Moon_cave, davids and terikan
* terikan is quite big with a load of subsurveys. Stations are therion-style
* bluemoon is all one survey 'fake.bluemoon'. stations are just "-name 10"
- no survey specified.
* Moon_Cave is a 2005 survey, so it has survex-style names, excpet
mainline_side which I have changed to
therion-style station names
* davids is a simple one-survey cave, with survex-style names, but using
"-station-names davids." in the scrap
Trying to process these individually I have found:
* terikan:
1) with import ../benarat.3d, and import command outside survey terikan
works, with these warnings:
therion: warning -- unable to import 25 stations outside survey
therion: warning -- unable to import 43 shots outside survey
2) with import ../benarat.3d, and import command outside survey terikan _and_
inside survey terikan
works, no warnings. (this is a very odd result!). Having two instances of
survey terikan is obviously 'wrong'
but it works - in fact it works better than above.
3) with import ../benarat -survey use
fails:
therion: warning -- unable to import 3895 stations outside survey
therion: error -- faketerikan.th2 [95] -- survey does not exist -- fake --
invalid station reference -- address@hidden
4) with import ../benarat -survey create
fails:
therion: error -- faketerikan.th2 [95] -- survey does not exist -- fake --
invalid station reference -- address@hidden
5) with import ../benarat -filter benarat, and import command outside survey
terikan
works, with these warnings:
therion: warning -- unable to import 25 stations outside survey
therion: warning -- unable to import 43 shots outside survey
6) with import ../benarat -filter terikan, and import command outside survey
terikan
fails:
therion: warning -- unable to import 25 stations outside survey
therion: warning -- unable to import 71 shots outside survey
therion: error -- faketerikan.th2 [95] -- survey does not exist -- fake --
invalid station reference -- address@hidden
7) with import ../benarat -filter terikan, and import command inside survey
terikan
works, no warnings.
>From this lot I infer that 7) is the 'right' way to do it, but why does having
>two
levels of survey terikan (2) seem to work? And why doesn't 5) work - it seems
to me that it should?
And I have previously determined above that we need -surveys use to allow
mixing of station-styles. But I
can't get it to work with -surveys use. I tried
8) import ../benarat -surveys use -filter terikan, and import command inside
survey terikan
fails:
therion: error -- faketerikan.th2 [95] -- survey does not exist -- fake --
invalid station reference -- address@hidden
But I thought it would work. Why doesn't it?
* bluemoon
I can only get this to work with import ../benarat.3d -surveys use -filter
bluemoon, or
import ../benarat.3d -filter bluemoon,
Everything else fails. I don't understand why I get different results from with
terikan - it
may be something to do with the fact that the scraps have no survey names
specified?
* davids
Works fine with -surveys use, doesn't work without it.
* Moon_Cave
I have never got this to work - whatever I do I always get:
therion: error -- Benerat_Mainline/Benerat_Mainline.th2 [396] -- survey does
not exist -- mainline_side --
invalid station reference -- address@hidden
or
therion: error -- Darkside/Darkside.th2 [158] -- invalid station reference --
darkside.13
* Overall benarat survey
The other aspect of this is processing the whole benarat dataset. I did have
this working with both terikan
and davids (mixed station name syntaxes) (which needed -surveys use), but I
can't get it to do it now
(after a huge amount of chopping and changing things to write this mail.
<aaargghh>
But I have never managed to get Moon_Cave to be included as well.
Now I can get terikan and bluemoon (and menagerie) to work together - i.e.
caves that have therion-style
names, _or_ I can get davids to process (needs -surveys use), but not both
together.
I tried looking at therion.logs to try and better understand what is going on.
The logs look rather odd.
can have survey terikan in terikan_only, _and_ survey terikan in terikan.th
that it includes. Still processes fine! therion.log shows an extra .terikan
on top of everything:
address@hidden
address@hidden
address@hidden
if you remove the survey terikan in .th files then it fails to process!
(either from terikan dir (using terikan_only.th), or above benerat dir,
where benerat.th has survey terikan/endsurvey round the input
terikan/terikan.th line). This makes no sense!
error is
processing references ...
therion: error -- faketerikan.th2 [95] -- survey does not exist -- fake --
invalid station reference -- address@hidden
where address@hidden is first station referenced in terikan.th2 - via
faketerikan.th2
therion.log still has
address@hidden
but no scrap references as it barfs before then
The log files seem to show the stations that were not used, which can be a
clue, but we can't see the stations that _were_ used, which might actually
be more useful.
Given how hard it has been for me to make sense of this, having spent some
10 hours or so getting to my current state of confusion, I think we need a
better way of debugging these sorts of errors, as well as a very clear
desription of how it is suppposed to work.
Apologies for the length of this - but I hope you can understand both ewhat
we are trying to do, and the problems we are having in achieving it.
Wookey
--
Aleph One Ltd, Bottisham, CAMBRIDGE, CB5 9BA, UK Tel +44 (0) 1223 811679
work: http://www.aleph1.co.uk/ play: http://wookware.org/
_______________________________________________
Therion mailing list
address@hidden
http://www.speleo.cz/mailman/listinfo/therion
- [Therion] Dataset structure including survex files,
Wookey <=