[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Texmacs-dev] More comments on David's document

From: Joris van der Hoeven
Subject: Re: [Texmacs-dev] More comments on David's document
Date: Thu, 11 Apr 2002 17:49:25 +0200 (MET DST)

> I think those methods must be part of stree and not of editor for the 
> following reasons:
>   1. A classical object oriented design principle: classes implement 
> concepts.  The stree class implements the parsed representation of the 
> document main file, so it should define all the low-level methods which 
> operate on a the file structure. 
>   2. When you have a class with two very different usage patterns, one of 
> these requiring a lot of additional code, a subclass must be defined to 
> encapsulate additional behaviour.
>   3. There is a fundamental feedback loop in the system:
>      stree->rewriter->typesetter->editor->stree
>      In that loop, rewriters are perfectly decoupled of "stree" and 
> "typesetter" by the use of a simple Facade-Observer design pattern. The 
> typesetter/editor interface is out of the scope of my work, but you say 
> yourself that it is clean. My point is that the last interface, editor/stree, 
> must also be kept as clean as possible.

Yes, but it already is clean, because all methods for modifications in
the source tree are done by a well determined submodule of the editor,
namely edit_modify. This module also keeps track of related issues,
such as undoing changes. If the number of related issues gets to large,
I can always create a new submodule to encapsulate this.

I really want to see source trees as (t)trees and not trees
with additional structure. In the *context* of an editor,
we have special routines for operating on them,
which are encapsulated by a dedicated module.

I think this discussion is getting a bit boring and
not really relevant for what you are going to do.
Also, both approaches are more or less equivalent and
it just depends on where to put the psychological accent.
I understand your point, but I just tend to see raw data
less as objects which operate on themselves than you do.

>      One specific implementation I have in mind is the real time concurrent 
> edition of a stree by several editors. That could be implemented by replacing 
> stree by a proxy object handling a remote object protocol. In that 
> implementation, there is no concept of a "current editor" which is in charge 
> of performing concrete structure operation on the stree and sending 
> notifications to rewriters.

Well, each edit tree is also part of the tm_buffer object;
a tm_buffer is really a source tree with some additional information.
If necessary, I could implement the thing you want on that level.
It is true that I could create a new data type for the edit tree,
but it if we see the edit tree more as data, then it suffices to
write the necessary routines to operate on them.
I am starting to repeat myself...

>      Anyway, rewriters are conceptually associated to a stree, not to an 
> editor, so according the "classes implement concepts" principle, rewriters 
> must be notified by stree.

No, the rewriters are conceptually associated to ttrees,
because we want to have the possibility to compose several rewriters.
Only the ttree fed in to the first rewriter in the chain is an stree.

>   4. There no overhead. If you want a dumb data structure (with associated 
> elementary operations) you can use ttree. The methods of stree and ttree will 
> not be virtual, so there will be zero overhead induced by subclassing.

Yes! I want a dumb data structure!

>      The real-time concurrent edition system would indeed require some kind 
> of abstraction, but that can be made with no time overhead by making the 
> editor class a template class (though I think that would not give a 
> significant performance gain over the other approach, which is defining 
> virtual methods in stree and redefining them in remote_stree)

No! Give me my dumb data structure!

> > > > > Observers
> > > > > ---------
> > > >
> > > > Like in the case of tree and stree, there is no real difference between
> > > > the rewriter, tree_observer and typesetter classes: they all refer to
> > > > the same abstract class.
> > > >
> > > > In the case of rewriter, tree_observer and typesetter,
> > > > I nevertheless feel that the difference in purpose is greater,
> > > > so you may use typedefs to let the different classes become synonyms.
> > > > (I also prefer "observer" to "tree_observer", since we will never
> > > > observe other types of objects). However, I am against doing
> > > > the same thing for "stree".
> > >
> > > I really do not understand what you are thinking of when your are talking
> > > of using "typedefs to let the different classes become synonyms". Why not
> > > simply use inheritance with an abstract "observer" class which is
> > > implemented by "rewriter" and "typesetter"???
> >
> > I want typedefs for the moment, because this is faster.
> > I repeat: efficiency is a *major* design goal, whether you like that or
> > not. In any case, in the foreseeable future, these classes will be perfect
> > synonyms, which is better expressed using a typedef. If they ever turn out
> > to be different, then this can always be changed later.
> If understand what you mean, you want to avoid the virtual method invocation 
> overhead, right?


> But you just cannot avoid it.

Not completely, but I can try not to increase it unnecessarily.

> But maybe you have some incremental implementation plan in mind that I just 
> do not know, in case that would make a lot more sense.

Yes, as I said before, I can not see yet in what respect "typesetter" and
"rewriter" would have more structure than the abstract "observer".
So let's just use a typedef for the moment. If these classes really get
different for some reason later on, then we can always switch to
a more complex inheritance scheme without much implementation cost.

> > > Edit tree
> > > ---------
> > >
> > > I think there is a need for more terminology here. Since the current edit
> > > tree is very overloaded and we are going to separate it in several parts,
> > > we need more words.
> >
> > You do not have to care about the edit tree anyway:
> > you care about the source tree.
> Today, as far as I know, the two are the same.


> The rewriters will separate 
> the source and edit trees, so we must decide which responsibility goes 
> where. So I care about the edit tree because it is the semantic complement of 
> the edit tree relative to the current editor's "et" data member.

In a first stage, the edit tree will remain a "tree" and
the source tree will become a "ttree".
So you will have to care about the source tree.

> > > I see TeXmacs as an interactive
> > > structured typesetting system. The typesetting system is made of a
> > > typesetter tightly coupled with an editor. That means that the
> > > typesetting language not only has primitives for layout, but also for
> > > controlling the editor.
> >
> > Wrong: the typesetting language has no primitives whatsoever for
> > controlling the editor and this independence is a major advantage.
> > This does not withhold the editor to *communicate* with typeset boxes,
> > in order to associate logical paths to physical positions or
> > to find a hyperlink associated to a region of text.
> > In other words, there is a clear separation between structure and
> > rendering on the implementation level.
> >
> > > For example, it could be useful to decouple the region concept (as
> > > implemented by varexpand) from the environment concept (as implemented by
> > > most structures). A region affects only the typesetter, while an
> > > environment also affect the editor (infinitesimal positions).
> >
> > Yes, this will be a result of what you are doing.
> > We are working towards a clean separation of
> >
> > 1. Structure
> > 2. Rewriting and/or scripting
> > 3. Rendering
> So you do not agree with my generic formulation, but you do agree with my 
> example... I guess that is consequence of divergence or a misunderstanding 
> somewhere else...

I just do not agree with the statement that the typesetting language also
has primitives for controlling the editor. For me, the editor *only*
operates on the edit tree (or source tree, which is almost the same) ,
although it may *communicate* with the box which is induced by it
after rewriting and typesetting.

> [from earlier your mail]
> > > I think that the behaviour of the editor must not be directly affected by
> > > the source tree, but only by the object tree.
> >
> > No: the source tree contains the structure; the object tree is only
> > obtained after rewriting and the boxes after typesetting.
> > The only sensible structure the editor operates on is the source tree.
> [end of moved part]


> > > Since the source tree structure is completely independent from the object
> > > tree structure, we cannot rely on the source tree for controlling the
> > > editor. Instead, the editor will only send edition notifications to the
> > > stree.
> >
> > No, we can not rely on the object tree for controlling the editor,
> > because the structure of the object tree does not directly correspond to
> > the structure of the source tree, which we are editing.
> I will try to make my point clear because I think there is an essential 
> misunderstanding here.
>   1. The behaviour of the editor must only be controlled by the object tree.

As stated, this is completely impossible: the object tree
will be an auxiliary object, between rewriting and typesetting.
You probably mean the "object box", obtained after rewriting
*and* typesetting.

>   2. Any edition operation is directly applied to the source tree, at the 
> location provided by the inverse path tag of the innermost object tree, that 
> is the source tree position corresponding to the cursor position, or the 
> source tree position corresponding to the innermost object tree containing 
> the whole selection.

So you agree that the edition operation is applied to the source tree,
i.e. that the editor operates on the source tree. That was exactly my point.
In order to find out *where* to operate, it may be necessary to
*communicate* with the object box (let us continue to use that name).
For instance, if you click somewhere on the screen, we ask the object
box to return a logical path (which is computed from the inverse path).
But at the moment that you want to insert something afterwards,
we do no longer need the object box: we know at which logical path
we want to insert something, we know what to insert, so we just insert
it in the source tree. The changes will next be notified to
the rewriting/typesetting process.

> Point 1 is required to support generic transformations. For example, suppose 
> you are editing a source document which has the following structure:
> <addbk|\
>   <persons|\
>     <person|<first|Joris><last|Hoeven><part|van der><job|psud>>\
>     <person|<first|Ralph><last|Treinen><job|psud>>\
>     <person|<<first|David><last|Allouche>job|eisti>>>\
>   <jobs|\
>     <job|psud|<name|Universite de Paris Sud><city|Orsay>>\
>     <job|eisti|<name|EISTI|Ecole Internationale...><city|Cery>>>>
> Yes, that is essentially a relational database, and yes it is XML oriented.
> And that your first rewriter gives a document which looks like:
> <body\>
>   <section|Jobs>
>   <description\>
>     <item*|<concat|Joris|van der|Hoeven>:><concat|Universite de Paris Sud>
>     <item*|<concat|Ralph|Treinen>:><concat|Universite de Paris Sud>
>     <item*|<concat|David|Allouche>:><concat|Ecole Internationale...>
>   <description/>
> <body/>
> Yes, I know concat is not a feature of TeXmacs externalisation style, that is 
> just an example.
> As you see, the structure of the second tree (which is what the user sees) is 
> completely different from the structure of the source tree.
> I want how you intend to control the editor from the source tree in a way 
> that makes sense for the user, who sees the second tree. It only make sense 
> to control editor from the object tree (which is essentially the same as the 
> second tree), and to give feedback to the user after modifications to the 
> source tree have made their way to the object tree.

Well the point about *tagged* trees is precisely this.
During the rewriting procedure each node in the second tree comes
with a pointer to a precise location in the first tree.
In order to get this working really well, this may necessitate
the insertion of invisible markers though. Also, certain subtrees
of the output may correspond to nothing in the source tree,
in which case they are marked as unaccessible.

In other words, the tagged tree paradigm allows us to build
an object box with complete logical position information.
The editor communicates with the object box in order to
obtain logical information from graphical information and
vice versa.

> Dynamic validator
> -----------------
> > Yes, that is the next step, which I already mentioned in a discussion:
> > I plan to incorporate DTD support in TeXmacs. At that point we will have
> > four levels:
> >
> > 1. DTD and validated logical trees
> > 2. Brute non validated trees (source trees for rewriting)
> > 3. Rewriting and/or scripting
> > 4. Rendering
> >
> > Most editor routines in the extension language will operate on level 1.
> > Some routines will operate directly on level 2 for efficiency reasons.
> That is another discussion, but I do not think we should do it that way. I 
> have not yet really thought of it, but, at first, I think that things should 
> be controlled by another feedback loop to allow the use of plugin validators 
> (DTD, various schemas, scripting languages).
> The feedback loop would look like:
>               +------- pointer path --------+       
>               v                             |
>   stree -> validator -> user interface -> editor -> validator -> stree
> The ideas is to only use the validator where validation can be broken, that 
> is during edition operations. That approach make it unnecessary to make a 
> distinction between validated and non-validated trees.
> Again, a simple protocol will have to be designed between all the components.

OK, let's talk about this later. I will remember that you have some
ideas about it. I did not fix my ideas yet anyway.

> > > Did you ever used different views of the same document using different
> > > stylesheets? That is very useful, especially when one of the stylesheet
> > > exposes otherwise invisible data. So we need that the complete
> > > transformation chain as well as the object tree be local to an editor.
> > > Things could indeed be optimised by using more independent abstractions
> > > of editors and views so that the same rewriting would not be redone for
> > > two views displaying the same document with the same transformation.
> >
> > Yes, I sometimes use this feature (although not very often).
> > But I agree, we need to make this feature more and more powerful.
> So, we agree that there should one object tree for each editor. The 
> disagreement seems only to be on what is really the edit tree. As I 
> previously said, I think the edit tree must be the object tree.

And you hopefully understand now why it has to be the source tree.

> Force method
> ------------
> It looks like we are eventually coming to a agreement on that issue :)
> > >
> > > So my prototype "box force(tree, path context)" is the right one, because
> > > it does not mandate anything but the existence of a rewriter for the
> > > whole tree.
> >
> > I am still not completely convinced.
> >
> > Mainly, I do not see the purpose of the tree argument.
> > It is used to modify the rendering (like putting it in bold),
> > then I feel that this should really be done in the tree structure itself,
> > because we might rewrite that structure in a way that it does what you
> > want anyway. The tree argument is only needed as a query,
> > but this, in its turn, is only needed if we do not return a box,
> > since the box already contains all information.
> >
> > Also, the path argument is not a context but a path to a subtree.
> In the RFC, I said:
> The tree will be typeset as if it was located at the given <var|context> path
> What I mean is that returned box is the result of the typesetting of tree 
> parameter after rewriting. We need to pass the tree to typeset as a parameter 
> because that information cannot be inferred from the identity of the notified 
> rewriter object, because rewriter is a Facade.

Well, I think that it necessarily should be inferrable from that
information because it would be nice to preserve the locality
of the present context. I think that this can be achieved
in all reasonable situations.

> To rewrite and typeset the parameter properly, we need to specify a context. 
> The path parameter specifies a position in the notional output ttree of the 
> rewriter issuing the force message. The tree parameter will be rewritten and 
> typeset as if it was inserted at that position.

But in that case, the tree that you want to pass as a parameter
may be directly included in the rewritten tree (in an invisible way),
so that you only need to specify a path to find it.
This allows you to benefit from the full power of the rewriting scheme.

> That allows the forcing of a document fragment which is not actually present 
> in the tree. Maybe you do not agree with that, but I think that approach is 
> sounder from a functional programming perspective.

The previous point contains the remedy: we force the typesetting of
the ttree *after* and not before rewriting. Of course we have to be
careful with synchronizing all this appropriately.

> First pass typesetting
> ----------------------
> > > I am one of those people who feel comfortable when they now a bit more
> > > than they strictly have to, so I assume the same for my readers. I know I
> > > have not yet a full understanding of how the typesetter works, but I
> > > think the little I said was correct. If it is not, I would really like
> > > the correct version.
> >
> > Yes, I will give this to you, but I have no time for that right now,
> > so I want to concentrate on the information that you really need most.
> Thanks in advance. I will leave that section in place until I have more 
> precise information. Since I had quite a look at the typesetting system 
> before sending my last patch (about moving the caret out of multicols), I 
> think it not too wrong.

Unfortunately, it is not really adequate; I prefer that you leave it out
so that other reader will not unnecessarily get confused.

> RFC layout
> ----------
> > > I will distribute the document in A5 papyrus, since I think that A4
> > > papyrus is too wide for comfortable on-screen reading. Given the
> > > typessetting time of the document, automatic page type is not an option.
> >
> > That is fine. Please use 600 dpi fonts by the way.
> I you really want me too, I will do it.
> But I want to point that the optimal anti-aliasing quality is obtained when 
> using shrinking=3 (so, dpi=360). With shrinking=2 (dpi=240), the text is 
> still ragged, but with shrinking=5 (dpi=600), the text has much less 
> contrast. If you want to be convinced, just use xmag on the same text with 
> different dpi and shrinking settings.

I know; I did many experiments like that. It also depends on the font size
(10 point, 12 point, etc.). But 600 dpi is the printing standard and
most people have their fonts generated for that resolution.

> I agree that more antialiasing makes the page overall more beautiful, but 
> more contrast makes the text less tiring to work with, since the eye can 
> accommodate on the screen more easily.

Yes, but the text is also smaller. I tend to work with 600dpi sf=4 11pt
on a 1024x768 screen. Hopefully, I will be able to switch to sf=3 once
on a larger screen.

Yours, Joris

reply via email to

[Prev in Thread] Current Thread [Next in Thread]