Re: Future Direction of GNU Hurd?

On Sun, Mar 14, 2021 at 11:23 AM Olaf Buddenhagen <olafbuddenhagen@gmx.net> wrote:

> One of the critical notions in capabilities is
> that the capability you wield names the object you manipulate. If the
> receive port can be transferred, this intuition is violated. In
> consequence, few capability-bases systems have implemented receive
> ports.

Interesting... Didn't realise that this is something capability designs
frown upon.

I just realized that I need to clarify this, because I wasn't using "object" in the way that I think it is being understood. When I say that a capability names a specific object, two things are true:

The term "object", in this context, means "state + behavior", not necessarily the specifically executing implementation.
When objects are "active" (that is: implemented by services), it is possible for the API or interface of the object to change dynamically. From an application perspective, it may be useful to think of the result as a new object, but for the purposes of understanding capabilities it isn't. The capacity to "morph" its operation set was part of the specification of the original behavior of the object. Because of this, the new behavior of the object is conceptually included in the original behavior.

In the current discussion, this comes up as follows:

The Coyotos "Endpoint" object contains a process capability to the receiving process (note: not an entry capability!). It's a scheduler activation design, so the effect of message arrival is (a) mark in the shared page that a message is waiting and (b) if the process is sleeping, wake it up so that it notices. The tricky part in scheduler activations on a multiprocessor is that these two things can be in a race. Anyway, the receiving process typically holds the "master" capability to the endpoint, so it is in a position to change the process capability. If it does so, the recipient process changes. This is very similar to the notion of a receive port or receive capability.

The reason this is OK is that the original recipient process could equally well implement this by forwarding the message to the new recipient process. That is: changing the process capability in the endpoint is logically equivalent to forwarding the message.

Note that this would not be possible in the Mach reply capability design, because that capability cannot be forwarded. It requires an explicit reply capability that can be forwarded. Ironically, the inability to forward the reply capability means that forwarding the receive capability needs some care.

If I remember correctly (hey, it's only been 38 years), Mach is even weirder, because a reply port is part of the process state rather than the thread state. A message received by one thread can be replied by a different thread in the same process, but cannot be replied by a different process. This creates a strange asymmetry.

FWIW, I was personally never able to conclude whether the ability to
transfer receivers is a useful feature in general or not.

The ability to transfer the authority to reply is fairly essential. This was a pretty fundamental design mistake in Mach IPC. The ability to transfer receive ports/capabilities is less so, but there is no semantic or security problem with it - it's equivalent to receiving and forwarding all messages to the new receiver.

> No member of the KeyKOS family implemented such a notion. Coyotos
> comes closest. "Entry" capabilities actually point to Endpoint
> objects, which in turn contain a Process capability to the
> implementing process. A scheduler activation is performed within this
> process. This is comparable to a receive port capability because the
> process capability within the Endpoint object can be updated.

Will have to think about whether such a design would work for what I'm
trying to do.

Given my explanation above, I think it will, because you can implement something equivalent to transferring the receive port.

I totally agree that it's probably not useful to have multiple active
listeners... It's not what I'm looking for :-)

It isn't obvious. The problem in a multiprocessor is that two different receive threads on the same endpoint may have message receive times that are different by several orders of magnitude. There is no place outside the kernel where choosing the receiver can be done well.

> This also has the advantage that all of the "pointers" (the object
> references) point from the invoker to the invokee. That turns out to
> be essential if you want to implement transparent orthogonal
> persistence. It rules out receive port capabilities.

That's funny: the thing that (I think) I need receiver capabilities for,
is actually for implementing a (not quite orthogonal) persistence
mechanism :-)

Feel free to steal what we did. The EROS version is pretty thoroughly written up. The Coyotos version was never implemented, but the way we modified the "range" architecture, the migration to a more conventional, multi-generational write ahead log for checkpoint, and the fact that no object has multiple "interpretations", makes it a whole lot simpler. The hardest part of persistence in EROS was that the "node" object had so many possible interpretations that needed to be taken into account. The WAL implementation was completed in later versions of EROS, so it should be possible to borrow that.

One the problem with orthogonal persistence is that it doesn't actually simplify much in networked systems. Two processes running on the same machine will be restored in a mutually consistent way, but processes running on different machines will not. This tends to mean that communications across the machine perimeter behave very differently, and a lot of processes need to know about it. The truth is that these two cases have always behaved differently, but UNIX and Windows go to extreme lengths to hide this from applications (unsuccessfully, because it can't be done in principle when messages cross failure domains).

The other problem is that you sometimes need to violate orthogonality. For example, you don't want to lose a committed banking transaction if the system has to restart before the next checkpoint. KeyKOS, EROS, and Coyotos all have ways to bypass the checkpoint rules for this kind of situation.

In abstract, we know how to build a multi-machine cluster that acts as if it were a single failure domain - I wrote a paper about it decades ago, but never published it because it was never implemented. The hard part is bounded rollback using local checkpoint. If anybody cares I can say more about it. But even if you do this, there will still be "foreign" systems you need to talk to. The problem of independent failure domains isn't going to go away, and once you have to deal with it anywhere the incentive to expand individual failure domains is greatly reduced.

> I have not looked at Viengoos, but this sounds functionally similar to
> what Coyotos does. In Coyotos, the receiving process designates a
> scheduler activation block that says where the incoming data should
> go.

Although I don't know the full history,
the Viengoos approach is quite likely inspired by the Coyotos one...

I didn't know anything about Viengoos until Neal arrived at Johns Hopkins. I don't know what may have happened afterwards, but in my interactions with Neal the Viengoos design seemed pretty well decided. So far as I know, Coyotos did not borrow from Viengoos. Coyotos was leveraging almost 35 years of concrete experience with a particular type of system architecture, and attempting to merge what we had learned in our verification efforts. Initially, it started because I wanted to look at how the "unit of operation" intuition (which was SO critical) would work in a multiprocessor variant. Viengoos, at that time, was a young design, and Neal was still exploring.

Jonathan

From:	Jonathan S. Shapiro
Subject:	Re: Future Direction of GNU Hurd?
Date:	Tue, 16 Mar 2021 11:39:39 -0700