[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Comments on the hurd-on-l4 document

From: Marcus Brinkmann
Subject: Re: Comments on the hurd-on-l4 document
Date: Wed, 08 Jun 2005 14:36:04 +0200
User-agent: Wanderlust/2.10.1 (Watching The Wheels) SEMI/1.14.6 (Maruoka) FLIM/1.14.6 (Marutamachi) APEL/10.6 Emacs/21.4 (i386-pc-linux-gnu) MULE/5.0 (SAKAKI)

Hi Niels,

great to see you still being interested and your input is as always
appreciated.  Before diving into the details eventually, I want to
give you a head start that under the surface there are some radical
changes going on that are not reflected anywhere, not in the docs, not
in the source, and also not on the mailing list.  So you couldn't know
about it (sorry about that).

The most important changes are related to the capability system.  I am
convinced by now (and I think Neal agrees) that we simply can not
feasibly implement a capability system without support by a central
authority, either the kernel via its IPC system, or a trusted
capability server.

I can not pinpoint this on a single killer argument.  There are a
couple of things, among them:

* We are violating many security aspects of a traditional capability
  system, so e can not rely on existing literature and security
  analysis.  (For example, we are leaking too much internal information).
* In upcoming L4 designs, global thread IDs will be _gone_, and our
  design will not carry over without some fundamental changes anyway.
* Capability transfer requires too much trust.  We should be able to
  accept capabilities even if they come from untrusted sources.  The
  capability transfer mechanism I designed is a pure nightmare (I have
  a race-free design, and it is horribly complicated, and can hardly
  be optimized at all).
* Task info capabilities are just an insane concept to go with in the
  first place.  They are a sad excuse for a real capability and/or
  notification server.
* Our design doesn't support transparent interposition for example of
  proxy servers or debugging servers.
* And our design didn't even go far enough.  To support proxy and
  debugging servers, which must be done explicitely, we need to
  consult the proxy server for each capability transfer.  Imagine
  something like unionfs and it gets pretty expensive.  Or we accept
  that we do not accept the capability in the first place, but use cap
  containers in place of capabilities when making upcalls to the
  proxy.  This means that our design must be extended a lot,
  increasing code complexity and decreasing performance.

So, instead we are now looking at capability server designs and what
type of kernel extensions are necessary.  It seems that only a very
small extension to upcoming L4 designs may be necessary, but it
depends a lot on the exact details, so we try to talk to everyone
about it.

About notifications: My current stance is that they are fundamental,
and have to be done right.  Instead of minimizing their use, I tried
to imagine what we could do if we had good notification support.  And
it turns out it has good consequences, in that we can get rid off
blocking calls and cancellation.  If you look at the
libhurd-cap-server code I have written, a _lot_ of the complexity
comes from the fact that clients can asynchronously cancel an RPC.
This feature is only needed because we expect some RPCs to block for
an unexpected time, like select().  There are very few such calls.
For all other calls, a cancellation request wouldn't in fact do
anything, but just wait until the call completed.  In addition,
cancelling RPCs on the client side is also a slow and complicated
process, due to hairy interactions with the signal thread.  Now, I
looked at the whole thing and tried to imagine it without any blocking
calls, and it becomes a lot simpler.

So, with no blocking calls, the cost is that now we need to implement
select() differently.  A bit of extra cost in the "blocking" case
seems to be OK to me, though.  So, a select() call should instead just
set up a continuation in the server and then return immediately.  The
client passes a "notification object" to the server that can be used
by the server to deliver exactly one notification to the client.
Notification objects are provided by trusted system servers (for
example, the capability server itself!).  The client will then just
wait for the notification (ie it will block on its own notification
handler thread).  If it wants to abort the operation, it just kills
the notification object, revoking the right of the server to send a
notification (eventually the server will clean up because it will get
a "cap death" notification from the capability server).

The only remote call that will block then, and needs to be treated
specifically, is the call the client makes to receive the next
notification from the notification server.  As this is a central
system service, a dedicated thread will be fine for that.  And because
it is only one interface, we can define some very specific semantics
for it if necessary.  For example, that cancellation is allowed by
simply breaking out of the IPC and have the notification server
recover from a failed reply.  Under such circumstances, polling for
notifications could maybe even be done by the signal thread.

Neal is also working on more radical designs for exchanging memory
between driver and client (using the server as an intermediate).  I
can't write much about that though.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]