l4-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: notifications


From: Marcus Brinkmann
Subject: Re: notifications
Date: Thu, 07 Oct 2004 17:11:40 +0200
User-agent: Wanderlust/2.10.1 (Watching The Wheels) SEMI/1.14.6 (Maruoka) FLIM/1.14.6 (Marutamachi) APEL/10.6 Emacs/21.3 (i386-pc-linux-gnu) MULE/5.0 (SAKAKI)

At Thu, 07 Oct 2004 15:53:51 +0200,
Bas Wijnen <address@hidden> wrote:
> 
> [1  <multipart/signed (7bit)>]
> [1.1  <text/plain; ISO-8859-1 (7bit)>]
> Marcus Brinkmann wrote:
> >>8 - Task sends a message about it to A.
> > 
> > Specifically, it replies to the IPC started in 1.
> 
> That's how A looks at it, however if I understood the L4 specs then 
> there's no technical difference.  The only difference between doing send 
> and receive in one or two calls is that doing it in one is an atomic 
> operation (while two calls could result in a "too fast" reply which 
> cannot be handled yet.)

Well, that's really how the Hurd looks at it.  We are really using the
capability system here, and that adds a lot semantics to the primitive
IPC operations in L4.  This includes certain guarantees and rules.
You are right, of course, but in the context of this discussion, not
only can we take the cap system's semantics as a given, they are also
necessary.

Unless you want to propose a different communication mechanism than
cap RPCs for notifications.  Which does not work, from my experience.
Which is my whole point ;)

> >>9 - C dies.
> >>10- Task waits (and does _not_ notify A yet.)
> > 
> > Be precise.  We want to define what "notify" means, so you can't use
> > that term.
> 
> Oh, I didn't think I was defining it.  Notify means for me to make the 
> information available in any way.

Sorry then, but we associate something very specific with it.

A normal RPC that queries some data in the server we would not call a
notification, and it would also make information available in any way.
So, what's the difference?  Conventionally, a notification is an
information that is sent to you without you explicitely requesting it
(you might request to receive a certain type of notifications at some
point, but you are not asking for any specific notification, mainly
because notifications are usually used for events you can't predict.
Ie, you want to know when a task from a group of task dies - without
knowing which task might die when.  Or you might want to know when a
file in a certain directory changes - but you have no idea when and
why that might happen).

Ideally, you would just declare your interest in receiving
notifications, and then the notifications would magically appear at
your doorstep.  What does that mean?  Well, take for example signals
in POSIX.  You install a signal handler, and when a signal arrives,
the program execution (by the receiving thread) is suspended, and the
thread that receives the signal jumps to the signal handler.  This is
a notification, and specifically it's asynchronous delivery.
Synchronous delivery would be if you block signals for a while and
then unblock them.

Now, with any such scheme, there is an important issue, and that is
how to deal with notifications that are not readily accepted.  For
synchronous signals, the answer is that they are flagged, and
delivered the next time this is possible (there is a similar setup for
signals that arrive while running a signal handler, and for interrupts
and interrupt handlers respectively).  However, POSIX says (losely)
that if another signal of the same type arrives of one that is
blocked, that signal will be lost.  Your signal handler will only run
once.

This is fine if a notification only conveys a certain vague type of
information like: "there is new data, look at it".  Both signals, the
unprocessed one, and the new one arriving, convey the same
information, so there is no harm in dropping the second signal.  It's
bad if your notifications contain more information that is unique to
each individual notification.

So, what you usually do is to have some sort of a buffer with all your
information, and then have a notification mean that you look into the
buffer for more details.  If you receive a SIGCHLD, then you can run
waitpid several times to learn about all children that died.  The
kernel buffers the information about dead children.

In the Hurd console, the stream of data that arrives is potentially
infinite, so not all changes can possibly be buffered.  The solution
to this is that the console does a full screen refresh if it notices
that buffer data is lost (the buffer data is only used to optimize
screen updates, to avoid flickering and wasting resources in general).

Now, let's talk about RPCs.  In Mach, we use ports.  There are certain
notifications that are provided by Mach itself, and the port system.
Those are always delivered, and there is special room in the kernel
structures for them.  In the case of the generic notifications used
for file changes and the console, for example, we use a port in the
client for which a send right is handed to the server.  The server
will then send notification messages to the client.  These messages
are buffered implicitely by Mach's port system.  However, in the
console I set the buffer size to 1, as I want to have signal-like
semantics to it.  All of this is fine as is.

In L4, the situation is slightly different.  It is totally clear that
the server must never block on sending a notification to a client.
This essentially means that the send timeout must be zero, as for
normal RPC reply messages.  The client must be ready to receive the
notification.  But what if the client is currently in the progress of
processing the last notification the server sent?  There is no way to
avoid a small window of the client not being in a receive state right
after receive and before the next receive.  It's impossible.  So, the
server must buffer at least one notification to the client in its own
space.  Which is not so good.  OTOH, Mach was not better: It buffered
the notification in kernel space.  However, logically, the kernel
space in Mach dedicated as a buffer was attributed to the client, not
to the server.

If you believe in paradigms like: The client should pay for resources
needed to implement services for it, then buffering data for the
client is bad style.  It's not a serious problem, as long as it is
just a small amount of data, but it's semantically dirty.  It just
feels wrong.

There is no solution to this within L4, we have to do the best out of
it.  The task server will have to cache task deaths until the
interested parties received them.  This feature is necessary for
security reasons, and to clean up server resources.  Luckily, there
are few other such notification interfaces, and every one has its own
"profile" and can be implemented using different work-arounds.

Bottom line: notifications are difficult to get right, so we have to
avoid them, and live with the resulting limitations in architectural
cleanliness.

Please note that this is not a criticism of L4, or any other
architecture: The above problems are an inherent property of
notifications.  There is no solution which can fulfill all desired
properties, you can only come as close as you want.  Ultimatively, if
the client does not provide the space, or cpu time, or state to
receive a notification at the time it occurs, the notification must be
dropped eventually.  you can turn the knobs to reduce the likelyhood
and consequences of this happening, but you can not avoid this if the
number and timing of notifications is unpredictable.

> > Better: It can not send a message to A because there is no
> > request to reply to.
> 
> I don't like to use the word "reply", because it suggests (to me at 
> least) that the IPC is started as a direct result of the request.  For A 
> it looks like a reply of course, but for task it's just an IPC.

This is not true.  A server will never do IPC to random tasks just for
the fun of it.  We are talking about an RPC context here.  At least
this is my underlying assumption.  If you want a different IPC
context, you need to define it.

An RPC context implies that there is a valid client thread making a
valid RPC on a valid object the client has a capability for.  It also
implies that the client blocks in the receive state from the time it
sends the message until the server replies (a paranoid client might
want time out, but if that happens, things become undefined, and the
server or capability might become unusable).  An RPC consists of two
phases, the request and reply phase.  We might allow RPCs with only
a request phase (no receive), but that can break things (in
particular, it does not guarantee a delivery, so it is not really
useful).

The server is free to break up the request and reply phase, stash away
the information from the RPC and reply at a later time.  However, this
is currently unsupported (it requires a bit of extra work because the
L4 kernel needs to know about it when the replying thread changes via
ipc propagation).

Talking about any of these issues discussed here in the sole context
of primitive L4 IPC operations is not useful, as this tells us too
little about the semantics and constraints.

The RPC context described above (for capabilities) is just one
possible IPC context.  There are others possible, and I have tried
hard to define at least one other (for notifications), but failed.
The requirements of the Hurd, that no mutual trust is required between
communicating tasks, rules out a lot of possible variations for IPC.

If you have two cooperative, trusting tasks, you can do a lot more
things, which are still useful.  In a Hurd server-client context,
however, it's much more restrictive.

> > Note that you can also cancel
> > the IPC from 1 and let that thread handle the request for more
> > notifications in its next request (this makes things more
> > synchronous, and may be desirable or not).
> 
> If the IPC is cancelled while waiting (so task is not in an IPC), would 
> task get notified?  Or would its reply IPC return immediatly with error 
> set?  That doesn't make sense to me.  I'm probably misunderstanding you, 
> could you please clarify?

Yes, actually, cancellation is a complex operation.  And there are
several cancellations involved.

I was thinking of the following: You have a single thread managing
task death stuff.  This thread will communicate with task, and usually
sit in a blocking RPC.  When a task death notification is received, it
will use callbacks to inform other subsystems like the cap system.

As you proposed, you can use RPCs to task to ask for more task death
notifications.  Or you could let the single task-death-manager-thread
do it (this is what I meant - it's just an option, not necessarily the
better one).  You would then do some magic (add your task to a list or
whatever) and unblock the manager thread to have it take a look at it.

Unblocking means canceling, specifically: pthread_cancel.  Everything
I say now is generic, and in fact much more interesting than the
details of the implementation of a task death notification manager
thread.

When a thread in an on-going IPC operation is cancelled, the IPC needs
to be canceled.  This is a completely different thing than cancelling
a thread.  It's an issue of cooperation: The server can not force you
to cancel the IPC.  But if you don't, you let the server think that
you are still blocking on the IPC, and this can lead to unexpected
results (spurious replies at a later time, blocked resources, in
particular the inability to make another RPC to the same server with
this thread!).  And of course, once you cancel the thread and leave
receive state, you will not know if the server sent the reply already
or not.  In short, you are going into a lot of undetermination, and in
fact a server that notices you are not in receive state when it sends
the reply might choose to punish you (by revoking all caps that your
task has access to, for example).

So, it's a good idea to cancel the IPC.  What happens is that instead
of cancelling the thread, pthread_cancel will notice that you are in
an RPC (glibc magic is needed here, in fact, a lot of hairy, dark
voodoo magic).  pthread_cancel will then send an RPC to the server
itself, and ask the server to cancel the IPC operation of that other
thread (see libhurd-cap-server/bucket-manage-mt.c,
HURD_CAP_MSG_LABEL_CANCEL).  The server will then call (in
manage_mt_worker(), line 423 ff.) pthread_cancel on the RPC worker
thread in the server that is currently processing the RPC of that
other client thread!  So, the pthread_cancel request was propagated
quite literally from the client to the server.

The whole libhurd-cap-server mess was carefully written in a way that
should (modulo) bugs allow you to call pthread_cancel on a worker
thread that is processing a pending RPC at any time without breaking
anything.  The same care has to be taken when implementing server
stubs, btw!  The canceled worker thread will not actually stop running
and be killed.  We delay cancellation, and will make sure that when we
hit cancellation points (ie, we are blocking or about to block), that
we catch the cancelation request and act on it by returning ECANCELED.
This is similar to how it is on Mach right now (and the whole purpose
of hurd_condition_wait in cthread).  This is stuff that is not yet
fully realized in pthread (an in fact, is quite problematic to
implement on L4, but that's another story).

When the worker thread notices the cancel flag, there are two options:
Either it is about to block, then it will not block and return
ECANCELED instead, which is returned as an error message to the
original RPC invoker (the client thread currently blocking).

Or it is too late to do anything, because the RPC is already fully
processed, or it is not a blocking RPC anyway, so what comes next will
not block.  Then the RPC is just processed normally, and no error is
sent.  (See also the comment in line 603).

If you read the code, don't get confused.  There is another
cancellation possible in that code, and that is cancelling the manager
in the server (ie, to stop serving RPCs).  This is related to the
bucket state (GREEN, RED, BLACK, etc) and stuff like that.  Don't mix
them up.

In either case, the RPC will _unblock_, and return to the user (with
an error ECANCEL, or with the normal reply).  Now the client thread
wakes up, and receives the reply, which can be ECANCEL, or the normal
reply.  In the latter case, there will still be the cancellation flag
of your thread set (this must be done by pthread_cancel), so the
information that there was a pthread_cancel request is not lost.

This assumes of course that the canceled thread does not use
asynchronous cancellation, which is evil anyway, but defered
cancellation.  But that is a given anyway, if you want to have
pthread_cancel do anything useful at all.

The same RPC cancellation is required to deliver signals, btw.

Again, there is a lot more to say, but this mail is long enough
already :)  If you have more questions, or anything is unclear, ask.

BTW, not much in this mail is Hurd-on-L4 specific.  The cancellation
setup I describe above is basically what is already implemented in the
Hurd on Mach, but with cthread instead pthread.  It also reflects how
to do it with pthread under Mach.

Thanks,
Marcus





reply via email to

[Prev in Thread] Current Thread [Next in Thread]