l4-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Questions


From: Martin Schaffner
Subject: Re: Questions
Date: Sat, 30 Oct 2004 02:16:28 +0200


On 29.10.2004, at 22:31, Marcus Brinkmann wrote:

Martin Schaffner <address@hidden> wrote:
What happens if a task that has to return extra pages does not get the
chance to do so, because it does not get any time slice?

You could have physmem interact with the scheduler at a pretty low
level to guarantee that the task had a fair chance and amount of cpu
cycles.

I don't like this, because:
* It seems more complicated than the following ideas
* the application/library developer still has to create an algorithm that will execute within an arbitrary time/cpu cycle count limit on all architectures
* The following reasoning:

The policy decisions about which page will be given back will be based on some data. The decision will only change if this data changes. The task will immediately be aware of any changes to this data. Given these assumptions, a task would want to keep a list of non-essential pages that it updates whenever this data changes, so that it can quickly tell physmem which page it wants to give back. Now, instead of telling physmem which page it wants to give back, the task could share the list directly, which removes the need to schedule the task when physmem wants to reclaim extra pages.

Let me suggest a model to you: Imagine every task would have a
dedicated page on which it records all pages it considers to be extra.
The page would be logically shared with physmem (ie, physmem would
know which page it is), and could access and read it, and know which
pages it lists.  On the page is a shared mutex that must be locked
when updating the page.

So, when the user wants to modify the list of extra pages (resort them
to vary the priority, or add/remove/exchange entries), it takes the
lock and performs the operation.  If physmem wants to read the list,
it takes the lock with a timeout.  If the timeout expires, the task is
in violation of the protocol.

Disadvantage: it is still (too) easy for the task to get itself killed, if it violates the protocol by modifying this page too slowly. But at least we're sure that the task at least has a time slice when it starts the critical operation.

Or, another model: Use some shared memory protocol to update the data.
Then physmem and the task can access it concurrently without looking.
When physmem does not find enough extra pages on the page, the task is
in violation of the protocol.  In this case, no locking takes place,
so no timeout is needed.  The task would have to make sure that the
list always contains at least as many pages as its extra page count.

I guess this could be done as long as the task's actual modifications of one list entry on the shared page is atomic (otherwise physmem could read half-updated entries and decide that the task is violating the protocol by not providing pages it owns)


I don't have thought about such a shared memory protocol, but my gut
feeling is that it is theoretically possible.  To actually update the
list, it may be necessary for the task to first add the new item, then
remove the old one (so that the total count never drops below the
extra page count).  But that is an acceptable restriction in my
opinion, esp if you first reorder things

The reordering needs to be atomic as well...

 so that the old page which
you want to remove comes last, and add the new one at a higher
priority: Then physmem will never pick that one, as it will only pick
as many as your total extra page count.

I guess you'd have to ensure that both pages are non-essential at the time you start the operation. If both pages are non-essential for the time of the operation, it should not matter how it is done. I'm assuming that the operation is fairly short, so that this is not much of a restriction to the task.

This is a new idea by me, so we have not thought about it yet.  But
it's one possible approach, that has the right semantics and doesn't
require a timeout.  So, we are still looking at solutions to this.

Could the same protocol be used as for Unix shared memory?
Will containers be used for Unix shared memory?


Which task owns the version specifier of a container? What makes sure
that a container's version is incremented on container_copy_in? is this
done by a physmem, or another trusted server?

I am not sure the version specifier is still needed or even part of
the design.

What replaces it?

For lock_physical, a device driver has to be privileged. Is it still
possible to have device drivers started by users? This could be useful
for example for USB devices.

No.  We don't have any particular ideas about untrusted drivers.  You
could write a trusted driver though that allows to communicate with
untrusted drivers, though, if you have ideas on how to do that safely.

Similar concepts could be used as are used for user-space USB-"drivers" in existing OSes, where not all USB devices have kernel-level drivers: some are driven by an application.

If no, is it therefore impossible to directly store a linked
list in a container?

If you mean if it is possible to store a linked list in a container
and _share_ it, then only if you always map the container at the same
virtual address.

Is this virtual address available without asking the owning task for it?

But containers are not intended to be used in this way in the first place.

Why not? The non-essential-tasks page could for example use this structure. Couldn't we just do what we have to do in Unix shared memory, i.e. to make the pointers relative to the start of the shared memory/container?

Can hurd task use L4 IPCs without restrictions (for example for a
single notification), or are they limited to hurdish communication?

Technically, they can use direct L4 IPC.  In general, there are good
reasons to stick to Hurdish RPCs, because those have certain
properties (like, they can be canceled), and it is difficult enough to
implement one IPC mechanism correctly.  Hopefully, the Hurdish RPCs
are generic and fast enough to be usable for a variety of things.
However, nothing stops you from having private protocols and
implementations that are not hurdish, and there may be good reasons
for that (like, if you have two tasks which trust each other, you can
use direct IPC for better performance, if normal shared memory and
hurdish RPC are not fast enough for your application).

Good!

Thanks,
Martin





reply via email to

[Prev in Thread] Current Thread [Next in Thread]