Re: New developer

l4-hurd
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: New developer

From:	Marcus Brinkmann
Subject:	Re: New developer
Date:	Wed, 10 Sep 2003 04:43:45 +0200
User-agent:	Mutt/1.5.4i
On Wed, Sep 10, 2003 at 02:52:39AM +0200, Bas Wijnen wrote:
> Right.  What kind of things need to be done at the moment then?  I
> understood from Marco (on IRC) that the design was about finished.

This is exaggeration.  We have the core ideas of the design worked out, but
the details, which matter a lot, are often either missing or there are
alternatives to choose from.

We are currently documenting the core ideas and a lot of the detail issues.
(See the doc directory of the hurd-l4 module in the Hurd CVS tree).

> Does
> that mean it's time to start coding, or am I missing some steps?

The design needs to be refined until it compiles ;)

> I
> haven't seen any document with the proposed interfaces of various servers 
> (L4-hurd specific ones I mean), which is something I would expect is 
> needed as part of the design.  Did I miss them, or do they still need to 
> be written?

An interface definition is a very specific piece of information.  To pin
down the details at that level, you really must know how everything is going
to fit together.  This is something that will automatically evolve once we
get confidence in our ideas.

> > I doubt that much can come out of it in terms of real productivity in 
> > these early stages if you don't become a jack of all trades.
> 
> Good :-)  I thought it would be better to start with a smaller part, and 
> work into it,

Well, you can take this approach to get your mind into the issues, but
surely you can not expect to write code in its final form without understanding
how it all will fit together eventually.

So, what is much more important than what the task interface exactly is
going to be down to the bit level, is to understand the grand design, and
then apply it to the various issues of the operating system.  Any day I
think about more issues of POSIX compliance, Hurd specific features and
whatever else you expect from an operating system, I detect more details I
didn't think about before, and some of these details really have the
potential to change the over-all design a _lot_.

Core issues might be, for example: How asynchronous operations are performed
(asynchronous on the client side?  on the server side?  What does this mean
to resource allocation?  What does it mean to POSIX compliance?  If msync()
is asynchronous on the server, this could affect our memory model in
fundamental and negative way.  If it is asynchronous on the client side,
what happens if the client dies abruptly to the operation wrt POSIX
compliance?)  What happens to a shared mmap()'ed region when the task dies?
How can we make sure that advanced features like network transparency and
persistency are not completely impossible to add later?  What do we have to
take care of to make real time implementations of variouses interfaces
possible?  What do we have to avoid to allow a high level of performance for
the common operations?  What do we have to avoid to get a high degree of
scalability on SMP systems?  How can we do all this while still allowing
user to have maximum control over what happens?  How can we give the user
maximum control without compromising security?  How to implement
notifications from the server to the client?  How to implement a somewhat
scalable select/poll?  How can we simplify the protocols in acceptable ways
to reduce code complexity?  How does the current Hurd design need to be
changed to be able to implement it under the new security and robustness
paradigms?

There are more questions like this, the list goes on and on and on.  And all
these questions are much more important than the question if task_create()
gets the max number of threads or the utcb area location as the first
argument.  We can not hope to address all issues in the first
implementation, but we should try to avoid to implement inferior solutions
knowingly.

So, once you have a good idea on what the plan is, the time has come to not
jump to its implementation, but to perform a reality check.  I am personally
in a mixed phase of on-going design, reality check, and some first steps in
getting code out that runs on L4 (to make my self familiar with it, and to
test out the ideas we are already pretty confident about).

It would be good to have more people work on the design and reality check. 
The reality check is actually something rather simple:  Take a copy of
POSIX, one of Richard Stevens books, the GNU C library manual,
the current Hurd interfaces, or whatever other topic interests you, and pick
a function or topic randomly.  Then try to project this function or topic
onto the GNU Hurd on L4 design, and try to envision how the implementation,
following the current design, would perform in a real world situation.  Try
to pay particular attention to border cases (what happens if a task is
SIGKILLed?  What if the untrusted party in the communication is malicious? 
What happens to system resources?  What about side effects?)

Once you stumble upon an interesting topic, problems will pop out of nowhere
quickly.  Then the interesting part of the work should happen: Re-designing
to solve problems.  Sometimes problems can be discussed away, simply by
giving new interpretation to what you see (for example, what a client's
responsibilities are).  In other times, you need to change the design in
radical and new ways.

An example: I didn't think much about pipes at all initially, assuming they
are going to be implemented as socketpair() over a unix domain socket, using
a pflocal server as in the Hurd now.  However, I then realized that we can
not have a global pflocal server, because of descriptor passing: A server
can not accept capabilities (which are behind file descriptors) from
untrusted clients.  In fact, that you can "hide" file descriptors in a
local socket is one of Unix design flaws.  So, can we do better than Unix,
and better than the current Hurd on Mach?  In fact, we _must_ do better as
the old solution is not desired on L4.  So, I changed the design to let
every user have their own pflocal server.  glibc can even start a pflocal
server just for the task, if the user doesn't happen to have one installed
(in ~/.hurd-servers/socket/2 for example).  This is a sane solution, because
now the pflocal server can trust the user and accept capabilities on its
behalf.  And it is also secure from the system's point of view, as the
pflocal server is a normal user task.  If this is also secure if you start
to consider communication over the socket or pipe with _other_ tasks,
possibly owned by a different user, is left to an exercise for the reader :)
And the other exercise is: Can we possibly optimize the pipe implementation
by not using a pflocal server but shared memory?

The whole issue of pflocal got to my attention when the descriptor passing
feature came to my attention, browsing one of Richard Stevens excellent
books.  I knew about descriptor passing long before that, it just didn't
came to my attention that we need to pay attention because the old solutions
wouldn't work.  A lot more work can be done by simply becoming aware of such
issues and pondering them.

Here are a couple of interesting functions that we will need to give special
attention:

mmap(), with all its parameters (a change in the parameters dramatically
changes the semantics for our user-level VMM system!)
munmap(), for all these cases
msync(), again, for all cases
fork(), I haven't thought about that one at all yet, and it is such an
 important function!
exec functions are pretty much solved, and very interesting
read(), write() and buffer exchange semantics: can we use string items? 
 mappings? for which buffer sizes does it pay off to use direct IPC instead
 of physmem containers?  this will require benchmarks of course, but before
 that we need to know what our options are.
stream I/O: can we make our design so that the stream I/O functions can be
 optimized intelligently?
pthread: simple things like suspending and resuming all threads in a task
 are an interesting challenge on L4, because only local threads can stop
 other threads.  So we will need to extend pthread to be able to glue in
 a pthread_suspend_all_threads_but_this_one with thread creation and
 destruction.
kill(): signal handling is a whole issue in itself.  in particular
 cancellation of RPCs.   This has a huge potential impact on the IPC design.
record locks: you can get the PID of the lock holder from the filesystem
 (one of the more exotic POSIX features).  on the Hurd on L4: Should the
 filesystem return the PID, or better the task ID?  And why?  This leads you
 straight into the synchronization issues of task ID protection and reusage.

Well, this is only a selection.  There is much more, I was just writing down
whatever crossed my mind first, for example because it is something I
thought about recently.  POSIX has 1000 pages, and many of them pose some
interesting question in the context of this port.

> > Well, you should read and understand the L4 specification, and the glibc 
> > and the Hurd source code.
> 
> Right.  I'm almost through the L4 specification, you'll hear from me again 
> in 5 years, when I'm done with all the source code ;-)

Lol, that's the spirit :)  We'll still be around, by any chance.

Thanks,
Marcus

-- 
`Rhubarb is no Egyptian god.' GNU      http://www.gnu.org    address@hidden
Marcus Brinkmann              The Hurd http://www.gnu.org/software/hurd/
address@hidden
http://www.marcus-brinkmann.de/
[Prev in Thread]
Current Thread
[Next in Thread]
New developer, Bas Wijnen, 2003/09/06
- Re: New developer, Marcus Brinkmann, 2003/09/07
- Re: New developer, Bas Wijnen, 2003/09/09
  - Re: New developer, Marcus Brinkmann <=
Prev by Date: Re: New developer
Next by Date: about roottask`s utcb ?
Previous by thread: Re: New developer
Next by thread: about roottask`s utcb ?
Index(es):
- Date
- Thread