[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ANNOUNCE] Introducing Codezero

From: Bahadir Balban
Subject: Re: [ANNOUNCE] Introducing Codezero
Date: Tue, 07 Jul 2009 11:19:09 +0300
User-agent: Thunderbird (X11/20090608)

Bas Wijnen wrote:

On Sat, Jul 04, 2009 at 01:40:03AM +0200, Ludovic Courtès wrote:
One could have "ctl" and "data" files as in Plan 9, to implement
driver control and data flow (or use extensible file attributes)

File-based API should work for most services such as communication
protocols, drivers, console etc. Please elaborate on why you oppose it.
I suppose Bas is referring to the fact that Plan 9 ends up doing a lot
of possibly costly marshalling/unmarshalling on `ctl' nodes (see [0] for
an example).

I don't really know much about Plan 9, so this isn't what I meant.
However, I had heard this and don't like that either.  But it's not an
essential part of the "all communication is done on file-style objects"

What I meant is that there are at least two major modes of
communication.  One is stream-based and is used for files and tcp/ip,
for example.  When writing or reading, a sent package may end up as two
received packets and vice versa.

The other mode is with packets.  Udp/ip uses this, and so do most device
drivers.  For files, something similar is possible when using mmap.  For
general communication between threads, I would expect this to be the
main mode, in fact.  As a comparison, in the usb protocol, there is:
- interrupt and control traffic: packet-based, small chunks.
- bulk traffic: packet-based, but often interpreted as a stream.  Larger
- isochronous traffic: stream based.  (Like tcp/ip, the underlying
  protocol uses packets, but packet boundaries need not be preserved.)
  Relatively huge throughput.

Using these terms for ipc, I think most communication channels would be
control traffic, sending each other command messages and replies.
When using a stream where you cannot be sure that an entire command is
transmitted in one packet, you will need to do buffering and checking,
only executing the command when it is completely received.  Given the
amount of communication in a microkernel-design, I expect this to be a
major cost.  I would therefore always require comands to be sent in one
go, and provide a guarantee that packet boundaries are preserved (or at
least never created by the transport layer)[0].  Of course, this does not
require a redesign of the system, and you may well have these
requirements already in your definition of file.

Hi there,

Leaving file-based IO aside, what I think you are suggesting here is
that data buffering may result in costs in general. I agree with this.
But when I say file-based IO or any other OS-design related concept,
these are always thought to be implemented in userspace. Currently with
Codezero you may pass up to 2KB of data during a single IPC, that is not
buffered by the kernel. But when implementing calls for larger data
transfers such as read/write (which are stream-based based on your
definition) the pager maps the client buffers in its address space and
fills them in. If data is not available, client blocks, if its
non-blocking call, server may return EAGAIN, etc. The microkernel is not
involved in any of this.

On packet-based communication such as commands, data transfer is
trivial, i.e. during the ipc rendezvous there is usually enough space to
also pass along the packet. So the packet boundary is met naturally
during a single ipc.

On the other hand, using packet-based communication for everything isn't
such a good idea either.  While most device driver and other "control"
traffic is normally in packets, much data traffic is not.  So you would
need to add all the buffering stuff for every stream in the system.  Now
I think there's not much you can do about that.  A stream does not have
a limited size, and new data can keep coming in.  If the kernel would
support it as a stream, it would need to do the buffering itself.  I
don't think I've ever seen an idea to organize streams, which was not
based on packets underneath.  However, in streams it is acceptable to do
partial operations ("half of the data was successfully written, please
retry the rest"), or merge two packets into one.  As I wrote above,
those operations cost a lot when both sides of the communication really
want to be looking at packets.

I think my above explanation is relevant here, too. In Codezero packets
may be passed in one go by a single IPC. Any bulk streaming transfer may
be done in partial operations in userspace, based on a unix-like system
call protocol between the client and the server.

I conclude therefore, that packets and streams are two separate forms of
communication, and that both have their own uses.  Using one system for
both types cannot give you maximum performance.  And with a microkernel
system, you need maximum performance when it comes to IPC, because it
happens so much.

Both methods are provided separately in Codezero, but I agree that
generally, context switches will cause some performance degradation,
compared to a monolithic implementation.

Finally, a note about networking: with ip, there is a stream and a
packet-based protocol (tcp and udp).  However, udp also implies
"unreliable": the order of packets may change, and packets may be lost.
This has great advantages for the routing hardware.  I don't think it's
useful to have anything like that for communication within one computer,
but it can be considered.


Well, anything is possible :) You could well implement this behaviour in
the userspace server that deals with this kind of communication.

Bahadir Balban

reply via email to

[Prev in Thread] Current Thread [Next in Thread]