Re: bug in task server startup code

l4-hurd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bug in task server startup code

From:	Bas Wijnen
Subject:	Re: bug in task server startup code
Date:	Thu, 07 Oct 2004 21:49:46 +0200
User-agent:	Mozilla Thunderbird 0.8 (X11/20040926)

Now I read this again I remember why there was a need for the error tobe handled. In my previous mail I had forgotten about the map items,and thought it could as well be ignored.


Marcus Brinkmann wrote:

At Tue, 17 Aug 2004 13:11:30 +0200,
Bas Wijnen <address@hidden> wrote:

I did some more testing, added output code to wortel/startup.c (which is the
startup code of all tasks started by wortel except physmem, so only the task
server at the moment) and tried to see if the mappings it request from physmem
(its startup and memory container) arrive with their correct data (I added
some print statements to physmem as well.)

Well, they don't.  startup.c pagefaulted on the check, so I checked the result
of the ipc which received the mapping.  It failed with error code 9, meaning
"message overflow in the receive phase.  A message overflow can occur [string
related stuff] and if a map/grant of an fpage fails because the system has not
enough page-table space available." (L4 Reference Manual X.2 page 62)



Well, first of all, you should check if the mappings are at least
correct (or reasonable).  That's an important sanity check because
there might just be a bug in determining the fpages and their load
addresses.

However, an error 9 is interesting indeed.

It was some time ago, but I remember all kinds of weird stuff happening,like different behaviour when only some debugging was added in partswhich were not called. So I'm not really sure about anything it wasdoing, and perhaps the error was changed between happening andreporting, or something.

The memory is mapped in fpages, maybe mapping one fpage, the one with
the addresses you accessed, worked, and another one failed?  Still,
you would not expect to see wrong data, and in your other mail you say
the offset was actually 0x1000 off or so, which would indicate to me a
bug in the ELF loader/startup mapping stuff.

I thought of that, too. The code is at its correct position in wortel,but there seems to be something wrong with the mapping. However, I haveseen it work as expected, too, which for me points to a buffer overflow(which brought me to valgrind ;-) )

The best way to track such things down is to track them down, line by
line, instruction by instruction.  It's slow work, but you can learn
something about the kernel debugger along the way :)

Failing ipcs may corrupt the database of a capability server.  In this case,
physmem thinks task has received the pages, because it is not notified of the

failed ipc.



This needs some consideration.  I think you have found one of the few
cases (maybe even the only one), where sending an IPC can actually
fail without either the sender or receiver being at fault (in a broad
sense).  OTOH, if there is no room for page tables anymore, you are in
deep shit.  Might as well panic and reboot at that point.

Mapping memory is to be considered a restricted operation.  We can
enforce that by using redirectors.

Eh, I don't see the problem. What's wrong with threads mapping theirmemory to other threads? If they want others to be able to access them,they can just do it for them. When the pages are unmapped, they areunmapped recursively, so the thread cannot keep a mapping by giving itaway. Also, if physmem keeps quota, mapping memory to other threads (oreven granting it) by yourself does not give you the right to an extrapage, so that is also not a problem.

For all other IPC failures I can think of, the story is actually quite
simple: It's either a programming bug in the server (fix the bug then)
or the fault of the client.


Agreed.

If it is not clear to you why it is always the client's fault, I can
explain further.  Let me know which case you are interested in (simple
IPC, string items, map items).

One thing about string items. I recall you saying (in an e-mail orcomments in the code, I don't remember) that they may be supported atsome point. That surprised me, because the reference manualspecifically states that page faults during a string ipc can lock bothsender and receiver (with a malicious pager on either side.) I expectedthat to be enough reason not to use them in the hurd, because there isnu mutual trust (other servers can use them of course, but not hurdservers.)


Thanks,
Bas

signature.asc
Description: OpenPGP digital signature

[Prev in Thread]

Current Thread

[Next in Thread]

Re: bug in task server startup code, Marcus Brinkmann, 2004/10/07
- Re: bug in task server startup code, Bas Wijnen <=
  - Re: bug in task server startup code, Marcus Brinkmann, 2004/10/07
    - Re: bug in task server startup code, Bas Wijnen, 2004/10/08
    - ipc security, Marcus Brinkmann, 2004/10/08
    - Re: ipc security, Bas Wijnen, 2004/10/08
    - Re: ipc security, Marcus Brinkmann, 2004/10/08
    - Re: ipc security, Espen Skoglund, 2004/10/08

Prev by Date: Re: bug in task server startup code
Next by Date: Re: notifications
Previous by thread: Re: bug in task server startup code
Next by thread: Re: bug in task server startup code
Index(es):
- Date
- Thread