[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Neal H. Walfield
Tue, 28 Dec 2004 13:59:39 +0000
Wanderlust/2.10.1 (Watching The Wheels) SEMI/1.14.6 (Maruoka) FLIM/1.14.6 (Marutamachi) APEL/10.6 Emacs/21.2 (i386-debian-linux-gnu) MULE/5.0 (SAKAKI)
The Hurd's physical memory server (physmem) starts by grabbing all of
the memory from sigma0 in as large mappings as is possible. For
instance, if the system has 192 MB of ram (ignoring that L4 and sigma0
reserve some memory for themselves and that there may be holes in the
physical memory map), physmem would end up with 2 fpages: one 128 MB
in size starting at address 0 and one 64 MB in size starting at
address 128MB. (Although this is not the case, the number of mappings
taking the above exclusions into account is actually relatively
small.) Clearly, physmem will only allocate pieces of these large
fpages to clients by "separat[ing] the fpage and map[ping] fpages
(objects) of smaller size" (L4 reference manual section 4.1).
On the Hurd, we allow clients to request mappings of size larger than
just the base page size. For instance, a client may allocate 16kb of
physical memory and request a map of it at all at once. Assuming the
alignment requirement is satisfied and there is a 16kb block of memory
available, phsymem will provide a 16kb mapitem rather than 4 4kb
We are running into a problem when the client deallocates the physical
memory. physmem needs to make sure that it doesn't have an exant
mapping and it cannot trust (most) clients to do an l4_unmap.
Ideally, physmem would just flush the mapping that it gave to that
client. As far as I can tell, my reading of the specification
suggests that that is not possible for two reasons.
The first problem is that physmem cannot flush mappings given to a
specific task. Thus, if the mapping in question is shared among many
tasks (as shared libraries, for instance, may be and very likely will
be in the case of the C library), then all tasks using the mapping in
question will need to fault it in. This isn't fatal: the tasks will
eventually fault and request a new mapping. A malicious task could,
however, potentially cause a DoS by simply mapping and unmapping the C
library repeatedly forcing all tasks on the system to fault and
request the mappings.
The other problem, and the one which is far worse in my analysis, is
that physmem cannot actually flush the 16kb fpage that it gave to the
client: it must flush the fpage that it has because it would be
"[p]artially unmapping an fpage [which] might or might not work"
(idem). Assuming that physmem flushes the containing fpage (either
the 128MB or 64MB page in the above example), a huge percentage of the
potenial mappings in the system will be invalidated and most clients
will have to fault them back in again. I expect that memory will be
allocated and deallocated on a regular basis and that this would
result in a huge performance hit.
I understand the restriction that a client cannot partially unmap an
fpage as it would require splitting the fpage which can be
complicated, however, the inability to flush an fpage that the server
has given to the client appears to me to be a real problem. This is
not simply a request for a convenience function; I see a lack of a
I was discussing this problem with Marcus and he came up with a work
around: physmem doesn't need to provide the mappings to the clients
directly. Instead, a proxy address space running code which physmem
can trust can be introduced. When a client requests a mapping,
physmem maps it to the proxy task which in turn maps the memory to the
client. When physmem needs to make sure that the maps are removed, it
need only ask the proxy task to unmap them.
Ignoring the added complications of having a client wait for a reply
from possible two different threads in different address spaces, this
remains a problematic solution: clients may request different sized
memory maps or provide different alignments restrictions. One client
might request, for instance, a 16kb fpage of a given memory block
while another only require a single 4kb fpage and a third require the
same 16kb but mapped at address = 4k (mod 16k). In this case, the
proxy task may not be able to reuse the map (especially if we wish to
preserve the ability to flush fpages on a per-task basis). This
limits the amount of extant maps to the size of the address space.
We could impose the requirement that memory be mapped to the proxy
task at most once. Thus, if 4kb of a block of memory is mapped and
later a request (either from the same task or from a different task)
for a 16kb map for the same block of memory which includes the 4kb
area is requested and the 4kb area is not properly aligned in the
proxy task, then we don't offer the 16kb but 4 4kb maps. This,
however, seems like a gratuitous limitation and moreover would mean
that we could not flush mappings on an per-task basis. If we just
hope that a single address space is enough and allow multiple
mappings, then we open ourselves up to another DoS: a task (or a small
group of tasks working together) could allocate all permutations of a
medium sized fpage with different alignments thereby exhausting the
proxy task's address space.
The way around this limitation and potential DoS is to introduce a
per-task proxy. Thus, every task gets a proxied address space. Then
we can securly flush mappings on a per-task basis and the DoS is no
What is the right approach? Am I missing something obvious? Using
proxy tasks seems to me like a huge amount of overhead for
functionality that seems to me to be a straightforward requirement.
The mechanisms that we require could be added to the current API. Is
there any reason not to do this?
- Unmapping fpages,
Neal H. Walfield <=