l4-hurd
[Top][All Lists]

## Mechanism to request physical memory with certain properties, new algori

 From: Matthieu Lemerre Subject: Mechanism to request physical memory with certain properties, new algorithm Date: Sat, 21 May 2005 01:13:03 +0200 User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux)

```Hi,

My previous previously proposed algorithm had the problem that I
didn't took into consideration superpages, which are groups of
power-of-two pages aligned on a power-of-two boundary.

My new proposal looks like the buddy system in the way allocation is
done; you have zones of size 2^n that you split into two equal parts
when needed.

The difference is that instead of keeping lists of zones of each size,
zones are arranged into a tree, which enable to search for zone with
constraints more efficiently.

The node of a tree is the following struct:

struct tree_node
{
struct tree_node *parent;
struct tree_node *left;
struct tree_node *right;
word_t present_sizes;
};

Each node represent a region in memory. Root node represents the whole memory
(considering that memory is of size 2^M, if that's not the case, you can act
as
if memory was of size 2^M, and some of the memory was already allocated).

When a node has sons, it means that it has ben cut in two: the left son is the
lower region, the right son is the upper region.

A node has no sons if it has been allocated as a whole, or if it's free.

present_size is word so that (present_size & (1 << i)) if the subtree contains
at least one zone of size 2^i.

(Presence of a zone of size 2^i doesn't imply presence of a zone of size
2^i-1)

The goal of this present_sizes word is to find as quickly as possible
if an allocation of a certain size can't be made, and thus to explore as
little
of the tree as  possible.

Normally we have the relation:
tree.present_sizes == tree.left.present_sizes | tree.right.present_sizes

Thus, when modifying this field, you have to go up through the list of parents
to modify the parent's present_sizes list consequently.

Note: all of the following functions are tail-recursive.

Allocation :
============

What allocation do is to explore the tree (choosing the branches according
to the constraints on the address).  If there is no possible allocation for
the requested zone size, then it search a larger zone that it splits in halfs,
and use for its allocation. It's quite similar in spirit to the buddy system.

struct allocation
{
struct tree_node *node;
word_t position;
};

/* Find a node of the wanted size.
MASK_SIZE is (1 << i) where 2^i is the number of wanted frames. */
bool_t
find_node(word_t constraint1, word_t constraint0, word_t mask,
word_t mask_size, tree_node_t node, allocation_t result,
word_t current_pos)
{
{
/* Did we find a zone with the good size.  */
return node;

result.node = find_node (constraint1, constraint0, mask>>1,
current_pos);
if (result.node)
{
result.position = current_pos;
return true;
}

result.node = find_node (constraint1, constraint0, mask>>1,
if (result.node)
{
result.position = current_pos;
return true;
}
}

return false;
}

/* Update the present_sizes information in each node.  */
void
update_nodes (tree_node_t node)
{
new_sizes = node.left.present_sizes | node.right.present_sizes;
if (node.present_sizes == new_sizes)
return;

node.present_sizes = new_sizes;
if (node.parent)
update_nodes (parent);
}

l4_word_t
zalloc (word_t constraint1, word_t constraint0, size_t size)
{

assert (constraint1 & constraint0 == 0)

struct allocation result;
word_t initial_mask_size = (1 << size);
word_t mask = (1 << 31); /* (1 << 63 on 64 bits processors) */

root_node, result, 0))
{
/* Try with a larger zone.  */

/* Allocation was impossible. */
return 0;

}

tree_node_t current_node = allocation.node;

/* Allocate the node according to the constraints.  */
{
tree_node_t node1, node2;

/* Allocate 2 struct nodes.  */
allocate_nodes (&node1, &node2);

node1->parent = current_node;
node1->left = NULL;
node1->right = NULL;

node2->parent = current_node;
node2->left = NULL;
node2->right = NULL;

current_node->left = node1;
current_node->right = node2;

/* New allocations are created.  */
update_nodes (current_node);

current_node = left;
else
{
current_node = right;
}
}

/* Here, current node is a correct node of the good size.  */
current_node.present_sizes = 0;
update_nodes (current_node);

return allocation.position;
}

/* We could instead store current_pos in each node, and allocation
would be slightly faster, but would consume more memory */

Deallocation (general idea)
============================

void
{
/* Find the node by following the bits in the address.  */

/* Look if we can merge the node with its buddy.  Do this recursively.  */

/* Call update_node on the last un-mergeable node.  */
}

This show the general idea.  Optimisations are possible;
(for instance, I just saw that it would be better to change tree_node
to store the address of the buddy node, and of the first child.
This would be better for zfree.)

We could indicate that low memory should be allocated less easily for instance
by exploring the "right" buddy before the left one. We could also have seperate
trees for low memory and normal memory (like Linux does, I believe)

Comparaison with the buddy allocator
====================================

We can do a quick comparaison with the buddy allocator, so in the case where
there is no constraints.

In that case, this zfree takes as many operation as the buddy allocator: it's
a very similar algorithm.

zalloc is different: assuming the size (2^t) you want is present
(which you can know by just looking at the present_sizes word of the root node),
in the worst case you have to look to two nodes for each level to find the good
node.

So this take 2*(N-t) operations to find the node.
Then, also for the worst case, you have (N-t) operations to update the node.

So you have 3*(N-t) "operations" maximum to do an allocation, so if N=20 and t=0
it's 60 operations. (In the worst case where you have only 1 single frame
located at position 0, which is a theorical case.)

In average, if you take t=5, you would have 1.5*(N-t)+ 3 ~= 25 operations to do
an allocation with no constraints. So I thing using this algorithm instead of
the
buddy allocator would not be too slow (or it would not be visible).

Thanks,
Matthieu

```