l4-hurd
[Top][All Lists]

## Re: Mechanism to request physical memory with certain properties

 From: Michael D. Adams Subject: Re: Mechanism to request physical memory with certain properties Date: Sat, 11 Jun 2005 15:06:20 -0500

```You've got me very curious.  What is this type of mechanism intended
to be used for?

(Sorry this is a month after the origonal e-mail.  I've been working
though a large back log.  If an e-mail send after mid-May already has
the answer to my question I won't have seen it yet.)

On 5/16/05, Matthieu Lemerre <address@hidden> wrote:
>
> Hi,
>
> Neal asked me to find a mechanism to request physical memory with
> constraints on location, so here's my proposal.
>
> things.  Maybe some of my explanations are also unclear.  Maybe
> there's faster ways to achieve this. Please tell me!
>
>
> Mechanism to request physical memory with certain properties
> ============================================================
>
> Description:
> ------------
>
> We need a mechanism to request physical memory with constraints on location.
> Constraints are requirements on the bits on the address:
>
> 01X0X10X11X00X means that the bytes you require on the allocation must have 1
> and 0 where specified, and can be either 1 or 0 when there is a 'X'. To do
> this,
> we will pass a pair of two words like this:
>
> *bits in the first word will indicate bits which have to be on in
>  the address of the physical memory eventually choosen
>
> *bits on in the second word will indicate bits which have to be off
>  in the address of the physical memory eventually choosen
>
> So with the example above, 01X0X10X11X00X is represented by
> (01000100110000, 10010010000110).
>
> The allocation unit is the frame, wich is 2^p bytes (p=12 on IA32)
>
>
> Possible solution:
> ------------------
>
> Here's my proposal to solve this problem.
>
> Memory is represented by a table of bits, in which a bit is set to 1
> if the corresponding frame is free, and 0 if not (or not existing).
>
> The table is the following:
>
>        |2^(N-z)-1 | 2^(N-z)-2 | ... | 1 | 0 |
> --------|------------------------------------|
> 0       |    0           1             0   0 |
> 1       |    1           1             0   1 |
> 2       |    0           1 (A)         0   1 |
> ...     |                                    |
> (2^z)-1 |    1           0             1   1 |
>
>
> Frames are numbered from 0 to (2^N)-1. The number of a frame in the table is:
> col_number * 2^z + row_number.
>
>
> For instance, frame (A) is free (its bit is set to 1), and its frame number is
> (2^(N-z)-2) * 2^z + 2.
>
>
> So, any frame number can be decomposed like this: 01001
> 101001010010010001001001
>                                                  ^^^^^
> ^^^^^^^^^^^^^^^^^^^^^^^^
>                                            (N-z) bits          z bits
>
> Similarly, we decompose the constraint word like this:
>
> 01XX1 10X10XX11X00X1 XXXXXXXXXXXX
> ^^^^^ ^^^^^^^^^^^^^^ ^^^^^^^^^^^^
> (N-z)bits  z bits    p undefined bits
>
> (since the allocation unit is the frame, you can't impose constraints on the
> 12
> last bits)
>
> The general idea is:
> 1/To generate a mask of possible cols (made with the N-z bits of the
> constraint word)
> 2/To iterate over the possible rows (with respect of the z bits of the
> constraint words)
> 3/In each iteration, we apply a logical AND on the mask and the row.  This
> give
> which frames were allocated. Then we clear the corresponding bytes.
>
> It's simpler for the algorithm if 2^(N-z) is the size of a word ( N-z=5 on
> IA32).
>
>
> Iteration over the possible rows:
> ---------------------------------
>
> Here's the algorithm to iterate over the possible rows: (Pseudo-C code)
>
> void
> iterate_rows (word_t constraint1, word_t constraint0, function_t function)
> {
>  iterate_rows_rec (constraint1 >> p, constraint0 >> p,
>                    2^z, 0, function);
> }
>
> void
> iterate_rows_rec (word_t constraint1, word_t constraint0,
>                  word_t mask, word_t beginning_of_word, function)
> {
>    function(beginning_of_word);
>  else
>    {
>                          beginning_of_word, function);
>
>    }
> }
>
> Note: the problem with this algorithm is that allocation is not distributed,
> so future allocations may take more time.  But there exist workarounds, like
> the following modification to the algorithm:
>
> void
> iterate_rows_rec (word_t constraint1, word_t constraint0,
>                  word_t pseudo_random_word, function)
> {
>    function(beginning_of_word);
>  else
>    {
>        {
>                               beginning_of_word, function);
>
>        }
>      else
>        {
>
>                               beginning_of_word, function);
>        }
>    }
> }
>
> Pseudo-random word could be set so that we can indicate that some memory
> is more desirable than other. (Like memory for DMA transfers should be
> allocated
> less easily).
>
> Generation of the column mask
> -----------------------------
>
> We have to generate a column mask which will make us able to check one row at
> one time.
> If there is no constraints on the upper N-z cols, we can check 2^(N-z) frame
> at one time!
>
>
> If N-z = 5:
>
>                       0b00110011001100110011001100110011,
>                       0b00001111000011110000111100001111,
>                       0b00000000111111110000000011111111,
>                       0b00000000000000001111111111111111};
>
>                       0b11001100110011001100110011001100,
>                       0b11110000111100001111000011110000,
>                       0b11111111000000001111111100000000,
>                       0b11111111111111110000000000000000};
>
>
> /* Only the first N-z bits are interresting.  */
> constraint1 = constraint1>>(p+z);
> constraint0 = constraint0>>(p+z);
>
> for(i=4; i>=0; i--)
>  {
>    test_mask = 1 << i;
>  }
>
>
> Test of the row and allocation
> ------------------------------
>
> Returns the number of allocated frames in the row.
>
> int
> allocate_frames(word_t row_number, word_t col_mask, allocation_t result)
> {
>   word_t allocated_frames = memory[row_number] & col_mask;
>
>   /* Deallocates the allocated frames*/
>
>   result.row_number = row_number;
>   result.allocated_frames = allocated_frames;
>
>   return NUMBER_OF_BITS (allocated_frames);
> }
>
>
> An allocation is a list of (row_numbers X allocated_frames):
>
> struct allocation
> {
>  word_t row_number;
>  word_t allocated_frames;
>  struct allocation_unit *next;
> };
>
>
> Deallocation of frames
> ----------------------
>
> Thus, deallocation is trivial: We just have to re-set the right bits
> at the right place.
>
> void
> deallocate (allocation_t *allocation)
> {
>  allocation_t unit = allocation;
>  do
>   {
>     word_t row_number = unit.row_number;
>     memory[row_number] |= unit.allocated_frames;
>    } while(unit = unit.next);
> }
>
>
> Conclusion
> ==========
>
> Allocating memory should not be too long using this algorithm, provided that
> memory
> is distributed over the different rows.  Deallocation is very fast.
>
> One possible variant would be not to have an array of bits, but to
> have an array (of size 2^z) of lists. But that would consume more
> memory and would be slowest, I think.
>
>
> Thanks,
>
> Matthieu
>
>
> _______________________________________________
> L4-hurd mailing list