poke-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] Add io space MMAP for memory mapping devices and files


From: Andreas Klinger
Subject: Re: [PATCH v2] Add io space MMAP for memory mapping devices and files
Date: Mon, 8 Jan 2024 17:11:41 +0100

Hi Jose,

"Jose E. Marchesi" <jemarch@gnu.org> schrieb am Mo, 08. Jan 15:08:
> 
> > Hi Jose,
> >
> > "Jose E. Marchesi" <jemarch@gnu.org> schrieb am Mo, 08. Jan 00:03:
> >> 
> >> > Hi again,
> >> >
> >> > Andreas Klinger <ak@it-klinger.de> schrieb am Sa, 06. Jan 20:42:
> >> >> Hi Jose,
> >> >> 
> >> >> "Jose E. Marchesi" <jemarch@gnu.org> schrieb am Sa, 06. Jan 20:14:
> >> >> > > I think that the approach of patch v2 could be optimized by a .set 
> >> >> > > alignment
> >> >> > > variable for those guys who are aware of such hardware constraints 
> >> >> > > on their
> >> >> > > architecture and want to optimize this. But I would set the default 
> >> >> > > to a value
> >> >> > > which works for all, which is the size of the address bus.
> >> >> > 
> >> >> > I think it is perfectly ok to have IOD specific configuration
> >> >> > parameters, sure.
> >> >> > 
> >> >> > Let's make it a Poke variable that the user can customize, plus a
> >> >> > setting in the registry so the .set dot-command works on it.  I 
> >> >> > suppose
> >> >> > it shall have the name mmap in it, like pk_mmap_alignment.  WDYT?
> >> >> 
> >> >> great. It makes sense to name it mmap as it only applies to mmap io 
> >> >> spaces.
> >> > [...]
> >> >
> >> > After thinking about the problem again I came to a different solution 
> >> > which is
> >> > much simpler, doesn't need an additional .set variable and avoids the 
> >> > signal
> >> > handler as well as it is copying with the address bus size if possible, 
> >> > thus
> >> > avoiding performance bottlenecks. The read function is as follows:
> >> >
> >> > static int
> >> > ios_dev_mmap_pread (void *iod, void *buf, size_t count, ios_dev_off 
> >> > offset)
> >> > {
> >> >   struct ios_dev_mmap *dev_map = iod;
> >> >   int align = sizeof(void*);
> >> >   char *m = buf;
> >> >
> >> >   if (offset > dev_map->size || count > dev_map->size - offset)
> >> >     return IOD_EOF;
> >> >
> >> >   /* copy unaligned bytes */
> >> >   while (count && offset % align)
> >> >     {
> >> >       memcpy (m, dev_map->addr + offset, 1);
> >> >       count--;
> >> >       offset++;
> >> >       m++;
> >> >     }
> >> >
> >> >   /* copy with the address bus size */
> >> >   while (count >= align)
> >> >     {
> >> >       memcpy (m, dev_map->addr + offset, align);
> >> 
> >> This is still copying a byte at a time, isn't it?
> >
> > There is no general answer to this question. It depends on what the C 
> > library
> > and the compiler are making out of it. Thus it's also dependent on the
> > architecture and the value of align.
> 
> Actually, memcpy is guaranteed to work regardless of any alignment
> requirement restrictions imposed by the architecture targetted by the
> compiler, so the general answer is that you can always assume that
> memcpy is equivalent to copying bytes by bytes.
> 
> The compiler will only optimize a memcpy (foo, bar, 8) into a long word
> move if it knows that foo and bar are properly aligned according to the
> rules of the target.

You're right.

But I'm still wondering why I get the SIGBUS (on zynqmp) when doing a memcpy of
2 bytes from an odd address which is mmaped, like:

memcpy(some-variable, odd-address, 2);


> > But yes I need to confess that when looking into the glibc for example this 
> > code
> > snippet could be optimized to allow larger amounts of data to be passed to 
> > the
> > memcpy function so that it's up to the C library and compiler to optimize 
> > it the
> > best.
> >
> >
> >> 
> >> >       count -= align;
> >> >       offset += align;
> >> >       m += align;
> >> >     }
> >> >
> >> >   /* copy remaining unaligned bytes */
> >> >   while (count)
> >> >     {
> >> >       memcpy (m, dev_map->addr + offset, 1);
> >> >       count--;
> >> >       offset++;
> >> >       m++;
> >> >     }
> >> >
> >> >   return IOD_OK;
> >> > }
> >> >
> >> > The write function is quite similar implemented in the same manner.
> >> >
> >> > WDYT?
> >> >
> >> > Andreas
> >
> > Best regards,
> >
> > Andreas

-- 
Andreas Klinger
Grabenreith 27
84508 Burgkirchen
+49 8623 919966
ak@it-klinger.de
www.it-klinger.de
www.grabenreith.de

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]