autoconf
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

ac-archive: ac_sys_largefile and libraries...


From: Guido Draheim
Subject: ac-archive: ac_sys_largefile and libraries...
Date: Mon, 06 Jan 2003 03:18:09 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686; de-AT; rv:1.1) Gecko/20020826

In a discussion lately, I did observe something that may have
been interesting to many but got by unnoticed - the problems
of largefile support modern unix systems.

First, ALL unix98 system are REQUIRED to support largefile access
in their base installation:
       http://unix.org/version2/unix98fm.html
       http://unix.org/version2/whatsnew/lfs.html

However, the support scheme is quite different - linux has
adopted the scheme from solaris to stick to the old 32bit
off_t sizes, and only specific -Defines will make your
program to use the 64bit off_t largefile support of the
system.

On the other hand, freebsd and darwin have done better and
simply changed over to use 64bit off_t by default. Well, an
application writer can take advantage of a prepackaged
macro in autoconf AC_SYS_LARGEFILE that will ensure that
his/her program gets compiled with largefile support on
either unix98 flavours.

If you are currently up on a solaris or linux system (and
most likely you are), you can see the different results
by grepping into the symbol table of your binaries - the
unix98 standard mandates that all base utilities must be
compiled as largefile binaries:

$ objdump -T /bin/cp | grep seek
080490c4      DF *UND*  00000088  GLIBC_2.1   lseek64

$ objdump -T /usr/bin/gdb | grep seek
08161e00 g    DF .text  000001e9  Base        bfd_seek
08070944      DF *UND*  000000d1  GLIBC_2.0   fseek
08070b84      DF *UND*  0000003d  GLIBC_2.0   lseek

* The library problem ...

As you see, the AC_SYS_LARGEFILE call will make these
binaries to be linked with different symbols compiled
into the base libraries (glibc here). That is easily
achieved by specific forms in the header files of the
C library which boils down to

#ifdef _LARGEFILE_SOURCE
#define lseek lseek64
#endif

and the glibc maintainers have ensured that any call
to (32bit type) lseek will be converted into a call
to the actual lseek64, simply because the underlying
unix kernel uses 64bit fileoffsets natively, it's
just that some arguments have to be checked. In
effect, the library exports two lseek symbols:

$ objdump -T /lib/libc.so.6 | grep " lseek"
000bf9a0  w   DF .text  0000003d  GLIBC_2.0   lseek
000cba90  w   DF .text  00000088  GLIBC_2.1   lseek64

* Third party libraries....

While this sounds logical, how would you guide a
developer of a third-party library to support this
dual-mode off_t field. And how can autoconf support
it actually.

First of all, the AC_SYS_LARGEFILE macro is a bit
short-sighted - it does simply enable 64bit off_t
but it does so on either freebsd system where this
is native and on linux system where it is optional.
However, on linux system we want the 64bit variant
of library calls to be named slightly differently.

http://ac-archive.sf.net/guidod/ac_sys_largefile_sensitive.html
has been build on top of it to AC_DEFINE an extra
symbol called LARGEFILE_SENSITIVE which guides the
library maker to the knowledege that this system
is actually off_t/off64_t sensitive. In the header
file of his library, he may write now:

#if defined _LARGEFILE_SOURCE && defined LARGEFILE_SENSITIVE
#define my_seek my_seek64
#define my_open my_open64
#endif

An application linking to the library will now be
made to link with the *64 variants, and the library
itself will provide them if being compiled as a
largefile variant. If the application and library
are both in 32bit then they will also link correctly,
but when they mismatch... then the linkage will fail,
i.e.
   ld: my_open not found
and doing a grep on the system table of the library
will reveal a symbol my_open64.

* Combined exports...

this scheme can be extended however - when the library
sources detect a LARGEFILE_SENSITIVE system and that it
is being compiled as _LARGEFILE_SOURCE then it can
choose to export a catch-call symbol as well.

off_t my_lseek (int fd, off_t offs, int whence)
{ ..... }

#if defined _LARGEFILE_SOURCE && defined LARGEFILE_SENSITIVE
#undef my_lseek /* has been diverted to my_lseek64 */
long my_lseek (int fd, long offs, int whence)
{
    off_t off = my_lseek64 (fd, offs, whence);
    offs = off;
    if (offs != off) { errno=EOVERFLOW; return -1; }
    return offs;
}
#endif


* Conclusions....

Well, what do you want - the 32bit off_t in a largefile unix98
is definitly a bad choice from the beginning. It is always
questionable whether it is good choice to extend the lifetime
of this nuisiance. On the other hand, there is sometimes the
need to create and maintain a library that may be regarded as
a "base library" for many programs and which should therefore
support the ability to -Define an off_t type and size.

hints? comments? objections?

-- cheers, guido        example project: http://zziplib.sf.net





reply via email to

[Prev in Thread] Current Thread [Next in Thread]