groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: z/OS porting issues, UTF-8 support, and the groff man(1) page


From: G. Branden Robinson
Subject: Re: z/OS porting issues, UTF-8 support, and the groff man(1) page
Date: Sat, 1 Apr 2023 20:48:48 -0500

At 2023-04-01T16:47:25-0700, Mike Fulton wrote:
> On Fri, Mar 31, 2023 at 2:55 PM G. Branden Robinson <
> g.branden.robinson@gmail.com> wrote:
> > When you're ready to make that shift, be sure to read the
> > "INSTALL.REPO" file in the root of the repository or distribution
> > archive.
> >
> 
> Bruno Haible has provided an enhancement to gnu libiconv that now
> 'falls back' to < and > from the mathematical angled brackets.  The
> net of that change is that 'man groff' now works for me, which is
> great!

Glad to hear it!  I've got a change stashed to add fallbacks on the
groff side too.  There's not much to it.

$ git stash show -p 0
diff --git a/tmac/tty.tmac b/tmac/tty.tmac
index 35a527c32..2a28a7dd2 100644
--- a/tmac/tty.tmac
+++ b/tmac/tty.tmac
@@ -51,6 +51,8 @@
 .fchar \[lA] <=
 .fchar \[rA] =>
 .fchar \[hA] <=>
+.fchar \[la] <
+.fchar \[ra] >
 .fchar \[rg] (R)
 .fchar \[OE] OE
 .fchar \[oe] oe

> I am going to take a crack at getting the 'git build' going. I will
> reach out once I have made progress with that. Hopefully it won't be
> too hard - depends on how many other tools are required for
> bootstrap/configure. It sounds like that may also help with my 'sed'
> problems (see below).

The build dependencies for groff 1.23.0.rc2 and later distribution
archives are, on net, _lighter_ than for groff 1.22.4; you no longer
need a TeX installation.  (You do need an m4 program, though.)

Some background on this can be found at
<https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1011666>.

> > > Yes - that is the change. No - it's not because of sed. We have
> > > ported sed and could rely on it as a dependency. The issue we hit
> > > is a bit ugly.  Because z/OS is a 'multi-tenant' operating system,
> > > we want people to be able to install into a particular location of
> > > their choice (either as developer _or_ as a consumer of the
> > > binary).
> >
> > ...without a recompile, I assume?
> >
> Correct. Without a recompile.

Without a recompile means without re-./configure-ing, so I think you'll
need my build-tree alteration trick.  If your multi-tenancy arrangement
keeps a copy of, say, a generic groff build which can then be copied to
some staging area for user customization, I reckon you could _either_
re-configure and rebuild, or do the simpler sed trick I suggested.

> > For groff 1.23, I revised our man pages to be much more careful
> > about documenting full file specifications to groff-installed files
> > and to compute their values based on the build's configuration
> > parameters--stuff like "./configure --prefix=/home/foobar".
> >
> I will check this out - maybe the problem 'goes away' in 1.23.

I don't _expect_ it to unless you re-configure and build for each user's
scenario...at least if groff gets installed to a location that
identifies the user, as '/home/branden' does, for example.  If, instead,
you have a more tightly compartmentalized user-specific view of the
system that also uses fixed directory names for the software packages,
(e.g., groff is always in /opt/groff, but only users that have selected
it will see it) you might indeed benefit from the groff 1.23
improvements here.  :)

> I will try the 'git build' first and see what that looks like.

I'm eager to hear your experience.

> > And thanks to makevarescape.sed, if the file names wind up being
> > long, they'll break in pleasant locations and won't be hyphenated.

I need to correct myself here.  My suggestion bypasses
makevarescape.sed, so the post-build rewritten @BINDIR@ trick will not
be protected from automatic hyphenation or have good hyphenless break
points in it.  However, if you do end up going back to a sed solution,
it's possible to put those in yourself in an automated way.

The idea is to do what the last line of makevarescape.sed is already
doing: prefix the file specification with (as they appear in the
generated man page document) '\%' and place '\:' after every sequence of
forward slash characters.

> Over the years, the operating system has evolved from MVS

When I first got on the Internet, I encountered this initialism at the
same time as VMS and VM/CMS, which just about broke me.

> to OS/390 to z/OS.  What is shipped with the operating system has
> evolved too. Up until the 80's, there was no POSIX environment
> available.

Would have been difficult; there was no POSIX before 1988.  :)

> That was added in the early 90's as 'Open Edition'. Back in the 90's
> it was optional, but now, it's always available on the z/OS system
> (although you can still restrict users to not be able to _use_ the
> POSIX environment if you want).  So, we now have lots of names for the
> same thing: OS/390 Unix, Unix System Services, Open Edition. Some
> services still spit out the old names (so that tools don't get broken)
> so you will see comparisons to 'OS/390' and sometimes to 'z/OS'.  It's
> important to note that the hardware (e.g a Z16) runs a variety of
> operating systems including Linux, z/OS, z/VM, z/TPF, z/VSE. The Z
> hardware family is typically referenced as 's390x'.
> 
> That was a very long background to say 'Yes - OS/390 Unix can be
> thought of as 'the same' as z/OS' although z/OS has a lot more stuff
> in it than just the POSIX environment that we now refer to as z/OS
> Unix System Services.

Thanks for clearing this up for me.

> > It looks like what's going on here is that z/OS has metadata available
> > for any file of interest to a Unix-like environment that tags a given
> > file as ISO 8859-1- or EBCDIC-encoded (if it has to be interpreted as a
> > character stream encoded using a single byte).
> >
> Correct. We can 'tag' a file (via the chtag command) in the
> hierarchical file system with the CCSID and we have some nice services
> for 'autoconversion' between ASCII and EBCDIC that can be used.

CCSID.  I remembered there being a term for the code page name space but
I could not summon it--thanks!

> > I do not yet assume it would be wise to kill off grotty(1)'s support
> > for generating code page 1047 _output_...but maybe we can.  Is it
> > possible to configure the environment on z/OS such that that is the
> > case?  How do you spell the standard C locale variables for this
> > scenario?
> >
> > "LC_ALL=en_US.EBCDIC"?
> >
> I'm no locale expert, but I think it's the other way around where it's
> assumed to be EBCDIC, e.g. LC_AL=fr_FR.UTF-8

Okay.  groff's nroff(1) command relies specifically on locale settings
to decide what character encoding to use for output, so this will be
worth validating at some point not too far down the road.

You can find the current logic here.

https://git.savannah.gnu.org/cgit/groff.git/tree/src/roff/nroff/nroff.sh?id=f720813c5f512627a246115a255989ad68dff395#n118

Thanks for throwing all this light on the z/OS environment!

Regards,
Branden

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]