bug-groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #64484] [troff] \X escape sequence should read its argument in (som


From: G. Branden Robinson
Subject: [bug #64484] [troff] \X escape sequence should read its argument in (something like) copy mode
Date: Mon, 26 Aug 2024 23:49:43 -0400 (EDT)

Follow-up Comment #17, bug #64484 (group groff):

Hi Deri,

[comment #16 comment #16:]
> Hi Branden,
> 
> Thanks for this. I appreciate you explaining what you are trying to do. ...
I'd love to see the sboxes issue.

Sure--you'll find it below.
 
> Running your for-deri.man example on current git I also get exactly the same
results, so I am not sure what effect your changes are making. If you add
"-fU-T" to the command you may see proper greek glyphs rather than slanted
stuff from the symbol font, it depends, not all URW fonts are the same, some
have proper greek glyphs, some don't. We ought to install the SS font for
devpdf, part of the deri-gropdf-ng branch.
> 
> I have a couple of one liners to illustrate current \X and .device
behaviour, ...

> printf  ".ft U-TR\n\X'ps: αβγabc'\nαβγabc"|test-groff -Tpdf -Kutf8 -Z
> 
> printf  ".ft U-TR\n.device ps: αβγabc\nαβγabc"|test-groff -Tpdf -Kutf8
-Z


With my pending changes, including bleeding-edge hacked-up working copy stuff,
I get a difference.  Which is bad, but it's not where you're worried about
it.


$ printf  ".ft U-TR\n\X'ps: αβγabc'\nαβγabc"|./build/test-groff -Tpdf
-Kutf8 -Z > deri-64484-escape.grout
## 2024-08-26 21:37:21 bash-5.1 [1] {0} (0) branden@illithid:~/src/GIT/groff
!5831$ printf  ".ft U-TR\n.device ps: αβγabc\nαβγabc"|./build/test-groff
-Tpdf -Kutf8 -Z > deri-64484-request.grout 
## 2024-08-26 21:37:46 bash-5.1 [1] {0} (0) branden@illithid:~/src/GIT/groff
!5832$ diff -u deri-64484-*
--- deri-64484-escape.grout     2024-08-26 21:37:21.629078727 -0500
+++ deri-64484-request.grout    2024-08-26 21:37:46.965034227 -0500
@@ -5,15 +5,14 @@
 V12000
 H72000
 x X ps: \[u03B1]\[u03B2]\[u03B3]abc
-V12000
-H72000
-DFd
-wx font 11 S
+x font 11 S
 f11
 s10000
 x Slant 16
-h2500
+V12000
+H72000
 md
+DFd
 C*a
 h6310
 C*b


I screwed up earlier and shared the same patch twice--the rewritten
`device_request()` function.

I mean to share in the first verbatim block a different patch that changed way
a "special node" works.


commit 90a7146a91391a257a323895a2bb5b75fb561eb4
Author:     G. Branden Robinson <g.branden.robinson@gmail.com>
AuthorDate: Mon Aug 26 13:08:39 2024 -0500
Commit:     G. Branden Robinson <g.branden.robinson@gmail.com>
CommitDate: Mon Aug 26 17:05:20 2024 -0500

    XXX Fix Savannah #63074 (1/x).

diff --git a/src/roff/troff/node.cpp b/src/roff/troff/node.cpp
index 0b3cbaaea..9e6d962c4 100644
--- a/src/roff/troff/node.cpp
+++ b/src/roff/troff/node.cpp
@@ -886,19 +886,23 @@ inline void troff_output_file::put(unsigned int i)
 void troff_output_file::start_special(tfont *tf, color *gcol, color *fcol,
                                      int no_init_string)
 {
+#if 0
   set_font(tf);
   glyph_color(gcol);
   fill_color(fcol);
   flush_tbuf();
   do_motion();
+#endif
   if (!no_init_string)
     put("x X ");
 }
 
 void troff_output_file::start_special()
 {
+#if 0
   flush_tbuf();
   do_motion();
+#endif
   put("x X ");
 }


So I think I know why I have the diff above.

But I also suspect this is where I'm running into trouble with sboxes.

Here's a diff of "gropdf -d" output (using Savannah Git HEAD, but I haven't
made any changes in my working copy) when I feed it each of the foregoing
grouts.


--- esc.pdf     2024-08-26 21:43:17.248345518 -0500
+++ req.pdf     2024-08-26 21:43:08.052366869 -0500
@@ -8,7 +8,7 @@
 /Type /Page
 >>
 endobj
-4 0 obj << /Length 1133
+4 0 obj << /Length 1217
 >>
 stream
 q 1 0 0 1 0 0 cm
@@ -18,25 +18,31 @@
 % V12000
 % H72000
 % x X ps: \[u03B1]\[u03B2]\[u03B3]abc
-%% V12000
-% V12000
-% H72000
-% DFd
-0 g
-% wx font 11 S
+%% x font 11 S
+% x font 11 S
 % f11
 q BT
 1 0 0 1 72.000 780.000 Tm
-
 0 Tc
 % s10000
 % x Slant 16
-% h2500
+% V12000
+1 0 0.287 1 72.000 780.000 Tm
+0 Tc
+% H72000
+0.000 0 Td
 % md
-1 0 0.287 1 74.500 780.000 Tm
+1 0 0.287 1 72.000 780.000 Tm
 0 Tc
 0 g
+% DFd
+ET Q
+0 G
+0 g
 % C*a
+q BT
+1 0 0.287 1 72.000 780.000 Tm
+0 Tc
 % Assign: /alpha to 0/97
 /F11 10 Tf
 % h6310
@@ -48,14 +54,14 @@
 % x font 39 U-TR
 % f39
 %! wht0sz=2.500, wt=--
-%! PutLine: XPOS=74.5, CHR=0/97(a)[/alpha], CWID=6.31, HWID=6.31, NOMV=1
-%! PutLine: XPOS=80.81, CHR=0/98(b)[/beta], CWID=5.49, HWID=5.49, NOMV=1
-%! PutLine: XPOS=86.3, CHR=0/103(g)[/gamma], CWID=4.11, HWID=0, NOMV=1
+%! PutLine: XPOS=72, CHR=0/97(a)[/alpha], CWID=6.31, HWID=6.31, NOMV=1
+%! PutLine: XPOS=78.31, CHR=0/98(b)[/beta], CWID=5.49, HWID=5.49, NOMV=1
+%! PutLine: XPOS=83.8, CHR=0/103(g)[/gamma], CWID=4.11, HWID=0, NOMV=1
 0.000 Tw [ (abg) 411.000 ] TJ
 % x Slant 0
 % h4110
 % tabc
-1 0 0 1 90.410 780.000 Tm
+1 0 0 1 87.910 780.000 Tm
 0 Tc
 % Assign: /a to 0/97
 /F39 10 Tf
@@ -63,23 +69,22 @@
 % Assign: /c to 0/99
 % n12000 0
 %! wht0sz=2.500, wt=--
-%! PutLine: XPOS=90.41, CHR=0/97(a)[/a], CWID=4.44, HWID=4.44, NOMV=0
-%! PutLine: XPOS=94.85, CHR=0/98(b)[/b], CWID=5, HWID=5, NOMV=0
-%! PutLine: XPOS=99.85, CHR=0/99(c)[/c], CWID=4.44, HWID=4.44, NOMV=0
+%! PutLine: XPOS=87.91, CHR=0/97(a)[/a], CWID=4.44, HWID=4.44, NOMV=0
+%! PutLine: XPOS=92.35, CHR=0/98(b)[/b], CWID=5, HWID=5, NOMV=0
+%! PutLine: XPOS=97.35, CHR=0/99(c)[/c], CWID=4.44, HWID=4.44, NOMV=0
 0.000 Tw [ (abc)] TJ
 % x trailer
 % V792000
 % x stop
-1 0 0 1 104.290 0.000 Tm
+1 0 0 1 101.790 0.000 Tm
 0 Tc
 ET Q
-0 G
 Q
 endstream
 endobj
-6 0 obj << /CreationDate (D:20240826214317-05'00')
+6 0 obj << /CreationDate (D:20240826214308-05'00')
 /Creator (groff version 1.23.0.1813-52e3b)
-/ModDate (D:20240826214317-05'00')
+/ModDate (D:20240826214308-05'00')
 /Producer (gropdf version 1.23.0.1813-52e3b)
 >>
 endobj
@@ -230,7 +235,7 @@
 /StemV 0
 /Type /FontDescriptor
 >>
-<< /BaseFont /ANFMBC+NimbusRomNo9L-Regu
+<< /BaseFont /LNRAZN+NimbusRomNo9L-Regu
 /Encoding 12 0 R 
 /FirstChar 32
 /FontDescriptor 13 0 R 
@@ -249,7 +254,7 @@
 /Flags 32
 /FontBBox [0 -281 1053 924 ]
 /FontFile 10 0 R 
-/FontName /ANFMBC+NimbusRomNo9L-Regu
+/FontName /LNRAZN+NimbusRomNo9L-Regu
 /ItalicAngle 0
 /StemV 0
 /Type /FontDescriptor
@@ -264,7 +269,7 @@
 /W [1 4 1 ]
 >>
 stream
-^@^@^@^@^@^@^A^@^@^E<D7>^@^A^@^@^F    
^@^A^@^@^@^O^@^A^@^@^@^?^@^A^@^@^F<CC>^@^A^@^@^E^_^@^B^@^@^@^N^@^B^@^@^@^N^A^B^@^@^@^N^B^A^@^@^HY^@^B^@^@^@^N^C^B^@^@^@^N^D^B^@^@^@^N^E^A^@^@*<D7>^@^A^@^@/<93>^@endstream
+^@^@^@^@^@^@^A^@^@^F+^@^A^@^@^F]^@^A^@^@^@^O^@^A^@^@^@^?^@^A^@^@^G
^@^A^@^@^Es^@^B^@^@^@^N^@^B^@^@^@^N^A^B^@^@^@^N^B^A^@^@^H<AD>^@^B^@^@^@^N^C^B^@^@^@^N^D^B^@^@^@^N^E^A^@^@++^@^A^@^@/<E7>^@endstream
 endobj
 trailer
 <<
@@ -272,5 +277,5 @@
 /Size 16
 >>
 startxref
-12179
+12263
 %%EOF


In okular(1), the two PDFs look the same, but only to casual inspection.  I
understand little of PDF, but what I see suggests to me that there are
differences in glyph placement here, which are a showstopper.

I suspect I will be grubbing around in node construction and node `tprint`ing
code of GNU _troff_ to try and figure out where the different motions are
coming from.

> I wonder if it would be possible for you to setup a branch on Savannah with
your code changes so we can both investigate for any problems.

The only objection I have to this is a permission or procedural one; Savannah
Git is not set up to let me force-push branches.  Or at least it wasn't the
last time I did any work on a branch.

My working copy churns like hell and I spend considerable time, once I've
cracked a problem, rebasing and massaging my hackery into something resembling
coherent change sets.

That's also what I would need to do rebase the changes onto the master branch.
 We don't want the master branch's history to look like a chaotic wreck, but
the product of at least somewhat considered changes, without too many false
starts or trips down the rabbit hole.

But I don't have any problem sharing my work as it stands right now.  I'll
generate some patches and attach them.

With all of these attached, all tests pass.

However, there is still a problem to be solved and a decision to be made.

1.  The corruption of the page background in msboxes.pdf.  We don't have an
automated test to detect this sort of wrongness.  When we've root-caused it,
it would be good to write a minimal case to provoke it since this is a
somewhat startling breakage.

2.  Producing groff-man-pages.pdf screams to holy hell over all the `\%` and
`\:` tokens appearing in device control commands.  This is of course a variant
of the problem you wrote `pdfclean` to solve.
Right now, I'm thinking I will just suppress the diagnostic from being thrown
for these known harmless cases.  If I get around to writing my long-promised
string iterator (not for 1.24), we can reactivate the diagnostics while giving
users a tool to sanitize grout-bound strings.

> which is why I get palpitations when you write (after talking about \X) that
"I've just about got the `device` request converted over to the same thing"!!
I realise you are talking about the "third way" (which could be a very good
thing) but it points to bug #63074 which in turn points to bug #65108 where
rule 5d restricts unicode to \[u00XX] 00-1f and 80-FF, a minute subset of
unicode.

Sorry, I should have addressed this when you first raised it.

I have no intention of imposing such a code point range restriction on device
control commands.  In the context of bug #65108 I was thinking of two other
loci of external representation: (a) bytes that get stuffed into
`fprintf(stderr, "%s", whatever, ...)` messages, and (b) bytes that get
stuffed into `fopen(filename, mode)` calls.  Apart from the basic matter of
these things taking arguments of type `const char *` (wide characters go
home!) I don't want to add support for POSIX locales to GNU _troff_ for these
auxiliary aspects of the formatter.  (And besides, your operating
environment's POSIX locale's character encoding and that of the file system
you're using might not be same, a whole 'nother class of headache.)  So for
`tm` and `so` requests and similar, my intention is that you (the generic
user) can specify any character encoding you like as long as you express it in
GNU _troff_'s terms as single bytes.  If those form UTF-8, or some ISO 2022
monstrosity, that's between you and your OS.

I have 12 commits in my working tree, but 9 of them aren't material to this
discussion:


$ git log --oneline origin..HEAD
5d2401c84 (HEAD -> master) XXX [troff]: Fix Savannah #63074 (3/x).
fdea8aff7 [troff]: Fix Savannah #63074 (2/x).
90a7146a9 XXX Fix Savannah #63074 (1/x).
7a504b6e2 XXX honor_vertical_position_traps
577fdf0c8 [troff]: Trivially refactor (bool `get_location`).
60999fcb0 [troff]: Trivially refactor (`do_underline`).
e2faabad6 [troff]: Migrate class backing `.T` register.
c5d22791b [troff]: Trivially refactor (boolify main() vars).
14aa1ac11 [troff]: Make usage message more helpful.
6b1ff2677 [docs]: Tweak "mac" warning category description.
9a50259a7 refer(1): Drop extraneous word.
aff143e1b HACKING: Fix omitted commands in examples.


I'll attach the last 3 (the first 3 in that reverse-chronological list).

Finally, here's a diff of the grout produced for msboxes.pdf by Savannah Git
HEAD and my working copy's HEAD.


$ diff -u msboxes-pdf.grout*
--- msboxes-pdf.grout1  2024-08-26 15:25:09.692875149 -0500
+++ msboxes-pdf.grout2  2024-08-26 22:28:26.429669799 -0500
@@ -281,10 +281,9 @@
 n13000 0
 V250792
 H91692
+x X pdf: background fillbox 80692z 48692z 514581z 757308z 1p
 V250792
 H91692
-mr 42405 10794 10794
-x X pdf: background fillbox 80692z 48692z 514581z 757308z 1p
 n13000 0
 V261792
 H56692
@@ -293,13 +292,10 @@
 n13000 0
 V271792
 H91692
-V271792
-H91692
-md
-DFr 65535 64250 61680
 x X pdf: background pagefill
 f20
-DFd
+V271792
+H91692
 Crs
 h3058
 tX
@@ -1453,9 +1449,9 @@
 n13000 0
 V694292
 H135692
+x X pdf: background off
 V694292
 H135692
-x X pdf: background off
 n13000 0
 V711192
 H56692
@@ -1636,10 +1632,9 @@
 n13000 0
 V64492
 H91692
+x X pdf: background fillbox 80692z 48692z 514581z 757308z 1p
 V64492
 H91692
-mr 42405 10794 10794
-x X pdf: background fillbox 80692z 48692z 514581z 757308z 1p
 n13000 0
 V75492
 H56692
@@ -1653,10 +1648,9 @@
 n13000 0
 V72492
 H91692
+x X pdf: background fill 89692z 56692z 505581z 748308z 0
 V72492
 H91692
-md
-x X pdf: background fill 89692z 56692z 505581z 748308z 0
 n13000 0
 V74492
 H56692
@@ -1706,11 +1700,11 @@
 n13000 0
 V89492
 H91692
-f23
+x X pdf: background off
 V89492
 H91692
-x X pdf: background off
 n13000 0
+f23
 V106392
 H91692
 tbegins
@@ -2014,9 +2008,9 @@
 n13000 0
 V218192
 H91692
+x X pdf: background off
 V218192
 H91692
-x X pdf: background off
 n13000 0
 V235092
 H56692
@@ -2272,10 +2266,9 @@
 n13000 0
 V298792
 H91692
+x X pdf: background fillbox 80692z 48692z 514581z 757308z 1p
 V298792
 H91692
-mr 42405 10794 10794
-x X pdf: background fillbox 80692z 48692z 514581z 757308z 1p
 n13000 0
 V309792
 H56692
@@ -2289,10 +2282,9 @@
 n13000 0
 V306792
 H91692
+x X pdf: background fill 89692z 56692z 505581z 748308z 0
 V306792
 H91692
-md
-x X pdf: background fill 89692z 56692z 505581z 748308z 0
 n13000 0
 V308792
 H56692
@@ -2310,11 +2302,11 @@
 n13000 0
 V323792
 H91692
-f23
+x X pdf: background off
 V323792
 H91692
-x X pdf: background off
 n13000 0
+f23
 V340692
 H91692
 ttak
@@ -2377,9 +2369,9 @@
 n13000 0
 V366692
 H91692
+x X pdf: background off
 V366692
 H91692
-x X pdf: background off
 n13000 0
 V383592
 H56692
@@ -2702,21 +2694,17 @@
 n13000 0
 V481092
 H56692
-x font 6 CR
-f6
+x X pdf: background fillbox 45692z 48692z 549581z 757308z 1p
 V481092
 H56692
-mr 42405 10794 10794
-DFr 65535 65535 65535
-x X pdf: background fillbox 45692z 48692z 549581z 757308z 1p
 n13000 0
 V492092
 H56692
 n13000 0
+x font 6 CR
+f6
 V502092
 H56692
-md
-DFd
 t.
 Crs
 h6600
@@ -5915,9 +5903,9 @@
 n13000 0
 V641692
 H56692
+x X pdf: background off
 V641692
 H56692
-x X pdf: background off
 n13000 0
 V776654
 H538581


That 'mr' command configuring one of the colors (cornsilk, maybe) going
missing distresses me.  I suspect it's a side effect of my machete-whack
change to `troff_output_file::start_special()` in the first diff above.  But
that chop solves other problems--it brings us the "miles better" result in
comment #15, which is why I'd like to make "specials" ('x X' commands) more
orthogonal.  Maybe there is another way to get the "grout" output to recognize
the color selection.

Come to think of it, I was startled to see that there's no `defcolor` request
anywhere in "msboxes.ms.in" or "sboxes.tmac".  I guess "ps.tmac" takes care of
that, but, (after my change) nothing ever "pushes" that color to the grout
unless a glyph gets written with it (or it's used as the fill color in a
geometric object).  A background shading or box drawn without _roff_ drawing
commands, but by _gropdf_(1), is not visible to the formatter.

This is the sort of thing I'm gesturing at when I say that maybe the `fl`
request should be extended to permit us to express the "dirtying" or usage of
some property of the _troff_ environment that is otherwise invisible to the
formatter since the grout language has no concept of environments.

If something leaps out at you, I'm keen to hear it.

Regards,
Branden

(file #56386, file #56387, file #56388)

    _______________________________________________________

Additional Item Attachment:

File name: 0010-XXX-Fix-Savannah-63074-1-x.patch Size: 4KiB
   
<https://file.savannah.gnu.org/file/0010-XXX-Fix-Savannah-63074-1-x.patch?file_id=56386>

File name: 0012-XXX-troff-Fix-Savannah-63074-3-x.patch Size: 3KiB
   
<https://file.savannah.gnu.org/file/0012-XXX-troff-Fix-Savannah-63074-3-x.patch?file_id=56387>

File name: 0011-troff-Fix-Savannah-63074-2-x.patch Size: 7KiB
   
<https://file.savannah.gnu.org/file/0011-troff-Fix-Savannah-63074-2-x.patch?file_id=56388>


    AGPL NOTICE

These attachments are served by Savane. You can download the corresponding
source code of Savane at
https://git.savannah.nongnu.org/cgit/administration/savane.git/snapshot/savane-d9187f8357c4214cb33ffea1211a2ffef68ff4ee.tar.gz


    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?64484>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]