[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] U+0027, U+002D, and U+0060 in code examples?

From: Tadziu Hoffmann
Subject: Re: [Groff] U+0027, U+002D, and U+0060 in code examples?
Date: Tue, 8 May 2012 20:08:37 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

> On the PostScript side, it should be theoretically possible to
> use the `GlyphNames2Unicode' dictionary (an undocumented Adobe
> Distiller extension) so that PS->PDF software can provide
> non-standard mappings.  Right now, I haven't found a full
> example code for that.

An interesting point.  I've played around a bit with this and
the difficulty I've had is getting ghostscript to actually
emit a ToUnicode map.  I've managed it only by hacking the
ghostscript source (making pdf_simple_font_needs_ToUnicode()
always return true).  Additionally, I've added a
GlyphNames2Unicode dictionary to Courier's FontInfo, like this:

  /quoteright 16#0027
  /quoteleft  16#0060
  /minus      16#002d

(FontInfo is in the "visible" part of the font file, so
no disassembly is required.  grops could insert this while
reencoding, but I'm against doing this unconditionally.)

With these changes selecting the Courier text in the resulting
PDF in acroread returns ASCII code points.

Another point: GlyphNames2Unicode appears to only support
single Unicode points, so we can't map one glyph to a sequence
of characters, as would be desirable for uncommon ligatures.
But for copy-and-pasting command lines it should be enough.

Attachment: encoding.pdf
Description: Adobe PDF document

reply via email to

[Prev in Thread] Current Thread [Next in Thread]