[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Bitmapped Images (Was: The PDF back end)

From: Reimer Behrends
Subject: Bitmapped Images (Was: The PDF back end)
Date: Mon, 21 May 2001 14:15:06 -0400
User-agent: Mutt/1.2.5i

On Fri, May 18, 2001 at 04:44:56AM +0400, Valeriy E. Ushakov wrote:
> . extend @IncludeGraphics to support other image formats, introduce,
>   say, @FileFormat paramter to that one can say:
>     @IncludeGraphics @FileFormat { JPEG } { logo.jpg }
>   That's a bigger project, but not too much tied to Lout internals 

Actually, it is easy to do that entirely without internal lout support,
just by using a set of external programs (at least that's what I've been
doing whenever the need arose--not that I like using bitmapped images in
Postscript/PDF, but sometimes you don't have a choice).

Warning: This is going to be lengthy, but there's code for an
@IncludeBitmap operator at the end. But since knowing what you're doing
when embedding bitmap images in Postscript can severely affect the size
and quality of the output, there's this lengthy treatise leading up to

To begin with, we need some form of compression, since otherwise most
images at 300+ dpi will consume an enourmous amount of space. It is
important to understand that there are three versions or levels of
Postscript: Level 1 is the most basic set of operations, supported even
by very old Postscript devices. Level 2 is pretty much the current
de-facto standard, having an enhanced set of operators, colorspace
extensions, and the colorimage operator, and being supported by the
majority of devices and interpreters. Level 3 is still not as widely
supported, but has added operators to keep it on par with the new
features in PDF (as far as the imaging model is concerned).

For our purposes, it is important to know that Postscript Level 2 has a
number of built-in compression filters, namely:

  * /RunLengthEncode & /RunLengthDecode -- basic run length encoding
  * /LZWEncode & /LZWDecode -- Using the (patented) LZW algorithm
  * /DCTEncode & /DCTDecode -- The JPEG algorithm, "Discrete Cosine Transform"
  * /CCITTFaxEncode & /CCITTFaxDecode -- The fax format, using Huffman coding

With Postscript Level 3, we also have /FlateEncode & /FlateDecode, the
compression algorithm used by [g]zip. PDF also uses the /LZWEncode
format (and /FlateEncode for PDF 1.2 and later) to compress page
content, which is why PDF files are typically smaller than Postscript
files (for the same reason, applying gzip to Postscript files tends to
make them even smaller than PDF, and also implies that using gzip on PDF
files usually has little effect).

Now, let's get back to compressing pictures. If you have access to
ImageMagick's "convert" program (see <URL:http://www.imagemagick.org/>,
where you will find binaries for almost any OS), you can simply convert
just about any image format to Postscript level 1 or level 2 (or to
other bitmapped formats). Level 1 output usually means that run-length
encoding is applied, which isn't all that great, but helps somewhat in
compressing pictures (simpler and more specialized programs are
"tiff2ps" and "pnmtops"). You will notice that at level 1 pictures are
still quite large. "convert" can also convert to Postscript level 2, and
if you live in a country where the LZW patent doesn't apply, you can
also compile "convert" with LZW support, and (for example) run

        convert -compress LZW image.png eps2:image.eps

There's a problem with using convert with eps2 as the target format;
namely, it will inherit the compression scheme from the source image.
Normally, this would be a good idea, since the creator of the image
usually has a good idea about how the image compresses well, except in
the case of PNG files, which use zip compression, which in turn
results in using the /FlateEncode filter. But as we've seen above,
/FlateEncode & /FlateDecode are level 3 features.

For better compression than /RunLengthEncode, our options are
unfortunately limited. LZW is patented, the CCITT Fax format is useful
only for B/W images, and /FlateEncode is level 3 only, as mentioned
above. However, /DCTEncode _will_ work with Postscript level 2, with the
only drawback being that it's not a lossless compression format.

Now, if you're happy with lossy compression, save your pictures in JPEG
format and grab jpeg2ps from <URL:http://www.pdflib.com/jpeg2ps/> or the
nearest CTAN archive. Jpeg2ps has the nice benefit that it will NOT
decompress and compress the picture (which would result in further
information loss), but will simply generate an EPS wrapper around the
image data (optionally applying a 7-bit safe ASCII encoding), since
Postscript can already handle the format.

After that, on to lossless compression. As we discussed, /FlateEncode
or /LZWEncode would be ideal, but are either unsupported by Postscript
level 2 (i.e. the majority of printers) or hampered by patent problems.
However, JPEG encoding has this mysterious "quality" parameter. Setting
it to 100 (by using "convert -quality 100 input.tif output.jpg", for
instance), you will get an image that is for all practical purposes
(such as viewing and printing) all but lossless and still results in a
pretty decent compression factor (for 24-bit color depth often
comparable to what PNG gives you). You can then use jpeg2ps to get
fairly compact Postscript files from that. Do not use it as a storage
format, though--in most cases, there _is_ still information loss,
however imperceptible.

Appended are files to do all this automatically under UNIX. You can
encode images as postscript level 1, 2, or 3. Level 1 is the most
portable, but only gives you run length encoding. Level 2 uses the
method described above and works best for images with a 24-bit color
depth. Level 3 relies on what "convert" does, which usually means
/FlateEncode compression. There are many printers where the result is
unlikely to work, but it tends to give the best compression for
grayscale images and images with a color map up to 256 colors (such as
GIFs and indexed PNGs). Also, the latest versions of ps2pdf should have
no problems whatsoever with level 3 PS, provided all you need is an
intermediate format to eventually generate PDF (though you may want to
set ps2pdf's parameters for image processing manually).

I have tested the code on the systems that I have access to, but there
are of course no guarantees that the code will always work as
advertised, though I've done my best to ensure that it does. Note also
that the code is intended for bitmapped images. For vector formats such
as WMF, programs such as sketch <URL:http://sketch.sourceforge.net/> and
StarOffice will usually work better when converting to EPS. (Note that
StarOffice produces unnecessarily huge postscript files when working
with gradients.)

                                Reimer Behrends

---- @IncludeBitmap ----
# Usage: @IncludeBitmap { myfile.png }, @IncludeBitmap { myfile.jpg }, etc.
# Include in mydefs.lt or another place where definitions can go.
def @IncludeBitmap
  named @PostScriptLevel { 2 }
  body x
  def @Filter { "lout-bitmap2eps.sh" @FilterIn @FilterOut @FilterErr 
@PostScriptLevel }

  @IncludeGraphic { x }
---- lout-bitmap2eps.sh ----
# To install:
# Fix the reference to bash in the first line if necessary
# Copy to a directory within your path
# chmod 755 lout-bitmap2eps.sh to make it executable
# Requires: identify, convert from http://www.imagemagick.org/
#           jpeg2ps from http://www.pdflib.com/jpeg2ps/
# Note that 'convert' may not support GIF files. In this case, storing them
# as indexed PNG files is usually a good alternative.
# This script generates *.eps and *.tmp files in the same directory
# as the original.
# This file is released under the Gnu General Public License.

if [ $# != 4 ]; then
  echo >&2 "Call '$0' as a lout @Filter, not from the command line."
  exit 1

if read file && test -n "$file"; then
  if [ ! -e "$file" ] ; then
    echo >$ERRFILE "@IncludeBitmap: $file not found."
    exit 1
  base=`echo "$file" | sed -e 's/[.][^.]*$//'`
  [ -e "$base.eps" -a "$file" -nt "$base.eps" ] && exit 0
  if [ "$PSLEVEL" = "1" ] ; then
    convert "$file" "$base.eps"
  elif identify "$file" | egrep 'format: JPEG' >/dev/null; then
    jpeg2ps "$file" >"$base.eps"
    if [ "$PSLEVEL" = "3" ]; then
      # 'convert' doesn't directly support a level 3 EPS target format,
      # so we just use the EPS2 output filter and force /FlateEncode
      # compression. Also, the straight ps3 output appears to be broken,
      # anyway, therefore this workaround.
      convert -compress Zip "$file" "eps2:$base.eps"
      # convert to the best possible JPEG quality and live with the increase
      # in file size we get.
      convert -quality 100 "$file" "jpeg:$base.tmp"
      jpeg2ps "$base.tmp" >"$base.eps"
      rm "$base.tmp"
  echo $base.eps >$OUTFILE
  exit 0
  echo >$ERRFILE "@IncludeBitmap has no parameter."
  exit 1

reply via email to

[Prev in Thread] Current Thread [Next in Thread]