[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[h5md-user] [Provenance] Attaching metadata to PDF files
From: |
Peter Colberg |
Subject: |
[h5md-user] [Provenance] Attaching metadata to PDF files |
Date: |
Thu, 23 May 2013 17:19:40 -0400 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
Hi all,
Sorry for the following off-topic post, but I have to share this tidbit.
Recently, I was looking for a way to attach metadata to a plot file in
PDF format, such as a list of files used to generate the plot, a set
of plot parameters, or even the plot script itself.
PDF has two different ways of storing metadata, in form of an info
dictionary with a predfined set of attributes, or additional streams
in XMP format. The former can be modified using the pdf backend of
matplotlib [1], but does not allow arbitrary fields. The latter,
free-form metadata is inaccessible from matplotlib.
But it turns out that PDF supports file attachments [2].
[1]
http://matplotlib.org/api/backend_pdf_api.html#matplotlib.backends.backend_pdf.PdfPages.infodict
[2] http://blogs.adobe.com/insidepdf/2010/11/pdf-file-attachments.html
So I appended this small snippet to my plot script:
path = tempfile.mkdtemp()
fn = os.path.join(path, os.path.basename(args.output))
fig.savefig(fn + ".pdf")
f = h5.File(fn + ".h5", "w")
f.attrs["input"] = args.input
f.attrs["nbin"] = args.nbin
f.attrs["range"] = args.range
f.attrs["moment"] = (mean, var)
f.attrs["temperature"] = T
f.create_dataset("bins", data=bins)
f.create_dataset("hist", data=hist)
f.close()
shutil.copy(__file__, fn + ".py")
subprocess.check_call(["pdftk", fn + ".pdf", "attach_files", fn + ".h5", fn +
".py", "output", args.output])
shutil.rmtree(path)
Using the pdftk tool, the resulting PDF file contains not only the
plot itself, but also the numerical data, the input filenames, plot
parameters, and the source code. When needed, the metadata is later
extracted using `pdftk <filename> unpack_files'.
Regards,
Peter
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [h5md-user] [Provenance] Attaching metadata to PDF files,
Peter Colberg <=