|
From: | James White |
Subject: | Re: [Swftools-common] Problem converting a pdf to swf |
Date: | Tue, 29 Jun 2010 16:40:50 -0700 |
User-agent: | Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.10) Gecko/20100512 Lightning/1.0b1 Thunderbird/3.0.5 ThunderBrowse/3.2.8.1 |
Greetings, Let me try one more time and I'll try to be clearer. Below is the output from a set of commands. If I convert the file "1-C.pdf" using the command below the resulting swf file contains the text from the OCR layer in the PDF. pdf2swf 1-C.pdf 1-C.swf -T 9 -fI can extract the strings in the resulting swf and it contains all the OCR'ed strings. swfstrings 1-C.swf > 1-C.txtHowever the image portion of the resulting swf file is not correct. swfrender 1-C.swf -o 1-C.pngI can create a swf file with a correct image using the commmand below: pdf2swf 1-C.pdf 1-C-O1.swf -T 9 -f -O1However the resulting swf file doesn't have any strings in it. swfstrings 1-C-O1.swf > 1-C-O1.txtI'm looking for a way to get the correct correct image portion from the "1-C-O1.swf" and maintain the strings that are in "1-C.swf". I have put all the files in : http://dl.dropbox.com/u/5051173/cmsd.zip Thanks again for your time and effort. Jim White 206-234-4832 Copyright 2010, Sage Research and Consulting. All rights reserved. Unauthorized disclosure prohibited. Notice: This communication may contain privileged or other confidential information. If you are not the intended recipient, or believe that you have received this communication in error, please do not print, copy, retransmit, disseminate, or otherwise use the information. Also, please indicate to the sender that you have received this email in error, and delete the copy you received. Thank you. On 6/28/2010 12:54 PM, Chris Pugh wrote: 2010/6/28 James White <address@hidden>:Greetings, I have a similar problem (I think) and I'm hoping someone can give me some insight into fixing it. I have a number of PDF document that I need to convert (on demand) to swf and view. I'm using FlexPaper (http://flexpaper.devaldi.com/) to view swf files. An important feature is to be able to search the document. A number of these PDF documents are scanned documents with an OCR'ed text in them. If I use "pdf2swf input.pdf -o output.swf " all is well. I can view the embedded OCR results using "swfstrings" and the resulting swf file is searchable in the FlexPaper viewer. However there are a number of PDF files that don't render properly with out using the "-O1" or "-s poly2bitmap" switch.Take a step back. Are these PDF's themselves properly searchable? If whatever you are using to create the PDFs can't make sense of the OCR'd fonts, then they'll simply be rendered as graphics in the PDF. Test the PDF with some of the command-line utilities from the xpdf distribution, such as pdftotext. See what happens. You may also have fonts that xpdf can't make sense of. Using the -vvv, verbose option during conversion, and examining the output, should give you a clue as to what is happening. HTH Regards, Chris. . |
[Prev in Thread] | Current Thread | [Next in Thread] |