swftools-common
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Swftools-common] Problem converting a pdf to swf


From: James White
Subject: Re: [Swftools-common] Problem converting a pdf to swf
Date: Tue, 29 Jun 2010 16:40:50 -0700
User-agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.10) Gecko/20100512 Lightning/1.0b1 Thunderbird/3.0.5 ThunderBrowse/3.2.8.1

Greetings,
    Let me try one more time and I'll try to be clearer.  Below is the output from a set of commands.

    If I convert the file "1-C.pdf" using the command below the resulting swf file contains the text from the OCR layer in the PDF. 
pdf2swf 1-C.pdf 1-C.swf -T 9 -f
NOTICE  File contains pbm pictures
.......
NOTICE  Writing SWF file 1-C.swf

    I can extract the strings in the resulting swf and it contains all the OCR'ed strings.
swfstrings 1-C.swf > 1-C.txt
wc 1-C.txt
 2236  2236 14033 1-C.txt
    However the image portion of the resulting swf file is not correct.
swfrender 1-C.swf -o 1-C.png

    I can create a swf file with a correct image using the commmand below:

pdf2swf 1-C.pdf 1-C-O1.swf -T 9 -f -O1
NOTICE  processing PDF page 1 (612x780:0:0) (move:0:0)
.......
NOTICE  Writing SWF file 1-C-O1.swf
    However the resulting swf file doesn't have any strings in it.
swfstrings 1-C-O1.swf > 1-C-O1.txt
wc 1-C-O1.txt
0 0 0 1-C-O1.txt
    I'm looking for a way to get the correct correct image portion from the "1-C-O1.swf" and maintain the strings that are in "1-C.swf".

    I have put all the files in : http://dl.dropbox.com/u/5051173/cmsd.zip


   Thanks again for your time and effort.

Jim White
206-234-4832

Copyright 2010, Sage Research and Consulting. All rights reserved.
Unauthorized disclosure prohibited.

Notice: This communication may contain privileged or other confidential information. If you are not the intended recipient, or believe that you have received this communication in error, please do not print, copy, retransmit, disseminate, or otherwise use the information. Also, please indicate to the sender that you have received this email in error, and delete the copy you received. Thank you.

On 6/28/2010 12:54 PM, Chris Pugh wrote:
2010/6/28 James White <address@hidden>:
  
Greetings,
     I have a similar problem (I think) and I'm hoping someone can give me
some insight into fixing it.  I have a number of PDF document that I need to
convert (on demand) to swf and view.  I'm using FlexPaper
(http://flexpaper.devaldi.com/) to view swf files.  An important feature is
to be able to search the document.  A number of these PDF documents are
scanned documents with an OCR'ed text in them.  If I use "pdf2swf input.pdf
-o output.swf " all is well.  I can view the embedded OCR results using
"swfstrings" and the resulting swf file is searchable in the FlexPaper
viewer.   However there are a number of PDF files that don't render properly
with out using the "-O1" or "-s poly2bitmap" switch.
    
Take a step back.  Are these PDF's themselves properly searchable?   If whatever
you are using to create the PDFs can't make sense of the OCR'd fonts,
then they'll
simply be rendered as graphics in the PDF.   Test the PDF with some of the
command-line utilities from the xpdf distribution,  such as pdftotext.  See what
happens.

You may also have fonts that xpdf can't make sense of.  Using the -vvv, verbose
option during conversion, and examining the output, should give you a clue as to
what is happening.

HTH

Regards,


Chris.

.


  

reply via email to

[Prev in Thread] Current Thread [Next in Thread]