bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: a good pdf-to-text converter? [Re: POSIX 2008 available


From: Bruno Haible
Subject: Re: a good pdf-to-text converter? [Re: POSIX 2008 available
Date: Wed, 10 Dec 2008 12:12:18 +0100
User-agent: KMail/1.9.9

Hi Jim,

> Do any of you know of a pdf-to-text converter that is better than
> pdftotxt?  pdftotxt does not preserve line breaks, table formatting,
> displayed code, etc.  Even the official .txt version of the previous
> release of POSIX had many conversion-artifact errors.

I would use a good html-to-text converter. The best I know of is 'w3m'.
Example:
  $ w3m -dump 
http://www.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html

Bruno





reply via email to

[Prev in Thread] Current Thread [Next in Thread]