[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: a good pdf-to-text converter? [Re: POSIX 2008 available
From: |
Bruno Haible |
Subject: |
Re: a good pdf-to-text converter? [Re: POSIX 2008 available |
Date: |
Wed, 10 Dec 2008 12:12:18 +0100 |
User-agent: |
KMail/1.9.9 |
Hi Jim,
> Do any of you know of a pdf-to-text converter that is better than
> pdftotxt? pdftotxt does not preserve line breaks, table formatting,
> displayed code, etc. Even the official .txt version of the previous
> release of POSIX had many conversion-artifact errors.
I would use a good html-to-text converter. The best I know of is 'w3m'.
Example:
$ w3m -dump
http://www.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html
Bruno
- POSIX 2008 available, Eric Blake, 2008/12/09
- Re: POSIX 2008 available, Mike Frysinger, 2008/12/09
- a good pdf-to-text converter? [Re: POSIX 2008 available, Jim Meyering, 2008/12/10
- Re: POSIX 2008 available, Bruno Haible, 2008/12/10
- Re: POSIX 2008 available, Bruno Haible, 2008/12/14
- Re: POSIX 2008 available, Bruno Haible, 2008/12/14
- Re: POSIX 2008 available, Bruno Haible, 2008/12/14
- Re: POSIX 2008 available, openat, Bruno Haible, 2008/12/14
- Re: POSIX 2008 available, openat, Jim Meyering, 2008/12/14
- Re: POSIX 2008 available, openat, Bruno Haible, 2008/12/14
- Re: POSIX 2008 available, openat, James Youngman, 2008/12/14