[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [AUCTeX] Re: tex2text
From: |
Andreas Roehler |
Subject: |
Re: [AUCTeX] Re: tex2text |
Date: |
Tue, 14 Nov 2006 11:26:03 +0100 |
User-agent: |
Thunderbird 1.5.0.4 (X11/20060516) |
Ralf Angeli schrieb:
* Andreas Roehler (2006-11-11) writes:
Couldn't find a convenient way to transfer tex-files
into plain text.
http://www.tex.ac.uk/cgi-bin/texfaq2html?label=toascii
OK, thanks.
Unfortunatly:
- dvi2tty
doesn't handle words splitted by tex formatter,
i.e. soft or syllable hyphens are still visible as
Char: - (45, #o55, #x2d)
long lines stay splitted.
Also backaddresses underline isn't removed.
- catdvi
fails with: bytesex.c:59: u_readbigendiannumber: Assertion `count <= 4'
failed.
- crudetype
seems difficult to install, longs for `Tangle'
As a solution at least for tex-documents build with
Dinbrief wasn't far away, I proceeded so far.
;;; tex2text.el --- Delete tex formatting elements. Preserve address,
signature, phone etc. at a convenient place.
;; Copyright (C) 2006 Andreas Roehler
;; Author: Andreas Roehler <address@hidden>
;; Keywords: tex, wp
;; This file is free software; you can redistribute it and/or modify
;; it under the terms of the GNU General Public License as published by
;; the Free Software Foundation; either version 2, or (at your option)
;; any later version.
;; This file is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
;; GNU General Public License for more details.
;; You should have received a copy of the GNU General Public License
;; along with GNU Emacs; see the file COPYING. If not, write to
;; the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
;; Boston, MA 02110-1301, USA.
;;; Commentary: Delete tex formatting
;;; elements. Preserve address, signature, phone
;;; etc. at a convenient place. Works with code
;;; employing `dinbrief' at the moment. Watching
;;; execution, I should be easy to adapt the script
;;; according to other packages in use.
;; This example code should facilitate a check
;; \documentclass[12pt]{dinbrief}
;; \usepackage{geometry}
;; \geometry{a4paper,bottom=2cm,top=2cm,left=35mm}
;; \usepackage[utf8x]{inputenc}
;; \usepackage[T1]{fontenc}
;; \usepackage{german}
;; \nowindowrules
;; \nowindowtics
;; % \stdaddress{JOE ANYONE}
;; \address{JOE ANYONE\\
;; ANYSTREET. 0 \\
;; 10407 Berlin\\}
;; \signature{JOE ANYONE}
;; % \place{XY-Stadt}
;; \begin{document}
;; \phone{+4930}{60000003}
;; \begin{letter}{An das\\
;; Finanzgericht Berlin\\
;; 32. Senat\\
;; Postfach 7300\\[\medskipamount]
;; {\bf 10863 Berlin}}
;; \backaddress{JOE ANYONE, ANYSTREET. 0, 11067 Berlin}
;; \yourmail{7 XY 544/93 u.a.}
;; % \sign{123456}
;; \subject{JOE ANYONE./. Finanzamt Friedrichshain/Prenzlauer Berg}
;; \opening{Sehr geehrter Herr Vorsitzender Richter K.,}
;;
;; bitte gestatten Sie festzustellen
;;
;; \begin{quote}
;; Der Senat hatte den Antrag des
;; Erschienenen auf Aussetzung der Vollziehung
;; mit der Begründung zurückgewiesen, er sei unzulässig, da
;; er nicht unterschrieben worden ist.
;;
;; Dies ist ausweislich der heute stattgefundenen
;; Akteneinsicht falsch. Die Unterschrift befindet sich
;; deutlich sichtbar auf Bl. 61 der Klage und
;; Antragsschrift.
;;
;; ...
;; \end{quote}
;;
;; Wie aber ist das Geschehen zu erklären?
;;
;; Ich habe das Finanzamt über die
;; Unrichtigkeit der Steuerschätzung informiert, u. a. mit
;; Schreiben vom 27. Oktober 1999 an Frau M. Dort
;; heißt es:
;;
;; \glqq
;; ..., ist die irrtümliche Steuerschätzung nach den Unterlagen des
Finanzamtes zu korrigieren.
;; \grqq{}
;;
;; Es kann nicht angehen, daß fehlerhafte Amtshandlungen
;; dem Opfer angelastet werden.
;;
;;
;; \closing{Mit freundlichen Grüßen}
;; % \ps{Wir bitten um schnelle Erledigung.}
;; % \cc{Kopie an:}
;; % \encl{Abschrift der Urkunde}
;; \end{letter}
;; \end{document}
;;; Code:
(defun tex2text ()
"Delete tex-formatters, but preserve address, signature, phone etc."
(interactive "*")
(let ((backadress-flag nil))
(goto-char (point-min))
(while (re-search-forward "^%+.*\n" nil t 1)
(replace-match ""))
(goto-char (point-min))
;; ;; {\bf 10863 Berlin}}
(while (re-search-forward "^{\\\\[[:graph:]]+ \\([^}]+\\)}" nil t 1)
(replace-match (match-string 1)))
(goto-char (point-min))
;; [\medskipamount]
(while (re-search-forward "\\[\\\\[^\]]+]" nil t 1)
(replace-match ""))
(goto-char (point-min))
(goto-char (point-min))
;; ^\\\\begin{letter}
(while (re-search-forward "^\\\\begin{letter}" nil t 1)
(replace-match "\n")
(kill-list-atpt)
(insert (replace-regexp-in-string "[{}\\]" "" (format "%s" (car
kill-ring)))))
(while (re-search-forward "^\\\\\\(end\\){quote}" nil t 1)
(replace-match "\"")
(forward-char -1)
(delete-region (point) (progn (skip-chars-backward " \t\n\f")
(point))))
(goto-char (point-min))
(while (re-search-forward "^\\\\\\(begin\\){quote}" nil t 1)
(replace-match "\"")
(delete-region (point) (progn (skip-chars-forward " \t\n\f")
(point))))
(goto-char (point-min))
(while (re-search-forward "^\\\\\\(glqq\\)" nil t 1)
(if (in-string-p)
(replace-match "\'")
(replace-match "\""))
(delete-region (point) (progn (skip-chars-forward " \t\n\f")
(point))))
(goto-char (point-min))
(while (re-search-forward "^\\\\\\(grqq\\){}" nil t 1)
(if (in-string-p)
(replace-match "\'")
(replace-match "\""))
(forward-char -1)
(delete-region (point) (progn (skip-chars-backward " \t\n\f")
(point))))
(goto-char (point-min))
(when (re-search-forward "^\\\\backaddress{\\([^}]+\\)}" nil t 1)
(setq backadress-flag t)
(save-excursion
(goto-char (point-min))
(insert (replace-regexp-in-string "[{}]" "" (match-string 1)))
(newline))
(kill-region (point) (line-beginning-position))
(newline))
(goto-char (point-min))
(when (re-search-forward "^\\\\address" nil t 1)
(if backadress-flag
(progn
(save-excursion
(kill-region (point) (progn (forward-list) (point))))
(kill-region (point) (line-beginning-position)))
(replace-match (match-string 2))))
(goto-char (point-min))
(while (re-search-forward "^\\\\[^[{ ]+$" nil t 1)
(replace-match ""))
(goto-char (point-min))
(while
(re-search-forward
"\\(^\\\\[^\[{]+\\)[\[{]\\{1\\}\\([^]}\n]+[\]}]\\{1\\}.*\\)$" nil t 1)
(cond
((or (string= (match-string 1) "\\phone")
(string= (match-string 1) "\\yourmail")
(string= (match-string 1) "\\subject")
(string= (match-string 1) "\\closing"))
(replace-match (replace-regexp-in-string "[{}]" "" (match-string 2)))
(newline))
((string= (match-string 1) "\\opening")
(replace-match (replace-regexp-in-string "[{}]" "" (match-string 2))))
((string= (match-string 1) "\\signature")
(save-excursion
(goto-char (point-max))
(insert (replace-regexp-in-string "[{}]" "" (match-string 2))))
(kill-region (point) (line-beginning-position)))
(t
(replace-match "\n"))))
(goto-char (point-min))
(while (re-search-forward "[\\]+" nil t 1)
(replace-match ""))))
(provide 'tex2text)
;;; tex2text.el ends here
Comments welcome.
Cheers
Andreas