[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [AUCTeX] Re: tex2text
From: |
Andreas Roehler |
Subject: |
Re: [AUCTeX] Re: tex2text |
Date: |
Mon, 27 Nov 2006 21:00:26 +0100 |
User-agent: |
Thunderbird 1.5.0.4 (X11/20060516) |
Couldn't find a convenient way to transfer tex-files
into plain text.
http://www.tex.ac.uk/cgi-bin/texfaq2html?label=toascii
AR> OK, thanks.
AR> Unfortunatly:
AR> - dvi2tty
AR> doesn't handle words splitted by tex formatter,
AR> i.e. soft or syllable hyphens are still visible as
AR> Char: - (45, #o55, #x2d)
DVI doesn't make a different between soft and hard hyphens unless you have
a special font.
So it seems a common problem.
Here is `dinbrief2text' as I use it. How far it would be
onto a more general solution?
__
Andreas Roehler
;;; Commentary: Delete tex formatting
;;; elements. Preserve address, signature, phone
;;; etc. at a convenient place. Works with code
;;; employing `dinbrief' at the moment. Watching
;;; execution, I should be easy to adapt the script
;;; according to other packages in use.
;; This example code should facilitate a check
;; \documentclass[12pt]{dinbrief}
;; \usepackage{geometry}
;; \geometry{a4paper,bottom=2cm,top=2cm,left=35mm}
;; \usepackage[utf8x]{inputenc}
;; \usepackage[T1]{fontenc}
;; \usepackage{german}
;; \nowindowrules
;; \nowindowtics
;; % \stdaddress{JOE ANYONE}
;; \address{JOE ANYONE\\
;; ANYSTREET. 0 \\
;; 10407 Berlin\\}
;; \signature{JOE ANYONE}
;; % \place{XY-Stadt}
;; \begin{document}
;; \phone{+4930}{60000003}
;; \begin{letter}{An das\\
;; Final Judgement\\
;; 32. Senat\\
;; Postfach 7300\\[\medskipamount]
;; {\bf 10863 Berlin}}
;; \backaddress{JOE ANYONE, ANYSTREET. 0, 11067 Berlin}
;; \yourmail{7 XY 544/93 u.a.}
;; % \sign{123456}
;; \subject{Free Speech in Europe}
;; \opening{Sehr geehrter Herr Vorsitzender Richter K.,}
;;
;; It's not just ยง 130 German Legal Code, there are a lot of laws
;; made against free speech in Europa nowadays.
;;
;; \begin{quote}
;;
;; As philosopher Ludwig Wittgenstein put it:
;;
;; \glqq What we cannot speak about we must pass over
;; in silence\grqq{} . ... \end{quote}
;;
;; Is this the intended result?
;;
;; \closing{Regards}
;; % \ps{Wir bitten um schnelle Erledigung.}
;; % \cc{Kopie an:}
;; % \encl{Abschrift der Urkunde}
;; \end{letter}
;; \end{document}
;;; Code:
(defun tex2text ()
"Delete tex-formatters, but preserve address, signature, phone etc."
(interactive)
(let ((oldbuf (buffer-name))
(newbuf (concat (substring (buffer-name) 0 (string-match "\\."
(buffer-name))) ".txt"))
(backadress-flag nil))
(set-buffer (get-buffer-create newbuf))
(switch-to-buffer (current-buffer))
(erase-buffer)
(insert-buffer oldbuf)
(goto-char (point-min))
(while (re-search-forward "\\\\dots{}" nil t 1)
(replace-match "..."))
(goto-char (point-min))
(while (re-search-forward "^%+.*\n" nil t 1)
(replace-match ""))
(goto-char (point-min))
;; ;; {\bf 10863 Berlin}}
(while (re-search-forward "^{\\\\[[:graph:]]+ \\([^}]+\\)}" nil t 1)
(replace-match (match-string 1)))
(goto-char (point-min))
;; [\medskipamount]
(while (re-search-forward "\\[\\\\[^\]]+]" nil t 1)
(replace-match ""))
(goto-char (point-min))
(goto-char (point-min))
;; ^\\\\begin{letter}
;; (string= (match-string 1) "\\subject")
(while (re-search-forward "^[ \t]*\\\\begin{letter}\\|\\\\subject"
nil t 1)
(replace-match "\n")
(kill-list-atpt)
(insert (replace-regexp-in-string "[{}\\]" "" (format "%s" (car
kill-ring))))
(insert "\n"))
(while (re-search-forward "^[ \t]*\\\\\\(end\\){quote}" nil t 1)
(replace-match "\"")
(forward-char -1)
(delete-region (point) (progn (skip-chars-backward " \t\n\f")
(point))))
(goto-char (point-min))
(while (re-search-forward "^[ \t]*\\\\\\(begin\\){quote}" nil t 1)
(replace-match "\"")
(delete-region (point) (progn (skip-chars-forward " \t\n\f")
(point))))
(goto-char (point-min))
(while (re-search-forward "\\\\\\(glqq\\)" nil t 1)
(if (save-match-data (in-string-p))
(progn
(replace-match "\'")
(delete-region (point) (progn (skip-chars-forward " \t\n\f")
(point)))
(re-search-forward "\\\\\\(grqq\\){}" nil t 1)
(replace-match "\'")
(forward-char -1)
(delete-region (point) (progn (skip-chars-backward " \t\n\f")
(point))))
(replace-match "\"")
(delete-region (point) (progn (skip-chars-forward " \t\n\f") (point)))
(re-search-forward "\\\\\\(grqq\\){}" nil t 1)
(replace-match "\"")
(forward-char -1)
(delete-region (point) (progn (skip-chars-backward " \t\n\f")
(point)))))
(goto-char (point-min))
(when (re-search-forward "^\\\\backaddress{\\([^}]+\\)}" nil t 1)
(setq backadress-flag t)
(save-excursion
(goto-char (point-min))
(insert (replace-regexp-in-string "[{}]" "" (match-string 1)))
(newline))
(kill-region (point) (line-beginning-position))
(newline))
(goto-char (point-min))
(when (re-search-forward "^\\\\address" nil t 1)
(if backadress-flag
(progn
(save-excursion
(kill-region (point) (progn (forward-list) (point))))
(kill-region (point) (line-beginning-position)))
(replace-match (match-string 2))))
(goto-char (point-min))
(while (re-search-forward "^\\\\[^[{ ]+$" nil t 1)
(replace-match ""))
(goto-char (point-min))
(while
(re-search-forward
"\\(^\\\\[^\[{]+\\)[\[{]\\{1\\}\\([^]}\n]+[\]}]\\{1\\}.*\\)$" nil t 1)
(cond
((or (string= (match-string 1) "\\ps"))
(save-excursion
(goto-char (point-max))
(newline 2)
(insert "PS ")
(insert (replace-regexp-in-string "[{}]" "" (match-string 2))))
(kill-region (point) (line-beginning-position)))
((or (string= (match-string 1) "\\phone")
(string= (match-string 1) "\\yourmail")
(string= (match-string 1) "\\textbf")
(string= (match-string 1) "\\closing"))
(replace-match (replace-regexp-in-string "[{}]" "" (match-string 2)))
(newline))
((string= (match-string 1) "\\opening")
(replace-match (replace-regexp-in-string "[{}]" "" (match-string 2))))
((string= (match-string 1) "\\signature")
(save-excursion
(goto-char (point-max))
(insert (replace-regexp-in-string "[{}]" "" (match-string 2))))
(kill-region (point) (line-beginning-position)))
(t
(replace-match "\n"))))
(goto-char (point-min))
(while (re-search-forward "[\\]+" nil t 1)
(replace-match "")))
(nur-eine-leerzeile))