pan-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Pan-devel] Another problem I need to add to my list: word-wrapping. (Re


From: SciFi
Subject: [Pan-devel] Another problem I need to add to my list: word-wrapping. (Re: My list of present problems with Pan.)
Date: Sat, 22 Oct 2011 22:41:40 +0000 (UTC)
User-agent: Pan/0.135 (Tomorrow I'll Wake Up and Scald Myself with Tea; GIT d8bfcda (github.com/judgefudge/pan2/master); x86_64-apple-darwin10.8.0; gcc-4.2.1 (build 5666 (dot 3)); 32-bit mode)


Hi,

I guess this would be #7 in my list, but the "priority" should be raised
somewhat, please.



A text article can be reformatted with the View->BodyPane->Wrap function
(or simply hit 'w' on the keyboard while viewing a text message).

The judgefudge repo is trying to make the text look nice this way,
but I see it is sometimes wrapping the text too "strongly".

This can be seen in some of the text posts in binary groups.
Many people will type text in a "table" format, for the technical info
of a movie for example, but the description paragraphs will go-on-&-on
with long lines that need rewrapping to be read properly inside a Pan
viewing window.

The judgefudge criteria seems to take _most_ text to need rewrapping,
even the "table" lines.  Somehow the "reverse-fill" wrapping is a bit
faulty, ISTM, but I cannot put my finger on why (not knowing enough
about C++ is probably hindering me).

Let me give a bit of my expertise (before disabled/retiring).
I was our department's pseudo-expert on word-processing and printing.
Before PCs became ubiquitous, this was with IBM mainframes and their
related distributed systems.
We had a lot of testing & bug-fixing with their systems, to put it mildly.
Even post-PC, we talked directly with the engineers of still-very-famous
companies in this area.  ;)
[It's one reason I like to use “typographic” characters in my text in
 various venues.  ;)  ]

I guess the best way to demonstrate is to copy one of the nfo posts from
the a.b.dvd.classics group, almost any made by TPKA 'JB'.  Let's pick
a recent post in the world-wide Usenet with a Message-ID of:
"<address@hidden>" -- it's a nfo file
that's been yEnc'd, but Pan can decode and show it directly:

-* begin *-

BUCK (2011)

http://www.imdb.com/title/tt1753549/
http://www.amazon.com/dp/B005E7SEMU/
http://www.sundanceselects.com/films/buck

DVD Studio: Sundance Selects / MPI Home Video
DVD Release Date: October 4, 2011

Director: Cindy Meehl
Stars: Buck Brannaman and Robert Redford 

Description
----------------
BUCK, a richly textured and visually stunning film, follows Buck Brannaman from 
his abusive childhood to his phenomenally successful approach to horses. A 
real-life horse-whisperer , he eschews the violence of his upbringing and 
teaches people to communicate with their horses through leadership and 
sensitivity, not punishment. Buck possesses near magical abilities as he 
dramatically transforms horses - and people - with his understanding, 
compassion and respect. A truly American story about an unsung hero and one of 
the most successful documentaries of the year, BUCK is about an ordinary man 
who has made an extraordinary life despite tremendous odds.

Format: NTSC
DVD Size: 6.63 Gb -- Exact Untouched Copy
Runtime (main feature): 88 minutes
Type: Color
Aspect Ratio: 1.85:1
Sound (main feature): English DD2.0
Subtitles: optional English SDH | Spanish

Disc Features
----------------
# Trailers.
# Deleted Scenes.
# Commentary with Filmmakers and Buck Brannaman.

Posted in:
a.b.dvd.classics

-* end *-

[Let's not worry whether the binary post is "legal",
 I only want to talk about this texual material as an example.]

Here, the only part that needs word-wrapping is the long single-line
paragraph under the 'Description'.

But the judgefudge repo will word-wrap almost every other line also,
with "reverse-wrap" to boot.
For ex., the 'Description' word would be followed by one space-byte
and the line of dashes right after it, instead of on the next line
as the original text shows.  But the long single-line paragraph
would still begin on the next line anyway.
(Give this message a try, see what it does when 'w' is hit.)
Let's see if I can copy-&-paste what this looks like wrapped:

-* begin *-

BUCK (2011)

http://www.imdb.com/title/tt1753549/
http://www.amazon.com/dp/B005E7SEMU/
http://www.sundanceselects.com/films/buck

DVD Studio: Sundance Selects / MPI Home Video DVD Release Date: October 4,
2011

Director: Cindy Meehl Stars: Buck Brannaman and Robert Redford

Description ----------------
BUCK, a richly textured and visually stunning film, follows Buck Brannaman
from his abusive childhood to his phenomenally successful approach to
horses. A real-life horse-whisperer , he eschews the violence of his
upbringing and teaches people to communicate with their horses through
leadership and sensitivity, not punishment. Buck possesses near magical
abilities as he dramatically transforms horses - and people - with his
understanding, compassion and respect. A truly American story about an
unsung hero and one of the most successful documentaries of the year, BUCK
is about an ordinary man who has made an extraordinary life despite
tremendous odds.

Format: NTSC DVD Size: 6.63 Gb -- Exact Untouched Copy Runtime (main
feature): 88 minutes Type: Color Aspect Ratio: 1.85:1 Sound (main
feature): English DD2.0 Subtitles: optional English SDH | Spanish

Disc Features ----------------
# Trailers.
# Deleted Scenes.
# Commentary with Filmmakers and Buck Brannaman.

Posted in:
a.b.dvd.classics

-* end *-

The lines that are wrongly wrapped together:
1)  DVD Studio + DVD Release Date
2)  Director + Stars
3)  Description + dashes
4)  Format + Runtime + Type + Aspect Ratio + Sound + Subtitles
5)  Disc Features + dashes

Other textual posts are much more onerous.  ;)
All just to get the long lines within the viewing window.

A.
What we need is a setting to let us do a "simple word-wrap" vis-a-vis
the current (more "complex") method.

The "simple" method would only wrap long lines,
and leave every LF (CR+LF) alone as-is, no matter what.
Don't squeeze anything out (but see my next "B" section below).
This should "fix" the "table" formatting, too.


B.
Another point is to treat multiple blanks/tabs (all "white space") as if
they are "invisible" if at along the point of wrapping.  That is, do not
count the space-byte (tab, etc.) as a byte to be put on the next line
if that is where the wrap-point is 'right now', so-to-speak.  The
byte to put on the next line should be a "visible" character, see.
White-space-bytes are "non-significant" only along the wrap-point;
"visible" bytes _are_ "significant", always.
Too many times I see this "mistake" in a word-processor.  ;)


C.
The judgefudge repo has another problem:
Some texts are formatted with a leading "indent" for the paragraph
and the rest of that paragraph is, again, a super-long line.
This type of message can be seen more often on Gmane, in mailing-lists
that go all over the world, and come from people who are writing in a
"typewriter" or "book" mode of sorts (I guess) and are likely not
native english writers.
This Pan will wrap the lines, sure, but it will "indent" _ALL_ of the
wrapped lines for that paragraph; it doesn't notice that these lines
should be wrapped to the left-margin as for a printed book, see.
The writer will usually not double-space his text, so his next
paragraph(s) are all indented with super-long line(s) each.
This Pan ends-up wrapping _all_ of those paragraphs all-together with
them all indented as well.  It does not notice that the separated indented
lines should begin new paragraphs.
I believe, again, that a "simple word-wrap" method will "fix" this
situation as well:
a1)  the leading indents count, they are _not_ at the wrap-point,
b1)  the following long line will get wrapped, to the left-margin,
c1)  the line ends with LF (or CR+LF),
a2)  the next white-space indents count for the next formatted line,
     they are _not_ at the wrap-point,
b2)  that next long line will get wrapped, to the left-margin,
c2)  that next line ends with LF (CR+LF),
etc. etc. etc. etc. etc.
Perfectly (re)formatted.

For an example of this,
hook-up to news.Gmane.org,
find the group 'gmane.comp.time.tz',
get a few days' worth of headers,
find this article:
"Message-ID: <address@hidden>"
"Date: Fri, 21 Oct 2011 10:05:20 -0300"
"Subject: Official time zone rule sources"
"From: Glenn Eychaner <address@hidden>"
[Let's not worry about the actual discussion there
 {O.T. unless you are worried about the lawsuit against the TZ database}
 but it is a prime example of this problem with Pan wrapping text.]


-*-

FWIW I don't ever remember a version of Pan, old or new,
that did this wrapping quite right, all told.

I know what needs to be done,
just not in the current computer language.  ;(
If I could ever understand C++ enough,
I'd probably be able to figure-out the present rigmarole
and try to rewrite it.
But right now I can't.  ;(

ISTM a simple thing, tho, y'know?
;)






reply via email to

[Prev in Thread] Current Thread [Next in Thread]