texinfo-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

"special" spaces in Texinfo parsing and output


From: Karl Berry
Subject: "special" spaces in Texinfo parsing and output
Date: Tue, 26 Mar 2013 21:19:02 GMT

(Switching to texinfo-devel)

    I would expect any unicode space to be
    treated as a space with respect to word and paragraph breaking.

Apparently Unicode agrees with you -- search for "breaking space" in
http://www.unicode.org/reports/tr14; all the Unicode space chars are
deemed breakpoints.  That seems quite wrong to me -- as an author, I
would certainly not want a line break at, say, a thin space -- but
Unicode is what it is.  Fine. 

    Yet, considering [\r\n\t ] only to be space characters and everything
    else to be non-space, treated as letters would simplify my life.

I think that is actually better, because makeinfo is not a display
engine.  In practice, makeinfo has never tried to implement full (or
most) Unicode semantics and I don't see any users wanting it, so I see
no problem with just saying "Unicode chars stay as is in utf-8 encoding,
all else is undefined".

    Suppose we have a text with a '* SPACE' what should be done at the
    end of a line, could it be replaced by a new line?  

I'm sorry, but I don't understand what you mean by '* SPACE'.
Do you mean three characters: an asterisk, a normal ASCII space, and
then an unusual Unicode space character?  From the rest of
what you write, I don't think so, but I can't figure it out.  

    Not necessarily, there is already some special handling of fullwidth
    east asian characters, 

Sure, I know.  But there's a lot more to Unicode line breaking than East
Asian character widths.  See above TR.  I would prefer that we *not*
implement it.  No one is expecting it.  I foresee it causing only
trouble to do so.

    I'd say that we let perl have its way.  

Fine.

k



reply via email to

[Prev in Thread] Current Thread [Next in Thread]