lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

lynx-dev patch: no substitution of ' ' for '\n' when Chinese/Japanese


From: Nelson Henry Eric
Subject: lynx-dev patch: no substitution of ' ' for '\n' when Chinese/Japanese
Date: Sat, 22 Aug 1998 18:30:56 +0900 (JST)

When we were on the topic of "Re: lynx-dev <BR> does not accumulate",
Dave Eaton provided us with a URL on how "Paragraphs, Lines, and Phrases"
should be handled, "http://www.w3.org/TR/REC-html40/struct/text.html";.

In that document, I found a reference to something which has been
bugging me about Lynx for a long, long time: Lynx substitutes ' ' in the
HTML stream for '\n' encountered in the source document.  This action
is fine for English and many other scripts which need inter-word space.
The problem is, quoting from the above document, "In Japanese and
Chinese, inter-word space is not typically rendered at all."

The appended patch to HTML.c gives better Japanese and Chinese support
for Lynx, I believe.  Also, some spelling correction was done.

__Henry

*** lynx2-8-1/src/HTML.c.orig   Fri Aug 21 22:30:14 1998
--- lynx2-8-1/src/HTML.c        Sat Aug 22 17:32:34 1998
***************
*** 223,230 ****
            return;
        if (c != '\n' && c != '\t' && c != '\r')
            HTChunkPutc(&me->title, c);
!       else
!           HTChunkPutc(&me->title, ' ');
        return;
  
      case HTML_STYLE:
--- 223,234 ----
            return;
        if (c != '\n' && c != '\t' && c != '\r')
            HTChunkPutc(&me->title, c);
!       else if (HTCJK == CHINESE || HTCJK == JAPANESE || HTCJK == TAIPEI)
!           if (c == '\t')
!               HTChunkPutc(&me->title, ' ');
!           else return;
!           /* don't replace '\n' with ' ' if Chinese or Japanese - HN */
!       else HTChunkPutc(&me->title, ' ');
        return;
  
      case HTML_STYLE:
***************
*** 336,348 ****
                UPDATE_STYLE;
            }
            if (c == '\n') {
!               if (me->in_word) {
!                   if (HText_getLastChar(me->text) != ' ') {
!                       me->inP = TRUE;
!                       me->inLABEL = FALSE;
!                       HText_appendCharacter(me->text, ' ');
                    }
-                   me->in_word = NO;
                }
  
            } else if (c == ' ' || c == '\t') {
--- 340,358 ----
                UPDATE_STYLE;
            }
            if (c == '\n') {
!               if (HTCJK == CHINESE || HTCJK == JAPANESE ||
!                   HTCJK == TAIPEI) {
!                   /* don't replace '\n' with ' ' if Chinese or Japanese - HN
!                    */
!               } else {
!                   if (me->in_word) {
!                       if (HText_getLastChar(me->text) != ' ') {
!                           me->inP = TRUE;
!                           me->inLABEL = FALSE;
!                           HText_appendCharacter(me->text, ' ');
!                       }
!                       me->in_word = NO;
                    }
                }
  
            } else if (c == ' ' || c == '\t') {
***************
*** 365,379 ****
      } /* end second switch */
  
      if (c == '\n' || c == '\t') {
!       HText_setLastChar(me->text, ' '); /* set it to a generic seperater */
  
        /*
         *  \r's are ignored.  In order to keep collapsing spaces
         *  correctly we must default back to the previous
!        *  seperater if there was one
         */
      } else if (c == '\r' && HText_getLastChar(me->text) == ' ') {
!       HText_setLastChar(me->text, ' '); /* set it to a generic seperater */
      } else {
        HText_setLastChar(me->text, c);
      }
--- 375,389 ----
      } /* end second switch */
  
      if (c == '\n' || c == '\t') {
!       HText_setLastChar(me->text, ' '); /* set it to a generic separator */
  
        /*
         *  \r's are ignored.  In order to keep collapsing spaces
         *  correctly we must default back to the previous
!        *  separator if there was one
         */
      } else if (c == '\r' && HText_getLastChar(me->text) == ' ') {
!       HText_setLastChar(me->text, ' '); /* set it to a generic separator */
      } else {
        HText_setLastChar(me->text, c);
      }
***************
*** 471,480 ****
                    UPDATE_STYLE;
                }
                if (c == '\n') {
!                   if (me->in_word) {
!                       if (HText_getLastChar(me->text) != ' ')
!                           HText_appendCharacter(me->text, ' ');
!                       me->in_word = NO;
                    }
  
                } else if (c == ' ' || c == '\t') {
--- 481,497 ----
                    UPDATE_STYLE;
                }
                if (c == '\n') {
!                   if (HTCJK == CHINESE || HTCJK == JAPANESE ||
!                       HTCJK == TAIPEI) {
!                       /* don't replace '\n' with ' '
!                        * if Chinese or Japanese - HN
!                        */
!                   } else {
!                       if (me->in_word) {
!                           if (HText_getLastChar(me->text) != ' ')
!                               HText_appendCharacter(me->text, ' ');
!                               me->in_word = NO;
!                       }
                    }
  
                } else if (c == ' ' || c == '\t') {
***************
*** 490,496 ****
  
                /* set the Last Character */
                if (c == '\n' || c == '\t') {
!                   /* set it to a generic seperater */
                    HText_setLastChar(me->text, ' ');
                } else if (c == '\r' &&
                           HText_getLastChar(me->text) == ' ') {
--- 507,513 ----
  
                /* set the Last Character */
                if (c == '\n' || c == '\t') {
!                   /* set it to a generic separator */
                    HText_setLastChar(me->text, ' ');
                } else if (c == '\r' &&
                           HText_getLastChar(me->text) == ' ') {
***************
*** 497,504 ****
                    /*
                     *  \r's are ignored.  In order to keep collapsing
                     *  spaces correctly, we must default back to the
!                    *  previous seperator, if there was one.  So we
!                    *  set LastChar to a generic seperater.
                     */
                    HText_setLastChar(me->text, ' ');
                } else {
--- 514,521 ----
                    /*
                     *  \r's are ignored.  In order to keep collapsing
                     *  spaces correctly, we must default back to the
!                    *  previous separator, if there was one.  So we
!                    *  set LastChar to a generic separator.
                     */
                    HText_setLastChar(me->text, ' ');
                } else {

reply via email to

[Prev in Thread] Current Thread [Next in Thread]