emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Coding system conversion error


From: Jan D.
Subject: Re: Coding system conversion error
Date: Thu, 10 Feb 2005 22:30:58 +0100
User-agent: Mozilla Thunderbird 1.0 (X11/20041206)

Stefan Monnier wrote:

Well, maybe we can help, if you tell us what you know, ;-)


The mail you replied to was all I knew at the time. But here is a distilled description of the problem (I've omitted the 1025 character string):

ELISP> (setq str (string-to-multibyte <1025 ASCII character string>))
...
ELISP> (multibyte-string-p str)
t
ELISP> (multibyte-string-p (encode-coding-string str
'compound-text-with-extensions))
t     <---- BUG, should be nil
ELISP> (multibyte-string-p (encode-coding-string str 'utf-8))
nil

Most applications don't ask for 'compound-text, so most of the time the xassert doesn't abort.


The compound-text case exits in the second return in encode_coding_string in coding.c in this code fragment (in the if (from == to_byte) cas near the bottom):

 if (! CODING_REQUIRE_ENCODING (coding))
   {
     coding->consumed = SBYTES (str);
     coding->consumed_char = SCHARS (str);
     if (STRING_MULTIBYTE (str))
      {
        str = Fstring_as_unibyte (str);
        nocopy = 1;
      }
     coding->produced = SBYTES (str);
     coding->produced_char = SCHARS (str);
     return (nocopy ? str : Fcopy_sequence (str));
   }

 if (coding->composing != COMPOSITION_DISABLED)
   coding_save_composition (coding, from, to, str);

 /* Try to skip the heading and tailing ASCIIs.  We can't skip them
    if we must run CCL program or there are compositions to
    encode.  */
 if (coding->type != coding_type_ccl
     && (! coding->cmp_data || coding->cmp_data->used == 0))
   {
     SHRINK_CONVERSION_REGION (&from, &to_byte, coding, SDATA (str),
                               1);
     if (from == to_byte)
      {
        coding_free_composition_data (coding);
        return (nocopy ? str : Fcopy_sequence (str));
      }
     shrinked_bytes = from + (SBYTES (str) - to_byte);
   }

So if str is mulitbyte when it enters this function, the return value is multibyte. I suspect it is here Fstring_as_unibyte should be called, as it is in the previous early return.

   Jan D.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]