[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Pan-devel] [PATCH] 8 bit characters in header

From: Sam Solon
Subject: [Pan-devel] [PATCH] 8 bit characters in header
Date: 20 Jul 2002 20:27:01 -0400

Although the proper answer is "they're wrong according to RFC 977" there
seem to be a number of postings that use 8 bit characters in the header
-- particularly for the subject. This seems most common in binary
newsgroups and is probably an attempt to disguise a copyright violation.

Since Pan uses the current locale to convert to UTF-8 there is the
possibility that the conversion will fail, leaving the subject blank. At
least, that's what it does on my system, with the default "C" locale.

I think it's better to get something rather than nothing so I propose
the following patch.

If the conversion using the default locale fails, it is tried again with
"ISO-8859-1" explicitly specified (maybe there's something better?). If
that fails the beginning of the string up to the conversion failure
point is used.

I find it disconerting to have lots of blank subject lines in the
article-list -- not that *I* would ever download a file that violates a
copyright. ;-)

Index: pan-glib-extensions.c
RCS file: /cvs/gnome/pan/pan/base/pan-glib-extensions.c,v
retrieving revision 1.25
diff -u -u -r1.25 pan-glib-extensions.c
--- pan-glib-extensions.c       23 Jun 2002 11:28:11 -0000      1.25
+++ pan-glib-extensions.c       21 Jul 2002 00:14:01 -0000
@@ -855,7 +855,25 @@
              gssize          len,
              char         ** g_freeme)
-       return pan_g_convert_to_utf8 (str, g_freeme, len, NULL, NULL, NULL);
+       const char * retval
+               = pan_g_convert_to_utf8 (str, g_freeme, len, NULL, NULL, NULL);
+       if (!retval) {
+               gsize bytes_read;
+               gsize bytes_written;
+               retval = *g_freeme = g_convert(str,
+                                              len,
+                                              "UTF-8",
+                                              "ISO-8859-1",
+                                              &bytes_read,
+                                              &bytes_written,
+                                              NULL);
+               if (!retval)
+                       retval = *g_freeme = g_strndup(str, bytes_read);
+       }
+       return retval;
 const char*

reply via email to

[Prev in Thread] Current Thread [Next in Thread]