[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] nmh internals: full MIME integration

From: Ken Hornstein
Subject: Re: [Nmh-workers] nmh internals: full MIME integration
Date: Tue, 29 Jul 2014 22:41:34 -0400

>Bio's formatted I/O routines are fully utf-8 aware.  E.g. when looking
>to consume or output a character they know how many octets are required
>to form any given unicode character.  The upside is you never have to
>think at all about processing utf-8 -- it just happens.

So here's the thing ... right now we (mostly) don't have to think about
processing UTF-8.  We get bytes in from decoding and squirt them out.
There's no processing; we leave that up to the terminal to handle it.
We're essentially UTF-8 ignorant, the same way we're ISO-8859 ignorant.

Now it does matter when we're doing stuff like scan(1); we need to know
how many bytes have been consumed, and how many column positions we've
moved so we can format things correctly.  But ... I'm looking at Bio(2)
again, and I don't see how that helps us (we cannot assume 1 rune =
1 column position).  For this we use the POSIX wcwidth() routine, and
I don't see a Plan 9 equivalent.

>Another benefit is that print() and friends let you install custom
>formatting verbs.  So you can do things like:
>  int to_qp(Fmt *fmt) {/* convert data to quoted-printable */}         ;
>  fmtinstall('Q', to_qp)                                               ;
>and then
>  char *text = "some string with non-ascii text"; print("%Q\n", text)
>and the %Q conversion formats its output as quoted-printable on the
>fly.  Similarly, you could define verbs that know how to encode stuff
>in headers, escape and quote addresses as necessary, etc.  In some
>situations this can really help improve the code's readability.

That might be interesting ... although I found out the hard way that
when it comes to RFC-2047 encoding, you need to keep a lot of state
around (see sbr/encode_rfc2047.c).  It's interesting, but I don't see
it as a good enough feature to switch to Bio(2) on it's own.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]