bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] mbswidth: add new functions to handle tabs


From: Bruno Haible
Subject: Re: [PATCH] mbswidth: add new functions to handle tabs
Date: Thu, 14 Jan 2010 01:39:45 +0100
User-agent: KMail/1.9.9

Hi Joel,

> a wrapper around mbsnwidth to compute screen columns while accounting
> for tabs.
> ...
> Does this duplicate any functionality already offered in 
> gnulib?  Is it general enough for gnulib?

This functionality is not yet in gnulib. But I don't think it's general
enough: Today you want to support tabs. Tomorrow you'll want to support
line numbers and '\v' characters. Next week someone will want to support
paragraph separator characters.

Instead of adding more and more variants of mbswidth, I think we should
make a bigger step and offer a customizable variant. It will take a
function pointer as argument, that gets passed a control character.
People will not want to handle many encodings within this function;
therefore its argument should be a Unicode character. This leads to a
function like this:

  /* Compute and return the current column at the end of the given
     STRING, assuming it starts at START_COLUMN. FUNC handles control
     characters.  */
  extern int mbs_update_column (const char *string, int start_column,
                                void (*func) (ucs4_t uc, int *column_p);

> I also considered providing a means to compute line numbers at the same 
> time.

With a little change of the interface, it can accommodate this use-case too:

  /* Compute and store in *COLUMN_P the current column at the end of the
     given STRING, assuming it starts at the initial value of *COLUMN_P.
     FUNC handles control characters.  */
  extern void mbs_update_column (const char *string, int *column_p,
                                 void (*func) (ucs4_t uc, int *column_p);

For computing line numbers, one would pass an int(*)[2] as column_p.

The implementation of this function should walk across the string until
it finds the first non-ASCII control character. At this point it
converts to Unicode using the u32_conv_from_encoding function, so that
it gets a correspondence between multibyte characters and Unicode
characters.

> +   BUF[0] is assumed to appear at screen column COLUMN_INIT (origin 1).

In an API, column numbers should start with 0. Origin-1 column numbers
can be implemented by adding 1 just before printing the column number.
Ratiionale: Half of the editors used origin-0 column numbers and half of
the software use origin-1 column numbers. Therefore you need to
accommodate both conventions.

Bruno




reply via email to

[Prev in Thread] Current Thread [Next in Thread]