lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev patch: Textarea editing and charset


From: Klaus Weide
Subject: Re: lynx-dev patch: Textarea editing and charset
Date: Fri, 15 Oct 1999 11:44:54 -0500 (CDT)

On Fri, 15 Oct 1999, Kim DeVaughn wrote:
> On Tue, Oct 12, 1999, Klaus Weide (address@hidden) said:
> |
> | You are only cloning the meta information here, but not any content, at
> | that point.
> 
> Correct.  Content is never cloned, BTW (unless your definition of "content"
> is different than mine).

So we agree. :)

The clone is just an empty vessel, but we have to stick a label on it now.
Which (label) of course doesn't say what's in it (since it's empty) but
what we're expecting to fill it with later.


> Consider:
> 
> User is on a page with a TEXTAREA.  Decides to change the d.c.s.  

AND does it in a way that does not reload the page with the textarea
completely.  (which would also lose any editing changes made).
Which isn't quite so easy to do, I think.

> Then decides to enter some text into the TEXTAREA.
> 
> Initial (original) TEXTAREA lines will still be using the original char
> set, but when/if the user adds some *new* TEXTAREA lines, they (and only
> they) will use the newly specified d.c.s.
> 
> If the additional lines are the result of using an external edit (and/or
> an insert-file operation that "overlaps" the old and the new lines), I'm
> not at all sure what the result will be (since presumably the editor's
> buffer and/or the inserted file are using a single char set, which now
> gets mapped onto a multiple-char-set TEXTAREA), except that it will probably
> NOT be what the user wants/expects.

Hey, I'm also not sure what the result would be, or whether we do support
a 'multiple-char-set TEXTAREA' in any meaningful way...

> Only in the case where the user has entered text in the original TEXTAREA
> lines with the original d.c.s, then changes it, and then enters new text
> in new TEXTAREA lines (using AUTOGROW lines, or the insert-file operation),
> being careful not to touch the original lines, does the end-result have a
> chance at being correctly rendered with the two/multiple-d.c.s's.  Note
> that the external editor operation (probably) cannot be used, and still
> come out with the "correct" result, since the entire TEXTAREA that exists
> at the time of editor invocation is sent to the editor.
> 
> I think it is at least *consistent* to use the d.c.s. of the previously
> existing "pattern" TEXTAREA line, when using it to clone new lines.  Yes,
> it is incorrect if the char set gets changed, and the page not reloaded,
> but that isn't any worse that the situation with the originally existing
> lines, which are then incorrect, also.  And at least using my methodology,
> *all* the lines are *consistent*.

There is a case that you haven't considered at all, but it is the
case I find (probably) most relevant: when the first-used charset
is indeed incompatible with the one used later, but it doesn't matter
because of the data.  In particular assume all the data, up to the
point of changing d.c.s - in the page itself and in the form fields -
has been only 7-bit ASCII.  No problem if it's labelled with some exotic
charset - ASCII is a subset of them all.

But *now* the user decides to change d.c.s. - maybe just to enter some
specific character.  AND (as above) manages to somehow do that without
getting the page and its asociated form structures completely reloaded/
refreshed/updated.  It should be obvious that in this case there is
no advantage in clinging to the old charset label.

In *other* cases - the ones you are considering - there are data
(characters) in the page, at the time of the d.c.s. change, whose
meaning changes by the re-labelling.  Well, I'd expect the user
to notice this in most cases.  Or at least, the user normally has
*a chance* to notice this, and can force a complete reload before
any really weird confussion happens.

> The *real* problem is (I suppose) the lack of a reload following a change
> of char sets, but I wouldn't want to impose the manditory reloading, of
> the page, just to correctly deal with this rather unlikely/infrequent
> situation.

We agree.  (in that that would be the "real problem", and also in
that it's not really a problem that needs solving.  It may not even exist,
for http, if you use SOURCE_CACHE.)

> Also, I think of a TEXTAREA as a monolithic entity, which implies that
> its constituent lines are all using a single char set.  Perhaps that is
> an overly restrictive point-of-view.

Well that's the simplistic (common-sense?) view that most everybody has
about textareas except lynx users, I guess.

The Lynx Users Guide says explicitly somewhere that lynx handles a
TEXTAREA as a series of TEXT input fields.

> I see that in -dev.11/12 Tom has decided to use  current_char_set  in both
> places.  I don't agree with that for the above reasons, but since it's more
> of a philosophical issue, than a substantive one (and one that is unlikely
> to come up very frequently, if at all), I shan't object any further.

Maybe consideration of the so-far-only-ASCII case can remove even
your philosophical disagreement?

> | Oh, and INSERT_FILE should probably be labelled according to
> | ASSUME_LOCAL_CHARSET, not d.c.s., when the cloned lines are first
> | filled in.
> 
> I haven't looked at your "megapatch" yet, but I did see that you did some-
> thing for the insert-file case, based on file extensions (though I am not
> sure how that can/should determine what char set should be used, ATT).

It's untested, but I try to use the same rules as-if we were loading
the file.

You could set something like
  SUFFIX:.uhtml:text/html;charset=UTF-8
amd then if you insert a file with that suffix into the
textarea it will be labelled thusly.  Then you either
submit the form without doing anything - and lynx will
charset-convert the included data, I hope, on submission -
or you go through the textarea line-by-line to correct
all garbled character, retype them so they are right in the
current d.c.s.

(But normally either their will be only ASCII in
included files, or I hope user will have set ASSUME_LOCAL_CHARSET
which should apply.)

   Klaus


reply via email to

[Prev in Thread] Current Thread [Next in Thread]