lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev stopping when viewing a site


From: Klaus Weide
Subject: Re: lynx-dev stopping when viewing a site
Date: Sat, 21 Aug 1999 19:59:56 -0500 (CDT)

On Fri, 20 Aug 1999, Leonid Pauzner wrote:

> 19-Aug-99 07:27 Klaus Weide wrote:
> 
> > Can you remind me what the kludge was good for?  I know that it allows
> > to add the tableless charsets in arbitrary order (without pre-filling
> > the LYCharSet_UC[MAXCHARSETS] in LYCharSets.c).  Were there any other
> > reasons?
> 
> Main reason - to clean up the code (so many special cases
> will be covered by UCDomap.c only - we could strip LYCharSets.c,
> SGML.c, LYCharUtils.c [...] from old entities, PassEightBitRaw etc.
> That would be more inlined with current style IMHO).

Yes, much of the stuff in LYCharSets.c was unnecessary after you
practically eliminated the "old method" (and you deserver credit for
doing a lot of cleanup).

It still makes sense to have a separation of character set meta-info bits
between stuff visible to all of Lynx, and stuff internal to the table lookup
implementation.  The first ("type A") are in LYCharSets.c (LYchar_set_names[],
LYCharSet_UC[], LYlowest_eightbit[], ...), the latter ("type B") are kept
in UCInfo[].  For the table-less character sets - those for which you
introduced macros in UCdomap.h - the purpose is really to get type A info
into the relevant arrays.  We don't really need any of the type B info,
since the table lookup implementation should never even _try_ to find a
translation table.  The only reason for filling in a UCInfo[i] for the
table-less charsets (with lots of dummies and zeroes) seems to be because
the existing UC_Charset_Setup() does this (before filling in / updating
the type A structures), but for the table-less sets it is just an unneeded
side effect.  The only exception seems to be the 'trydefault' (and that
probably should become type A info).  So overall, _this_ change (= use
macros in UCdomap.h instead of harwired info in LYCharSets.c) doesn't
look like cleanup to me.

> >> >         ut = UCInfo[UChndl_out].unitable;
> >> >         if (ut != UC_current_unitable) {
> >>
> >> If we change (!isdefault) with (!isdefault && UCInfo[...].unitable != NULL)
> >> we will probably bypass all calls to dedicated charset's tables
> >> `conv_uni_to_pc(unicode, 0)' so initialization may not be necessary
> >> since we only use default table - calls to `conv_uni_to_pc(unicode, 1)'
> 
> > That gets ugly, several places to change and to check whether all
> > calls to conv_uni_to_pc(unicode, 0) _and_ to UC_con_set_unimap are
> > then always skipped...
> 
> Well, the only external calls to UCDomap.c stuff are UCTrans* (see UCMap.h)
> so it wouldn't be hard to check that places only
> (perhaps, an initialization ~UCInit() also).

Yes, it is certainly possible to protect against doing lookups when there
is no valid table in the UCTrans* functions... I just think it should
never get that far into those functions.  Or it should be rejected at
the top of those functions, not further down where you also have to
protect all the 0xfffd and other fallbacks...

> > [...]            I don't know whether "Transparent" needs to be handled
> > specially wrt 'trydefault', or whether all specially handled charsets
> > can/should fall back if they get so far (to the UCTrans* functions).
> 
> My conclusion about "Tansparent" vs 'trydefault' was based on that fact:
> let we have html page with lots of entities like ¥∧&something;
> then in transparent mode these entities will have no translation
> so we saw them verbatim, instead old behaviour (before my changes)
> was based on using 7-bit approximations. IMO CJK behave the same way.

So you _want_ "Transparent" to translate entities and NCRs.  Do I understand
that right?

I am not sure what _should_ be the behavior - I can accept both choices.
Especially if it's easy do change in the source code...

> I mean translation "TO charset" e.g. to x-transparent, to any_CJK yes?
> (not from ...)

Understood.

> (snipping of quotes just a matter of taste based on many factors).

Well I get tired of scrolling down through pages and pages of quoted
text only to find that nothing new has been added.

   Klaus


reply via email to

[Prev in Thread] Current Thread [Next in Thread]