lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev EBCDIC, HTStreamStack, etc.


From: Klaus Weide
Subject: Re: lynx-dev EBCDIC, HTStreamStack, etc.
Date: Mon, 22 Mar 1999 21:47:52 -0600 (CST)

On Mon, 22 Mar 1999 address@hidden wrote:

> I'm seeking to fix a little deficiency in EBCDIC support.
> At present, all data coming from the Web is translated from
> ASCII to EBCDIC (for EBCDIC hosts, and all data going to the
> web is translated from EBCDIC).  I believe this was the right
> way to do it; it works well, and seemed to require the fewest
> code changes.

I assume this means you want to keep it this way: translate the
bytes immediately to EBCDIC, before looking at them in any way,
even for binary files.  Indeed that's the only way that makes
sense for HTTP, since you don't know whether it's binary or not
before you have looked at it... (but FTP for example could be
handled differently).

> Downloaded text files are suitably received in EBCDIC.  But
> the problem is that downloaded binary files are perverted by
> this translation.  So far, I've been dealing with it by
> defining a downloader in lynx.cfg that undoes the translation,
> or using "lynx -source URL | iconv -f IBM-1047 -t ISO8859-1",
> 
> It would be better to retranslate binary files internally.  But
> I'm lost.  I'm staring at HTFormat, which seems to be assembling
> pseudo-OO objects with (int (* ())) fields that get assigned
> to and moved around.  

What parts of HTFormat.c are you staring at?

The stuff in the second half (basically HTCopy and below) is quite
separate from the first half that deals with HTPresentation and
StreamStack etc.  They shouldn't really be even in the same source
file...

The bottom functions deal with reading input, they come 'early' in
the processing of the data stream.  But you already know that, since
that's where you most of the FROMASCII is done.

The top function provide a general mechanism for plugging 'HTStreams'
together, controlled by MIME types.  Did you read the (sparse) comments
in HTFormat.h?  See also <http://www.w3.org/Library/User/Using/Streams.html>
form the newer libwww (which just happens to be still valid for what Lynx
uses, except for the 'return an integer status' part.

But you don't want to look at the geneal mechanism, you want the real
meat... but you should understand the sequence of events in which a
HTStream object is used: first it is created (often indirectly by
HTStreamStack), then its Foo_put_*() methods are called, finally
Foo_free() or Foo_abort().

See where these things are registered: mostly HTSetPresentation()
and HTSetConversion() in HTInit.c.

> I haven't the faintest idea where the
> actual code is.  I suppose I want to create a method (or modify
> an existing one) to undo the EBCDIC translation for binary
> files, even as some method, somewhere, must convert newlines for
> text files.
> 
> Can someone point me in the right direction?

I think you want to convert TOASCII as late as possible (after
MIME header parsing has happened, all in EBCDIC representation).
So how about right before data is written to the temp file -
that would be in HTFWriter.c.

Yeah, it's a bit ugly to understand...  but you probably
want to modify HTSaveToFile and the HTFWriter_put_character,
HTFWriter_put_string, HTFWriter_write methods.

HTSaveToFile() creates the HTStream object.  You could
provide an alternative version, or (probably better) modify
the existing one.  You modified version could set a flag that
is checked in the abovementioned methods, to determine whether
or not to TOASCII() before writing.

Note that the existing HTSaveToFile() already does an binary-or-not
determination based on MIME type, currently only used for VMS.
That may not be always the right thing for you, but it's a starting
point.   (Lynx also keeps information (unused?) on what types are
binary, but unfortunately on a per-filesuffix, not on a per-MIMEtype
basis; see HTFile.c, HTInit.c.)

An alternative approach to messing with HTSaveToFile() would be
to plug a little TOASCII()-or-not HTStream *before* HTSaveToFile().
So date would flow like this (for example)

   HTTP.c+HTCopy() -> HTMIME.c -> HTMaybeSaveToASCII() -> HTSaveToFile()

[ or rename HTSaveToFile to HTReallySaveToFile and rename
  HTMaybeToASCII to HTSaveToFile #ifdef NOT_ASCII, avoids having
  to change all places where HTSaveToFile occurs. ]

[ oh and HTMaybeSaveToASCII may be not the best name for this... ]

HTMaybeSaveToASCII() would determine 'do we want to translate back
or not' based on MIME type, set a flag, then call the existing
HTSaveToFile() and store a pointer to the target stream created by
HTSaveToFile(), then return itself ('me') if everything went ok
like other HTStream implementations do.

The 'MaybeSaveToASCII' stream's other methods would just pass the
call and data (after possible translation) on to its target stream's
methods.  And destroy itself after free() and abort() calls.

You should find more or less usable templates for this in nearly
all .c files that have a 'PUBLIC HTStream* <name> ARGS3'.


   Klaus

reply via email to

[Prev in Thread] Current Thread [Next in Thread]