[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: The state I'm in - rfc 822 parsing.
From: |
Alain Magloire |
Subject: |
Re: The state I'm in - rfc 822 parsing. |
Date: |
Wed, 14 Mar 2001 22:18:17 -0500 (EST) |
>
> > > If I find the awful comment, I put it where I would have
> > > put a phrase.
> >
> > Seems you actually remove comments totaly.
> > Some clients may want them right. Especially in the "Received:"
>
> Uh, GetMailBox() grabs the comment and puts in the personal,
> doesn't it? I don't have the source in front of me, but
> if what I sent you doesn't do that I sent the wrong diff.
# ./addr
alain(the squirrel fighter)@qnx.com
alain(the squirrel fighter)@qnx.com=> pcount 1
1 email <address@hidden>
local-part <alain> domain <qnx.com>
# ./addr -1
alain(the squirrel fighter)@qnx.com
alain(the squirrel fighter)@qnx.com=> pcount 1
1 email <address@hidden>
comments <the squirrel fighter>
> The underlying GetComment() collects the contents of the comment,
> so if (when) I parse Received fields, I will get the comment
> contents and return them.
Actually, I'm off topic with the Received field, since I do
not think anyone will use it in address_create().
> > Ok, suggestion; how about to build on top(not now, but much ... much later)
> > something that would permit a user to build and save aliases:
> > alias_create (&alias, "canadian_hackers");
> > alias_add (alias, "address@hidden");
> > alias_add (alias, "address@hidden");
>
> Sure, but I'm not a fan of calling it alias. Alias is used as a nickname
> feature in mail clients, and the alias can be to a single route-addr,
> and usually is. Alias is used in sendmail, I think, to expand to
> a list of addresses, #(mailbox), not to a group. In general, I like
> to call something by the RFC name, i.e. "group".
Ok.
> > 8-), If I remember djb, he was not easy to deal with.
>
> Careful there, some people say that about RMS! ;-) Actually, his
..
8-)
> > > - <>, valid in SMTP "rcpt from:", should I parse and ignore it?
> >
> > I vote strip. Since I want to reuse the email part to build the "From "
> > string separator for Unix Mbox.
>
> By strip, you mean parse and ignore? If I parse and ignore, this
> would be a list with two addresses: address@hidden, <>, address@hidden
Yes, which sounds reasonnable, no?
for exemple:
To: Alain M. <address@hidden>
I would expect address_get_email () --> "address@hidden".
> > Yes, you could use the "poison" mallocs (xmalloc(), xrealloc, xcalloc)
> > but I do know if that is the proper way for a library to behave.
>
> Our test suites at work do amazing things, we do a loop around
> our test cases, and before entering the loop set the n'th malloc()
> to fail. Then we call the test, it fails with ENOMEM, we then
> check that all allocated memory was free. Then we increment n.
> It's stunning what exhaustive checking turns up... I'll bet you
> a case of beer the same approach to libmailbox would turn up
> bugs.
I would __love__ to run this type of test on the library and programs.
You might be suprise, for the POP and MBOX(unix) client, mailutils
could pass we good grades 8-).
So far there is no serious test case for the mailutils, maybe later.
> > The realloc-by-1 is hard. Why not construct a "string class" and realloc
> > by a fixed size, say 16/32, since it's only short strings.
> >
> > struct _string {
> > char *buffer;
> > size_t buflen;
> > };
> > and pass around the struct _string *;
>
> Ok. How about I use my favorite optimization, double memory on realloc(),
> half when usage drops below a quarter of allocated. This leads to O(log n)
> performance, or something, rather than O(n/16). I'll do this.
>
> I was also thinking about using
> struct _strspan {
> char* begin;
> char* end;
> };
Sounds good.
> So I could parse through stuff, return what I found, and the caller could
> decide whether they wanted to allocate space for it. Have to look and see
> if that's useful.
How would you do this in C? for example,
status = address_create (&address, big_complex_address_string);
if status == ENOMEM, I expect the address object to not be constructed and
any memory acquire in the parsing process release. Actually this goes
for any status != 0 i.e status != 0 means an error occured.
> > Or using GNU obstack object.
>
> How easy would it be to cut GNU obstack code into mailutils? Not
> all libraries are GNU...
It's a generic container.
> > - (*)Proper prototypes is a good thing.
>
> parse822_ as a function prefix ok? _ as word seperator, with
> all lower case function names ok? The current names are a
> result of ':%s/MParse822:://g'...
If you intend to extern them, yes to reserve a prefix for your
namespace is a good idea.
> > - (*)A more consitent use of "const char *" versus "char *"
> > for example StrAppend*() should be :
> > int StrAppend (char **to, const char *from);
>
> I didn't do this initially, because the C spec has a bug:
>
> char* acp[];
>
> void f(const char* const a[]);
>
> f(acp); /* this causes warnings in C, though not C++, C is
> being lazy, its a safe conversion. */
>
> It's a pain in the ass. Since I use double indirection a lot,
> I'll have to cast all over the place.
Ok, but this is not our case since you have "const char *" and for
double indirection "const char **". Are you missing C++ issues,
Remember this is C ;-)
> > - (*)"new = (char*) realloc()", cast not needed
>
> realloc() returns void*, same as malloc, doesn't it? I'll check.
Yes, realloc()/malloc()/calloc() returns a the generic pointer (void *)
so an explicit cast is not needed, You only will get a warning
if you did not include <stdlib.h>. I do not know for C++.
> > - (**)GNU coding style: this not a high priority, I probably will not do
> > nothing
> > now, since lots of othe code have different style, for example, mime.c.
> > It's up to you, some people can not stand it.
>
> The GNU coding style was invented as a demonstration of how cool
> emacs is, IMO. There's not enough time in my life to convince my
> editor to support GNU conventions. I could run it through indent,
> but it would probably make changes all over the place. How do
> people who don't use emacs usually produce GNU code? If somebody
> has a vim cfg statement that will do it, I'll use it.
8-), sure, have pointed this to shaleh too, this is done for consistency.
There is a GNU std which is much more then just coding style. So far having
a thight std as been very good in the harmony of the GNU project, this
goes for things like long_options which are consistent etc ..
If you browse the latest packages on alpha.gnu.org/gnu/{tar,grep,fetish}
(fetish is {file,shell,text}utils) you will notice an overall consistency
even in the function names ;-).
But that said, program in any coding style, you want, we can kosher
after, other things are more important like not having array limits.
> > - (***)Some memory leaks, when the parsing failed, since it's a library
> > we should release, (see append patch, ok to commit?).
>
> Some? I don't recall checking the return value of a single StrAppend()!
8-) that's another issue, for example if the parsing failed, there were
some leaks. For example :
# ./addr
alain me @me
alain me @me=> pcount 0
Would leak. But very obvious to spot.
> I'll look at the patch tonight.
..
> In general, I'm not adverse to fixing bugs in code I wrote! 8-)
8-), well sometimes the authors may have a better patch and
since parse822 is new, you may have already change things in your
repository. For minor stuff, I usually commit straight.
> I'll integrate your patch with a bunch of the other things.
I just noted, some of diff are just trailing differences.
> Btw, my intention was to make the parsing functions public,
> not static, that's what I meant by "proper" prototypes.
>
> Should parse822.h be in mailbox/include, mailbox, or
> include/mailbox? The component functions can be quickly used
> by others, for instance if you want'ed to parse the
> route, you would build your own loop, but use the parse822
> functions to grab the bits, skip the comments, etc. Or if
> you were to use it to parse the header fields, which I was,
> you might want to use its functions from somewhere else.
Yes good idea, parse822 stand on its own, I would recommand
the prefix you propose above (parse822_*()) and static anything else
internal. Also to uncomment out the GetFieldName() GetFieldBody().
--
au revoir, alain
----
Aussi haut que l'on soit assis, on est toujours assis que sur son cul !!!
- mailutils (rfc822 parser), (continued)
- mailutils (rfc822 parser), Alain Magloire, 2001/03/13
- Virtual User/Alternative Authorization Methods, Jeff Breitner, 2001/03/14
- Re: Virtual User/Alternative Authorization Methods, Jeremy C. Reed, 2001/03/15
- Re: Virtual User/Alternative Authorization Methods, Alain Magloire, 2001/03/15
- Re: Virtual User/Alternative Authorization Methods, Jeremy C. Reed, 2001/03/15
- Re: Virtual User/Alternative Authorization Methods, Alain Magloire, 2001/03/15
Re: The state I'm in - rfc 822 parsing., Sam Roberts, 2001/03/14
Message not available