[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Weird bugs
From: |
Jason Kim |
Subject: |
Weird bugs |
Date: |
Wed, 28 Sep 2005 23:09:28 -0400 |
User-agent: |
KMail/1.8.1 |
Ok, I'm going to ramble a bit here, but I've been staring down cfengine code
all day, so please forgive me. I've just gone insane trying to figure this
out, if anyone actually has the patience to read the whole thing and grok it,
please let me know if I'm anywhere near on track (and if you can help me get
my sanity back).
======
I had originally setup my systems to use an unqualified hostname. So for
example, the output of 'hostname' would be 'testy', and 'hostname -f' would
be 'testy.mgmt.advance.net'. This would work without any problems for the
majority of things, only a few things would bug me:
-When cfexecd would send me emails it would send it with a subject of
'(testy./10.1.9.9)' and from 'address@hidden'. Note the lack of a domain, and
the extra '.' in the subject.
-The emails would also have a reply-to field set to 'address@hidden', which
was the mailto field.
-I would get an extra runlog in /var/cfengine named 'cfengine..runlog' (the
normal runlog would be named 'cfengine.testy.runlog'). The logs in the file
are from standalone cfengine scripts, seemed like they couldn't figure out
the hostname to generate a proper runlog.
So, just to see what would happen, I changed my hostname to use a fqdn, ie the
output of 'hostname' would be 'testy.mgmt.advance.net'. This seems to have
changed the behavior of a couple of things:
-The emails now come with the subject '(testy.mgmt.advance.net/10.1.9.9)' and
from 'address@hidden'.
-The reply-to field was still set as 'address@hidden'.
-A new runlog would appear: 'cfengine.testy.mgmt.advance.net.runlog', which
came from cfexecd. Cfagent would write to 'cfengine.zerg.runlog', and the
standalones would write to 'cfengine..runlog'.
After hours of poking around I have the following observations and theories
(this is the long confusing part and I apologize):
The emailing section of cfexecd does set a reply-to field (via the 'MAIL
FROM:' smtp command) with a) the 'EmailFrom' if it was specified, or b) the
address 'address@hidden' where '[domain]' is everything after the '@' in
the 'EmailTo' variable, or c) the 'EmailTo' variable itself. I don't set the
'EmailFrom' variable, so I should be getting a reply-to of
'address@hidden', not 'address@hidden'. I believe the problem is a
sscanf call which is supposed to grab the domain from the 'EmailTo' variable,
attached a simple patch.
The emailing section is supposed to use the fqdn (the VFQNAME variable) for
the subject and from fields, so there apparently is a problem in acquiring
the fqdn when the hostname is unqualified. So here what I think is going
on...
The main() function of cfexecd calls CheckOptsAndInit() which then calls
GetNameInfo() which then calls uname() and then calls SetDomainName() with
the resulting nodename. Now oddly enough, SetDomainName() then ignores the
given name and calls hostname() directly.
If a '.' appears in the resulting hostname it's assumed to be fully qualified,
VFQNAME is set to it and a domain (VDOMAIN) variable is set on everything
after the first '.' (and the emails appear correctly).
If there is no '.' and VDOMAIN is not set to 'undefined.domain', VDOMAIN is
appended to the hostname to create the VFQNAME. The only problem is that at
this point VDOMAIN is empty, so we end up with a VFQNAME of 'testy.'.
Now it would appear that the fix is to always use a fully qualified hostname,
but that brings up the runlog problems...
From what I can deduce, runlogs are a) created every time something needs a
lock, b) are always called with the unqualified name (the VUQNAME variable),
and c) are used throughout cfexecd, cfagent, and cfenvd. So theoretically I
should always have only one runlog, 'cfengine.testy.runlog', no matter if my
hostname is fully qualified or not. Now the problem is that cfexecd assigns
the nodename of a machine to VUQNAME flat out, regardless of whether the it
is fully qualified or not, resulting in differing runlog names depending on
the hostname.
Cfexecd then spawns cfagent, which _doesn't_ make assumptions about the
hostname, it calls the GetNameInfo() as described above. Now regardless of
whether the hostname is fully qualified or not, the GetNameInfo() call only
sets the VFQNAME and VDOMAIN, cfagent has to parse a 'domain' variable in
order to figure out the VUQNAME. As long as that is set, it appears that
cfagent's runlogs are always the correct form, however I'm not in the habit
of setting the domain var in standalone cfagent scripts, which results in
cfagent never setting the VUQNAME and creating 'cfengine..runlog' files.
Oh, and for the record, it seems that cfenvd always uses the name 'localhost'
when acquiring a lock, so it always generates its own file,
'cfengine.localhost.runlog'.
Sooo...
The question is, WTF is the correct way to name a host?? The docs seem to
disparage using fully qualified names, but also claim that they shouldn't be
an issue. I don't think these bugs actually affect the way cfexecd/cfagent
run, except perhaps via some odd lock name or class definition mangling... So
should I just ignore these problems for the sake of my sanity? I had set out
to see if I could squash a couple of simple bugs, but I don't have the
slightest idea of where one would even start. I'd venture to say that the
whole set of functions that set the qualified/unqualified hostnames and
domainname need to be pulled out and made a separate, consistent module for
cfexecd, cfagent, and (possibly) cfenvd. But that seems like waaay too much
work for what is basically such a small set of problems that no one but me
has even cared about it (that I know of anyway).
Congrats if you got this far. I'm going to go cry now...
-JayKim
cfexecd_EmailFrom_patch
Description: Text Data