bug-cfengine
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Weird bugs


From: Jason Kim
Subject: Weird bugs
Date: Wed, 28 Sep 2005 23:09:28 -0400
User-agent: KMail/1.8.1

Ok, I'm going to ramble a bit here, but I've been staring down cfengine code 
all day, so please forgive me. I've just gone insane trying to figure this 
out, if anyone actually has the patience to read the whole thing and grok it, 
please let me know if I'm anywhere near on track (and if you can help me get 
my sanity back).
======

I had originally setup my systems to use an unqualified hostname. So for 
example, the output of 'hostname' would be 'testy', and 'hostname -f' would 
be 'testy.mgmt.advance.net'. This would work without any problems for the 
majority of things, only a few things would bug me:
-When cfexecd would send me emails it would send it with a subject of 
'(testy./10.1.9.9)' and from 'address@hidden'. Note the lack of a domain, and 
the extra '.' in the subject.
-The emails would also have a reply-to field set to 'address@hidden', which 
was the mailto field.
-I would get an extra runlog in /var/cfengine named 'cfengine..runlog' (the 
normal runlog would be named 'cfengine.testy.runlog'). The logs in the file 
are from standalone cfengine scripts, seemed like they couldn't figure out 
the hostname to generate a proper runlog.

So, just to see what would happen, I changed my hostname to use a fqdn, ie the 
output of 'hostname' would be 'testy.mgmt.advance.net'. This seems to have 
changed the behavior of a couple of things:
-The emails now come with the subject '(testy.mgmt.advance.net/10.1.9.9)' and 
from 'address@hidden'.
-The reply-to field was still set as 'address@hidden'.
-A new runlog would appear: 'cfengine.testy.mgmt.advance.net.runlog', which 
came from cfexecd. Cfagent would write to 'cfengine.zerg.runlog', and the 
standalones would write to 'cfengine..runlog'.

After hours of poking around I have the following observations and theories 
(this is the long confusing part and I apologize):

The emailing section of cfexecd does set a reply-to field (via the 'MAIL 
FROM:' smtp command) with a) the 'EmailFrom' if it was specified, or b) the 
address 'address@hidden' where '[domain]' is everything after the '@' in 
the 'EmailTo' variable, or c) the 'EmailTo' variable itself. I don't set the 
'EmailFrom' variable, so I should be getting a reply-to of 
'address@hidden', not 'address@hidden'. I believe the problem is a 
sscanf call which is supposed to grab the domain from the 'EmailTo' variable, 
attached a simple patch.

The emailing section is supposed to use the fqdn (the VFQNAME variable) for 
the subject and from fields, so there apparently is a problem in acquiring 
the fqdn when the hostname is unqualified. So here what I think is going 
on... 
The main() function of cfexecd calls CheckOptsAndInit() which then calls 
GetNameInfo() which then calls uname() and then calls SetDomainName() with 
the resulting nodename. Now oddly enough, SetDomainName() then ignores the 
given name and calls hostname() directly. 
If a '.' appears in the resulting hostname it's assumed to be fully qualified, 
VFQNAME is set to it and a domain (VDOMAIN) variable is set on everything 
after the first '.' (and the emails appear correctly). 
If there is no '.' and VDOMAIN is not set to 'undefined.domain', VDOMAIN is 
appended to the hostname to create the VFQNAME. The only problem is that at 
this point VDOMAIN is empty, so we end up with a VFQNAME of 'testy.'.
Now it would appear that the fix is to always use a fully qualified hostname, 
but that brings up the runlog problems...

From what I can deduce, runlogs are a) created every time something needs a 
lock, b) are always called with the unqualified name (the VUQNAME variable), 
and c) are used throughout cfexecd, cfagent, and cfenvd. So theoretically I 
should always have only one runlog, 'cfengine.testy.runlog', no matter if my 
hostname is fully qualified or not. Now the problem is that cfexecd assigns 
the nodename of a machine to VUQNAME flat out, regardless of whether the it 
is fully qualified or not, resulting in differing runlog names depending on 
the hostname.
Cfexecd then spawns cfagent, which _doesn't_ make assumptions about the 
hostname, it calls the GetNameInfo() as described above. Now regardless of 
whether the hostname is fully qualified or not, the GetNameInfo() call only 
sets the VFQNAME and VDOMAIN, cfagent has to parse a 'domain' variable in 
order to figure out the VUQNAME. As long as that is set, it appears that 
cfagent's runlogs are always the correct form, however I'm not in the habit 
of setting the domain var in standalone cfagent scripts, which results in 
cfagent never setting the VUQNAME and creating 'cfengine..runlog' files.
Oh, and for the record, it seems that cfenvd always uses the name 'localhost' 
when acquiring a lock, so it always generates its own file, 
'cfengine.localhost.runlog'.

Sooo...
The question is, WTF is the correct way to name a host?? The docs seem to 
disparage using fully qualified names, but also claim that they shouldn't be 
an issue. I don't think these bugs actually affect the way cfexecd/cfagent 
run, except perhaps via some odd lock name or class definition mangling... So 
should I just ignore these problems for the sake of my sanity? I had set out 
to see if I could squash a couple of simple bugs, but I don't have the 
slightest idea of where one would even start. I'd venture to say that the 
whole set of functions that set the qualified/unqualified hostnames and 
domainname need to be pulled out and made a separate, consistent module for 
cfexecd, cfagent, and (possibly) cfenvd. But that seems like waaay too much 
work for what is basically such a small set of problems that no one but me 
has even cared about it (that I know of anyway).

Congrats if you got this far. I'm going to go cry now...
-JayKim

Attachment: cfexecd_EmailFrom_patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]