[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Problems with netgroups and ldap.
From: |
Mark Burgess |
Subject: |
Re: Problems with netgroups and ldap. |
Date: |
Fri, 08 Apr 2005 20:10:58 +0200 |
I could not find the function on my new linux. It looks like a thread
rentrancy problem - very strange. Is this a multiprocessor machine?
It might take me a little while to look at this as I am about to travel
to the US for a week.
M
On Fri, 2005-04-08 at 10:12 -0700, Mark Keller wrote:
> >> Mark, this sounds odd. Nothing has changed here in the cfengine code for
> > many years. I don't think the innetgr function is portable. It looks
> > solarissy. Have you tried doing an strace (truss) to see exactly what it
> > gets stuck on?
> >
>
> The innetgr function seems to be on Solaris, Redhat EL3, Fedora, HPUX 11,
> FreeBSD, and Mac OS X in netdb.h (just doing a quick google search anyhow). I
> suppose it might not be on all OS's though. Am-utils uses it exclusively for
> netgroup calls for all OS's it supports and so does pam_access, which are
> working for me.
>
> So I added a couple of debug statements to the original code:
> case netgroup: setnetgrent(ebuff);
> Debug1("set netgroup to %s\n",ebuff);
>
> while (getnetgrent(&machine,&user,&domain))
> {
> Debug1("getnetgrent m = %s, u = %s, d = %s in netgroup
> %s\n",machine,user,domain,ebuff);
> if (strcmp(machine,VDEFAULTBINSERVER.name) == 0)
> {
> Debug1("Matched %s in netgroup %s\n",machine,ebuff);
> AddClassToHeap(GROUPBUFF);
> break;
> }
> ...truncated...
> }
>
> endnetgrent();
> break;
>
> Running "cfagent -Bvq -d1", I get the following:
>
> ....truncated.....
> ==============================BEGIN NEW ACTION Groups:=============
>
>
> Resetting CLASS to ANY
>
> LVALUE betahosts
> HandleLVALUE(betahosts) in action Groups:
> EQUALS =
> LEFTBRACK
> RVAL-VAROBJ address@hidden
>
> HandleGroupRvalue(address@hidden)
> Netgroup rval, lookup NIS group (ops-beta-sos5)
> ExpandVarstring(ops-beta-sos5)
> HandleGroupRVal(ops-beta-sos5) group (betahosts), type=1
> set netgroup to ops-beta-sos5
> cfengine:: Time out
>
> It takes a couple of minutes before I get the "Time out" error and then
> cfagent just sits there for an extremely long time.
>
>
> Now Running "truss -f cfagent -Bvq -d1", I get the following:
>
> ....truncated.......
> 8328: write(5, "160301\08610\0\082\0809F".., 182) = 182
> 8328: read(5, "140301\001", 5) = 5
> 8328: read(5, "01", 1) = 1
> 8328: read(5, "160301\0 ", 5) = 5
> 8328: read(5, " O 09D12 G1EF981 kF59D b".., 32) = 32
> 8328: time() = 1112978413
> 8328: write(5, "170301\01ED8 ]BE 3 =B0 ]".., 35) = 35
> 8328: time() = 1112978413
> 8328: poll(0xFFBF9708, 1, 30000) = 1
> 8328: read(5, "170301\01E", 5) = 5
> 8328: read(5, "C8C5A8C1EB N $1A19 g H05".., 30) = 30
> 8328: time() = 1112978413
> 8328: setsockopt(5, SOL_SOCKET, SO_KEEPALIVE, 0xFFBFB8D0, 4, 1) = 0
> 8328: fcntl(5, F_SETFD, 0x00000001) = 0
> 8328: getsockname(5, 0xFEFF00C0, 0xFFBFB8CC, 1) = 0
> 8328: getpeername(5, 0xFEFF00D0, 0xFFBFB8C8, 1) = 0
> 8328: time() = 1112978413
> 8328: getpid() = 8328 [8327]
> 8328: getuid() = 0 [0]
> 8328: time() = 1112978413
> 8328: write(5, "170301\095D58BA9\r FAE S".., 154) = 154
> 8328: poll(0xFFBF96B0, 1, -1) = 1
> 8328: read(5, "17030101 .", 5) = 5
> 8328: read(5, "\tF20215D380\0B3A41BDD83".., 302) = 302
> 8328: poll(0xFFBF96B0, 1, -1) = 1
> 8328: read(5, "170301\01E", 5) = 5
> 8328: read(5, "9CBC\b G02 [0691B2\nFBF0".., 30) = 30
> 8328: time() = 1112978413
> 8328: time() = 1112978413
> 8328: sigaction(SIGPIPE, 0xFFBFC0C0, 0x00000000) = 0
> set netgroup to ops-beta-sos5
> 8328: write(1, " s e t n e t g r o u p".., 30) = 30
> 8328: sigaction(SIGPIPE, 0xFFBFC118, 0xFEFF10E0) = 0
> 8328: mmap(0x00000000, 110592, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON,
> -1, 0) = 0xFEDA0000
> 8328: lwp_park(0x00000000, 0) (sleeping...)
> 8328: Received signal #14, SIGALRM, in lwp_park() [caught]
> 8328: lwp_park(0x00000000, 0) Err#4 EINTR
> 8328: sigprocmask(SIG_SETMASK, 0xFFBFBC54, 0x00000000) = 0
> 8328: alarm(0) = 0
> cfengine:: Time out
> 8328: write(1, " c f e n g i n e : : T".., 20) = 20
> 8328: sigprocmask(SIG_SETMASK, 0xFF07A074, 0xFFBFBA08) = 0
> 8328: lwp_unpark(1, 1) = 0
> 8328: setcontext(0xFFBFBA18)
> 8328: lwp_park(0x00000000, 0) = 0
> 8328: lwp_park(0x00000000, 0) (sleeping...)
>
> Here is takes a couple of minutes between the first lwp_park and the "Time
> out" error, then stays on the last lwp_park for an extremely long time. Might
> be infinite, but it is at least hours. If you need more trussing please let
> me know.
>
> So to me seems like something is going wrong around the getnetgrent function.
> I can't tell what might be the problem as my C coding/debugging is rusty. My
> small test code using getnetgrent seems to work just fine.
>
> Thanks,
>
> Mark Keller
>
>
>
>
>
>
>
>
> _______________________________________________
> Bug-cfengine mailing list
> address@hidden
> http://lists.gnu.org/mailman/listinfo/bug-cfengine