coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unexpected behavior of 'tail --follow=name' on special file via syml


From: Glenn Golden
Subject: Re: Unexpected behavior of 'tail --follow=name' on special file via symlink
Date: Sun, 5 Feb 2023 09:56:27 -0700

Glenn Golden <gdg@zplane.com> [2023-02-02 15:55:51 -0700]:
> Pádraig Brady <P@draigBrady.com> [2023-02-01 20:37:47 +0000]:
> > 
> > That was informative thanks.
> >
> > What I think is happening is that to support --follow=name tail(1) operates
> > in non blocking mode so that it doesn't block when reading a file, and has
> > the opportunity to recheck.
> > 
> > Now it determines when a read() shouldn't block by doing stat() and if size
> > or mtime haven't changed, then it doesn't perform the read. In fact for non
> > regular files it only goes on the mtime, which I guess is not changing for
> > your device.
> > 
> 
> Yes, that appears to be the case: The mtime is evidently updated only when
> the USB host writes to the device; data arriving from the device to the host
> does not update mtime.
> 
> >
> > Note if you were just doing --follow=descriptor on a single file tail(1)
> > would operate in a simpler blocking manner and would work fine with your
> > device I expect (not through the changing symlink of course).
> > 
> 
> I think following by descriptor can't really work reliably for this case,
> because the tail process, even though it operates in blocking mode as you say,
> nevertheless kind of heisenbergs the situation because it holds the original
> fd open, which in turn causes the USB driver to assign a new minor device
> number to the device upon its reboot. And that effectively dissociates the
> original fd from the device. So tail thereafter just sees EOF forever.
> 
> Example: Suppose the original file descriptor that dev/ttyPSLOG points to
> is /dev/ttyACM0, and tail has been successfully following that via
> --follow=descriptor in blocking mode.  Then the device reboots.  Upon
> detecting the reincarnated device, the driver sees that /dev/ttyACM0 is
> still open (because tail hasn't explicitly close()d it, despite the fact
> that there's no longer a device behind it and he keeps getting EOF on
> every read()).  So because /dev/ttyACM0 is still marked as "open", the
> driver winds up assigning the reincarnated device a new interface name based
> on the next available minor device number, e.g. /dev/ttyACM1.  As a result,
> the original fd is left pointing to /dev/ttyACM0, which no longer exists
> in the filesystem, and read() returns EOF forever. (Why doesn't it instead
> return in error upon disappearance of the original device node? I don't know.
> But "EOF forever" is what is observed with strace.)
> 
> In contrast, if tail -- or any other process for that matter -- did not have
> the original fd open when the reboot occurred, only then would the driver
> (probably) wind up reassigning /dev/ttyACM0 to the device.
> 
> So, perversely, it is the tail process itself, by holding the fd open,
> that prevents the behavior we'd like with --follow=descriptor.
> 
> >
> > I wonder could we try the read() anyway when operating on a
> > a single non regular file that has successfully been set in non blocking 
> > mode?
> > In that case we shouldn't block and read() would return 0 if no data.
> > I haven't thought about that much, but the diff below should do it.
> > At least it shows the part of the code involved,
> > which compares the various stat() members to determine to read or not.
> > Perhaps you could add debugging there (or strace -v -e fstat,newfstatat 
> > ...).
> > For example to see if st_size changes for you
> > (I don't think there is any other stat member we could key on).
> > 
> 
> Ok, thanks. Let me fiddle around with your updated patch (the one named
> "tail-F-dev.patch") and I'll let you know what I find.
> 

Your two patches (the one mentioned above and the earlier one avoiding the
xlseek() on a non-regular file) do almost fix the problem, but not quite.
I added one additional mod, and with that, it now seems to address the
issue fully. (And doesn't seem to break any "make check" tests.) 

See what you think.  Summary comments up top.

Thanks,

- Glenn

Attachment: tail_9.1.139-b5904_vs_tail_p2g1.patch
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]