coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unexpected behavior of 'tail --follow=name' on special file via syml


From: Pádraig Brady
Subject: Re: Unexpected behavior of 'tail --follow=name' on special file via symlink
Date: Sun, 5 Feb 2023 19:13:15 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Thunderbird/109.0

On 05/02/2023 16:56, Glenn Golden wrote:
Glenn Golden <gdg@zplane.com> [2023-02-02 15:55:51 -0700]:
Pádraig Brady <P@draigBrady.com> [2023-02-01 20:37:47 +0000]:

That was informative thanks.

What I think is happening is that to support --follow=name tail(1) operates
in non blocking mode so that it doesn't block when reading a file, and has
the opportunity to recheck.

Now it determines when a read() shouldn't block by doing stat() and if size
or mtime haven't changed, then it doesn't perform the read. In fact for non
regular files it only goes on the mtime, which I guess is not changing for
your device.


Yes, that appears to be the case: The mtime is evidently updated only when
the USB host writes to the device; data arriving from the device to the host
does not update mtime.


Note if you were just doing --follow=descriptor on a single file tail(1)
would operate in a simpler blocking manner and would work fine with your
device I expect (not through the changing symlink of course).


I think following by descriptor can't really work reliably for this case,
because the tail process, even though it operates in blocking mode as you say,
nevertheless kind of heisenbergs the situation because it holds the original
fd open, which in turn causes the USB driver to assign a new minor device
number to the device upon its reboot. And that effectively dissociates the
original fd from the device. So tail thereafter just sees EOF forever.

Example: Suppose the original file descriptor that dev/ttyPSLOG points to
is /dev/ttyACM0, and tail has been successfully following that via
--follow=descriptor in blocking mode.  Then the device reboots.  Upon
detecting the reincarnated device, the driver sees that /dev/ttyACM0 is
still open (because tail hasn't explicitly close()d it, despite the fact
that there's no longer a device behind it and he keeps getting EOF on
every read()).  So because /dev/ttyACM0 is still marked as "open", the
driver winds up assigning the reincarnated device a new interface name based
on the next available minor device number, e.g. /dev/ttyACM1.  As a result,
the original fd is left pointing to /dev/ttyACM0, which no longer exists
in the filesystem, and read() returns EOF forever. (Why doesn't it instead
return in error upon disappearance of the original device node? I don't know.
But "EOF forever" is what is observed with strace.)

In contrast, if tail -- or any other process for that matter -- did not have
the original fd open when the reboot occurred, only then would the driver
(probably) wind up reassigning /dev/ttyACM0 to the device.

So, perversely, it is the tail process itself, by holding the fd open,
that prevents the behavior we'd like with --follow=descriptor.


I wonder could we try the read() anyway when operating on a
a single non regular file that has successfully been set in non blocking mode?
In that case we shouldn't block and read() would return 0 if no data.
I haven't thought about that much, but the diff below should do it.
At least it shows the part of the code involved,
which compares the various stat() members to determine to read or not.
Perhaps you could add debugging there (or strace -v -e fstat,newfstatat ...).
For example to see if st_size changes for you
(I don't think there is any other stat member we could key on).


Ok, thanks. Let me fiddle around with your updated patch (the one named
"tail-F-dev.patch") and I'll let you know what I find.


Your two patches (the one mentioned above and the earlier one avoiding the
xlseek() on a non-regular file) do almost fix the problem, but not quite.
I added one additional mod, and with that, it now seems to address the
issue fully. (And doesn't seem to break any "make check" tests.)

Oh right I see. Sorry for missing that.
After an fd change we need to go around the loop again
to reinit the local fd and also guard against fd==-1 etc.
Previously we always went around the loop again after a recheck()
so any previous use of fd was fine.

Attaching the final two patches I hope to apply for this.
They should apply to coreutils-9.1

cheers,
Pádraig

Attachment: 0001-tail-fix-support-for-F-with-non-seekable-files.patch
Description: Text Data

Attachment: 0002-tail-improve-follow-name-with-single-non-regular-fil.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]