coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: some concern about the fix of " tail: consistently output all data f


From: Lian, George (Nokia - CN/Hangzhou)
Subject: RE: some concern about the fix of " tail: consistently output all data for truncated files"
Date: Wed, 9 Nov 2016 08:08:20 +0000

Hi,

>yes I'm sure one can always find some artificial case, but can you think of 
>any real
>usecase? Because I could not think for any kind of real use case.

>Moreover what may happen is that in case of file rotation with old design that 
>part 
>of the data will be missing in tail output. And that is real usecase.

For truncate as command in Linux or system call "ftruncate" API, they all have 
2 parameter, one is file or fd, the second is length.
I suppose length is meaningful to truncate or ftruncate. So for really case we 
can't assume the length is certainly be zero.

If the length is not 0, the new version certainly have issues there.
And anyway, just take size make less to decide  the truncate operation happen, 
it is also need to do more discussion.

Best Regards,
George

-----Original Message-----
From: Zizka, Jan (Nokia - CZ/Prague) 
Sent: Wednesday, November 09, 2016 4:00 PM
To: Zhang, Bingxuan (Nokia - CN/Hangzhou) <address@hidden>; Lian, George (Nokia 
- CN/Hangzhou) <address@hidden>; Pádraig Brady <address@hidden>; address@hidden
Cc: Li, Deqian (Nokia - CN/Hangzhou) <address@hidden>; Bao, Xiaohui (Nokia - 
CN/Hangzhou) <address@hidden>
Subject: RE: some concern about the fix of " tail: consistently output all data 
for truncated files"

> -----Original Message-----
> From: Zhang, Bingxuan (Nokia - CN/Hangzhou)
> Sent: Wednesday, November 09, 2016 8:51 AM
> To: Zizka, Jan (Nokia - CZ/Prague) <address@hidden>; Lian, George
> (Nokia - CN/Hangzhou) <address@hidden>; Pádraig Brady
> <address@hidden>; address@hidden
> Cc: Li, Deqian (Nokia - CN/Hangzhou) <address@hidden>; Bao, Xiaohui
> (Nokia - CN/Hangzhou) <address@hidden>
> Subject: RE: some concern about the fix of " tail: consistently output all 
> data
> for truncated files"
> 
> >Can you tell any real use case where the changed tail behaviour would fail
> >and print old content as you describe? I mean some realy use case not the
> >behaviour caused by GlusterFS bug.
> 
> Not found from real environment, but we can design one program to do this:
>       A program write a log file, and it want to keep its first 1K bytes
> always.
>       When the file reach its limit (e.g. 10K bytes), it truncates its content
> to 1KB, then start to write content again.
> 
> In this case, with new version, the beginning 1KB data will be printed by tail
> always when the truncate happen.

yes I'm sure one can always find some artificial case, but can you think of any 
real
usecase? Because I could not think for any kind of real use case.

Moreover what may happen is that in case of file rotation with old design that 
part 
of the data will be missing in tail output. And that is real usecase.

Jan

> 
> 
> Br, Jimmy
> 
> -----Original Message-----
> From: Zizka, Jan (Nokia - CZ/Prague)
> Sent: Wednesday, November 09, 2016 3:41 PM
> To: Zhang, Bingxuan (Nokia - CN/Hangzhou) <address@hidden>;
> Lian, George (Nokia - CN/Hangzhou) <address@hidden>; Pádraig
> Brady <address@hidden>; address@hidden
> Cc: Li, Deqian (Nokia - CN/Hangzhou) <address@hidden>; Bao, Xiaohui
> (Nokia - CN/Hangzhou) <address@hidden>
> Subject: RE: some concern about the fix of " tail: consistently output all 
> data
> for truncated files"
> 
> > -----Original Message-----
> > From: Zhang, Bingxuan (Nokia - CN/Hangzhou)
> > Sent: Wednesday, November 09, 2016 8:19 AM
> > To: Zizka, Jan (Nokia - CZ/Prague) <address@hidden>; Lian, George
> > (Nokia - CN/Hangzhou) <address@hidden>; Pádraig Brady
> > <address@hidden>; address@hidden
> > Cc: Li, Deqian (Nokia - CN/Hangzhou) <address@hidden>; Bao, Xiaohui
> > (Nokia - CN/Hangzhou) <address@hidden>
> > Subject: RE: some concern about the fix of " tail: consistently output all
> data
> > for truncated files"
> >
> > Hi,
> >
> > Let's not mix 2 problems here.
> 
> yes and I was not mixing the two :)
> 
> >
> > 1. glusterfs problem  => We'll continue the investigation.
> >
> > 2. tail problem, let's discuss it separately from glusterfs bug, just from 
> > its
> > own design.
> >     New version: when find file size reduce, print content from 0 to the
> > reduced_size.
> >     Old version: when find file size reduce, stay in the end of the
> > reduced size and wait for new content.
> > Both 2 ways has its limitation,  neither of them are perfect or precisely.
> > Here I just want to say the older version is better than new version in my
> > understanding.
> > Refer to man manual, the '-f' option is designed to print the file which is 
> > on
> > append mode, but not designed for the file which might have truncate
> > happen on it.
> > "tail" should focus on what is added, but not on the data from original
> > printed size part of the file.
> 
> yes exactly. And in case file is truncated or replaced tail has to assume it 
> is
> with
> new content  which was added.
> 
> Can you tell any real use case where the changed tail behaviour would fail
> and print old content as you describe? I mean some realy use case not the
> behaviour caused by GlusterFS bug.
> 
> Jan
> 
> > =============================
> > # man tail
> > TAIL(1)                          User Commands                         
> > TAIL(1)
> >
> >
> > NAME
> >        tail - output the last part of files
> > ...
> >        -f, --follow[={name|descriptor}]
> >               output appended data as the file grows;
> > ...
> > =============================
> >
> > Br, Jimmy
> >
> > -----Original Message-----
> > From: Zizka, Jan (Nokia - CZ/Prague)
> > Sent: Wednesday, November 09, 2016 3:08 PM
> > To: Zhang, Bingxuan (Nokia - CN/Hangzhou) <address@hidden>;
> > Lian, George (Nokia - CN/Hangzhou) <address@hidden>; Pádraig
> > Brady <address@hidden>; address@hidden
> > Cc: Li, Deqian (Nokia - CN/Hangzhou) <address@hidden>; Bao, Xiaohui
> > (Nokia - CN/Hangzhou) <address@hidden>
> > Subject: RE: some concern about the fix of " tail: consistently output all
> data
> > for truncated files"
> >
> > > -----Original Message-----
> > > From: Zhang, Bingxuan (Nokia - CN/Hangzhou)
> > > Sent: Wednesday, November 09, 2016 6:36 AM
> > > To: Lian, George (Nokia - CN/Hangzhou) <address@hidden>;
> Pádraig
> > > Brady <address@hidden>; address@hidden
> > > Cc: Li, Deqian (Nokia - CN/Hangzhou) <address@hidden>; Zizka, Jan
> > > (Nokia - CZ/Prague) <address@hidden>; Bao, Xiaohui (Nokia -
> > > CN/Hangzhou) <address@hidden>
> > > Subject: RE: some concern about the fix of " tail: consistently output all
> > data
> > > for truncated files"
> > >
> > > Hi,
> > >
> > > I wonder the original requirement of "tail", what is the purpose of this
> > tool?
> > > Referred to:
> > >   tail - output the last part of files
> > >
> > > Here when "tail" found the some file length become small, is it really
> need
> > > to print old content?
> >
> > but tail cannot know if that is old content. The truncate detection was
> > added there
> > to overcome problem when someone overwrites the file being tailed, in
> > which case
> > it should indeed start dumping the file from beggining.
> >
> > > My opinion is that ignore those old content is better alternative.
> >
> > OK but how would you do that as tail doens't know that it is old content ...
> >
> > >
> > > It is possible those "old content" is written newly (e.g. truncate to 0, 
> > > then
> > > write small content).
> > > It is also possible those "old content" is really old (e.g. truncate to 
> > > small
> > > size).
> > >
> > > So "tail" can do perfect design here to trace every piece of data write to
> > the
> > > file.
> > > But it should focus on only the data to the last with current reality.
> > >
> > > So my opinion is "revert to previous design" is better choice then
> currently.
> > > What you think?
> >
> > If the change is reverted then you will get regressions on the cases for
> which
> > this
> > was added so that is definately not an option.
> >
> > What should be fixed is GlusterFS instead of trying to make workarounds
> for
> > its
> > misbehaviour. As Pádraig also noted:
> >
> > > This stale st_size behavior, giving a smaller value _after_ a read,
> > > seems quite problematic to lots of apps though, not just tail(1).
> >
> > this will affect other applications and tools not only tail. If you make 
> > some
> > kind of
> > workaround in tail for this and GlusterFS is not fixed then this problem 
> > will
> > stay
> > hidden and will hit some other application sooner or later.
> >
> > Jan
> >
> >
> > >
> > >
> > > Br, Jimmy
> > >
> > > -----Original Message-----
> > > From: Lian, George (Nokia - CN/Hangzhou)
> > > Sent: Wednesday, November 09, 2016 9:36 AM
> > > To: Pádraig Brady <address@hidden>; address@hidden
> > > Cc: Zhang, Bingxuan (Nokia - CN/Hangzhou)
> <address@hidden>;
> > > Li, Deqian (Nokia - CN/Hangzhou) <address@hidden>; Zizka, Jan
> (Nokia
> > -
> > > CZ/Prague) <address@hidden>; Bao, Xiaohui (Nokia - CN/Hangzhou)
> > > <address@hidden>
> > > Subject: RE: some concern about the fix of " tail: consistently output all
> > data
> > > for truncated files"
> > >
> > > Hi,
> > > >What network file system type is this?
> > >
> > > The file systems is GlusterFS of Redhat,
> > >
> > > >This stale st_size behavior, giving a smaller value _after_ a read,seems
> > > quite problematic to lots of apps though, not just tail(1).
> > > I agree, but I still suppose more application will do get st_size first 
> > > then
> do
> > > seek and read which will not over the size of file.
> > >
> > > We also have submit the issue to GlusterFS community, but till now, they
> > > can't find the root cause in glusterfs.
> > >
> > > I still complain to "tail application", even if there has some issue on
> > > glusterfs,
> > > but "tail" eat all the space of the disk (by continues pseudo-truncate for
> a
> > > large syslog file)  , I suggest "tail" could do some change to prevent it.
> > >
> > > Thanks & Best Regards,
> > > George
> > >
> > > -----Original Message-----
> > > From: Pádraig Brady [mailto:address@hidden]
> > > Sent: Tuesday, November 08, 2016 7:29 PM
> > > To: Lian, George (Nokia - CN/Hangzhou) <address@hidden>;
> > > address@hidden
> > > Cc: Zhang, Bingxuan (Nokia - CN/Hangzhou)
> <address@hidden>;
> > > Li, Deqian (Nokia - CN/Hangzhou) <address@hidden>; Zizka, Jan
> (Nokia
> > -
> > > CZ/Prague) <address@hidden>; Bao, Xiaohui (Nokia - CN/Hangzhou)
> > > <address@hidden>
> > > Subject: Re: some concern about the fix of " tail: consistently output all
> > data
> > > for truncated files"
> > >
> > > On 08/11/16 02:50, Lian, George (Nokia - CN/Hangzhou) wrote:
> > > > Hi,
> > > >>> Add one more suggestion, if we have not a perfect solution to
> consider
> > > all the case of truncate, could we add an option to tail, such like tail 
> > > -no-
> > > truncate
> > > >>> If tail run with this option, than application not consider any
> truncate
> > > case.
> > > >>>
> > > >>> For example, I suppose syslog output file will not have any truncate
> > case
> > > in our environment, then the tail could use the option to avoid the mis-
> > > truncated case?
> > > >
> > > >> Note for case 2) above, we only update fspec->size _after_ the read,
> > > >> so I'm not sure how practical the race with reading a _smaller_ st_size
> > > after that is?
> > > >> I.E. the heuristic is fairly good I think,
> > > >> so an option may be overkill.
> > > >> We'd have to see a demonstratable issue to consider such an option.
> > > >
> > > > We have an issue now for tail a syslog file which stored in a network-
> > based
> > > file system. A automated cased need tail the syslog about one hour to
> get
> > > the syslog of that period,
> > > > in that period of one hour , happen 6 times of  un-expected file
> > truncated
> > > issue, so the output of tail has 6 times full syslog file, so the output 
> > > file is
> > so
> > > huge and eat all of the disks.
> > > > The network-based file system maybe not so easy to change to meet
> the
> > > current implement of "tail" application.
> > > > So I need helps from yours :)
> > > >
> > > > And which your mean for demonstratable?  The issue we encounter
> > could
> > > be easy to reproduce, maybe the file-system is not so strict like ext4 
> > > file
> > > system,
> > > > but I still suggest "tail" application could do some change to adapt 
> > > > this
> > > kinds network-based file system?
> > >
> > > It's important info that you have seen the issue.
> > > What network file system type is this?
> > > We might just revert this change if the issue is widespread enough.
> > >
> > > This stale st_size behavior, giving a smaller value _after_ a read,
> > > seems quite problematic to lots of apps though, not just tail(1).
> > >
> > > thanks,
> > > Pádraig.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]