discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] comments on stream tags and metadata storage


From: Nowlan, Sean
Subject: Re: [Discuss-gnuradio] comments on stream tags and metadata storage
Date: Fri, 18 Jul 2014 03:04:28 +0000

> ________________________________________
> From: address@hidden <address@hidden> on behalf of Peter A. Bigot 
> <address@hidden>
> Sent: Thursday, July 17, 2014 6:11 PM
> To: address@hidden
> Subject: [Discuss-gnuradio] comments on stream tags and metadata storage
> 
> Some comments after playing with stream tags and metadata this
> afternoon.
>

I can't speak to all of these issues due to not having played around much with 
the file_meta_sink and tagged_file_sink blocks but I have some responses to 
some of your comments/questions.
 
> (1) Although the discussion of stream tag insertion hints that this
> should be done within the scheduler's call to work() it could be more
> clear that doing it in any other context can result in race conditions.
> (I did think I saw it stated more clearly somewhere, but can't find
> that now, so maybe this point has been addressed.)
> 
> (2) In the current implementation it's further necessary that tags be
> added to an output in monotonic non-decreasing offset order.
> file_meta_sink does not sort the return value from get_tags_in_range(),
> and emits all data up to the timestamp of the next tag, so a subsequent
> tag with an earlier offset is dropped from the archive.
> 
> (I note that tagged_file_sink() does sort the tags it receives in one
> case, but not in others.)
> 
> I don't see this requirement on ordered generation documented.  In some
> cases, it may be inconvenient to do this, e.g. when a block's analysis
> discovers after-the-fact that something interesting can be associated
> with a past sample.  Similarly, a user might want a block to associate
> a tag with sample that not yet arrived, to notify a downstream block
> that will need to process the event.
>

I don't think that ordered generation is required per se, but certain blocks 
sort and others don't. For instance, the tag_work function in usrp_sink_impl.cc 
"does" sort precisely because get_tags_in_range doesn't.
 
> A simple solution for the infrastructure is to require that tags only be
> generated from within work(), with offsets corresponding to samples
> generated in that call to work(), and in non-decreasing offset order
> (though this last requirement could be handled by add_item_tag()).  The
> developer must then handle the too-late/too-early tag associations
> through some other mechanism, such as carrying the effective offset as
> part of the tag value.
> 

As far as I'm aware, adding tags from within work is the only safe way to add 
tags to a stream. Also, it is required that offsets correspond to the valid 
range spanning the buffer of input items passed to work. The scheduler prunes 
others outside this range. It's also worth noting that although the history 
mechanism allows viewing past samples (filters use this, for example), 
attempting to add tags to samples in history will not work; those tags will be 
pruned.

If tags need to be stored for future processing in subsequent calls to work, 
it's up to the programmer to push them onto a stack/queue/whatever inside the 
block. The scheduler won't handle this.

> (3) Qt GUI Range with widget Counter + Slider invokes callbacks twice,
> even if the value itself was set exactly once through the counter text
> entry.  If the callback records the change by queuing a stream tag for
> addition to the output, multiple tags with the same offset/key/value
> will be generated.
> 
> There are ugly solutions to this but it's probably sufficient to note
> somewhere that it can happen.  It's really not specific to tags, but is
> clearly visible in that case.
> 
> (4) The in-memory stream of tags can produce multiple settings of the
> same key at the same offset.  However, when stored to a file only the
> last setting of the key is recorded.
> 
> I believe this last behavior is incorrect and that it's a mistake to use
> a map instead of a multimap or simple list for the metadata record of
> stream tags associated with a sample.
> 
> One argument is that it's critical that a stream archive of a processing
> session faithfully record the contents of the stream so that re-running
> the application using playback reproduces that stream and thus the
> original behavior (absent non-determinism due to asynchrony). This
> faithful reproduction is what would allow a maintainer to diagnose an
> operational failure caused by a block with a runtime failure when the
> same tag is processed twice at the same offset.  This is true even if
> the same key is set to the same value at the same sample offset multiple
> times, which some might otherwise want to argue is redundant.
> 
> A corollary argument is that the sample number at which an event like a
> tuner configuration change occurs usually can't be exactly associated
> with a sample; the best estimate is likely to be the index of the first
> sample generated by the next call to work.  But depending on processing
> speed an application might change an attribute of a data source multiple
> times before work was invoked.  The effect of those intermediate changes
> may be visible in the signal, and to lose the fact they occurred by
> discarding all but the last change affects both reproducibility and
> interpretation of the signal itself.
> 

I agree this is a problem, but I don't see a workaround as the data plane 
(work, streams, etc.) is asynchronous to the control logic. On the bright side, 
I believe the USRP source block does associate tuner, sample rate, etc. changes 
with an absolute sample in the stream, but this set of features doesn't 
necessarily extend to other hardware data sources. As for other asynchronous 
events generating stream tags, I think the user is stuck dealing with the 
inevitable latency unless the data source can produce metadata that is tightly 
coupled in time and pass that information along to GNU Radio.

> (5) All stream tags are placed in the extras block, and when a segment
> is completed file_meta_sink will generate a new header.  The new header
> contains copies of the unique tags, but updates their offsets to be the
> start of the new segment.
> 
> This is incorrect as the original stream did not have those tags
> associated with those samples, so re-playing will introduce a behavioral
> difference.  For example, a tag that is meant to be associated with the
> start of a packet will be duplicated at an offset that is probably not
> the start of a packet.
> 
> Solutions include (a) leave the original offset setting for tags in the
> extras section when they're reproduced in a new segment, even though
> that offset is not present in the segment; (b) treat stream tags as
> ephemeral and do not persist them in the extras section when generating
> a new segment; (c) extend the add_item_tag API to record whether the
> tag is ephemeral or persistent.  Offhand I can see no argument
> supporting persisting a tag and updating its offset, and only rare cases
> where it's appropriate to replicate outdated information in a new
> segment, so (b) seems to be the right move.
> 
> All the above is based on my understanding and expectations of how
> stream  tags are/should be used.  If my understanding is mistaken,
> please let me know.
> 
> Peter
> 
> 
> _______________________________________________
> Discuss-gnuradio mailing list
> address@hidden
> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
> 

Sean


reply via email to

[Prev in Thread] Current Thread [Next in Thread]