Re: [Discuss-gnuradio] Heterogeneous Computing Workgroup at GRCon

discuss-gnuradio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] Heterogeneous Computing Workgroup at GRCon

From:	CEL
Subject:	Re: [Discuss-gnuradio] Heterogeneous Computing Workgroup at GRCon
Date:	Sat, 15 Sep 2018 14:45:25 +0000

Hi heterogeneous computation crowd,

regarding familiarizing with GNU Radio's buffer architecture, here's
the rundown from a relatively technical point of view. I've posted a
blog post with a higher-level overview[1].

TL;DR: GNU Radio emulates ring buffers on MMUs using mmap; contiguous
memory containing consecutive calls' unconsumed and new data is the
block API. GNU Radio blocks run in a separate thread each, but the
whole flow graph is one process.

GNU Radio exchanges samples between blocks in buffers; a connection
between two blocks is actually the output buffer (writer) of the
upstream block, and one or multiple buffer readers on the downstream
block(s).

GNU Radio doesn't do the usual fixed-chunk-size DSP like e.g. audio
systems tend to do. Instead, it lets blocks process as much input there
is, and what they are able to process at this point. The remainder of
the input isn't "consumed" and is the beginning of the next call to
work's "input_items[i]". This is done in a zero-copy manner by actually
having memory that looks like a proper ring buffer.
Since GNU Radio typically runs on architectures that don't have a AGU
capable of simple modulus address generation (like e.g. the infamous
DSP56k would), one needs to emulate a ring buffer using the MMU of the
CPU.

GNU Radio uses mmap (or the windows equivalent) on shared memory or
file-backed memory to map the same pages twice back-to-back; that way,
a block can always be presented a full memory buffer that looks linear,
no matter where in the emulated ring the view actually starts.
Corollary to using MMUs, the buffers need to be multiples-of-page-
sized. (This works without elevated privileges or kernel drivers.)

This is kind of the central point of the GNU Radio block API: you write
a (general_)work function, and it gets called by the scheduler with a
pointer to the beginning of your input items, their number, a pointer
to the beginning of the unused space in your output ring buffer, and
the amount of space there.

Now, GNU Radio's current scheduler, the thread per block (TPB)
scheduler, uses simple condition variables, and whatever
hardware/kernel sync mechanism these are implemented on the individual
platform, to inform the block executors of adjacent blocks of changes
to available input items and output space.

This yields, from my perspective, a few challenges we could be
discussing:

· how to implement buffers that exchange data between host CPU and e.g.
DMA PCIe devices (GPUs?), or more specifically buffer ring devices
(mainly, network cards) 
· how to increase cache locality/avoid memory congestion without
abandoning the multithreaded TPB approach in general
· how to implement above notifications in high-rate scenarios, as at
least in classical OS theory, every block-on-condition-variable/notify
incurs the cost of at least two syscalls (this is b-prio topic, as most
OSes, far as I can tell, have gotten really good at reducing the cost
of that; Linux' `futex` implementation could be worse)
· how to avoid shared state between blocks so that this can work out
without major pitfalls.

I know there's multiple people at the conference that actually did very
solid work on heterogeneous compute architectures, and I'd love to see
their knowledge come together to identify the architecture we'll be
taking. Let's make this an exciting discussion – I'd rather have
experts fight (friendly) about the advantages of their individual
solutions than have hours of general agreement that we need to do
something vague.

So, see you at the conference,
Marcus

[1] It seems I've messed up the images when we overhauled our website,
so here's a working version from the internet archive:

https://web.archive.org/web/20170804013547/https://www.gnuradio.org/blog/buffers/

On Fri, 2018-09-14 at 16:54 -0700, Martin Braun wrote:
> Hi all,
> 
> here's a quick reminder that we'll have a heterogeneous computing
> workgroup at GRCon this year. The workgroup will be active on Friday,
> 8:45-12:15, in Sierra A (there'll be maps of the venue when you get
> there). Here's the original abstract:
> 
> ```
>    In this breakout session, we intend to revisit the problem of
> implementing heterogeneous computing as a first-class citizen of GNU
> Radio, a topic that has kept boiling up throughout the years. Since
> the
> inception of GNU Radio, it has been ported to various platforms, and
> RFNoC support was added via gr-ettus using the new GRC block-domain
> concept.
> 
>     However, all those efforts are very platform-dependent, and it is
> still not clear how one would write blocks that straddle multiple
> different processing domains, such as GPUs, FPGAs, dedicated DSPs, as
> well as GPPs. We attempt to revive this discussion at GRCon and come
> up
> with a plan on how to proceed to make heterogeneous computing
> something
> that works generically for various underlying hardware platforms.
> 
>     This breakout session will be very technical. We will be
> analysing
> sections of the GNU Radio runtime to identify where we need to add
> hooks
> or provide additional APIs and/or functionalities.
> 
>     We invite all experts on processing and computer architectures,
> as
> well as anyone who's interested in improving the GNU Radio runtime in
> this domain to participate in this breakout session.
> ```
> 
> I would like to add a couple of comments to this for people who are
> planning to attend:
> 
> - Like mentioned above, this will be a very technical workgroup. A
> strong familiarity with GNU Radio will be useful to traverse the
> conversations, or prior experience with heterogeneous computing, or
> comparable technical experience.
> 
> - If you want to attend, please try and familiarize yourself with the
> buffer architecture in GNU Radio, which are the foundation of how we
> pass data from one block to another.
> 
> - One action item for this group will be the creating a list of
> architectures that we want to target (FPGAs? GPUs? DSPs? etc.)
> 
> - The intention of this working group is to identify people who are
> willing to commit time and energy to improving the state of
> heterogeneous computing in GNU Radio. Consider this a first meeting,
> not
> something that in itself is complete.
> 
> I'm looking forward to seeing you in this workgroup!
> 
> Cheers,
> Martin
> 
> 
> _______________________________________________
> Discuss-gnuradio mailing list
> address@hidden
> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio

[Prev in Thread]

Current Thread

[Next in Thread]

[Discuss-gnuradio] Heterogeneous Computing Workgroup at GRCon, Martin Braun, 2018/09/14
- Re: [Discuss-gnuradio] Heterogeneous Computing Workgroup at GRCon, CEL <=

Prev by Date: Re: [Discuss-gnuradio] Generating carrier from samples
Next by Date: [Discuss-gnuradio] [UHD] Announcing 3.13.0.3 Release Candidate 1
Previous by thread: [Discuss-gnuradio] Heterogeneous Computing Workgroup at GRCon
Next by thread: [Discuss-gnuradio] [UHD] Announcing 3.13.0.3 Release Candidate 1
Index(es):
- Date
- Thread