the implementation provides the PIM coroutine procedures TRANSFER, IOTRANSFER, NEWPROCESS and LISTEN, they are all non-preemptive.
if I understand you correctly, you are saying that your implementation of these primitives is non-preemptive but it does call some primitives in the Pthreads library, yes?
In fact I thought this was a light weakness of the implementation in that IOTRANSFER will only return to the original callee when another process
calls LISTEN or calls a procedure which is in a module with a lower interrupt
priority ...
It depends on what you want to do. There are scenarios where preemption may be desirable and the associated management overhead is not a serious concern. Workstation operating systems and GUIs are generally considered to fall into this category. However, there are also scenarios where preemption is entirely undesirable because it has a significant management overhead and it introduces non-determinism.
Your implementation would not be the first one to implement coroutine semantics on top of a library meant for preemptive threads. This approach will (or at least should) avoid the accidental introduction of non-deterministic data models into one's code, but it does not usually avoid the impact of the management overhead. If coroutines are specifically chosen for their efficiency, an implementation on top of the Pthreads library is usually not paying the desired dividends.
The reason for this appears to be that non-preemptive task switching incurs about the same overhead as a function call, while any call to a preemptive scheduling system, task switching or otherwise, incurs the cost of a kernel context switch, which is far more expensive.
In fact, coroutine implementation from scratch doesn't even guarantee lightweightness. The Pth library was written to implement coroutines but with a POSIX threads compatible API. This compatibility came at a high management overhead cost.
A side project of the Apache project once aimed to improve the scalability of the Apache web server by replacing the Pthreads with Pth and thus preemptive threads with coroutines. Unfortunately, the difference between the resulting non-preempted Apache and the preempted Apache was negligible.
By contrast, a study undertaken by the Swedish Institute of Computer Science (SICS) which evaluated Apache against YAWS, a web server written in Erlang and based on a true lightweight coroutine implementation (which is native to Erlang) shows that YAWS scales better by an order of magnitude.
I don't have a link at hand but I have come across reports with performance data by Ericsson which showed that telecommunication switches with non-preemptive scheduling systems can have scalability advantages of several orders of magnitude over preemptive ones. Our own experiments with software switched telephone calls confirms this.
As for the PIM specific feature and semantics of IOTRANSFER, I admit that the usefulness may not be obvious to everyone, but in the telephony industry there are usage scenarios where this is extremely handy. Certain tasks are of higher priority and if they do not mess with the data of other tasks then they are safe to interrupt while others are not. For example the writing of call detail records by a telephone exchange, which is necessary for billing and also to fulfill legal obligations has a higher priority than the actual handling of audio and they can safely interrupt audio handling because they do not modify the data used by the audio handling tasks.
I don't want to bore anybody here with the details but for those who are curious, I will link to a page on coroutines on our website which describes a usage scenario for coroutines in telephony and perhaps more interestingly, it also has a number of links to papers by scholars who promote the use of coroutines over pthreads and events:
This page also has a link to the aforementioned Pth library as well as a user-space implementation of a coroutine library for C which is called LibTask.
LibTask is very efficient and reliable and I would like to suggest to look into the possibility of using it instead of Pthreads for the implementation of coroutines in GM2, at the very least as an alternative.
Note that one of the reasons why Modula-2 was once a preferred environment for embedded development was the presence of coroutines with low overhead. These days Erlang has outshined Modula-2 in respect of asynchronous concurrency performance but unlike Erlang, Modula-2 is relatively small and thus far more suitable for embedded development. It would be nice to see GM2 eventually emerge as a platform which could be used to do cross development for embedded target platforms (not just a shrunk down linux box). For that to happen a user-space based coroutine implementation would seem to me to be a necessary requirement.