[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Proposal: Making datamash extendable

From: Tim Rice
Subject: Re: Proposal: Making datamash extendable
Date: Wed, 18 May 2022 21:25:32 +0000

Hey Shawn,

I like the idea of making datamash more easily extendable. On the other hand, I 
have concerns about the performance hit of moving core functionality out to any 
scripting language.

An idea that comes to mind is using something like Bash's dynamically-loadable 
builtins. We could have it so that datamash is able to read extra object files 
from a particular directory. Since they are dynamically linked after being 
compiled, I believe (correct me if I'm wrong) they would or could be 
language-agnostic. People could then write extensions with C, Fortran, or 
whatever. Even assembly if that's the way they like to party :)

Another option would be to do what Git does: a "core" program which basically 
just searches the path for any other program prefixed with `git-` and farms out the rest 
of the arguments to that subprogram. This would make datamash very easy to extend, with 
the main problem being it would certainly destroy backwards compatibility in heavy-handed 

If people do want to use scripting languages with datamash, our refactoring work for v2.0 
could aim to establish a "libdatamash" which people could then create language 
bindings for. Then datamash could be scripted not only for guile or tcl but also python, 
perl, ruby, lua etc, depending on who wants to create the bindings for their favorite 


~ Tim

On Wed, May 18, 2022 at 05:52:46AM -0700, Shawn Wagner wrote:
(This is a datamash 2.0 idea)

Currently, adding a new operation is an annoying pain - you have to
touch 3 or 4 different source files, making sure the order of
different things all match up, etc.

I want to embed a scripting language in it so that if an unknown
operation is encountered, it can just load a source file that
implements it - and maybe rewrite some/all of the existing operations
to use this framework. It'll make for easier additions of new
features, and allow user-contributed ones without needing to patch and

My preference for a language to use is Guile, since it's GNU's
official extension language and I'm quite fond of Scheme, with tcl a
close second. There are some who like lua for an embedded scripting
language, but they're silly people who should be treated kindly.

A simple example of what defining a new operation might look like:

(define-scalar add1 #:type 'numeric #:help "Add 1 to the value"
   (lambda (n) (+ n 1)))

reply via email to

[Prev in Thread] Current Thread [Next in Thread]