[Myexperiment-discuss] Improved RDF Data for Taverna Workflow Components

Hi All,

Over the last few days I have been composing the RDF to expose the extra information provided by the new Taverna (v1 and v2) GEMs. It is now possible to query this improved RDF data for the components of public workflow at http://rdf.myexperiment.org/sparql. The ontology for workflow components can be found at http://rdf.myexperiment.org/ontologies/components.

I have run a few consistency checks over it to make sure the data has been generated correctly but there may still be errors. So if anyone using the SPARQL endpoint spots anything awry them please do tell me. One of the consistency checks I ran was to look at the breakdown of processor types (and made sure they tallied). So far I have broken processors down into 4 categories based on the value of the type property I got from the REST API that uses the Taverna GEMs:

WSDL processor: 1594

arbitrarywsdl: 1026

biomobywsdl: 105

soaplabwsdl: 287

wsdl: 176

Dataflow / nested workflow processor: 706

workflow: 706

Beanshell processor: 1595

beanshell: 1595

Other Processor: 5491

apiconsumer: 17

biomart: 89

biomoby: 53

biomobyobject: 223

biomobyparser: 85

local: 2672

localworker: 301

rshell: 49

seqhound: 8

soaplab: 18

spreadsheet: 1

stringconstant: 1622

xmlsplitter: 230

xmpp: 10

UNDEFINED: 113

TOTAL: 9386

If anyone has suggestions on any additional categories I could add or which processor types I could put into each category that would be really useful, as my understanding of Taverna processors is somewhat limited. Ideally I would prefer not to have processor types in multiple categories but this could be done if it is appropriate.

Regards

David Newman

From:	David R Newman
Subject:	[Myexperiment-discuss] Improved RDF Data for Taverna Workflow Components
Date:	Tue, 24 Nov 2009 19:23:07 -0000