Friday 19 March 2010

The role of workflows

These are some notes from a discussion with my colleague, Jun Zhao, who has been asked asked about using our research partners as use-case studies for workflow sharing.

Our immediate response to this request was one of skepticism, based on our belief that none of our research partners would be willing to try using workflow-based tools because we couldn't see that they would gain sufficient benefits to justify the "activation energy" of deploying and learning to use such tools. In the past, our partners have been dismissive of using even very simple systems for which they could not perceive immediate benefits.

This was a somewhat surprising conclusion given the enthusiasm for workflow sharing among other bioinformatics researchers, and also researchers in other disciplines, and we wondered why this might be.

We considered each of our research group partners, covering Drosphila genomics, evolutionary development, animal behaviour, mechanical properties and evolutionary factors affecting silk, and elephant conservation in Africa. We noticed that:

  • each research group used quite manually intensive experimental procedures and data analysis, of which the mechanized data analysis portions were quite a small proportion,
  • the nature of the procedures and analysis techniques used in the different groups was very diverse, with very little opportunity for sharing between them.

This seems to stand in contrast to genetic studies that screen large numbers of samples for varying levels of gene products, or high throughput sequencing looking for significant similarities of differences in the gene sequences of different sample populations. The closest our research partners come to this is the evolutionary development group, who use shotgun gene sequencing approaches to look for interesting gene products, but even here the particular workflows used appear to be highly dependent on the experimental hypothesis being tested.

What conclusions can we draw from this? Mainly, we think, that it would be a mistake for computer scientists and software tool developers to assume that a set of tools that has been found useful by one group of researchers is useful to all groups studying similar natural phenomena. Details of experiment design would appear to be a dominant indicator for the suitability of a particular type of tool.

No comments:

Post a Comment