Posted in Education and Outreach on Sep 07, 2025

WINNER 2025-red-and-black.png A team led from the University of Canberra last week won the prestigious Eureka Prize for Excellence in Research Software. The prize was awarded for an innovative software package called dartR. Following the publicity, some people have been asking, what is dartR?

For those who do not know, dartR is a collection of scripts written in the R programming language for undertaking routine analyses on datasets comprising Single Nucleotide Polymorphisms or SNPs. These are single point differences in our DNA between individuals, and allow the genotyping of a large number of individuals at low cost. These individuals can be koalas, playtpus, crocodiles, birds, fish or indeed any organism and so are of great value in studying the population genetics of wild and captive populations. SNP datasets are of immense value in providing a genetic foundation for decisions governing plant and animal breeding or decisions on who to breed with who in captive insurance colonies to maximize retention of genetic diversity. The application of these markers is very extensive.

Bernd_acceptance_small.jpg Commercial services, like Diversity Arrays Technology (DArT) co-located at the University of Canberra, bring the power of genomics to biologists with little or no experience in wet lab work. Indeed, one need only place a small snippet of tissue or even a buccal swab in a tube of ethanol, pass the tubes over to DArT. Eight weeks later a SNP dataset arrives.

With the sequencing accessible to all (from their kitchen even -- no lab required), the challenge for many has been the analysis. What we have done in developing dartR is make the foundational analysis routine, with a range of pre-packaged scripts to read the data in to the R package, add metadata to the SNP genotypes, interrogate the data, visualize it with tools like Principal Components Analysis, filter the data and prepare it for subsequent more sophisticated analyses using other third party software.

Basically, dartR brings the power of genomics and genetic analysis to the fingertips of the ecologist and general biologist to explore and analyse there data to answer questions of substance. Minimal programming experience is required.

The SNPs are generated by what is called representational sequencing. A pair of restriction enzymes (DNA cutters) are used to digest the DNA into fragments. Only those fragments that are of manageable length are sequenced. This is representational because only a relatively small portion of the genome of the target species is sequenced, say 5%, effectively at random. DArT run these sequence tags, as they are called, through a proprietary pipeline to generate a set of reproducible and reliable SNP markers. If you want to read more, the process is described in the methods section of our paper in Molecular Ecology. Our GitHub site is also available.

dartR_crew2_small.jpg Bernd Gruber and Arthur Georges, foundational developers of the package, are extremely grateful for the support received from dartR users and those who provided testimonials in support of the prize. We are astonished at the uptake of this dartR package globally, and the manner in which it has grown through the efforts of other developers in our team. Luis Mijangos was first employed to work on the package with funds from the ACT Priority Investment Program secured in collaboration with DArT and from CSIRO Environomics Future Science Platform. Luis is now with DArT. Carlo Pacioni from the Arthur Rylah Institute joint the development team as did Diana Robledo Ruiz from Monash University each bringing new capacity to the dartR package. The development team now includes Emily Stringer, Peter Unmack and Ching Ching Lau, along with Oliver Berry and Floriaan Devloo-Delva (CSIRO), Renee Catullo (University of Western Australia), Eric Archer (National Oceanic and Atmospheric Administration, US) and Jesus Castrejon-Figureoa (UNSW).

In addition to further development of the package, we are now working on an eBook to provide guidance on the use of dartR especially now that it is moving to greater functional capacity for complex analyses. The eBook will draw and expand on existing tutorials and will provide basic support for our regular workshops.

Provided we can keep ahead of this rapidly deveoping field, the future for dartR seems bright.

Further reading:

The package dartR was announced in Molecular Ecology Resources; Version 2 announcement also in Molecular Ecology Resources.


Powered by