Posted in Uncategorized on Jan 14, 2018

An announcement of an exciting new R package appeared today in Molecular Ecology Resources -- dartR for facilitating analysis of SNP data generated from reduced representation genome sequencing.

Untitled-1.jpg

The University of Canberra has a special relationship with a company called Diversity Arrays Technology (DArT) that specialises in genotyping by sequencing, particularly for agriculture and plant breeding. DArT, co-located with the Institute for Applied Ecology, has recently moved into applying its wares in the animal domain, and have been transformational in the capacity they bring the IAE and its genetics and genomics team.

In the interests of painless exploratory analysis, Bernd Gruber, Arthur Georges, Peter Unmack and Olly Berry have constructed an R package {dartR} for

  1. loading DArTâ„¢ SNP and SilicoDArT data generated from the commercial service provided by Diversity Arrays Technology Pty Ltd;
  2. applying filters to those data based on locus metadata such as call rate, information content or reproducibility;
  3. assigning individuals to populations and selecting subsets of individuals or populations;
  4. visualization using Principal Coordinates Analysis (PCoA); and
  5. providing a conduit to a range of standard data formats and R packages for analysis.

In most cases, the scripts in {dartR} are simple wrappers for scripts included in other already available packages, to provide transparent access to these packages for analyzing DArT data, and to provide some enhanced output diagnostics. Relatively few scripts provide novel analyses. We make no apologies for this, as the objective of {dartR} is to provide fundamental tools for accessing and manipulating DArT datafiles in preparation for analysis by the vast suite of packages available in R through the CRAN repository.

A summary of the capabilities of {dartR} is as follows:

  • Intelligent interpretation and input of DArT comma-delimited files to a compact genlight form of the R {adegenet} package.
  • Filtering loci and individuals on criteria drawn from the DArT locus metadata (such as repAvg, AvgPIC) or on computed statistics (such as call rate).
  • Relabelling individuals and recoding populations into new aggregations, and deleting selected individuals or populations.
  • Visualization using Principal Coordinates Analysis (PCoA) and Neighbour-joining trees.
  • Translation to other R packages (e.g. NewHybrids), to other {adegenet} objects (e.g. genind), and to standard data formats (e.g. fastA).
  • A few specific analyses not available elsewhere (e.g. fixed difference analysis, assignment analysis).

The package is currently under development and available as a beta release on https://github.com/green-striped-gecko/dartR. There you will find installation instructions and other guidance. The package dartR (version 0.80) is now available on CRAN (install.packages("dartR").


Powered by