atena: an R/Bioconductor package for the analysis of transposable elements

atena: an R/Bioconductor package for the analysis of transposable elements


Author(s): Beatriz Calvo-Serra,Robert Castelo

Affiliation(s): Universitat Pompeu Fabra



The quantification of RNA expression of transposable elements (TEs) requires specialized software and annotations outside the standardised pipelines and data sources employed in the analysis of RNA sequencing (RNA-seq) data. This often puts a burden on the users of such software, who first need to pull and combine input annotations from heterogeneous sources and formats and, second, parse the output quantifications before they can be fed into the next tool for a downstream analysis, such as a differential expression. We present atena, a Bioconductor package that provides, in the first place, efficient and accurate re-implementations in R of three of the most popular methods for TE expression quantification: TEtranscripts (Jin et al., 2015), ERVmap (Tokuyama et al., 2018) and Telescope (Bendall et al., 2019). In the second place, atena also provides a single interface to download and flexibly parse into TE annotations all the RepeatMasker track data available at the UCSC Genome Browser using different algorithms, including a re-implementation in R of the one by Bailly-Bechet et al. (2014). Furthermore, it provides a fourth expression quantification method, called atena, which is built upon the other three to address some of their shortcomings. We have used atena to investigate the contribution of TEs to the postnatal changes in expression following a fetal inflammatory response in extremely preterm neonates, using the newborn screening RNA-seq data produced by Costa et al. (2021). The atena package is publicly available at https://bioconductor.org/packages/atena.