ResearchHub | Open Science Community

FH

Frank Hartung

Author with expertise in RNA Sequencing Data Analysis

Achievements

Cited Author

Open Access Advocate

Key Stats

Upvotes received:

0

Publications:

4

(50% Open Access)

Cited by:

1,013

h-index:

30

/

i10-index:

39

Reputation

Biology

< 1%

Chemistry

< 1%

Economics

< 1%

Show more

How is this calculated?

Publications

Using intron position conservation for homology-based gene prediction

Jens Keilwagen et al.Feb 17, 2016

Annotation of protein-coding genes is very important in bioinformatics and biology and has a decisive influence on many downstream analyses. Homology-based gene prediction programs allow for transferring knowledge about protein-coding genes from an annotated organism to an organism of interest.

Molecular Biology

0

Paper

Save

GeMoMa: Homology-Based Gene Prediction Utilizing Intron Position Conservation and RNA-seq Data

Jens Keilwagen et al.Jan 1, 2019

GeMoMa is a homology-based gene prediction program that predicts gene models in target species based on gene models in evolutionary related reference species. GeMoMa utilizes amino acid sequence conservation, intron position conservation, and RNA-seq data to accurately predict protein-coding transcripts. Furthermore, GeMoMa supports the combination of predictions based on several reference species allowing to transfer high-quality annotation of different reference species to a target species. Here, we present a detailed description of GeMoMa modules and the GeMoMa pipeline and how they can be used on the command line to address particular biological problems.

Molecular Biology

0

Paper

Save

Combining RNA-seq data and homology-based gene prediction for plants, animals and fungi

Jens Keilwagen et al.May 29, 2018

Genome annotation is of key importance in many research questions. The identification of protein-coding genes is often based on transcriptome sequencing data, ab-initio or homology-based prediction. Recently, it was demonstrated that intron position conservation improves homology-based gene prediction, and that experimental data improves ab-initio gene prediction.Here, we present an extension of the gene prediction program GeMoMa that utilizes amino acid sequence conservation, intron position conservation and optionally RNA-seq data for homology-based gene prediction. We show on published benchmark data for plants, animals and fungi that GeMoMa performs better than the gene prediction programs BRAKER1, MAKER2, and CodingQuarry, and purely RNA-seq-based pipelines for transcript identification. In addition, we demonstrate that using multiple reference organisms may help to further improve the performance of GeMoMa. Finally, we apply GeMoMa to four nematode species and to the recently published barley reference genome indicating that current annotations of protein-coding genes may be refined using GeMoMa predictions.GeMoMa might be of great utility for annotating newly sequenced genomes but also for finding homologs of a specific gene or gene family. GeMoMa has been published under GNU GPL3 and is freely available at http://www.jstacs.de/index.php/GeMoMa .

Molecular Biology

0

Paper

Save

Combining RNA-seq data and homology-based gene prediction for plants, animals and fungi

Jens Keilwagen et al.Nov 14, 2017

Motivation: Genome annotation is of key importance in many research questions. The identification of protein-coding genes is often based on transcriptome sequencing data, ab-initio or homology-based prediction. Recently, it was demonstrated that intron position conservation improves homology-based gene prediction, and that experimental data improves ab-initio gene prediction. Results: Here, we present an extension of the gene prediction tool GeMoMa that utilizes amino acid sequence conservation, intron position conservation and optionally RNA-seq data for homology-based gene prediction. We show on published benchmark data for plants, animals and fungi that GeMoMa performs better than the gene prediction programs BRAKER1, MAKER2, and CodingQuarry, and purely RNA-seq-based pipelines for transcript identification. In addition, we demonstrate that using multiple reference organisms may help to further improve the performance of GeMoMa. Finally, we apply GeMoMa to four nematode species and to the recently published barley reference genome indicating that current annotations of protein-coding genes may be refined using GeMoMa predictions. Availability: GeMoMa has been published under GNU GPL3 and is freely available at http://www.jstacs.de/index.php/GeMoMa.

Molecular Biology

0

Paper

Molecular Biology

Save