ResearchHub | Open Science Community

WormBase ParaSite − a comprehensive resource for helminth genomics

Kevin Howe et al.Nov 27, 2016

The number of publicly available parasitic worm genome sequences has increased dramatically in the past three years, and research interest in helminth functional genomics is now quickly gathering pace in response to the foundation that has been laid by these collective efforts. A systematic approach to the organisation, curation, analysis and presentation of these data is clearly vital for maximising the utility of these data to researchers. We have developed a portal called WormBase ParaSite (http://parasite.wormbase.org) for interrogating helminth genomes on a large scale. Data from over 100 nematode and platyhelminth species are integrated, adding value by way of systematic and consistent functional annotation (e.g. protein domains and Gene Ontology terms), gene expression analysis (e.g. alignment of life-stage specific transcriptome data sets), and comparative analysis (e.g. orthologues and paralogues). We provide several ways of exploring the data, including genome browsers, genome and gene summary pages, text search, sequence search, a query wizard, bulk downloads, and programmatic interfaces. In this review, we provide an overview of the back-end infrastructure and analysis behind WormBase ParaSite, and the displays and tools available to users for interrogating helminth genomic data.

Genetics

Ecology

0

Paper

Save

Ensembl Genomes 2016: more genomes, more complexity

Paul Kersey et al.Nov 17, 2015

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces.

Genetics

Paleontology

0

Paper

Save

Ensembl Genomes 2018: an integrated omics infrastructure for non-vertebrate species

Paul Kersey et al.Oct 24, 2017

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including genome sequence, gene models, transcript sequence, genetic variation, and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments and expansions. These include the incorporation of almost 20 000 additional genome sequences and over 35 000 tracks of RNA-Seq data, which have been aligned to genomic sequence and made available for visualization. Other advances since 2015 include the release of the database in Resource Description Framework (RDF) format, a large increase in community-derived curation, a new high-performance protein sequence search, additional cross-references, improved annotation of non-protein-coding genes, and the launch of pre-release and archival sites. Collectively, these changes are part of a continuing response to the increasing quantity of publicly-available genome-scale data, and the consequent need to archive, integrate, annotate and disseminate these using automated, scalable methods.

Genetics

Ecology

0

Paper

Save

WormBase 2016: expanding to enable helminth genomic research

Kevin Howe et al.Nov 17, 2015

WormBase (www.wormbase.org) is a central repository for research data on the biology, genetics and genomics of Caenorhabditis elegans and other nematodes. The project has evolved from its original remit to collect and integrate all data for a single species, and now extends to numerous nematodes, ranging from evolutionary comparators of C. elegans to parasitic species that threaten plant, animal and human health. Research activity using C. elegans as a model system is as vibrant as ever, and we have created new tools for community curation in response to the ever-increasing volume and complexity of data. To better allow users to navigate their way through these data, we have made a number of improvements to our main website, including new tools for browsing genomic features and ontology annotations. Finally, we have developed a new portal for parasitic worm genomes. WormBase ParaSite (parasite.wormbase.org) contains all publicly available nematode and platyhelminth annotated genome sequences, and is designed specifically to support helminth genomic research.

Genetics

Philosophy

0

Paper

Save

Tracking Subclonal Mutation Frequencies Throughout Lymphomagenesis Identifies Cancer Drivers in Mouse Models of Lymphoma.

Philip Webster et al.Jun 29, 2017

Determining whether recurrent but rare cancer mutations are bona fide driver mutations remains a bottleneck in cancer research. Here we present the most comprehensive analysis of retrovirus driven lymphomagenesis produced to date, sequencing 700,000 mutations from >500 malignancies collected at time points throughout tumor development. This enabled identification of positively selected events, and the first demonstration of negative selection of mutations that may be deleterious to tumor development indicating novel avenues for therapy. Customized sequencing and bioinformatics methodologies were developed to quantify subclonal mutations in both premalignant and malignant tissue, greatly expanding the statistical power for identifying driver mutations and yielding a high-resolution, genome wide map of the selective forces surrounding cancer gene loci. Screening two BCL2 transgenic models confirms known drivers of human B-cell non-Hodgkin lymphoma, and implicates novel candidates including modifiers of immunosurveillance such as co-stimulatory molecules and MHC loci. Correlating mutations with genotypic and phenotypic features also gives robust identification of known cancer genes independently of local variance in mutation density. An online resource http://mulv.lms.mrc.ac.uk allows customized queries of the entire dataset.

Genetics

Oncology

0

Paper

Genetics

Oncology

0

Save