ResearchHub | Open Science Community

The Cell Ontology 2016: enhanced content, modularization, and ontology interoperability

Alexander Diehl et al.Jul 4, 2016

The Cell Ontology (CL) is an OBO Foundry candidate ontology covering the domain of canonical, natural biological cell types. Since its inception in 2005, the CL has undergone multiple rounds of revision and expansion, most notably in its representation of hematopoietic cells. For in vivo cells, the CL focuses on vertebrates but provides general classes that can be used for other metazoans, which can be subtyped in species-specific ontologies. Recent work on the CL has focused on extending the representation of various cell types, and developing new modules in the CL itself, and in related ontologies in coordination with the CL. For example, the Kidney and Urinary Pathway Ontology was used as a template to populate the CL with additional cell types. In addition, subtypes of the class 'cell in vitro' have received improved definitions and labels to provide for modularity with the representation of cells in the Cell Line Ontology and Reagent Ontology. Recent changes in the ontology development methodology for CL include a switch from OBO to OWL for the primary encoding of the ontology, and an increasing reliance on logical definitions for improved reasoning. The CL is now mandated as a metadata standard for large functional genomics and transcriptomics projects, and is used extensively for annotation, querying, and analyses of cell type specific data in sequencing consortia such as FANTOM5 and ENCODE, as well as for the NIAID ImmPort database and the Cell Image Library. The CL is also a vital component used in the modular construction of other biomedical ontologies—for example, the Gene Ontology and the cross-species anatomy ontology, Uberon, use CL to support the consistent representation of cell types across different levels of anatomical granularity, such as tissues and organs. The ongoing improvements to the CL make it a valuable resource to both the OBO Foundry community and the wider scientific community, and we continue to experience increased interest in the CL both among developers and within the user community.

Philosophy

Artificial Intelligence

0

Paper

Save

Finding Our Way through Phenotypes

Andrew Deans et al.Jan 6, 2015

Despite a large and multifaceted effort to understand the vast landscape of phenotypic data, their current form inhibits productive data analysis. The lack of a community-wide, consensus-based, human- and machine-interpretable language for describing phenotypes and their genomic and environmental contexts is perhaps the most pressing scientific bottleneck to integration across many key fields in biology, including genomics, systems biology, development, medicine, evolution, ecology, and systematics. Here we survey the current phenomics landscape, including data resources and handling, and the progress that has been made to accurately capture relevant data descriptions for phenotypes. We present an example of the kind of integration across domains that computable phenotypes would enable, and we call upon the broader biology community, publishers, and relevant funding agencies to support efforts to surmount today's data barriers and facilitate analytical reproducibility.

Genetics

Ecology

0

Paper

Save

Assessing Bayesian Phylogenetic Information Content of Morphological Data Using Knowledge from Anatomy Ontologies

Diego Porto et al.Jan 6, 2022

Abstract Morphology remains a primary source of phylogenetic information for many groups of organisms, and the only one for most fossil taxa. Organismal anatomy is not a collection of randomly assembled and independent ‘parts’, but instead a set of dependent and hierarchically nested entities resulting from ontogeny and phylogeny. How do we make sense of these dependent and at times redundant characters? One promising approach is using ontologies—structured controlled vocabularies that summarize knowledge about different properties of anatomical entities, including developmental and structural dependencies. Here we assess whether the proximity of ontology-annotated characters within an ontology predicts evolutionary patterns. To do so, we measure phylogenetic information across characters and evaluate if it is hierarchically structured by ontological knowledge—in much the same way as phylogeny structures across-species diversity. We implement an approach to evaluate the Bayesian phylogenetic information (BPI) content and phylogenetic dissonance among ontology-annotated anatomical data subsets. We applied this to datasets representing two disparate animal groups: bees (Hexapoda: Hymenoptera: Apoidea, 209 chars) and characiform fishes (Actinopterygii: Ostariophysi: Characiformes, 463 chars). For bees, we find that BPI is not substantially structured by anatomy since dissonance is often high among morphologically related anatomical entities. For fishes, we find substantial information for two clusters of anatomical entities instantiating concepts from the jaws and branchial arch bones, but among-subset information decreases and dissonance increases substantially moving to higher level subsets in the ontology. We further applied our approach to address particular evolutionary hypotheses with an example of morphological evolution in miniature fishes. While we show that ontology does indeed structure phylogenetic information, additional relationships and processes, such as convergence, likely play a substantial role in explaining BPI and dissonance, and merit future investigation. Our work demonstrates how complex morphological datasets can be interrogated with ontologies by allowing one to access how information is spread hierarchically across anatomical concepts, how congruent this information is, and what sorts of processes may structure it: phylogeny, development, or convergence.

Philosophy

Artificial Intelligence

35

Paper

Save

rphenoscate: An R package for semantic-aware evolutionary analyses of anatomical traits

Diego Porto et al.Feb 21, 2023

Abstract Organismal anatomy is a complex hierarchical system of interconnected anatomical entities often producing dependencies among multiple morphological characters. Ontologies provide a formalized and computable framework for representing and incorporating prior biological knowledge about anatomical dependencies in models of trait evolution. Further, ontologies offer new opportunities for assembling and working with semantic representations of morphological data. In this work we present a new R package— rphenoscate —that enables incorporating ontological knowledge in evolutionary analyses and exploring semantic patterns of morphological data. In conjunction with rphenoscape it also allows for assembling synthetic phylogenetic character matrices from semantic phenotypes of morphological data. We showcase the new package functionalities with three data sets from bees and fishes. We demonstrate that ontology knowledge can be employed to automatically set up ontologyinformed evolutionary models that account for trait dependencies in the context of stochastic character mapping. We also demonstrate how ontology annotations can be explored to interrogate patterns of morphological evolution. Finally, we demonstrate that synthetic character matrices assembled from semantic phenotypes retain most of the phylogenetic information of the original data set. Ontologies will become an increasingly important tool not only for enabling prior anatomical knowledge to be integrated into phylogenetic methods but also to make morphological data FAIR compliant—a critical component of the ongoing ‘phenomics’ revolution. Our new package offers key advancements toward this goal.

Genetics

Philosophy

13

Paper

Save

A logical model of homology for comparative biology

Paula Mabee et al.Mar 26, 2019

There is a growing body of research on the evolution of anatomy in a wide variety of organisms. Discoveries in this field could be greatly accelerated by computational methods and resources that enable these findings to be compared across different studies and different organisms and linked with the genes responsible for anatomical modifications. Homology is a key concept in comparative anatomy; two important types are historical homology (the similarity of organisms due to common ancestry) and serial homology (the similarity of repeated structures within an organism). We explored how to most effectively represent historical and serial homology across anatomical structures to facilitate computational reasoning. We assembled a collection of homology assertions from the literature with a set of taxon phenotypes for vertebrate fins and limbs from the Phenoscape Knowledgebase (KB). Using six competency questions, we evaluated the reasoning ramifications of two logical models: the Reciprocal Existential Axioms Homology Model (REA) and the Ancestral Value Axioms Homology Model (AVA). Both models returned the user-expected results for all but one historical homology query and all serial homology queries. Additionally, for each competency question, the AVA model returns the search term and any subtypes. We identify some challenges of implementing complete homology queries due to limitations of OWL reasoning. This work lays the foundation for homology reasoning to be incorporated into other ontology-based tools, such as those that enable synthetic supermatrix construction and candidate gene discovery.

Genetics

Molecular Biology

0

Paper

Save

Hierarchy-guided Neural Networks for Species Classification

Mohannad Elhamod et al.Jan 18, 2021

Abstract Species classification is an important task that is the foundation of industrial, commercial, ecological, and scientific applications involving the study of species distributions, dynamics, and evolution. While conventional approaches for this task use off-the-shelf machine learning (ML) methods such as existing Convolutional Neural Network (ConvNet) architectures, there is an opportunity to inform the ConvNet architecture using our knowledge of biological hierarchies among taxonomic classes. In this work, we propose a new approach for species classification termed Hierarchy-Guided Neural Network ( HGNN ), which infuses hierarchical taxonomic information into the neural network’s training to guide the structure and relationships among the extracted features. We perform extensive experiments on an illustrative use-case of classifying fish species to demonstrate that HGNN outperforms conventional ConvNet models in terms of classification accuracy, especially under scarce training data conditions. We also observe that HGNN shows better resilience to adversarial occlusions, when some of the most informative patch regions of the image are intentionally blocked and their effect on classification accuracy is studied.

Artificial Intelligence

Molecular Biology

9

Paper

Artificial Intelligence

Molecular Biology

0

Save

0

Annotation of phenotypes using ontologies: a Gold Standard for the training and evaluation of natural language processing systems

Wasila Dahdul et al.May 15, 2018

Natural language descriptions of organismal phenotypes, a principal object of study in biology, are abundant in biological literature. Expressing these phenotypes as logical statements using formal ontologies would enable large-scale analysis on phenotypic information from diverse systems. However, considerable human effort is required to make the semantics of phenotype descriptions amenable to machine reasoning by (a) recognizing appropriate ontological terms for entities in text and (b) stringing these terms into logical statements. Most existing Natural Language Processing tools stop at entity recognition, leaving a need for tools that can assist with both aspects of the task. The recently described Semantic CharaParser aims to meet this need. We describe the first expert-curated Gold Standard corpus for ontology-based annotation of phenotypes from the systematics literature. We use it to evaluate Semantic CharaParser's annotations and explore differences in performance between humans and machine. We use four annotation accuracy metrics that can account for both semantically identical and similar matches. We found that machine human consistency was significantly lower than intercurator (human human) consistency. Surprisingly, allowing curators access to external information that was not available to Semantic CharaParser did not significantly increase the similarity of their annotations to the Gold Standard nor have a significant effect on inter-curator consistency. We found that the similarity of machine annotations to the Gold Standard increased after new ontology terms relevant to the input text had been added. Evaluation by the original authors of the character descriptions indicated that the Gold Standard annotations came closer to representing their intended meaning than did either the curator or machine annotations. These findings point toward ways to better design of software to augment human curators, and the Gold Standard corpus will allow training and assessment of new tools to improve phenotype annotation accuracy at scale.

Philosophy

Artificial Intelligence

0

Paper

Philosophy

Artificial Intelligence

0

Save