ResearchHub | Open Science Community

Phylogenomics resolves the timing and pattern of insect evolution

Bernhard Misof et al.Nov 6, 2014

Toward an insect evolution resolution Insects are the most diverse group of animals, with the largest number of species. However, many of the evolutionary relationships between insect species have been controversial and difficult to resolve. Misof et al. performed a phylogenomic analysis of protein-coding genes from all major insect orders and close relatives, resolving the placement of taxa. The authors used this resolved phylogenetic tree together with fossil analysis to date the origin of insects to ~479 million years ago and to resolve long-controversial subjects in insect phylogeny. Science , this issue p. 763

Genetics

Ecology

0

Paper

Save

The oyster genome reveals stress adaptation and complexity of shell formation

Guofan Zhang et al.Sep 19, 2012

The Pacific oyster Crassostrea gigas belongs to one of the most species-rich but genomically poorly explored phyla, the Mollusca. Here we report the sequencing and assembly of the oyster genome using short reads and a fosmid-pooling strategy, along with transcriptomes of development and stress response and the proteome of the shell. The oyster genome is highly polymorphic and rich in repetitive sequences, with some transposable elements still actively shaping variation. Transcriptome studies reveal an extensive set of genes responding to environmental stress. The expansion of genes coding for heat shock protein 70 and inhibitors of apoptosis is probably central to the oyster’s adaptation to sessile life in the highly stressful intertidal zone. Our analyses also show that shell formation in molluscs is more complex than currently understood and involves extensive participation of cells and their exosomes. The oyster genome sequence fills a void in our understanding of the Lophotrochozoa. The sequencing and assembly of the highly polymorphic oyster genome through a combination of short reads and fosmid pooling, complemented with extensive transcriptome analysis of development and stress response and proteome analysis of the shell, provides new insight into oyster biology and adaptation to a highly changeable environment. Oysters are keystone species in estuarine ecology and among the most important aquaculture species worldwide. The sequencing and assembly of the genome of the Pacific oyster, Crassostrea gigas, are now reported. Comparisons with other genomes reveal an expansion of defence genes as an adaptation to life as a sessile species in the intertidal zone, a surprisingly complex pathway for shell formation and dramatic evolution of genes related to larval development, highlighting their adaptive significance for marine invertebrates.

Genetics

Ecology

0

Paper

Save

PKSolver: An add-in program for pharmacokinetic and pharmacodynamic data analysis in Microsoft Excel

Yong Zhang et al.Feb 22, 2010

This study presents PKSolver, a freely available menu-driven add-in program for Microsoft Excel written in Visual Basic for Applications (VBA), for solving basic problems in pharmacokinetic (PK) and pharmacodynamic (PD) data analysis. The program provides a range of modules for PK and PD analysis including noncompartmental analysis (NCA), compartmental analysis (CA), and pharmacodynamic modeling. Two special built-in modules, multiple absorption sites (MAS) and enterohepatic circulation (EHC), were developed for fitting the double-peak concentration–time profile based on the classical one-compartment model. In addition, twenty frequently used pharmacokinetic functions were encoded as a macro and can be directly accessed in an Excel spreadsheet. To evaluate the program, a detailed comparison of modeling PK data using PKSolver and professional PK/PD software package WinNonlin and Scientist was performed. The results showed that the parameters estimated with PKSolver were satisfactory. In conclusion, the PKSolver simplified the PK and PD data analysis process and its output could be generated in Microsoft Word in the form of an integrated report. The program provides pharmacokinetic researchers with a fast and easy-to-use tool for routine and basic PK and PD data analysis with a more user-friendly interface.

Pharmacology

Analytical Chemistry

0

Paper

Save

SOAPnuke: a MapReduce acceleration-supported software for integrated quality control and preprocessing of high-throughput sequencing data

Yuxin Chen et al.Dec 4, 2017

Quality control (QC) and preprocessing are essential steps for sequencing data analysis to ensure the accuracy of results. However, existing tools cannot provide a satisfying solution with integrated comprehensive functions, proper architectures, and highly scalable acceleration. In this article, we demonstrate SOAPnuke as a tool with abundant functions for a “QC-Preprocess-QC” workflow and MapReduce acceleration framework. Four modules with different preprocessing functions are designed for processing datasets from genomic, small RNA, Digital Gene Expression, and metagenomic experiments, respectively. As a workflow-like tool, SOAPnuke centralizes processing functions into 1 executable and predefines their order to avoid the necessity of reformatting different files when switching tools. Furthermore, the MapReduce framework enables large scalability to distribute all the processing works to an entire compute cluster. We conducted a benchmarking where SOAPnuke and other tools are used to preprocess a ∼30× NA12878 dataset published by GIAB. The standalone operation of SOAPnuke struck a balance between resource occupancy and performance. When accelerated on 16 working nodes with MapReduce, SOAPnuke achieved ∼5.7 times the fastest speed of other tools.

Ecology

Artificial Intelligence

0

Paper

Save

Sequencing of 50 Human Exomes Reveals Adaptation to High Altitude

Xin Yi et al.Jul 1, 2010

No Genetic Vertigo Peoples living in high altitudes have adapted to their situation (see the Perspective by Storz ). To identify gene regions that might have contributed to high-altitude adaptation in Tibetans, Simonson et al. (p. 72 , published online 13 May) conducted a genome scan of nucleotide polymorphism comparing Tibetans, Han Chinese, and Japanese, while Yi et al. (p. 75 ) performed comparable analyses on the coding regions of all genes—their exomes. Both studies converged on a gene, endothelial Per-Arnt-Sim domain protein 1 (also known as hypoxia-inducible factor 2 α), which has been linked to the regulation of red blood cell production. Other genes identified that were potentially under selection included adult and fetal hemoglobin and two functional candidate loci that were correlated with low hemoglobin concentration in Tibetans. Future detailed functional studies will now be required to examine the mechanistic underpinnings of physiological adaptation to high altitudes.

Genetics

Demography

0

Paper

Save

The genome of the cucumber, Cucumis sativus L.

Sanwen Huang et al.Nov 1, 2009

Jun Wang and colleagues report the genome sequence of the cucumber. The cucumber genome is the seventh plant genome sequence to be reported and was assembled with a combination of traditional Sanger and next-generation sequencing methods. Cucumber is an economically important crop as well as a model system for sex determination studies and plant vascular biology. Here we report the draft genome sequence of Cucumis sativus var. sativus L., assembled using a novel combination of traditional Sanger and next-generation Illumina GA sequencing technologies to obtain 72.2-fold genome coverage. The absence of recent whole-genome duplication, along with the presence of few tandem duplications, explains the small number of genes in the cucumber. Our study establishes that five of the cucumber's seven chromosomes arose from fusions of ten ancestral chromosomes after divergence from Cucumis melo. The sequenced cucumber genome affords insight into traits such as its sex expression, disease resistance, biosynthesis of cucurbitacin and 'fresh green' odor. We also identify 686 gene clusters related to phloem function. The cucumber genome provides a valuable resource for developing elite cultivars and for studying the evolution and function of the plant vascular system.

Genetics

Molecular Biology

0

Paper

Save

A Draft Sequence for the Genome of the Domesticated Silkworm ( Bombyx mori )

Qingyou Xia et al.Dec 10, 2004

We report a draft sequence for the genome of the domesticated silkworm (Bombyx mori), covering 90.9% of all known silkworm genes. Our estimated gene count is 18,510, which exceeds the 13,379 genes reported for Drosophila melanogaster. Comparative analyses to fruitfly, mosquito, spider, and butterfly reveal both similarities and differences in gene content.

Genetics

Immunology

0

Paper

Save

The Genomes of Oryza sativa: A History of Duplications

Jun Yu et al.Jan 21, 2005

We report improved whole-genome shotgun sequences for the genomes of indica and japonica rice, both with multimegabase contiguity, or almost 1,000-fold improvement over the drafts of 2002. Tested against a nonredundant collection of 19,079 full-length cDNAs, 97.7% of the genes are aligned, without fragmentation, to the mapped super-scaffolds of one or the other genome. We introduce a gene identification procedure for plants that does not rely on similarity to known genes to remove erroneous predictions resulting from transposable elements. Using the available EST data to adjust for residual errors in the predictions, the estimated gene count is at least 38,000–40,000. Only 2%–3% of the genes are unique to any one subspecies, comparable to the amount of sequence that might still be missing. Despite this lack of variation in gene content, there is enormous variation in the intergenic regions. At least a quarter of the two sequences could not be aligned, and where they could be aligned, single nucleotide polymorphism (SNP) rates varied from as little as 3.0 SNP/kb in the coding regions to 27.6 SNP/kb in the transposable elements. A more inclusive new approach for analyzing duplication history is introduced here. It reveals an ancient whole-genome duplication, a recent segmental duplication on Chromosomes 11 and 12, and massive ongoing individual gene duplications. We find 18 distinct pairs of duplicated segments that cover 65.7% of the genome; 17 of these pairs date back to a common time before the divergence of the grasses. More important, ongoing individual gene duplications provide a never-ending source of raw material for gene genesis and are major contributors to the differences between members of the grass family.

Genetics

Molecular Biology

0

Paper

Save

Identification of rice diseases using deep convolutional neural networks

Yang Lü et al.Jun 30, 2017

The automatic identification and diagnosis of rice diseases are highly desired in the field of agricultural information. Deep learning is a hot research topic in pattern recognition and machine learning at present, it can effectively solve these problems in vegetable pathology. In this study, we propose a novel rice diseases identification method based on deep convolutional neural networks (CNNs) techniques. Using a dataset of 500 natural images of diseased and healthy rice leaves and stems captured from rice experimental field, CNNs are trained to identify 10 common rice diseases. Under the 10-fold cross-validation strategy, the proposed CNNs-based model achieves an accuracy of 95.48%. This accuracy is much higher than conventional machine learning model. The simulation results for the identification of rice diseases show the feasibility and effectiveness of the proposed method.

Genetics

Artificial Intelligence

0

Paper

Save

WEGO 2.0: a web tool for analyzing and plotting GO annotations, 2018 update

Jia Ye et al.May 10, 2018

WEGO (Web Gene Ontology Annotation Plot), created in 2006, is a simple but useful tool for visualizing, comparing and plotting GO (Gene Ontology) annotation results. Owing largely to the rapid development of high-throughput sequencing and the increasing acceptance of GO, WEGO has benefitted from outstanding performance regarding the number of users and citations in recent years, which motivated us to update to version 2.0. WEGO uses the GO annotation results as input. Based on GO’s standardized DAG (Directed Acyclic Graph) structured vocabulary system, the number of genes corresponding to each GO ID is calculated and shown in a graphical format. WEGO 2.0 updates have targeted four aspects, aiming to provide a more efficient and up-to-date approach for comparative genomic analyses. First, the number of input files, previously limited to three, is now unlimited, allowing WEGO to analyze multiple datasets. Also added in this version are the reference datasets of nine model species that can be adopted as baselines in genomic comparative analyses. Furthermore, in the analyzing processes each Chi-square test is carried out for multiple datasets instead of every two samples. At last, WEGO 2.0 provides an additional output graph along with the traditional WEGO histogram, displaying the sorted P-values of GO terms and indicating their significant differences. At the same time, WEGO 2.0 features an entirely new user interface. WEGO is available for free at http://wego.genomics.org.cn.

Genetics

Philosophy

0

Paper

Genetics

494

0

Save