ResearchHub | Open Science Community

An integrated semiconductor device enabling non-optical genome sequencing

Jonathan Rothberg et al.Jul 1, 2011

The seminal importance of DNA sequencing to the life sciences, biotechnology and medicine has driven the search for more scalable and lower-cost solutions. Here we describe a DNA sequencing technology in which scalable, low-cost semiconductor manufacturing techniques are used to make an integrated circuit able to directly perform non-optical DNA sequencing of genomes. Sequence data are obtained by directly sensing the ions produced by template-directed DNA polymerase synthesis using all-natural nucleotides on this massively parallel semiconductor-sensing device or ion chip. The ion chip contains ion-sensitive, field-effect transistor-based sensors in perfect register with 1.2 million wells, which provide confinement and allow parallel, simultaneous detection of independent sequencing reactions. Use of the most widely used technology for constructing integrated circuits, the complementary metal-oxide semiconductor (CMOS) process, allows for low-cost, large-scale production and scaling of the device to higher densities and larger array sizes. We show the performance of the system by sequencing three bacterial genomes, its robustness and scalability by producing ion chips with up to 10 times as many sensors and sequencing a human genome. Progress towards cheaper and more compact DNA sequencing devices is limited by a number of factors, including the need for imaging technology. A new DNA sequencing technology that does away with optical readout, instead gathering sequence data by directly sensing hydrogen ions produced by template-directed DNA synthesis, offers a route to low cost and scalable sequencing on a massively parallel semiconductor-sensing device or ion chip. The reactions are performed using all natural nucleotides, and the individual ion-sensitive chips are disposable and inexpensive. The system has been used to sequence three bacterial genomes and a human genome: that of Gordon Moore of Moore's law fame.

Genetics

Molecular Biology

0

Paper

Save

MIBiG 2.0: a repository for biosynthetic gene clusters of known function

Satria Kautsar et al.Oct 1, 2019

Abstract Fueled by the explosion of (meta)genomic data, genome mining of specialized metabolites has become a major technology for drug discovery and studying microbiome ecology. In these efforts, computational tools like antiSMASH have played a central role through the analysis of Biosynthetic Gene Clusters (BGCs). Thousands of candidate BGCs from microbial genomes have been identified and stored in public databases. Interpreting the function and novelty of these predicted BGCs requires comparison with a well-documented set of BGCs of known function. The MIBiG (Minimum Information about a Biosynthetic Gene Cluster) Data Standard and Repository was established in 2015 to enable curation and storage of known BGCs. Here, we present MIBiG 2.0, which encompasses major updates to the schema, the data, and the online repository itself. Over the past five years, 851 new BGCs have been added. Additionally, we performed extensive manual data curation of all entries to improve the annotation quality of our repository. We also redesigned the data schema to ensure the compliance of future annotations. Finally, we improved the user experience by adding new features such as query searches and a statistics page, and enabled direct link-outs to chemical structure databases. The repository is accessible online at https://mibig.secondarymetabolites.org/.

Genetics

Philosophy

0

Paper

Save

AutoMLST: an automated web server for generating multi-locus species trees highlighting natural product potential

Mohammad Alanjary et al.Apr 11, 2019

Abstract Understanding the evolutionary background of a bacterial isolate has applications for a wide range of research. However generating an accurate species phylogeny remains challenging. Reliance on 16S rDNA for species identification currently remains popular. Unfortunately, this widespread method suffers from low resolution at the species level due to high sequence conservation. Currently, there is now a wealth of genomic data that can be used to yield more accurate species designations via modern phylogenetic methods and multiple genetic loci. However, these often require extensive expertise and time. The Automated Multi-Locus Species Tree (autoMLST) was thus developed to provide a rapid ‘one-click’ pipeline to simplify this workflow at: https://automlst.ziemertlab.com. This server utilizes Multi-Locus Sequence Analysis (MLSA) to produce high-resolution species trees; this does not preform multi-locus sequence typing (MLST), a related classification method. The resulting phylogenetic tree also includes helpful annotations, such as species clade designations and secondary metabolite counts to aid natural product prospecting. Distinct from currently available web-interfaces, autoMLST can automate selection of reference genomes and out-group organisms based on one or more query genomes. This enables a wide range of researchers to perform rigorous phylogenetic analyses more rapidly compared to manual MLSA workflows.

Genetics

Pharmacology

0

Paper

Save

MIBiG 3.0: a community-driven effort to annotate experimentally validated biosynthetic gene clusters

Barbara Terlouw et al.Nov 18, 2022

Abstract With an ever-increasing amount of (meta)genomic data being deposited in sequence databases, (meta)genome mining for natural product biosynthetic pathways occupies a critical role in the discovery of novel pharmaceutical drugs, crop protection agents and biomaterials. The genes that encode these pathways are often organised into biosynthetic gene clusters (BGCs). In 2015, we defined the Minimum Information about a Biosynthetic Gene cluster (MIBiG): a standardised data format that describes the minimally required information to uniquely characterise a BGC. We simultaneously constructed an accompanying online database of BGCs, which has since been widely used by the community as a reference dataset for BGCs and was expanded to 2021 entries in 2019 (MIBiG 2.0). Here, we describe MIBiG 3.0, a database update comprising large-scale validation and re-annotation of existing entries and 661 new entries. Particular attention was paid to the annotation of compound structures and biological activities, as well as protein domain selectivities. Together, these new features keep the database up-to-date, and will provide new opportunities for the scientific community to use its freely available data, e.g. for the training of new machine learning models to predict sequence-structure-function relationships for diverse natural products. MIBiG 3.0 is accessible online at https://mibig.secondarymetabolites.org/.

Genetics

Pharmacology

0

Paper

Save

The Chemical Structure of Widespread Microbial Aryl Polyene Lipids

G.L.C. Grammbitter et al.Dec 20, 2020

Abstract Biosynthetic gene clusters (BGC) involved in aryl polyene (APE) biosynthesis are supposed to represent the most widespread BGC in the bacterial world. [1–3] Still, only hydrolysis products [4–8] and not the full-length product(s) have been identified, hindering studies on their biosynthesis and natural function. Here, we apply subsequent chromatographic separations to purify the aryl polyene-containing lipids (APELs) from the entomopathogenic bacterium Xenorhabdus doucetiae . Structure elucidation using a combination of isotope labeling, nuclear magnetic resonance techniques, and tandem mass spectrometry reveals an array of APELs featuring an all- trans C26:5 conjugated fatty acyl and a galactosamine-phosphate-glycerol moiety. In combination with extensive genetic studies, this research broadens the bacterial natural product repertoire and paves the way for future functional characterization of this almost universal microbial compound class. Due to their protective function against reactive oxygen species, [5,9] APELs might be important for virulence or symbiosis, mediating organismic interactions in several ecological niches.

Genetics

Biochemistry

1

Paper

Save

A biaryl-linked tripeptide from Planomonospora leads to widespread class of minimal RiPP gene clusters

Mitja Zdouc et al.Jul 21, 2020

Abstract Microbial natural products impress by their bioactivity, structural diversity and ingenious biosynthesis. While screening the rare actinobacterial genus Planomonospora, cyclopeptides 1A and 1B were discovered, featuring an unusual Tyr-His biaryl-bridging across a tripeptide scaffold, with the sequences N -acetyl-Tyr-Tyr-His ( 1A ) and N -acetyl-Tyr-Phe-His ( 1B ). Genome analysis of the 1A producing strain pointed to-wards a ribosomal synthesis of 1A , from a pentapeptide precursor encoded by the tiny 18-nucleotide gene bycA, to our knowledge the smallest gene ever reported. Further, biaryl instalment is performed by the closely linked gene bycB, encoding a cytochrome P450 monooxygenase. Biosynthesis of 1A was confirmed by heterologous production in Streptomyces, yielding the mature product. Bioinformatic analysis of related cytochrome P450 monooxygenases indicated that they constitute a widespread family of pathways, associated to 5-aa coding sequences in approximately 200 (actino)bacterial genomes, all with potential for a biaryl linkage between amino acids 1 and 3. We propose the name biarylicins for this newly discovered family of RiPPs.

Genetics

Biochemistry

15

Paper

Save

Identification, structure and function of the methyltransferase involved in the biosynthesis of the dithiolopyrrolone antibiotic xenorhabdin

Ли Су et al.Jan 12, 2024

Xenorhabdins (XRDs) are produced by Xenorhabdus species and are members of the dithiopyrrolone (DTP) class of natural products that have potent antibacterial, antifungal and anticancer activity. The amide moiety of their DTP core can be methylated or not to fine-tune the bioactivity properties. However, the enzyme responsible for the amide N-methylation remained elusive. Here, we identified and characterized the amide methyltransferase XrdM that is encoded nearly 600 kb away from the XRD gene cluster using proteomic analysis, methyltransferase candidate screening, gene deletion, and allied approaches. In addition, crystallographic analysis and site-directed mutagenesis proved that XrdM is completely distinct from the recently reported DTP methyltransferase DtpM, and that both have been tailored in a species-specific manner for DTP biosynthesis in Gram-negative/positive organisms. Our study expands the limited knowledge of post-NRPS amide methylation in DTP biosynthesis and reveals the evolution of two structurally completely different enzymes for the same reaction in different organisms.

Biochemistry

Plant Science

0

Paper

Save

CAGECAT: The CompArative GEne Cluster Analysis Toolbox for rapid search and visualisation of homologous gene clusters

Matthias Belt et al.Feb 10, 2023

ABSTRACT Background Co-localized sets of genes that encode specialized functions are common across microbial genomes and occur in genomes of larger eukaryotes as well. Important examples include Biosynthetic Gene Clusters (BGCs) that produce specialized metabolites with medicinal, agricultural, and industrial value (e.g. antimicrobials). Comparative analysis of BGCs can aid in the discovery of novel metabolites by highlighting distribution and identifying variants in public genomes. Unfortunately, gene-cluster-level homology detection remains inaccessible, time-consuming and difficult to interpret. Results The comparative gene cluster analysis toolbox (CAGECAT) is a rapid and user-friendly platform to mitigate difficulties in comparative analysis of whole gene clusters. The software provides homology searches and downstream analyses without the need for command-line or programming expertise. By leveraging remote BLAST databases, which always provide up-to-date results, CAGECAT can yield relevant matches that aid in the comparison, taxonomic distribution, or evolution of an unknown query. The service is extensible and interoperable and implements the cblaster and clinker pipelines to perform homology search, filtering, gene neighbourhood estimation, and dynamic visualisation of resulting variant BGCs. With the visualisation module, publication-quality figures can be customized directly from a web-browser, which greatly accelerates their interpretation via informative overlays to identify conserved genes in a BGC query. Conclusion Overall, CAGECAT is an extensible software that can be interfaced via a standard web-browser for whole region homology searches and comparison on continually updated genomes from NCBI. The public web server and installable docker image are open source and freely available without registration at: https://cagecat.bioinformatics.nl

Genetics

Pharmacology

60

Paper

Genetics

Pharmacology

0

Save