ResearchHub | Open Science Community

5

Initial sequencing and comparative analysis of the mouse genome

R Waterston et al.Dec 1, 2002

Genetics

Molecular Biology

5

Paper

Save

Base-Calling of Automated Sequencer Traces UsingPhred. I. Accuracy Assessment

Brent Ewing et al.Mar 1, 1998

The availability of massive amounts of DNA sequence information has begun to revolutionize the practice of biology. As a result, current large-scale sequencing output, while impressive, is not adequate to keep pace with growing demand and, in particular, is far short of what will be required to obtain the 3-billion-base human genome sequence by the target date of 2005. To reach this goal, improved automation will be essential, and it is particularly important that human involvement in sequence data processing be significantly reduced or eliminated. Progress in this respect will require both improved accuracy of the data processing software and reliable accuracy measures to reduce the need for human involvement in error correction and make human review more efficient. Here, we describe one step toward that goal: a base-calling program for automated sequencer traces, phred, with improved accuracy. phred appears to be the first base-calling program to achieve a lower error rate than the ABI software, averaging 40%–50% fewer errors in the data sets examined independent of position in read, machine running conditions, or sequencing chemistry.

Genetics

Artificial Intelligence

0

Paper

Save

2.2 Mb of contiguous nucleotide sequence from chromosome III of C. elegans

Richard Wilson et al.Mar 1, 1994

Genetics

Molecular Biology

0

Paper

Save

DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome

Timothy Ley et al.Nov 1, 2008

Acute myeloid leukaemia is a highly malignant haematopoietic tumour that affects about 13,000 adults in the United States each year. The treatment of this disease has changed little in the past two decades, because most of the genetic events that initiate the disease remain undiscovered. Whole-genome sequencing is now possible at a reasonable cost and timeframe to use this approach for the unbiased discovery of tumour-specific somatic mutations that alter the protein-coding genes. Here we present the results obtained from sequencing a typical acute myeloid leukaemia genome, and its matched normal counterpart obtained from the same patient's skin. We discovered ten genes with acquired mutations; two were previously described mutations that are thought to contribute to tumour progression, and eight were new mutations present in virtually all tumour cells at presentation and relapse, the function of which is not yet known. Our study establishes whole-genome sequencing as an unbiased method for discovering cancer-initiating mutations in previously unidentified genes that may respond to targeted therapies.

Genetics

Hematology

0

Paper

Save

The Genome Sequence of Caenorhabditis briggsae: A Platform for Comparative Genomics

Lincoln Stein et al.Nov 14, 2003

The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome to a high-quality draft stage and compared it to the finished C. elegans sequence. We predict approximately 19,500 protein-coding genes in the C. briggsae genome, roughly the same as in C. elegans. Of these, 12,200 have clear C. elegans orthologs, a further 6,500 have one or more clearly detectable C. elegans homologs, and approximately 800 C. briggsae genes have no detectable matches in C. elegans. Almost all of the noncoding RNAs (ncRNAs) known are shared between the two species. The two genomes exhibit extensive colinearity, and the rate of divergence appears to be higher in the chromosomal arms than in the centers. Operons, a distinctive feature of C. elegans, are highly conserved in C. briggsae, with the arrangement of genes being preserved in 96% of cases. The difference in size between the C. briggsae (estimated at approximately 104 Mbp) and C. elegans (100.3 Mbp) genomes is almost entirely due to repetitive sequence, which accounts for 22.4% of the C. briggsae genome in contrast to 16.5% of the C. elegans genome. Few, if any, repeat families are shared, suggesting that most were acquired after the two species diverged or are undergoing rapid evolution. Coclustering the C. elegans and C. briggsae proteins reveals 2,169 protein families of two or more members. Most of these are shared between the two species, but some appear to be expanding or contracting, and there seem to be as many as several hundred novel C. briggsae gene families. The C. briggsae draft sequence will greatly improve the annotation of the C. elegans genome. Based on similarity to C. briggsae, we found strong evidence for 1,300 new C. elegans genes. In addition, comparisons of the two genomes will help to understand the evolutionary forces that mold nematode genomes.

Genetics

Molecular Biology

0

Paper

Save

Elephant shark genome provides unique insights into gnathostome evolution

Byrappa Venkatesh et al.Jan 7, 2014

The emergence of jawed vertebrates (gnathostomes) from jawless vertebrates was accompanied by major morphological and physiological innovations, such as hinged jaws, paired fins and immunoglobulin-based adaptive immunity. Gnathostomes subsequently diverged into two groups, the cartilaginous fishes and the bony vertebrates. Here we report the whole-genome analysis of a cartilaginous fish, the elephant shark (Callorhinchus milii). We find that the C. milii genome is the slowest evolving of all known vertebrates, including the ‘living fossil’ coelacanth, and features extensive synteny conservation with tetrapod genomes, making it a good model for comparative analyses of gnathostome genomes. Our functional studies suggest that the lack of genes encoding secreted calcium-binding phosphoproteins in cartilaginous fishes explains the absence of bone in their endoskeleton. Furthermore, the adaptive immune system of cartilaginous fishes is unusual: it lacks the canonical CD4 co-receptor and most transcription factors, cytokines and cytokine receptors related to the CD4 lineage, despite the presence of polymorphic major histocompatibility complex class II molecules. It thus presents a new model for understanding the origin of adaptive immunity. Whole-genome analysis of the elephant shark, a cartilaginous fish, shows that it is the slowest evolving of all known vertebrates, lacks critical bone formation genes and has an unusual adaptive immune system. The elephant shark (Callorhinchus milii) is a cartilaginous fish native to the temperate waters off southern Australia and New Zealand, living at depths of 200 to 500 metres and migrating into shallow waters during spring for breeding. The genome sequence is published in this issue of Nature. Comparison with other vertebrate genomes shows that it is the slowest evolving genome of all known vertebrates — coelacanth included. Genome analysis points to an unusual adaptive immune system lacking the CD4 receptor and some associated cytokines, indicating that cartilaginous fishes possess a primordial gnathostome adaptive immune system. Also absent are genes encoding secreted calcium-binding phosphoproteins, in line with the absence of bone in cartilaginous fish.

Genetics

Immunology

0

Paper

Save

The million mutation project: A new approach to genetics in Caenorhabditis elegans

Owen Thompson et al.Jun 25, 2013

We have created a library of 2007 mutagenized Caenorhabditis elegans strains, each sequenced to a target depth of 15-fold coverage, to provide the research community with mutant alleles for each of the worm's more than 20,000 genes. The library contains over 800,000 unique single nucleotide variants (SNVs) with an average of eight nonsynonymous changes per gene and more than 16,000 insertion/deletion (indel) and copy number changes, providing an unprecedented genetic resource for this multicellular organism. To supplement this collection, we also sequenced 40 wild isolates, identifying more than 630,000 unique SNVs and 220,000 indels. Comparison of the two sets demonstrates that the mutant collection has a much richer array of both nonsense and missense mutations than the wild isolate set. We also find a wide range of rDNA and telomere repeat copy number in both sets. Scanning the mutant collection for molecular phenotypes reveals a nonsense suppressor as well as strains with higher levels of indels that harbor mutations in DNA repair genes and strains with abundant males associated with him mutations. All the strains are available through the Caenorhabditis Genetics Center and all the sequence changes have been deposited in WormBase and are available through an interactive website.

Genetics

Molecular Biology

0

Paper

Save

High Throughput Fingerprint Analysis of Large-Insert Clones

Marco Marra et al.Nov 1, 1997

As part of the Human Genome Project, the Washington University Genome Sequencing Center has commenced systematic sequencing of human chromsome 7. To organize and supply the effort, we have undertaken the construction of sequence-ready physical maps for defined chromosomal intervals. Map construction is a serial process composed of three main activities. First, candidate STS-positive large-insert PAC and BAC clones are identified. Next, these candidate clones are subjected to fingerprint analysis. Finally, the fingerprint data are used to assemble sequence-ready maps. The fingerprinting method we have devised is key to the success of the overall approach. We present here the details of the method and show that the fingerprints are of sufficient quality to permit the construction of megabase-size contigs in defined regions of the human genome. We anticipate that the high throughput and precision characteristic of our fingerprinting method will make it of general utility.

Genetics

Artificial Intelligence

0

Paper

Save

Long-read sequence assembly of the gorilla genome

David Gordon et al.Mar 31, 2016

Improving on the gorilla genome Access to complete, high-quality genomes of nonhuman primates will also help us understand human biology. Gordon et al. used long-read sequencing technology to improve genome data on our close relative the gorilla. Sequencing from a single individual decreased assembly fragmentation and recovered previously missed genes and noncoding loci. Mapping short-read sequences from additional gorillas helped reconstruct a “pan” gorilla sequence documenting genetic variation. Comparison with human genomes revealed species-specific differences ranging in size from one to thousands of bases in length, including some that are likely to affect gene regulation. Science , this issue p. 10.1126/science.aae0344

Genetics

Paleontology

0

Paper

Save

A survey of expressed genes in Caenorhabditis elegans

R Waterston et al.May 1, 1992

Genetics

Molecular Biology

0

Paper

Genetics

355

0

Save